RICA: Robust Inferences Based on Commonsense Axioms| USC/ISI

Physical (30%)

A is smaller in size than B, so A is likely [MASK] to put into a box than B.

Select the answer!

Material (30%)

A is made out of glass and B is made out of stone, so A is [MASK] transparent than B.

Select the answer!

Social (30%)

A makes the varsity team while B does not, so A is [MASK] skilled than B.

Select the answer!

Temporal (10%)

A was eating dinner now, so A was probably hungry [MASK] eating dinner.

Select the answer!

Physical-Perturbed

B is smaller in size than A, so A is likely [MASK] to put into a box than B.

Select the answer!

Material-Perturbed

A is made out of glass and B is made out of stone, so A is not [MASK] transparent than B.

Select the answer!

Social-Perturbed

A makes the varsity team while B does not, so B is [MASK] inexperienced than A.

Select the answer!

Temporal-Perturbed

A was eating dinner now, so A was probably not hungry [MASK] eating dinner.

Select the answer!

Submit to this leaderboard: You can submit your prediction by sending email to peiz@usc.edu with the title "RICA submission (your model name)" and the same format of this example prediction file.

Rank	Model	Average Accuracy
	Human Performance	91.7
1	RoBERTa-Large Radford et. al. 2019	50.3
2	ERNIE Zhang et. al. 2019	50.2
3	BART Lewis et. al. 2019	50.2
4	GPT-2 Radford et. al. 2019	50.1
5	BERT-Large Devlin et. al. 2018	49.4

Rank	Model	Average Accuracy
	Human Performance	91.7
1	RoBERTa-Large Radford et. al. 2019	52.3
2	BART Lewis et. al. 2019	50.2
3	ERNIE Zhang et. al. 2019	50.1
4	GPT-2 Radford et. al. 2019	50.1
5	BERT-Large Devlin et. al. 2018	49.9


	@inproceedings{zhou2021rica,
		title={RICA: Evaluating Robust Inference Capabilities Based on Commonsense Axioms},
		author={Zhou, Pei and Khanna, Rahul and Lee, Seyeon and Lin, Bill Yuchen and Ho, Daniel and Pujara, Jay and Ren, Xiang},
		booktitle={Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing},
		pages={7560--7579},
		year={2021}
	  }

RICA

Introduction

Examples

Physical (30%)

A is smaller in size than B, so A is likely [MASK] to put into a box than B.

Material (30%)

A is made out of glass and B is made out of stone, so A is [MASK] transparent than B.

Social (30%)

A makes the varsity team while B does not, so A is [MASK] skilled than B.

Temporal (10%)

A was eating dinner now, so A was probably hungry [MASK] eating dinner.

Physical-Perturbed

B is smaller in size than A, so A is likely [MASK] to put into a box than B.

Material-Perturbed

A is made out of glass and B is made out of stone, so A is not [MASK] transparent than B.

Social-Perturbed

A makes the varsity team while B does not, so B is [MASK] inexperienced than A.

Temporal-Perturbed

A was eating dinner now, so A was probably not hungry [MASK] eating dinner.

Leaderboard

RICA-Zero Shot

RICA-Finetuned

Citation.