RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge
βοΈ Bill Yuchen Lin, Ziyi Wu, Yichi Yang, Dong-Ho Lee, Xiang Ren
π’ in Proceedings of ACL 2021 Findings
βThe essence of a riddle is to express true facts under impossible combinations.β β Aristotle, Poetics (350 BCE)
Quick Links
Paper Video Github Dataset Leaderboard
Abstract
Question: I have five fingers but I am not alive. What am I? Answer: a glove.
Answering such a riddle-style question is a challenging cognitive process, in that it requires complex commonsense reasoning abilities, an understanding of figurative language, and counterfactual reasoning skills, which are all important abilities for advanced natural language understanding (NLU). However, there is currently no dedicated datasets aiming to test these abilities. Herein, we present RiddleSense, a new multiple-choice question answering task, which comes with the first large dataset (5.7k examples) for answering riddle-style commonsense questions. We systematically evaluate a wide range of models over the challenge, and point out that there is a large gap between the best-supervised model and human performance β suggesting intriguing future research in the direction of higher-order commonsense reasoning and linguistic creativity towards building advanced NLU systems.
Interactive Examples
- Example 1: I am black when you buy me, red when you use me. When I turn white, you know it's time to trow me away. What am I?
- Example 2: I have a long tail that I let fly. Every time I go through a gap, I leave a bit of my tail in the trap. What am I?
- Example 3: If you take off my skin, I will not cry, but you will. What am I?
- Example 4: What is that which, though black itself, enlightens the world without burning?
- Example 5: I have hundreds of legs, but I can only lean. What am I?
Dataset Format
Please download our dataset by filling the form here and the link will show up once you read the disclaimer and submit it. There are five files as follows:
rs_train.jsonl
(3,510 lines)- The training data of RiddleSense.
csqa_train.jsonl
(9,741 lines)- The training data of CommonsenseQA.
csqa_rs_train.jsonl
(13,251 lines)- The training of of CommonsenseQA + RiddleSense, i.e., the combination of both.
rs_dev.jsonl
(1,021 lines)- The development data of RiddleSense.
rs_test_hidden.jsonl
(1,184 lines)- The test data of RiddleSense, where the truth answers are hidden.
{ # a particular line in our jsonl file
"id": "c1235zcx90023230",
"question": {
"stem": "My life can be measured in hours. I serve by being devoured. Thin, I am quick. Fat, I am slow. Wind is my foe. What am I?", # The riddle question.
"choices": [
{"label": "A", "text": "paper"},
{"label": "B", "text": "candle"}, # the correct answer
{"label": "C", "text": "lamp"},
{"label": "D", "text": "clock"},
{"label": "E", "text": "worm"}
]
},
"answerKey": "B" # this will be "hidden" in the test data.
}
Leaderboard
Model | Submitter | Date | Training Data | Acc |
---|---|---|---|---|
Humans | - | - | N/A | 91.33 |
UnifiedQA (T5-3B) | USC-INK | 5/30/2021 | RS+CSQA | 68.80 |
ALBERT-XXL | USC-INK | 5/30/2021 | RS+CSQA | 67.30 |
MHGRN (AB-XXL) | USC-INK | 5/30/2021 | RS+CSQA | 66.81 |
MHGRN (RoBERTa-Large) | USC-INK | 5/30/2021 | RS+CSQA | 63.73 |
RoBERTa-Large | USC-INK | 5/30/2021 | RS+CSQA | 59.82 |
KagNet (RoBERTa-Large) | USC-INK | 5/30/2021 | RS+CSQA | 59.72 |
UnifiedQA (T5-Large) | USC-INK | 5/30/2021 | RS+CSQA | 56.57 |
BERT-Large | USC-INK | 5/30/2021 | RS+CSQA | 54.91 |
BERT-Base | USC-INK | 5/30/2021 | RS+CSQA | 47.67 |
Random Guess | - | - | N/A | 20.00 |
Submission Guide
This is an example submission file. Please submit your prediction file and information via this form.
Citation
@inproceedings{lin-etal-2021-riddlesense,
title = "RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge",
author = "Lin, Bill Yuchen and Wu, Ziyi and Yang, Yichi and Lee, Dong-Ho and Ren, Xiang",
booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL-IJCNLP 2021): Findings",
year = "2021",
note={to appear}
}