NumerSense is a new numerical commonsense reasoning probing task, with a diagnostic dataset consisting of 3,145 masked-word-prediction probes.

We propose to study whether numerical commonsense knowledge can be induced from pre-trained language models like BERT, and to what extent this access to knowledge robust against adversarial examples is. We hope this will be beneficial for tasks such as knowledge base completion and open-domain question answering.

Links:   [Paper]   [Data]   [Github]   [INK Lab]  



Everday Objects (35.2%)

A bicycle has [MASK] tires.

Select the answer!

Biology (13.5%)

Most ants have [MASK] legs.

Select the answer!

Geometry (11.7%)

A cube has [MASK] faces.

Select the answer!

Unit Converting (6.3%)

A week is [MASK] days.

Select the answer!

Math (7.3%)

I will be [MASK] next year,as I am nine now.

Select the answer!

Physics (5.7%)

Water will freeze at [MASK] degrees centigrade.

Select the answer!

Geography (2.9%)

The world contains [MASK] continents.

Select the answer!

Others (17.5%)

There are [MASK] princes in the United States.

Select the answer!


Core Core + Adversarial
Models hit@1 hit@2 hit@3 hit@1 hit@2 hit@3



     author = {Bill Yuchen Lin and Seyeon Lee and Rahul Khanna and Xiang Ren},
     title = {NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models},
     journal = {Proceedings of EMNLP},
    year = {2020}
    note ={to appear}