DocNLI: A Large-scale Dataset for Document-level Natural Language Inference

This repo contains the data, source code, and pretrained (DocNLI) model for ACL'21 paper.

Requirements

HuggingFace Transformer
Pytorch

DocNLI Dataset download

https://drive.google.com/file/d/16TZBTZcb9laNKxIvgbs5nOBgq3MhND5s/view?usp=sharing

In addition to the DocNLI dataset, we also release other three datasets (in "Data" folder) used in this paper:

"binary FEVER data for NLI"
"binary MNLI"
"NLI-version of MCTest".

Pretrained RoBERTa Model download

https://drive.google.com/file/d/12kNONo0jgktxU0vWtV3Z2ZrCrB3DJPVj/view?usp=sharing

Contact:

Wenpeng Yin (mr.yinwenpeng@gmail.com or wyin@salesforce.com)