Skip to content

Latest commit

 

History

History
61 lines (49 loc) · 3.28 KB

File metadata and controls

61 lines (49 loc) · 3.28 KB

Reproducing the Results

We provide most of the models mentioned in the final results (Table 6 and 7 in the paper). We used fairseq to implement and train the models. You can install the dependencies by executing:

pip install -r requirements.txt

Additionally, we used the luigi library for all the staff related to preprocessing and postprocessing the data. Finally, we provide an evaluation script to determine the scores.

Dual-Source Models

Dual-Source models (vanilla dual-source transformer and dual-source RoBERTa) share the same preprocessing process but use different vocabularies. We train a new sentencepiece model on all training data for dual-source transformer and reuse the original RoBERTa vocabulary for Dual-source RoBERTa.

The input to these models consists of two parts: the article and the requested properties. The properties are separated by the ### token. For instance:

William Costello Kennedy, PC (August 27, 1868 -- January 17, 1923) was a Canadian politician. Born in Ottawa, Ontario, he was first elected to the Canadian House of Commons in the riding of Essex North in the 1917 federal election as a Laurier-Liberal. He was re-elected as a Liberal in 1921. From 1921 until his death, he was the Minister of Railways and Canals in the government of William Lyon Mackenzie King.
family name ### position held ### date of birth ### member of political party ### instance of ### date of death ### occupation ### sex or gender ### given name ### country of citizenship

The first line is a Wikipedia article, and the second contains the requested properties.

Evaluation of Models

We published three models on WikiReading Recycled:

  • T5,
  • vanilla dual-source transformer,
  • dual-source RoBERTa.

Moreover, we prepared a complete pipeline to reproduce our results. To evaluate a model, run:

PYTHONPATH=. luigi --local-scheduler --module tutorial_scripts EvaluateModelTask --model MODEL --split SPLIT

where MODEL may be one of T5, DUAL_SOURCE_TRANSFORMER, DUAL_ROBERTA_TRANSFORMER; and split one of dev-0, test-A, test-B.

For example:

PYTHONPATH=. luigi --local-scheduler --module tutorial_scripts EvaluateModelTask --model DUAL_ROBERTA_TRANSFORMER --split test-B

will evaluate dual-source RoBERTa model on test-B.

You may specify a diagnostic subset by adding --subset SUBSET option (SUBSET: unseen, rare, categorical, relational, exact-match, long-articles). For example:

PYTHONPATH=. luigi --local-scheduler --module tutorial_scripts EvaluateModelTask --model DUAL_ROBERTA_TRANSFORMER --split test-B --subset rare

The results are written to results/DUAL_ROBERTA_TRANSFORMER/test-B.rare.

The schema of directories is as follows:

  • dataset: the WikiReading Recycled dataset directory
  • models: the models
  • processed: the preprocessed data
  • binarized: the binary version of preprocessed files (required by fairseq)
  • outputs: model's raw predictions and they post-processed version
  • results: the final results
  • fairseq_modules: implementation of the models in fairseq
  • tutorial_scripts: the pipeline implementation

In case of any problem, contact Tomasz Dwojak or report an issue.