amazon.science machine learning natural language processing
This dataset provides how-to articles from wikihow.com and their summaries, written as a coherent paragraph. The dataset itself is available at wikisum.zip, and contains the article, the summary, the wikihow url, and an official fold (train, val, or test). In addition, human evaluation results are available at wikisum-human-eval.zip. It consists of human evaluation of the summary of the Pegasus system, annotators response regarding the difficulty of the task, and words they marked as unknown.
Not currently being updated
Dataset is published under CC-NC-SA-3.0. Human evaluation is published under CC-SA-4.0.
https://wikisum.s3.amazonaws.com/README.txt
See all datasets managed by Amazon.
nachshon@amazon.com, orenk@amazon.com
WikiSum: Coherent Summarization Dataset for Efficient Human-Evaluation was accessed on DATE
from https://registry.opendata.aws/wikisum.
arn:aws:s3:::wikisum
us-east-1
aws s3 ls --no-sign-request s3://wikisum/