NLP - fast.ai datasets

deep learning natural language processing machine learning

Description

Some of the most important datasets for NLP, with a focus on classification, including IMDb, AG-News, Amazon Reviews (polarity and full), Yelp Reviews (polarity and full), Dbpedia, Sogou News (Pinyin), Yahoo Answers, Wikitext 2 and Wikitext 103, and ACL-2010 French-English 10^9 corpus. This is part of the fast.ai datasets collection hosted by AWS for convenience of fast.ai students. See documentation link for citation and license details for each dataset.

Update Frequency

As required

License

Varies by dataset - see documentation link

Documentation

http://course.fast.ai/datasets

Managed By

fast.ai

Contact

info@fast.ai

Resources on AWS

  • Description
    Datasets
    Resource type
    S3 Bucket
    Amazon Resource Name (ARN)
    arn:aws:s3:::fast-ai-nlp
    AWS Region
    us-east-1

Edit this dataset entry on GitHub

Home