amazon.science conversation data machine learning natural language processing
This dataset provides extra annotations on top of the publicly released Topical-Chat dataset(https://github.com/alexa/Topical-Chat) which will help in reproducing the results in our paper "Policy-Driven Neural Response Generation for Knowledge-Grounded Dialogue Systems" (https://arxiv.org/abs/2005.12529?context=cs.CL). The dataset contains 5 files: train.json, valid_freq.json, valid_rare.json, test_freq.json and test_rare.json. Each of these files will have additional annotations on top of the original Topical-Chat dataset. These specific annotations are: dialogue act annotations and knowledge sentence annotations. The annotations were computed automatically using off the shelf models which are mentioned in the README.txt
Not currently being updated
https://github.com/alexa/Topical-Chat/blob/master/TopicalChatEnriched/README.md
See all datasets managed by Amazon.
behnam@amazon.com, karthgop@amazon.com, seokhwk@amazon.com, yangliud@amazon.com, mihaeric@amazon.com, hakkanit@amazon.com
Enriched Topical-Chat Dataset for Knowledge-Grounded Dialogue Systems was accessed on DATE
from https://registry.opendata.aws/topical-chat-enriched.
arn:aws:s3:::enriched-topical-chat
us-west-2
aws s3 ls --no-sign-request s3://enriched-topical-chat/