The Registry of Open Data on AWS is now available on AWS Data Exchange
All datasets on the Registry of Open Data are now discoverable on AWS Data Exchange alongside 3,000+ existing data products from category-leading data providers across industries. Explore the catalog to find open, free, and commercial data sets. Learn more about AWS Data Exchange

End of Term Web Archive Dataset

archives internet natural language processing web archive


The End of Term Web Archive (EOT) captures and saves U.S. Government websites at the end of presidential administrations. The EOT has thus far preserved websites from administration changes in 2008, 2012, 2016, and 2020. Data from these web crawls have been made openly available in several formats in this dataset.

Update Frequency

Every four years after a US Presidentaial Election


There are no restrictions on the use, access, and/or download of data from the End of Term Web Archive Dataset. We request that you cite the End of Term Web Archive project when using the data provided from this dataset.

Creative Commons Zero


Managed By

End of Term Web Archive

See all datasets managed by End of Term Web Archive.


Mark Phillips, Sawood Alam

How to Cite

End of Term Web Archive Dataset was accessed on DATE from

Usage Examples


Resources on AWS

  • Description
    Web Archive Crawl Data (WARC and ARC formats)
    Resource type
    S3 Bucket
    Amazon Resource Name (ARN)
    AWS Region
    AWS CLI Access (No AWS account required)
    aws s3 ls --no-sign-request s3://eotarchive/

Edit this dataset entry on GitHub

Tell us about your project