The Public Utility Data Liberation Project (PUDL) provides analysis-ready energy system data to climate advocates,
researchers, policymakers, and journalists.
PUDL is an open source data processing pipeline that makes US energy data easier to access and use programmatically. Hundreds of gigabytes of valuable data are published by US government agencies, but it's often difficult to work with. PUDL takes the original spreadsheets, CSV files, and databases and turns them into a unified resource. This allows users to spend more time on novel analysis and less time on data preparation.
This information allows users to explore the operating costs of individual power plants, and see how fuel costs impact the viability of different types of generation. It can highlight the competitiveness of renewable electricity in the market today. It can show how the generation mix of different utilities has evolved over time, and how the usage of individual power plants has changed as fuel prices have changed and more renewable generation has been brought online.
The data hosted on Amazon Web Services is intended to be accessed through the PUDL Intake Catalog. The catalog allows users to access the data via a uniform API for each data type (parquet, SQL), handles local caching and provides rich metadata about the data.
The federal agencies that publish the raw data PUDL processes release new data, monthly, quarterly and yearly. PUDL is continuously improving the data and tries to release new versions of the data monthly.
The PUDL data and documentation are published under the Creative Commons Attribution License v4.0 (CC-BY-4.0).
See all datasets managed by Catalyst Cooperative.
Public Utility Data Liberation Project was accessed on
DATE from https://registry.opendata.aws/catalyst-cooperative-pudl.
aws s3 ls --no-sign-request s3://intake.catalyst.coop/