The Registry of Open Data on AWS is now available on AWS Data Exchange
All datasets on the Registry of Open Data are now discoverable on AWS Data Exchange alongside 3,000+ existing data products from category-leading data providers across industries. Explore the catalog to find open, free, and commercial data sets. Learn more about AWS Data Exchange

NOAA Cloud Optimized Zarr Reference Files (Kerchunk)

climate coastal disaster response environmental meteorological oceans water weather

Description

This repository contains references to datasets published to the NOAA Open Data Dissemination Program. These reference datasets serve as index files to the original data by mapping to the Zarr V2 specification. When multidimensional model output is read through zarr, data can be lazily loaded (i.e. retrieving only the data chunks needed for processing) and data reads can be scaled horizontally to optimize object storage read performance.

The process used to optimize the data is called kerchunk. RPS runs the workflow in their AWS cloud environment every time a new data notification is received from a relevant source data bucket.

These are the current datasets being cloud-optimized. Refer to those pages for file naming conventions and other information regarding the specific model implementations:
NOAA Operational Forecast System (OFS)

NOAA Global Real-Time Ocean Forecast System (Global RTOFS)

NOAA National Water Model Short-Range Forecast

Filenames follow the source dataset’s conventions. For example, if the source file is
nos.dbofs.fields.f024.20240527.t00z.nc

Then the cloud-optimized filename is the same, with “.zarr” appended
nos.dbofs.fields.f024.20240527.t00z.nc.zarr

Data Aggregations
We also produce virtual aggregations to group an entire forecast model run, and the “best” available forecast.
Best Forecast (continuously updated) - nos.dbofs.fields.best.nc.zarr Full Model Run - nos.dbofs.fields.forecast.[YYYYMMDD].t[CC]z.nc.zarr

  • CC is the model run cycles, 00, 06, 12, 18 , or 03, 09, 15, 21 for nowcast and forecast runs
  • YYYY = year, MM = month, DD = day

    Cloud optimization workflows supported by [RPS Group](https://www.rpsgroup.com/services/oceans-and-coastal/metocean-science-and-technology/), a Tetra Tech Company

Update Frequency

Optimizations run every time new data is uploaded to the source buckets and are available here within minutes.

License

Open Data. There are no restrictions on the use of this data.

Documentation

Refer to source datasets documentation

Managed By

NOAA's National Ocean Service, the Integrated Ocean Observing System (IOOS)

See all datasets managed by NOAA's National Ocean Service, the Integrated Ocean Observing System (IOOS).

Contact

For questions regarding data content or quality, visit Email The Tetra Tech Team.
For any questions regarding data delivery or any general questions regarding the NOAA Open Data Dissemination (NODD) Program, email the NODD Team at nodd@noaa.gov.
We also seek to identify case studies on how NOAA data is being used and will be featuring those stories in joint publications and in upcoming events. If you are interested in seeing your story highlighted, please share it with the NODD team by emailing nodd@noaa.gov

How to Cite

NOAA Cloud Optimized Zarr Reference Files (Kerchunk) was accessed on DATE from https://registry.opendata.aws/noaa-nodd-kerchunk.

Resources on AWS

  • Description
    Cloud-optimized Zarr Reference Files
    Resource type
    S3 Bucket
    Amazon Resource Name (ARN)
    arn:aws:s3:::noaa-nodd-kerchunk-pds
    AWS Region
    us-east-1
    AWS CLI Access (No AWS account required)
    aws s3 ls --no-sign-request s3://noaa-nodd-kerchunk-pds/
    Explore
    Browse Bucket
  • Description
    New data notifications for Cloud-optimized Zarr Reference Files
    Resource type
    SNS Topic
    Amazon Resource Name (ARN)
    arn:aws:sns:us-east-1:123901341784:NewNODDKerchunkObject
    AWS Region
    us-east-1

Edit this dataset entry on GitHub

Tell us about your project

Home