HRL data lifecycle overview

HRL structures data governance and reproducible, collaborative practices around a program-specific data lifecycle. The data lifecycle outlines key phases that data pass through from initial collection in the field to final reporting and communication of results. Each phase includes defined activities, responsible parties, and governance touchpoints to ensure integrity, transparency, and usability of data and analyses throughout the HRL program.

HRL data lifecycle diagram

The HRL Science Committee oversees the data lifecycle and governance model but does not directly implement any phases of the HRL data lifecycle. The HRL Science Committee provides strategic direction, prioritization, and oversight to ensure that Data Producers, the Central Data Team, and Synthesis Teams are equipped to uphold HRL Science Program commitments and collaborate effectively.

Lifecycle phases at a glance

Lifecycle phase Summary Responsible entity
Collection Collect data and document approved protocols, metadata, and field quality assurance and quality control practices. Data Producers
Static Publication Publish cleaned datasets with metadata packages and DOIs through the HRL GitHub + EDI workflow. Data Producers
Ingestion and Standardization Harmonize static data releases, incorporate external datasets, align schemas and vocabularies, conduct program-level quality control, and capture provenance. Central Data Team
Storage and Serving Keep curated datasets durable, discoverable, and accessible via catalogs, APIs, and SDKs. Central Data Team
Analysis and Synthesis Run reproducible analyses, create models and indicators, and return models derived data products to the Central Data Team for curation. Synthesis Teams
Reporting and Communication Translate findings into synthesis reports, decision support tools, visualization applications, and other science communication products that guide adaptive management and make HRL science outcomes accessible. Central Data Team
Synthesis Teams