NOAA Environmental Data Management Framework

"NOAA is, at its foundation, an environmental information generating organization.

Fundamental to ensuring that the wealth of environmental information generated by NOAA

is effectively utilized now and for the long-term is an increased focus on information

management standards and strategies to improve access, interoperability, and usability."

- From NOAA's Next Generation Strategic Plan (2010)

Revision History

Version Date Editor Description Dr. Jeff de La Beaujardière, First Draft sent for Committee review (EDMC, OSC, 0.1 2012 Sep 7 NOAA Data Mgmt Architect GISC, EAC) and others.

0.2 2012 Oct 31 " Second draft for Committee review 0.3 2012 Nov 30 " Third draft for NOSC and CIO Council review 0.4 2013 Jan 24 " Fourth draft for NEP/NEC review 1.0 2013 Mar 14 " Version 1.0 for presentation to SAB Acknowledgements

This document was improved thanks to comments received from the following reviewers:

Michelle Hertzfeld, NESDIS/IIAD Anne Ball, NOS/CSC Paul Hirschberg, PPI Charles Baker, NESDIS Christina Horvat, NWS Julie Bosch, NESDIS/NODC Thomas Karl, NESDIS/NCDC Deirdre Byrne, NESDIS/NODC Eric Kihn, NESDIS/NGDC Leo Carling, OMAO Tony Lavoi, OCIO/GIO Kenneth Casey, NESDIS/NODC Clark Lind, NESDIS/NCDC Jennifer Clapp, NESDIS/IIAD Roy Mendelssohn, NMFS Don Collins, NESDIS/NODC Lewis McCulloch, NESDIS/TPIO Jihong Dai, NMFS/ST Ana Pinheiro Privette, NESDIS/NCDC Darien Davis et al., OAR Nancy Ritchey, NESDIS/NCDC Nancy DeFrancesco, NESDIS/CID Steve Rutz, NESDIS/NODC Larry Goldberg, NMFS Jim Sargent et al., NMFS Peter Grimm, NESDIS/CID Rebecca Shuford, NMFS Ted Habermann, NESDIS/NGDC Matt Seybold, NESDIS/OSPO Karl Hampton, NESDIS/OSPO

Executive Summary This Environmental Data Management Framework defines and categorizes the policies, requirements, activities, and technical considerations relevant to the management of observational data and derived products by the US National Oceanic and Atmospheric Administration (NOAA). These data are an irreplaceable national resource that must be well-documented, discoverable, accessible, and preserved for future use. This Framework recommends that environmental data management (EDM) activities be coordinated across the agency, properly defined, and adequately resourced in order to ensure the usability, quality, and preservation of NOAA data.

The NOAA EDM Framework includes Principles, Governance, Resources, Standards, Architecture, and Assessment that apply broadly to many classes of data. The concept of the Data Lifecycle is introduced and separated into planning and production, data management, and data usage activities. Relevant NOAA policies, procedures, and groups are highlighted. Specific recommendations are enumerated in an Appendix.

The EDM Framework was developed in response to a recommendation from NOAA's Science Advisory Board (SAB) at their Spring 2012 meeting. * The transmittal letter from SAB Chair Raymond J. Ban to NOAA Administrator Dr Jane Lubchenco refers to "the urgent need to establish a NOAA-wide Environmental Data Management Framework... that incorporates both access and archive elements of data management" in order to "integrate disparate environmental data management initiatives into an enterprise-wide environmental data management system meeting NOAA’s critical mission requirements as well as those of its constituents and users, over the long term."

* http://www.sab.noaa.gov/Reports/Reports.html

1. Introduction 1.1. Motivation Accurate, timely, and comprehensive observations of the Earth and its surrounding space are critical to support government decisions and policies, scientific research, and the economic, environmental, and public health of the United States. Earth observations are typically produced for one specific purpose -sometimes at great cost -- but are often useful for other purposes as well. It is important that these observations be managed and preserved such that all potential users can find, evaluate, understand, and utilize these data. The range of scientific and observation efforts at NOAA, and the resulting magnitude of data collections and diversity of data types, requires a systematic approach to data management that is broadly applicable yet can be tailored to particular needs.

This document establishes a conceptual Environmental Data Management (EDM) Framework of policies, organizational practices, and technical considerations to support effective and continuing access to Earth observations and derived products. The EDM Framework clarifies the expectations and requirements for NOAA projects and personnel involved in the funding, collection, processing, stewardship, and dissemination of environmental data. The goals of the Framework are (1) to promote a common understanding of data management policies and activities across NOAA, (2) to maximize the likelihood that environmental data are discoverable, accessible, well-documented, and preserved for future use, and (3) to encourage the development and use of uniform tools and practices across NOAA for handling environmental data. This Framework should guide and inform the development of program-specific data management plans and other NOAA activities to improve data management.

Specific recommendations for activities in support of these goals are enumerated in Appendix A.

The NOAA Environmental Data Management Framework builds on ideas and recommendations from NOAA's Next Generation Strategic Plan (1), NOAA Administrative Order (NAO) 212-15 (2), the National Research Council (NRC) study Environmental Data Management at NOAA (3), the White House Office and Science and Technology Policy (OSTP) Interagency Working Group on Digital Data (IWGDD) report Harnessing the Power of Digital Data: Taking the Next Step (4), the US Group on Earth Observations (USGEO) Exchanging Data for Societal Benefit (5), the U.S. Chief Information Officer’s 25 Point Implementation Plan to Reform Federal Information Technology Management (6), and Open Government initiatives such as Data.gov. This Framework is also very well aligned with the draft US Office of Management and Budget (OMB) memorandum on "Managing Government Information as an Asset throughout its Life Cycle to Promote Interoperability and Openness."* The NOAA EDM Framework was developed in response to a recommendation from NOAA's Science Advisory Board (SAB) at their March 2012 meeting. This Framework will be used and updated by NOAA's Environmental Data Management Committee (EDMC). EDMC activities, recommendations and directives * Draft circulated for NOAA review the week of 2012-11-26; issuance date to be determined.

1.2. Key Concepts Note: A list of acronyms may be found in Appendix B: Abbreviations.

Environmental Data: NAO 212-15 defines environmental data as "recorded and derived observations and measurements of the physical, chemical, biological, geological, and geophysical properties and conditions of the oceans, atmosphere, space environment, sun, and solid earth, as well as correlative data, such as socioeconomic data, related documentation, and metadata." For the purposes of this document, we use the terms "data" and "environmental data" interchangeably. This Framework focuses primarily on observations and derived products rather than numerical model outputs, but the latter are mentioned in several contexts. Non-digital media such as audio recordings or photographs are discussed only in the context of data rescue (see Section 3.2.7). Published papers, preserved geological or biological samples, and non-environmental data (personnel, budget, etc.) are outside the scope of this EDM Framework.

NOAA Data: Data collected directly by a NOAA entity or directly funded by a NOAA entity are the primary focus of this Framework. However, the NOAA National Data Centers archive data from a wide range of non-NOAA sources (e.g., international partners, commercial businesses, educational institutions and other federal agencies). Furthermore, many NOAA entities use data from non-NOAA sources to develop products. Some categories of externally-produced data may therefore need to be managed in the same manner as purely NOAA data.

Observing System: Strictly speaking, an observing system is a set of one or more platforms (such as a satellite, buoy, radar, fixed instrument platform, ship, airplane, or autonomous vehicle), each containing one or more sensors. More generally, some observations may be completely or fully manual and involve human observations or sample gathering. This document uses the term "observing system" in a general sense and applies to both automatic and human observations.

1.3. Data Management Target State Figure 1 illustrates conceptually the desired target state of NOAA data management activities. Not all activities are illustrated in this diagram, but it is useful as a high-level concept. The NOAA EDM Framework is intended to help guide NOAA activities toward such a target state. The modest expectations of this target state are appropriate for the medium term, and do not reflect the possible inclusion of advanced technologies in the longer term. Some NOAA datasets are nearly at this target state, but others are not; an assessment (see Section 2.6) will assist in determining the gaps. The Directive documents mentioned here, some of which are in preparation, are discussed more fully in Section 2.2.2.

* https://www.nosc.noaa.gov/EDMC/

Figure 1: Conceptual overview of the desired target state of NOAA data management activities. Not all activities are illustrated. The numbers correspond to steps in the walk-through below.

Walk-through starting at the upper left of Figure 1:

1. Requirements for observational data are established by agency leadership and guide data producers in determining what NOAA observing systems to develop and deploy, and from what non-NOAA systems to acquire data.

2. Advanced planning based on the NOAA Data Management Planning directive addresses how the observed or acquired data will be handled and preserved.

3. Data producers generate data, and in accordance with the Data Documentation directive also ensure the creation of associated metadata that explains the nature, origin and quality of the data.

This step implicitly includes quality control and product generation, which are not shown for simplicity.

4. Data are transmitted in near-real-time to operational data users.

5. Data are also made discoverable and accessible for other users via standardized online services per the Data Access directive.

6. Data and metadata are sent to a NOAA National Data Center (or other approved Archive facility) for long-term preservation.

7. Datasets are assigned a persistent identifier (ID) by the Data Center in accordance with the Data Citation directive.

8. The Data Center offers access and discovery of archived data using services compatible with those offered by the original data producers.

Version 1.0 6 2013-03-14 NOAA Environmental Data Management Framework

9. A Data Management Dashboard automatically measures statistics from metadata records and catalog holdings to enable leadership to assess the status of, and observe improvements in, data access, documentation, and preservation.

10. Data Users both in and out of NOAA can employ the software Tools of their choice to find, retrieve and decode data because NOAA metadata and services are well-defined and functional.

11. Users employ NOAA data to create a result such as a derived information product, forecast, scientific paper, decision, policy, or incident response.

12. The User can cite the data used by referencing its ID, so the agency can track usage and provide credit to data producers and managers.

13. Users have the opportunity to provide feedback regarding data quality and other attributes.

14. Finally, Users help refine the requirements for new or improved observations.

2. The Environmental Data Management Framework The basic elements of the Environmental Data Management Framework are illustrated in Figure 2. The EDM Framework includes Principles, Governance, Resources, Standards, Architecture, and Assessment that apply broadly to many classes of data, and individual Data Lifecycles for particular data collections.

