Interagency Data Stewardship/LifeCycle/Preservation Forum/TeleconNotes/2017-11-20meetingnotes

From Earth Science Information Partners (ESIP)

Meeting Notes - Data Stewardship Committee - 2017-11-20 2 p.m. EST / 12 p.m. MST / 11 a.m. PT

  • Join the meeting from your computer, tablet or smartphone.
  • You can also dial in using your phone.
  • United States: +1 (408) 650-3123
  • Access Code: 453-694-565


Attendees: Bruce Caron, Denise Hills, Mike Daniels, Nancy Hoebelheinrich, Shelley Stall, Sophie Hou, Matt Mayernik, Ruth Duerr, Paul Lemieux, Bob Downs, Justin Goldstein, Rama


Media from guest presentations:

  • Audio/slide capture from GoToMeeting (.mp4) is available upon request from the DS Committee leadership


Notes:


1) ESIP Program Committee update / budget request update (Matt M.)

  • New procedure/policy is being put in place for travel support. Those who are supported by ESIP will be expected to provide report out to the community, so that the community as a whole can benefit from the experiences.


2) Invited presentation: Mike Daniels (NCAR), "Cloud-Hosted Real-time Data Services for the Geosciences (CHORDS)"

  • CHORDS, funded by the NSF EarthCube program, is a real-time data services infrastructure (https://www.earthcube.org/group/chords) that will provide an easy-to-use system to acquire, navigate and distribute real-time data streams via cloud services and the Internet. It will lower the barrier to these services for small instrument teams, employ data and metadata formats that adhere to community accepted standards, and broaden access to real-time data for the geosciences community.
  • CHORDS is focused on real time data, which “real-time” is defined as “data that needs to be dealt with”. This definition emphasizes on urgency instead of specific timeline.
  • CHORDS is the only EarthCube project that focuses on real-time data.
  • A use case from NCAR’s study of the convective clouds using flights and ground-based instruments is given as an example of real-time data.
  • The cloud movement, rate that the measurements is being taken, flight path, and weather pattern are components of this use case that can change real time.
  • Other real time data could be “long tail”. These projects mainly focus on measurements that are collected instead of the infrastructure to support the measurement collection.
  • Yet another category of real time data relevant for this project applies to data that could be integrated over time from different disciplines (e.g. hydrology, atmosphere, and oceanography).
  • “Internet of Things” is part of the recently technological development that is helping to enable how different measurements could be connected to each other. However, researchers are more interested in how to advance science as a result of using the infrastructure, and is not as interested in the development of the infrastructure.
  • CHORDS is helping researchers in connecting their data with each other based on the “Internet of Things” architecture.
  • CHORDS sits between the sensors used by the investigators and the data archives, building services to help data meet standards.
  • An application example of using CHORDS is setting up 3-D printed weather stations at different locations (US, Europe, Zambia, Kenya, and many additional local African farms) and utilizing the local radio/cellular infrastructure for connectivity.
  • Another example is detecting volcanic activities in Tanzania from a remote location.
  • Current project is to verify hail estimates from radar data using the combination of CHORDs and ground instrument.
  • A demo from the user’s perspective is given using: http://portal.chordsrt.com/
  • Three major components of the CHORDS architecture are: portals, services, and workflows and EarthCube building blocks.
  • CHORDS leverages Docker to allow it to be implementable/installed across different platforms/environment.
  • Key activities for CHORDS’ roadmap include: allowing users to create notifications, improving security, being able to assign DOIs, expand service capabilities to connect to additional measurements.
  • CHORDS is not meant to be a data archive. It is a system that handles real time data.
  • Currently, CHORDS uses Sensor Model Language (SensorML) initially for describing the data. GeoCSV and GeoJSON as well as Consortium of Universities for the Advancement of Hydrologic Science, Inc (CUAHSI) controlled vocabularies are also used.
  • The outreach/engagement is focused on providers first; the outreach/engagement to users has been serendipitous so far. The team would like to consider additional strategies to reach out to more users.


3) Invited Presentation: Shelley Stall (AGU), "Enabling FAIR Data - Project Update"

  • The Laura and John Arnold Foundation has awarded a grant to a coalition of groups representing the international Earth and space science community, convened by the American Geophysical Union (AGU), to develop standards that will connect researchers, publishers, and data repositories in the Earth and space sciences to enable FAIR (findable, accessible, interoperable, and reusable) data – a concept first developed by Force11.org – on a large scale. The partnership currently includes AGU, the Earth Science Information Partners, and Research Data Alliance, and has support from the Proceedings of the National Academy of Sciences, Nature, Science, AuScope, National Computational Infrastructure of Australia, the Australian National Data Service, and the Center for Open Science.
  • Four key areas for Data Fair at AGU17 are: Emerging Sources of Scientific Data, Data Planning, Data Capacity Building, and Author and Reviewer Data Management Best Practices
  • This is a great way for the data/information professionals to reach out to the researchers.
  • AGU17 will also feature Data Help Desk for the first time (an event suggested by Ruth Duerr).
  • Three available formats are: workshop, demo, and reference.
  • F = Findable
  • A = Accessible
  • I = Interoperable
  • R = Reusable
  • The FAIR principles are in line with AGU’s position statement on data.
  • In short, data is the world’s heritage.
  • Well managed data results in better science, but many stakeholders are part of this ecosystem, and all need to participate and contribute.
  • Other related activities include: Transparency and Openness Promotion guidelines, COPDESS.org, and Joint Declaration of Data Citation Principles.
  • AGU’s “Enabling FAIR Data” project has a few key objectives (truncated versions are shown below):
  • For Earth and Space science publications to cite data items that are deposited.
  • Sufficient essential documentation is included.
  • Repositories support data citation via persistent identifiers.
  • Repositories support publication peer review.
  • Leading publishers and repositories implement the recommendations and guidelines.
  • Researchers have a common experience when submitting their paper.
  • The Earth and space science community begins the culture change to support open and FAIR data.
  • This project is community drive, so it is crucial for community members to continue to participate.
  • To participate in a Targeted Adoption Group (TAG) from the AGU’s “Enabling FAIR Data” project: https://osf.io/jy4d9
  • To stay informed: http://www.copdess.org


4) Open discussion: Any DS-related sessions/activities coming up at AGU or other conferences (e.g., AGU Data Help Desk sessions)?