Sustainable Data Management/20160610 telecon notes

From Earth Science Information Partners (ESIP)

For recorded session, see: https://esip.sharefile.com/fo871c56-e32f-42e4-a08b-5d8988e0f041

Log in as a guest; search for sessions beginning with the string "SustDM"

Contact ESIP staff or committee chair for help.

Agenda

ROI progress
---
---
ESIP summer meeting prep (content due 6/21)
Poster for Summer meeting
Draft here: link tbd (ideally, margaret will have a mockup, but will be soliciting content)
Current sessions:
last month's notes, 2 sessions
Summary Session: http://commons.esipfed.org/node/9154
plans:
ROI: http://commons.esipfed.org/node/9133
Plans:
tbd?
landscape: http://commons.esipfed.org/node/9139
Speakers:
COPDESS-Re3Data, Registry of Repositories (Kerstin Lehnert, tbd -- Cyndy emailed 6/16)
LTER, Perspectives from Researchers (Margaret O'Brien, confirmed)
Data ONE, Perspectives of aggregators (Matt Jones, confirmed)

Attending

  • Corinna Gries
  • Margaret O'Brien
  • Matt Jones
  • Cyndy Parr
  • Shelley Stall

Notes

Our ESIP sessions, July

http://commons.esipfed.org/taxonomy/term/2206

  • ROI: no ROI people on this call, they are organizing themselves independently.
  • Infrastructure group: Before we can do an analysis of gaps, would like to know the landscape.
  • Landscape discussion follows.

Landscape = understand what repositories are actually capturing.

repository registries advertise, but are very broad. COPDESS approach was to expand so that their researchers could determine which were best for their requirement to publish data long with papers.

systems like re3data have adequate schemas, but their vocabs could use work.

Continuum: 3 perspectives: tightly coupled > loose: LTER, dataone, re3data Another organization scheme: what do registries record now? what do researchers ask for? What services does infrastructure enable?

Speakers:

  • Kerstin Lehnert: COPDESS-Re3DAta: Registry of Repositories
  • Margaret OBrien: Perspectives from Researchers
  • Matt Jones: Data ONE: Perspectives of aggregators

5 Questions for discussion

  • How open are each of these repositories?
  • What will it take so can we get more data in there?
  • How to guide people to the right ones?
  • Additional fields to add to the registry to help? E.g. Certifications.
  • Any obvious gaps in services that we know of?

Cindy: Kirsten likely to address mostly the first 3, margaret #2/3, Matt: the end - the missing parts to link them together? We suspect that researchers are not finding the right repo by going to the registry. So these questions will be partly answered by the speakers.

What outcomes do we want?

Outcome A: set of recommendations for registries (they are open to suggestions). Eg, as COPDESS become more connected to datacite, more so.

1. fields to add to the repo registration. eg, dataone is focused on an API (machine services), not as much descriptive metadata content about repos. but machine-interop is not the kind of data that re3data collects. they cannot tap into re3data info (e.g., re3data mints ids for repos, and boundaries are unclear (e.g., both LTER and HJ Andrews are listed for LTER, so re3data misses that one is a superset of the other). e.g., D1 would like to add to the left hand panel in a display like this: https://search.dataone.org/#profile/KNB

There are other repository registries too, but re3data getting bigger, gaining momentum. https://biosharing.org/databases/ eg, with info on repos, and also on standards.

2. transparent mechanisms for updating. some do not appear to be community maintained. there is strong curation element, at least at re3data, but if this doesn't happen, then entries get stale.

Outcome B - something for last (summary) session on Friday (10:xx, 90 min) Report out from A


Shelley presented this vis. Very useful for organization. She is willing to prep more material. Different roles that impact a data lifecycle. https://www.lucidchart.com/documents/edit/b08c641e-7da2-4a03-82fe-ee808796a78c/0?shared=true&


CDF: also meets on Friday, all day. There is a good argument for combining these efforts, or encouraging CDF to use sustainabledm cluster. CDF is earth-cube, ESIP is broader. Some people will not be available for sustDM summary session if they are at CDF.


Action items

  • Margaret -Start a poster on the cluster- 3 panels, one for each of the 3 focus groups, link together somehow.
  • Cindy Parr: Take over organization of the landscape session
  • Corinna: contact Tim Ahern about adding sustanableDM to CDF agenda.
  • _____: list of registries we are interested in(?)