Difference between revisions of "Documentation Cluster Minutes 2016-06-27"
From Earth Science Information Partners (ESIP)
(Created page with " ==Attendees== Anna Milan, John Kozimor, Lindsay Powers, Sean Gordon, Tyler Stevens, Annie Burgess, Sean Gordon ==Agenda== ===Presentation: Materials Science Data Management ...") |
(,) |
||
Line 5: | Line 5: | ||
==Agenda== | ==Agenda== | ||
===Presentation: Materials Science Data Management Initiatives at NIST by Bob Hanisch === | ===Presentation: Materials Science Data Management Initiatives at NIST by Bob Hanisch === | ||
+ | * Office of Data and Informatics | ||
+ | ** Standard reference data - undertaking modernization of apps and interfaces, all the metadata goes to data.gov, do charge for some of the reference data is for fee | ||
+ | ** Research data - NIST data portal | ||
+ | ** Data Science - informatics and analytics | ||
+ | ** Community - research data alliance, work with Network of National Metrology Institutes and BIPM | ||
+ | |||
+ | * Key ODI activities | ||
+ | ** 2 years ago - the practice was haphazard - trying to improve the data infrastructure | ||
+ | ** Materials Genome Initiative is a major stakeholder | ||
+ | |||
+ | * Goals | ||
+ | ** FAIR principles: Find, Access, Interoperable, Reusable | ||
+ | |||
+ | * Find | ||
+ | * acceleratornetwork.org/mse-challenge | ||
+ | * would like to go to a service to find "who as data on X or Y" | ||
+ | * Example from Astronomy: VAO | ||
+ | |||
+ | * Materials Resource Registry (MRR) | ||
+ | ** "not going to take over the world" with federated system | ||
+ | ** using OAI-PMH | ||
+ | ** challenge is defining metadata fields and terminology | ||
+ | ** new international WG with RDA to define new metadata schema that is appropriate | ||
+ | ** demo of keyword search with facet | ||
+ | ** draft of metadata terms | ||
+ | |||
+ | * Federated Architecture | ||
+ | |||
+ | * will be at the next RDA plenary in Denver | ||
+ | |||
+ | * MGI Code Catalog | ||
+ | ** will integrate 50-60 entries into the MRR | ||
+ | ** Metadata Schema that is used to describe the software - coding language, documentation, | ||
+ | |||
+ | * Standard Reference Data (SRD) | ||
+ | ** 1968 act | ||
+ | ** copyright | ||
+ | ** cost recovery | ||
+ | ** "UI Anarchy" | ||
+ | ** Socrata is a nice platform for display tabular data - APIs will help streamline | ||
+ | |||
+ | * materialsdata.nist.gov | ||
+ | ** DSpace - communities within Dspace can be public or private (most are private) | ||
+ | ** 20 communities currently | ||
+ | ** | ||
+ | |||
+ | * Work closely with National Data Service (NDS) | ||
+ | ** NDS Labs environment allows sharing --- Docker Containers... | ||
+ | |||
+ | * NDS Materials Data Facility | ||
+ | * provides capability to link data to analysis | ||
+ | * NIST is strict about what can be deployed by NIST and and shared | ||
+ | * basic metadata capability - trying to ensure that it's interoperable with their other metadata | ||
+ | |||
+ | === Interoperate=== | ||
+ | * Materials Data Curation System (MDCS) | ||
+ | ** python, MongoDB, SPARQL, XML schema | ||
+ | ** documenting actual DATA - not just collections | ||
+ | ** HUGE challenges: stored in 140 different formats, no common schemas, proprietary in nature (e.g. Vendor specific) | ||
+ | ** curator is breaking down these barriers | ||
+ | ** 3 steps to curate | ||
+ | ** Can create own templates, but try to encourage re-use of existing | ||
+ | ** There is nothing specific to materials, but can be used to describe any research domain | ||
+ | ** 3 steps to export | ||
+ | ** REST API - supports automated capture | ||
+ | |||
+ | * some things to think about | ||
+ | ** Quality metadata is KEY | ||
+ | *** metadata curation is non-trivial, can be costly | ||
+ | ** "whatever you do, you can always do more" | ||
+ | ** Important to address Interoperability at the proper scale | ||
+ | *** too wide vs too narrow - important to cast the net at the appropriate scale | ||
+ | * will often start with DC or DataCite and then add enough of that to support domain specific. If get too detailed - then no-one takes time to develop content. | ||
+ | ** Standards require community participation to assure take-up -- national, international.. | ||
+ | |||
+ | == Q&A == | ||
+ | * LP: HDF has a collaborative forum with RDA on data formats. Finally trying to get a handle on who the HDF community is. Do any of these communities use HDF? | ||
+ | ** BH: not aware of any HDF use. Trying to get requirements for standardization, transparent, open-based format of instrument...? | ||
+ | |||
+ | * LP: are you hosting the code in the code catalog? Or just encourage publication and pointing to where ever it is hosted. | ||
+ | ** BH: the latter, unless we have developed the code. No code validation. | ||
+ | |||
+ | * AM: what is software metadata schema? | ||
+ | ** BH: was kind of ad hoc and started before he came, developed in house, but aware of Force 11 efforts. |
Revision as of 11:54, June 27, 2016
Attendees
Anna Milan, John Kozimor, Lindsay Powers, Sean Gordon, Tyler Stevens, Annie Burgess, Sean Gordon
Agenda
Presentation: Materials Science Data Management Initiatives at NIST by Bob Hanisch
- Office of Data and Informatics
- Standard reference data - undertaking modernization of apps and interfaces, all the metadata goes to data.gov, do charge for some of the reference data is for fee
- Research data - NIST data portal
- Data Science - informatics and analytics
- Community - research data alliance, work with Network of National Metrology Institutes and BIPM
- Key ODI activities
- 2 years ago - the practice was haphazard - trying to improve the data infrastructure
- Materials Genome Initiative is a major stakeholder
- Goals
- FAIR principles: Find, Access, Interoperable, Reusable
- Find
- acceleratornetwork.org/mse-challenge
- would like to go to a service to find "who as data on X or Y"
- Example from Astronomy: VAO
- Materials Resource Registry (MRR)
- "not going to take over the world" with federated system
- using OAI-PMH
- challenge is defining metadata fields and terminology
- new international WG with RDA to define new metadata schema that is appropriate
- demo of keyword search with facet
- draft of metadata terms
- Federated Architecture
- will be at the next RDA plenary in Denver
- MGI Code Catalog
- will integrate 50-60 entries into the MRR
- Metadata Schema that is used to describe the software - coding language, documentation,
- Standard Reference Data (SRD)
- 1968 act
- copyright
- cost recovery
- "UI Anarchy"
- Socrata is a nice platform for display tabular data - APIs will help streamline
- materialsdata.nist.gov
- DSpace - communities within Dspace can be public or private (most are private)
- 20 communities currently
- Work closely with National Data Service (NDS)
- NDS Labs environment allows sharing --- Docker Containers...
- NDS Materials Data Facility
- provides capability to link data to analysis
- NIST is strict about what can be deployed by NIST and and shared
- basic metadata capability - trying to ensure that it's interoperable with their other metadata
Interoperate
- Materials Data Curation System (MDCS)
- python, MongoDB, SPARQL, XML schema
- documenting actual DATA - not just collections
- HUGE challenges: stored in 140 different formats, no common schemas, proprietary in nature (e.g. Vendor specific)
- curator is breaking down these barriers
- 3 steps to curate
- Can create own templates, but try to encourage re-use of existing
- There is nothing specific to materials, but can be used to describe any research domain
- 3 steps to export
- REST API - supports automated capture
- some things to think about
- Quality metadata is KEY
- metadata curation is non-trivial, can be costly
- "whatever you do, you can always do more"
- Important to address Interoperability at the proper scale
- too wide vs too narrow - important to cast the net at the appropriate scale
- Quality metadata is KEY
- will often start with DC or DataCite and then add enough of that to support domain specific. If get too detailed - then no-one takes time to develop content.
- Standards require community participation to assure take-up -- national, international..
Q&A
- LP: HDF has a collaborative forum with RDA on data formats. Finally trying to get a handle on who the HDF community is. Do any of these communities use HDF?
- BH: not aware of any HDF use. Trying to get requirements for standardization, transparent, open-based format of instrument...?
- LP: are you hosting the code in the code catalog? Or just encourage publication and pointing to where ever it is hosted.
- BH: the latter, unless we have developed the code. No code validation.
- AM: what is software metadata schema?
- BH: was kind of ad hoc and started before he came, developed in house, but aware of Force 11 efforts.