Difference between revisions of "Documentation Cluster Minutes 2016-06-27"

From Federation of Earth Science Information Partners
Line 1: Line 1:
Anna Milan, John Kozimor, Lindsay Powers, Sean Gordon, Tyler Stevens, Annie Burgess, Sean Gordon
Anna Milan, John Kozimor, Lindsay Powers, Sean Gordon, Tyler Stevens, Annie Burgess, Sean Gordon, Aaron Sweeney, Paul Lemieux

Revision as of 11:55, June 27, 2016


Anna Milan, John Kozimor, Lindsay Powers, Sean Gordon, Tyler Stevens, Annie Burgess, Sean Gordon, Aaron Sweeney, Paul Lemieux


Presentation: Materials Science Data Management Initiatives at NIST by Bob Hanisch

  • Office of Data and Informatics
    • Standard reference data - undertaking modernization of apps and interfaces, all the metadata goes to data.gov, do charge for some of the reference data is for fee
    • Research data - NIST data portal
    • Data Science - informatics and analytics
    • Community - research data alliance, work with Network of National Metrology Institutes and BIPM
  • Key ODI activities
    • 2 years ago - the practice was haphazard - trying to improve the data infrastructure
    • Materials Genome Initiative is a major stakeholder
  • Goals
    • FAIR principles: Find, Access, Interoperable, Reusable
  • Find
  • acceleratornetwork.org/mse-challenge
  • would like to go to a service to find "who as data on X or Y"
  • Example from Astronomy: VAO
  • Materials Resource Registry (MRR)
    • "not going to take over the world" with federated system
    • using OAI-PMH
    • challenge is defining metadata fields and terminology
    • new international WG with RDA to define new metadata schema that is appropriate
    • demo of keyword search with facet
    • draft of metadata terms
  • Federated Architecture
  • will be at the next RDA plenary in Denver
  • MGI Code Catalog
    • will integrate 50-60 entries into the MRR
    • Metadata Schema that is used to describe the software - coding language, documentation,
  • Standard Reference Data (SRD)
    • 1968 act
    • copyright
    • cost recovery
    • "UI Anarchy"
    • Socrata is a nice platform for display tabular data - APIs will help streamline
  • materialsdata.nist.gov
    • DSpace - communities within Dspace can be public or private (most are private)
    • 20 communities currently
  • Work closely with National Data Service (NDS)
    • NDS Labs environment allows sharing --- Docker Containers...
  • NDS Materials Data Facility
  • provides capability to link data to analysis
  • NIST is strict about what can be deployed by NIST and and shared
  • basic metadata capability - trying to ensure that it's interoperable with their other metadata


  • Materials Data Curation System (MDCS)
    • python, MongoDB, SPARQL, XML schema
    • documenting actual DATA - not just collections
    • HUGE challenges: stored in 140 different formats, no common schemas, proprietary in nature (e.g. Vendor specific)
    • curator is breaking down these barriers
    • 3 steps to curate
    • Can create own templates, but try to encourage re-use of existing
    • There is nothing specific to materials, but can be used to describe any research domain
    • 3 steps to export
    • REST API - supports automated capture
  • some things to think about
    • Quality metadata is KEY
      • metadata curation is non-trivial, can be costly
    • "whatever you do, you can always do more"
    • Important to address Interoperability at the proper scale
      • too wide vs too narrow - important to cast the net at the appropriate scale
  • will often start with DC or DataCite and then add enough of that to support domain specific. If get too detailed - then no-one takes time to develop content.
    • Standards require community participation to assure take-up -- national, international..


  • LP: HDF has a collaborative forum with RDA on data formats. Finally trying to get a handle on who the HDF community is. Do any of these communities use HDF?
    • BH: not aware of any HDF use. Trying to get requirements for standardization, transparent, open-based format of instrument...?
  • LP: are you hosting the code in the code catalog? Or just encourage publication and pointing to where ever it is hosted.
    • BH: the latter, unless we have developed the code. No code validation.
  • AM: what is software metadata schema?
    • BH: was kind of ad hoc and started before he came, developed in house, but aware of Force 11 efforts.