Discovering data

From Earth Science Information Partners (ESIP)

Discovery (Note: this case is still being developed)

How to determine which dataset you want based on several similar looking collections

  • distinguishing between different types of datasets
    • descriptions aren’t always helpful
    • crawling catalog links, automated harvesting
    • determining the popularity of the data for research
    • “yelp” for collections, “this data was useful” or “this data is crappy”
    • maybe use:

Using DOIs for bidirectional links to research/publications and back to originating data

  • Interesting Questions:
    • What has this data been used for in research?
    • What data was used for this research, can I reproduce this work?
  • Dryad (
    • it involves submitting actual data, not pointers to the data
  • Including DOI creation and citation as part of the process for data (not just publications)
    • how to enforce this?
    • incentives for data producers vs providers
    • feeds back into the rating system