FederatedSearchUseCases

From Earth Science Information Partners (ESIP)

Volcanic Eruption Dataset Search

Try to find all the aerosol data for the May 2009 eruption of Chaiten in Chile in order to study the extent in space and time of the ash cloud. For this, a number of possible data types could be useful: gridded data, high-resolution swath, model output, experimental products... Candidate data sources include: ASDC, CNES, GES DISC, DataFed, MRDC (aka LAADS), NCAR, Principal Investigators. These datasets and sources can be located in a number of different directories:

  • GCMD
  • GEOSS
  • GOS
  • Mercury
  • DataFed

However, these directories are all different in access mode, structure, etc. More to the point, however, the file-level inventories are even more widely dispersed:

  • ECHO
  • Mercury (for some datasets)
  • DataFed
  • Individual data centers

All of these have very different modes of searching and accessing the data. With current practice, a user would have to first learn multiple user interfaces, then execute different searches in each one, somehow manually integrating the results. Worse, some inventories do not offer any search capability (esp. experimental products distributed by individual scientists.) What we want is to execute a single space-time query to all of the data sources at once, from a single interface. In fact, we would like to be able to integrate this query capability into our normal analysis tool so that we need not jump back and forth between our web browser, FTP client and analysis tool. In this scenario, the user:

  1. selects the space and time area in the analysis tool map selection
  2. types in "aerosols" in a keyword blank
  3. receives an integrated, aggregated results list encompassing data from all the possible useful sources, with the URLs to the data files
  4. selects the data to be used
  5. has the data show up in the tool