Difference between revisions of "ToolMatch"
(Created page with "= Problem Statement and Use Case = For a given dataset, it is difficult to find the tools that can be used to work with the dataset. In many cases, the information that Tool A w...") |
|||
Line 10: | Line 10: | ||
= Requirements = | = Requirements = | ||
− | * Datasets should be identifiable either through GCMD DIF ID or DOI | + | * Tools can be either downloadable tools or online services |
− | * A | + | * Datasets should be identifiable either through GCMD DIF ID or DOI. |
+ | * Reformatted data and reformatting services (WCS, OPeNDAP) should be considered in compatibility. | ||
+ | * A simple User interface should provide the ability to search for tools compatible with a certain dataset | ||
+ | * Users should be able to see a brief description of the tool. | ||
+ | * Users should be presented with a website for the tool in the search results. |
Revision as of 13:10, January 9, 2012
Problem Statement and Use Case
For a given dataset, it is difficult to find the tools that can be used to work with the dataset. In many cases, the information that Tool A works with Dataset B is somewhere on the Web, but not in a readily identifiable or discoverable form. In other cases, particularly more generalized tools, the information does not exist at all, until somebody tries to use the tool on a given dataset.
Thus, the simplest, most prevalent use case is for a user to search for the tools that can be used with a given dataset. A further refinement would be to specify what the tool can do with the dataset, e.g., read, visualize, map, analyze, reformat.
Proposed Solution
Often, whether a tool is likely to work with a dataset can be inferred through simple rules. For example, knowing that a data is available in netcdf/CF1 and is on a lat/long grid is typically sufficient to infer the data can be viewed through Panoply. Secondly, the problem lends itself to crowdsourcing: once one user has found a given tool to be usable with a given dataset, this holds true for all users, and so the information should be promulgated.
We propose the construction of RDF triples that record the fact that a tool works with a particular dataset. It would be based on a simple ontology, with minimal information about the dataset (enough to uniquely identify it and present it as an option in a user interface). There would be slightly more information captured for the tool. A simple user interface would allow a user to select a dataset or paste in a unique dataset identifier.
Requirements
- Tools can be either downloadable tools or online services
- Datasets should be identifiable either through GCMD DIF ID or DOI.
- Reformatted data and reformatting services (WCS, OPeNDAP) should be considered in compatibility.
- A simple User interface should provide the ability to search for tools compatible with a certain dataset
- Users should be able to see a brief description of the tool.
- Users should be presented with a website for the tool in the search results.