Revision as of 13:10, January 9, 2012

Problem Statement and Use Case

For a given dataset, it is difficult to find the tools that can be used to work with the dataset. In many cases, the information that Tool A works with Dataset B is somewhere on the Web, but not in a readily identifiable or discoverable form. In other cases, particularly more generalized tools, the information does not exist at all, until somebody tries to use the tool on a given dataset.

Thus, the simplest, most prevalent use case is for a user to search for the tools that can be used with a given dataset. A further refinement would be to specify what the tool can do with the dataset, e.g., read, visualize, map, analyze, reformat.

Proposed Solution

Often, whether a tool is likely to work with a dataset can be inferred through simple rules. For example, knowing that a data is available in netcdf/CF1 and is on a lat/long grid is typically sufficient to infer the data can be viewed through Panoply. Secondly, the problem lends itself to crowdsourcing: once one user has found a given tool to be usable with a given dataset, this holds true for all users, and so the information should be promulgated.

We propose the construction of RDF triples that record the fact that a tool works with a particular dataset. It would be based on a simple ontology, with minimal information about the dataset (enough to uniquely identify it and present it as an option in a user interface). There would be slightly more information captured for the tool. A simple user interface would allow a user to select a dataset or paste in a unique dataset identifier.

Requirements

Tools can be either downloadable tools or online services
Datasets should be identifiable either through GCMD DIF ID or DOI.
Reformatted data and reformatting services (WCS, OPeNDAP) should be considered in compatibility.
A simple User interface should provide the ability to search for tools compatible with a certain dataset
Users should be able to see a brief description of the tool.
Users should be presented with a website for the tool in the search results.

@@ Line 10: / Line 10: @@
 = Requirements =
-* Datasets should be identifiable either through GCMD DIF ID or DOI
+* Tools can be either downloadable tools or online services
-* A sipmle interface
+* Datasets should be identifiable either through GCMD DIF ID or DOI.
+* Reformatted data and reformatting services (WCS, OPeNDAP) should be considered in compatibility.
+* A simple User interface should provide the ability to search for tools compatible with a certain dataset
+* Users should be able to see a brief description of the tool.
+* Users should be presented with a website for the tool in the search results.

Difference between revisions of "ToolMatch"

Revision as of 13:10, January 9, 2012

Problem Statement and Use Case

Proposed Solution

Requirements