Talkoot: Discover, Tag, Share, and Reuse Collaborative Science Workflows
A small but growing number of scientists and researchers are beginning to harness Web 2.0 technologies as a transformative way of doing science. Since communication is at the heart of science, these technologies provide researchers easy mechanisms to critique, suggest, and share ideas, data and algorithms. These technologies complement formal means of sharing knowledge via conferences and published papers, where it is impossible to share all the research details, and where negative results are rarely included. At the same time, science software developers have embraced the paradigm of Service Oriented Architectures (SOA). Data processing, analysis, mining and visualization algorithms are being converted into publicly available web services, allowing researchers access to large suites of algorithms for data processing and science analysis. This model of chaining services to create analysis workflows provides the research community unprecedented opportunity to collaborate, sharing their workflows with one another, reproducing and analyzing research results, and leveraging colleagues’ expertise to expedite the process of scientific knowledge discovery. In many cases, the output of one workflow can be an input to others, leading to chained workflows with components shared by two or more researchers.
A crucial component still missing to foster this unprecedented level of cooperation within the research community is a reusable, extensible and customizable environment for building collaborative “open science” portals for managing these shared analysis workflows. Current collaborative portals have been one-time development efforts for specific science domains that cannot be easily extended beyond their initial features or reused by other science domains. As part of a current NASA-ACCESS project, we are developing “Talkoot” (the Finnish word for “barn raising”) - a customizable software appliance to build collaborative portals for Earth Science services and analysis workflows. Talkoot will allow researchers (not just information technologists) to build collaborative sites around service workflows within a few hours. Talkoot is leveraging Drupal, an open architecture platform to provide the core Content Management System capabilities required by an online collaborative portal. Drupal also has a vast array of specialized modules that have been developed by its user community to provide additional features. Talkoot is adding Earth science specific modules to provide data searching, processing, analysis and mining capabilities. We will demonstrate the current prototype of Talkoot, which allows users to: create research experiments; select data sets; create data processing/analysis/mining workflows; execute these workflows on remote compute resources; and share both the results and the workflows with other users. Additional Talkoot features that are being developed or are planned will also be presented.