Analytics Tools

From Federation of Earth Science Information Partners

< Back to ESDA Home


Analytics Tools/Techniques Analytics URLs Description Use Case Reference
DISC new start www.pdl.cmu.edu/DISC 'We are formulating a plan'..."Data-Intensive Super Computing" (DISC) systems. DISC systems differ from conventional supercomputers in their focus on data: they acquire and maintain continually changing data sets, in addition to performing large-scale computations over the data.
Dryad new start http://research.microsoft.com/en-us/projects/dryad The Dryad Project is investigating programming models for writing parallel and distributed programs to scale from a small cluster to a large data-center.
MapReduce parallelization http://labs.google.com/papers/mapreduce.html 'MapReduce is the programming paradigm, popularized by Google, which is widely used for processing large data sets in parallel. Its salient feature is that if a task can be formulated as a MapReduce, the user can perform it in parallel without writing any parallel code.
Hadoop parallelization http://hadoop.apache.org The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
OpenCyc knowledgebase www.opencyc.org EnterpriseCyc (or ECyc for short) is a commercial-grade, fully supported version of the knowledge base and reasoning technology, suitable for developing, deploying, and managing applications in an enterprise setting. The OpenCyc Platform is your gateway to the full power of Cyc, the world's largest and most complete general knowledge base and commonsense reasoning engine. OpenCyc contains hundreds of thousands of Cyc terms organized in a carefully designed ontology. Cycorp offers this ontology at no cost and encourages you to make use of, and extend, this ontology rather than starting your own from scratch. OpenCyc can be used as the basis of a wide variety of intelligent applications such as: - rich domain modeling; - semantic data integration; - text understanding; - domain-specific expert systems; - game AIs
Powerset knowledge query www.powerset.com Powerset was working on building a natural language search engine that could find targeted answers to user questions (as opposed to keyword based search)
True Knowledge knowledge query www.trueknowledge.com Evi was founded in August 2005, originally under the name of True Knowledge, with the mission of powering a new kind of search experience where users can access the world’s knowledge simply by asking for the information they need in a way that is completely natural.
WolframAlpha knowledge query with calculations www.wolframalpha.com Alpha introduces a fundamentally new way to get knowledge and answers—not by searching the web, but by doing dynamic computations based on a vast collection of built-in data, algorithms, and methods.
myGrid sharing knowledge www.mygrid.org.uk a suite of tools designed to “help e-Scientists get on with science and get on with scientists”. The tools support the creation of e-laboratories and have been used in domains as diverse as systems biology, social science, music, astronomy, multimedia and chemistry.
UV-CDAT visualization http://uv-cdat.llnl.gov/ UV-CDAT brings together two active projects -- Ultrascale Visualization Climate Data Analysis Tools and Visual Data Exploration and Analysis of Ultra-large Climate Data, with the intent to deliver new capabilities to the climate-science community. This project’s vision is to provide large-scale visualization and analysis for both observational and model-generated climate data, with the goal of delivering new capabilities into the hands of the climate scientists. The integrated software product, the Ultrascale Visualization Climate Data Analysis Tools (UV-CDAT), is intended to be a powerful and complete front-end to a rich set of visual-data exploration and analysis capabilities well suited for climate-data analysis problems. UV-CDAT builds on the following key technologies: the Climate Data Analysis Tools (CDAT) framework; ParaView; VisTrails; and VisIt
ClimatePipes visualization ClimatePipes is a web-based application platform/"IDE" for science data analysis. It can be used to create and run analysis workflows and visualizations.
Example Example Example Example
Example Example Example Example
Example Example Example Example
Example Example Example Example
Example Example Example Example
Example Example Example Example
Example Example Example Example
Example Example Example Example
Example Example Example Example