https://wiki.esipfed.org/w/api.php?action=feedcontributions&user=157.55.17.199&feedformat=atomEarth Science Information Partners (ESIP) - User contributions [en]2024-03-29T11:23:11ZUser contributionsMediaWiki 1.35.14https://wiki.esipfed.org/w/index.php?title=2012_AGU_ESSI_Session_Ideas&diff=403072012 AGU ESSI Session Ideas2012-07-27T22:25:27Z<p>157.55.17.199: Reverted edits by Dlm (talk) to last revision by Kmoe</p>
<hr />
<div>This wiki is intended to help the AGU Earth and Space Science Informatics (ESSI) Focus Group collaborate on themes and topics for the [http://fallmeeting.agu.org/2012/ December 2012 AGU Fall Meeting]. Please be sure to submit your session proposals to the official AGU Fall Meeting [http://agu-fm12.abstractcentral.com/ Session Proposal Site] by the April 20, 2012 deadline. Additional information on submission policies and guidelines can be found [http://fallmeeting.agu.org/2012/scientific-program/session-proposal-guide/ here]. Recent session titles and statistics from past AGU Informatics sessions can be found [[Past AGU ESSI Statistics| here]]. (add link to child page with the text below)<br />
<br />
'''How to add content:''' To contribute to an idea to this page, [http://wiki.esipfed.org/index.php?title=Special:UserLogin&returnto=2012_AGU_Session_Ideas login] or [http://wiki.esipfed.org/index.php/Special:RequestAccount request an account]. Once logged in, on this page copy the session template below and click the edit tab. Paste the template into the wiki text box and then fill out the requested information. <br />
<br />
'''Session template:''' <br />
=== Replace with Suggested Session Title=== <br />
* Description: <br />
* Name/Contact: <br />
* Others interested in similar session? If you are interested in co-convening or support this session add your name here. <br />
<br />
==Session Ideas==<br />
<br />
===Earth and Space Science Informatics General Contributions===<br />
* Description: Each AGU section and focus group has a general purpose session where members can submit abstracts when their work does not appear to fit in the other session topics. <br />
* Name/Contact: Karen Moe/karen.moe@nasa.gov<br />
* Others interested in similar session?<br />
<br />
===Environmental Sensor networks and informatics===<br />
* Description: sensor network innovations and deployments plus closely related informatics<br />
*Contacts: Kirk Martinez, Jane Hart, Steve Foley<br />
<br />
===Linked Data for Earth and space science===<br />
* Description: The interdisciplinary nature of science is leading to an increasing need to integrate data from multiple sources. Linked Data is a methodology that addresses this problem and one that is becoming increasing popular within the Earth and space sciences. This session aims to discuss the range of research approaches leveraging Linked Data. Submissions are encouraged in, but not limited to:<br />
** Outcomes of using Linked Data within the Earth and space sciences<br />
** Discussions of open source tools that aid/facilitate Linked Data<br />
** Reusable scientific vocabularies <br />
** Methods for computing similarity across linked data sets<br />
** Entity Disambiguation <br />
** User Interfaces/User interactions with Linked Data<br />
*Contacts: Tom Narock (Thomas.W.Narock@nasa.gov), Eric Rozell (rozele@rpi.edu)<br />
<br />
===Data and Service Brokering: mediating interactions across diverse resources in network-based systems=== <br />
* Description: Papers will be solicited that describe implementations of the brokering architectural style in which brokering services (mediation, services composition, distribution, etc.) facilitate the interconnection of clients and servers without the need for modifications on the part of the owners of those services.<br />
* Name/Contact: SiriJodha Khalsa (sjsk@nsidc.org), Stefano Nativi, Jay Pearlman<br />
* Others interested in similar session? If you are interested in co-convening or support this session add your name here.<br />
<br />
===Knowledge Networks and Collaborative Platforms in the Earth Sciences===<br />
<br />
*Description: Increasingly interdisciplinary Earth science research requires infrastructures that can support knowledge integration on larger scales. This session will discuss novel use or case studies of cyberinfrastructure to facilitate collaboration, community building, governance, or knowledge sharing. Examples of possible submissions include:<br />
<br />
**Hubs or forges for Earth science software development, particularly if they integrate multiple projects<br />
**Knowledge sharing platforms deploying social networking tools<br />
**Portals designed to promote community participation in integrating models, data, or knowledge<br />
**Name/Contact: Sylvia Murphy (sylvia.murphy@noaa.gov), Paul Edwards (pne@umich.edu)<br />
**Others interested in similar session?<br />
<br />
=== Data Prospecting, Exploration and Mining – “big data” exploitation challenges and applications in Earth Science ===<br />
<br />
There are typically two categories of data analysis, namely, data exploration and data mining. Data exploration focuses on manual methods brought to bear on data analysis such as standard statistical analysis and visualization. Data exploration usually requires small datasets. Data mining, on the other hand, is defined as "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data" (Fayyad et al, 2008). Data Mining uses automated algorithms to extract useful information. Humans guide these automated algorithms and specify algorithm parameters (training samples, clustering size, etc.). Large datasets typically require data mining.<br />
<br />
A new approach for exploiting "big data" is now possible with the availability of high performance computing and the advent of new techniques for efficient distributed file access. This new approach coined as “data prospecting” combines methods from both data exploration and mining. Just as prospecting focuses on locating the site within the vast land and determining the type of deposit that is located at that site. Data prospecting focuses on finding the right subset of data amongst all the data files and determining the value of the information contained within the subset. Papers on Web-initiated high-volume computational intensive data analysis capabilities on distributed peta-scale data archives extracting information at the source are also being sought. Such papers may include non-linear dynamics in search of signals within climate archives. <br />
<br />
This session invites talks focusing on applications and challenges of exploiting “big data” using different data exploration, prospecting and mining approaches. Talks on tools addressing any of these topics are also welcome.<br />
<br />
Name/Contacts - Rahul Ramachandran (rahul.ramachandran@uah.edu), Sara Graves, Glenn K. Rutledge (glenn.rutledge@noaa.gov), and Kwo-sen Kuo<br />
<br />
=== NASA Open Source Summit for Science Data Systems === <br />
<br />
The consumption and production of open source software (OSS) is a widespread meme in the science data systems domain. These systems are long-lived, and are responsible for collecting, processing, distributing, discovering, reusing, and preserving scientific data. Open source components and software help NASA construct these systems at low-cost, and at low-risk, for Earth science, planetary science, astronomy and a number of other scientific domains. There are many opportunities for fostering broader community support, for exchanging ideas, and for collaboratively developing the next generation of science data systems in the open source domain. In this proposed session, we encourage reports of case studies, lessons learned, theories, experiences, challenges, and related topics in the development and use of OSS for scientific data systems. This session builds on last year’s widely successful oral AGU session IN30: Software Reuse and Open Source Software in Earth Science.<br />
<br />
Name/Contacts - Chris A. Mattmann and Robert R. Downs<br />
<br />
=== Data Scientists Come of Age ===<br />
<br />
A continuation of "Rise of the Data Scientist" from 2011.<br />
<br />
Many earth scientists are adapting their skills to effectively manage and curate complex digital data. Such skills are becoming the domain of data professionals, but such people rarely have an understanding of some special needs of earth science data. They are data scientists: people who are both specialists in data management and also have domain expertise on earth science data structures, formats, vocabularies, ontologies, etc. New programs are emerging in universities to develop and train such people. These experts and those connected with them, are now developing formal professional structures to enable sharing of expertise and more importantly gain formal recognition, promotion and stable career paths. We seek contributions from these scientists.<br />
<br />
Name - Peter Fox and others welcome<br />
<br />
=== Technology Enabling Earth Science from Big Data to Small Satellites ===<br />
<br />
New technology will play a key role in enabling future Earth-observing missions. We welcome abstracts from scientists and technologists in the areas of satellite systems, advanced data processing and management that utilize information system advances to enable the scientific objectives of the NRC decadal survey for NASA and NOAA. The informatics domain stretches from techniques to capture, store, access, & analyze very large remote sensing data to technology for satellite control & dynamic sensor processing. Areas of interest include autonomy, onboard computing, data mining/fusion/assimilation, software tools & services, OSSE, uncertainty analysis, sensor webs, and community frameworks.<br />
<br />
Name/Contacts - Karen Moe, Jacqueline LeMoigne, Charles Norton</div>157.55.17.199https://wiki.esipfed.org/w/index.php?title=Agreed_Items_of_Discussion_on_Units&diff=40299Agreed Items of Discussion on Units2012-07-27T14:13:29Z<p>157.55.17.199: Reverted edits by EsipSysop (talk) to last revision by Cpufreak04</p>
<hr />
<div></div>157.55.17.199https://wiki.esipfed.org/w/index.php?title=Interagency_Data_Stewardship/Identifiers/UseCases&diff=40291Interagency Data Stewardship/Identifiers/UseCases2012-07-27T09:49:37Z<p>157.55.17.199: Reverted edits by Rduerr (talk) to last revision by Ctilmes</p>
<hr />
<div>Back to the [[Interagency_Data_Stewardship/Identifiers | Identifiers Testbed]] home page.<br />
<br />
== General Use Case ==<br />
<br />
An archive is responsible for a Data Type (in NASA EOS parlance, an ESDT). Call it Datatype '''DT'''. It has processed that data twice, each with a different version (either a major algorithm update, calibration update, or a similar update to the version of one of its inputs). Call those versions 1 and 2 (in NASA EOS parlance, Collection 1 or Collection 2). This therefore results in two Data Sets, '''DT1''' and '''DT2'''. Assume that '''DT1''' is a closed data set and will have no further changes, but '''DT2''' is an open data set, so could have additional granules added to it, or older broken granules removed or replaced.<br />
<br />
The archive maintains a database of metadata, and can present a web page of information for the data type '''DT''', as well as for each of the data sets '''DT1''' and '''DT2'''. To support that, it can produce URLs into its web site for each of those entities. That page can include textual information, structured metadata, links to download the data, etc.<br />
<br />
The curator does '''[#1]''' to construct an identifier for either the data type, or the data set, and uses '''[#2]''' mechanism to produce more specific identifiers from which the precise granule membership can be determined.<br />
<br />
On Jan 1, 2010 at 2:00PM UTC, a researcher downloads some data and wants to cite it:<br />
<br />
'''[#3]''' All of the data from '''DT1'''<br />
<br />
'''[#4]''' All of the data from '''DT2'''<br />
<br />
'''[#5]''' Doesn't download any specific data, but refers to the data type in general way.<br />
<br />
At some point in the future the data type is transferred to a new organization. This could happen in a couple of ways:<br />
<br />
'''[#6]''' the whole organization goes away, including the server that hosted the first level of identifiers.<br />
<br />
'''[#7]''' the archive remains active, but a particular data type is transferred to another archive.<br />
<br />
== DOIs ==<br />
<br />
Assume for now we use the [http://crossref.org/ CrossRef] DOIs. <br />
<br />
=== Central Registration === <br />
<br />
A central organization (ESIP Federation?) registers as a CrossRef organization (<tt>10.12345</tt>)and plays 'middle man' to register DOIs for us. The archive curator logs into the ESIP Federation site and enters some basic metadata and registers '''DT''' and gets assigned a DOI. It could also register each of '''DT1''' and '''DT2''' if we want. <br />
<br />
'''[#3]''' Since '''DT1''' is closed, the DOI itself is enough to resolve the specific granule membership.<br />
<br />
Smith, John. ''Some Earth Science Data.'' DT1. DOI: 10.12345/DT1.<br />
<br />
This DOI gets pointed at a URI on the ESIP Fed site, which redirects to the Archive metadata page, something like<br />
<br />
<tt>http://archivea.agency.gov/DT1</tt><br />
<br />
'''[#4]''' Since '''DT2''' is open, the reference must be qualified by something: either a date/time stamp or some other identifier component.<br />
<br />
Smith, John. ''Some Earth Science Data.'' DT2. DOI: 10.12345/DT2, 2010-01-01T14:00:00.<br />
<br />
'''[#5]''' <br />
<br />
Smith, John. ''Some Earth Science Data.'' DT. DOI: 10.12345/DT.<br />
<br />
'''[#6]''' The curator logs into the ESIP Fed site and does a 'global' search/replace on some prefix of its identifiers for the new archive's URI scheme.<br />
<br />
'''[#7]''' The curator logs into the ESIP Fed site and updates the particular database record with the identifiers for the data type that is being moved.<br />
<br />
=== Distributed Registration ===<br />
<br />
Each archive individually joins CrossRef and gets assigned their own organization code and can assign their DOIs themselves. Cases '''[#3]''', '''[#4]''', and '''[#5]''' are similar to central, with one fewer redirection needed to resolve.<br />
<br />
'''[#6]''' and '''[#7]''' The curator for the archive updates the CrossRef database itself to point to the new archive.<br />
<br />
== PURLs ==<br />
<br />
This is very similar to the DOI case, and could be done centralized or distributed. We could either have each organization register for their own "purl.org" prefix, or set up an ESIP Fed. instance of the PURL resolver at something like "purl.esipfed.org" and everything would work identically. (Though we could put more policies/guidelines in place for "purl.esipfed.org" registration.)<br />
<br />
Assume the archive registers and gets the <tt>http://purl.org/net/archivea</tt> prefix.<br />
<br />
'''[#1]''' It assigns these identifiers:<br />
<br />
* <tt>http://purl.org/net/archivea/DT</tt><br />
* <tt>http://purl.org/net/archivea/DT1</tt> (or perhaps <tt>http://purl.org/net/archivea/DT/1</tt>?)<br />
* <tt>http://purl.org/net/archivea/DT2</tt><br />
<br />
'''[#3]''' <br />
Smith, John. ''Some Earth Science Data.'' DT1. <tt>http://purl.org/net/archivea/DT1</tt>.<br />
<br />
'''[#4]''' Since '''DT2''' is open, the reference must be qualified by something: either a date/time stamp or some other identifier component.<br />
<br />
Smith, John. ''Some Earth Science Data.'' DT2. <tt>http://purl.org/net/archivea/DT2/2010-01-01T14:00:00</tt>.<br />
<br />
'''[#5]''' <br />
<br />
Smith, John. ''Some Earth Science Data.'' DT. <tt>http://purl.org/net/archivea/DT</tt>.<br />
<br />
'''[#6]''' The curator logs into the PURL server and redirects the prefix as a whole to the new archive.<br />
<br />
'''[#7]''' Either the curator redirects the prefix for the data type at the PURL server level to the new archive, or maintains the<br />
redirection in its own server. In either case after redirection, the published PURLs still end up on the right pages at the new archive.<br />
<br />
== DOI Use Cases ==<br />
<br />
=== Finding a data set referenced in a paper from the DOI in its citation ===<br />
<br />
While reading a journal article a scientist comes across a reference to a data set that might be useful in their work. Since the authors of the paper had formally cited the data set, including a DOI for the data set as a whole, the user accesses the data set by either 1) using the DOI add-on to the Firefox browser; or 2) using one of the several DOI resolvers available on the web. In either case, the user is taken to the home web page for that version of the data set. A place from which data set metadata and documentation can be accessed, which also contains links to access the data, perhaps through several different mechanisms.<br />
<br />
=== Accessing a data set discovered on the ESIP Federation data set registry ===<br />
<br />
While perusing the ESIP Federation data set registry (if we can such a thing to exist), a user runs across a description of a data set they'd like to evaluate for use in their work. Part of the registry entry for the data set is its DOI. They click on this and are re-directed to the home web page for that version of the data set.<br />
<br />
'''NOTES''': Such a registry could be at least populated with NSIDC metadata using OAI-PMH metadata harvesting...</div>157.55.17.199