Telecon 04.08.14 materials

From Earth Science Information Partners (ESIP)
Revision as of 08:33, April 11, 2014 by Wclenhardt (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Below is a strawman set of slides/talking points based on our conversation today.

(Our manifesto that was presented to the BRDI is here: http://wiki.esipfed.org/index.php/File:BRDI_draft_03.11.14.docx.)


SLIDE: Our recommendation

In our manifesto presented to the BRDI we wrote:

"These challenges are potential opportunities to achieve progress in science, innovation, the economy, and broader society. To actually capture the value of our data, the Federation of Earth Science Information Partners (ESIP Federation or ESIP) calls upon the National Research Council (NRC) to conduct a study to determine strategic priorities for the scientific data enterprise. NRC surveys are considered the gold standard for advice on research programming [DSSS 2007] and offer an authoritative and unbiased assessment for strategic scientific investments. This study would inform and guide decision makers in the government, academia, and industry in helping to improve their practices and priorities for managing scientific data, giving the U.S. a boost in all impacted arenas.

This study should:

  • Synthesize and analyze prior work in data management/infrastructure, such as, what was successful, what was not successful, and why has this not been sufficient?
  • Take a broad perspective of the value of the scientific data enterprise and the infrastructure that supports it from the perspectives of societal benefit, economic competitiveness, and other important values
  • Provide a vision of what might be, then prioritize with conclusions and recommendations."

[What should we actually put here? I'm uncomfortable with this section. If we keep this entire quotation I'm afraid we'll get down in the weeds with this. -Anne]

[Noting Paul's comment about lack of convergence of these 3 points, the write up, and the title. I see that also. Any thoughts? -Anne]

SLIDE: We heard these issues from the BRDI

  • How would this study be different from past reports? Why would it be any more successful?
  • The linkage between scientific data and national interests is weak
  • The audience - who wants it? who would lisiten? who would pay?
  • A high level committee is difficult to achieve and would be very expensive
  • With respect to the draft, the title was not descriptive of the content


SLIDE: How would this study be different from past reports? Why would it be any more successful?

  • Will build on previous studies; synthesize existing, but go beyond
  • We argue prior studies reflect the stove-piped nature of how these problems have been addressed; not holistic in their approach
  • Focuses on cross domain and agency interoperability needed to address problems of societal interest
  • Argue we need a big[ger] leap, as incremental changes are not sufficiently meeting our needs


SLIDE: The linkage between scientific data and national interests is weak

  • Pressing major societal challenges need to be addressed by transdisciplilnary science
    • Think climate change adaptation and resilience
    • Think understanding the linkages between the human genome and health
    • Think providing food for a world of 9 billion w/o using any more land, water, or energy than we currently do
  • These are cross cutting problems that span domains and agencies
    • Need to facilitate data interoperability across a variety of scales and across varied science domains
    • Need to solve the science data value add problem, i.e. metadata and semantic meaning
    • Need to facilitate longitudinal, cross disciplinary studies
    • Need to figure out alternatives to moving data around the network
    • [Data security]
  • We need a national, cross-agency, robust scientific data infrastructure, but lack the capabilities to support that


SLIDE: The audience - who wants it? who would listen? who would pay?

  • Researchers need help
    • The 80/20 rule - 80% of the resources spent in data management, 20% on science
    • Many scientists report not using a data set because it was too hard to access or understand [cite EarthCube studies]
    • Can also point to EarthCube domain workshops which show the science drivers are fundamentally cross-domain
      • i.e. the sedimentary folks need to interact with the hydrologists and the paleogeologists and the biologists and the crustal dynamics, etc.
  • Agencies need help
    • Agencies need the impetus and political cover to work in a more coordinated way
    • Agencies are continually being asked to manage more data with less funding
    • Funding is not sustained or reliable
    • Funding almost inherently promotes stovepipes
  • "OSTP is already working on open data"
    • Validates one point of our argument
    • Our scope also includes: discovery, understandability, citation, transparency,
    • Changing of administrations means possibility exists that that effort would tabled
    • But need to push a more comprehensive vision of how to approach the problem


SLIDE: A high level committee is difficult to achieve and would be very expensive

  • The ROI in investing in data science may be very high
    • Jisc has just published the synthesis report of the value & impact studies of Economic and Social Data Service (ESDS), the Archaeology Data Service (ADS), and the British Atmospheric Data Centre (BADC) has just published the synthesis report of the value & impact studies of Economic and Social Data Service (ESDS), the Archaeology Data Service (ADS), and the British Atmospheric Data Centre (BADC), http://repository.jisc.ac.uk/5568/1/iDF308_-_Digital_Infrastructure_Directions_Report%2C_Jan14_v1-04.pdf
      • Quantitative analysis Results
        • The value to users exceeds the investment made in data sharing and curation via the centres in all three cases – with the benefits from 2.2 to 2.7 times the costs;
        • Very significant increases in work efficiency are realised by users as a result of their use of the data centres – with efficiency gains from 2 to 20 times the costs; and
        • By facilitating additional use, the data centres significantly increase the returns on investment in the creation/collection of the data hosted – with increases in returns from 2 to 12 times the costs.
      • Qualitative analysis Results
        • Academic users report that the centres are very or extremely important for their research. Between 53% and 61% of respondents across the three surveys reported that it would have a major or severe impact on their work if they could not access the data and services; and
        • For depositors, having the data preserved for the long-term and its dissemination being targeted to the academic community are seen as the most beneficial aspects of depositing data with the centres.
      • "An important aim of the studies was to contribute to the further development of impact evaluation methods that can provide estimates of the value and benefits of research data sharing and curation infrastructure investments. This synthesis reflects on lessons learnt and provides a set of recommendations that could help develop future studies of this type."
    • 2013 study of generative economic and social value of Open Government Data lists technical connectivity as a key enable for OGD value generated [ECIS 2013]
    • Thus, it might make financial sense to invest the resources
  • Cost of inaction
    • Unrealized generative value of data - the “capacity to produce unanticipated change through unfiltered contributions from broad and varied audiences” [Wilbanks, 2010]
    • Lack of access to dark data, knowledge of data known only to select few, "“information is available only to a small set of people and they can pervert the process." [IBE (Wilbanks) 2012]
    • Failure to act impinges on national interest, including:
      • US' ability to respond to natural and manmade disasters; i.e. resilience and sustainability
      • Impinges on our scientific competitiveness (e.g. Chinese are projected to outpace the world in genome science/technology)
      • Data [science] literate workforce
      • Slows innovation and economic development
  • Besides expense, why is it difficult to achieve such a committee? Are there points we can address there?

SLIDE: With respect to the draft, the title was not descriptive of the content

  • We welcome working with BRDI to fine-tune this
  • We have some additional ideas, but we would like to reach a final consensus on the drivers and study approach
  • This also ties back to determining the audience

SLIDE: Next steps?

  • Would like to work closely with BRDI on developing and realizing the vision; BRDI support is critical
  • Help us to fine-tune the message; partner going forward
  • Would be beneficial to find some small funding to support a student to do some synthesis research looking at prior studies
  • Finalize our workshop white paper