RDA Big Data/Analytics

From Earth Science Information Partners (ESIP)
Revision as of 14:10, May 9, 2013 by Rramachandran (talk | contribs) (Created page with " == Big Data Definitions == '''Gartner’s big data definition''' - “Big data” is high-volume, -velocity and -variety information assets that demand cost-effective, inno...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Big Data Definitions

Gartner’s big data definition - “Big data” is high-volume, -velocity and -variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. Variety: companies are digging out amazing insights from text, locations or log file ( Multi sensor data for science) Velocity is the most misunderstood data characteristic: it is frequently equated to real-time analytics. Yet, velocity is also about the rate of changes, about linking data sets that are coming with different speeds and about bursts of activity (Data fusion issues - time, space, resolution issues) Volume is about the number of big data mentions in the press and social media.

Jim Frew:

My favorite definition is: You can't move it---if you want to use it, you have to go where it is (kind of like a pipe organ...)

For data generally, that means it has to be housed in a system that can do everything you'd want to do to it.

I.e. you send the problem to the data, not v.v.

For science data specifically, this means the data has to live in a reasonably complex processing environment.