Difference between revisions of "Output from Visualization Summit"

From Earth Science Information Partners (ESIP)
(Created page with "* RESPONSES TO BIG QUESTIONS (edited) ===================================== still in progress Overall questions Question 1: Imagine the perfect earth...")
 
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
* RESPONSES TO BIG QUESTIONS (edited)
+
<h1>* RESPONSES TO KEY VISUALIZATION QUESTIONS </h1>
=====================================
 
  
still in progress
+
<div class="writeboardbody">
 +
  <p><em>Earth Data Visualization Summit. Santa Barbara, CA. October 26-27, 2009.
 +
Participants: Bruce Caron,
 +
Marty Landsfeld,
 +
Kevin Ward,
 +
Tommy Jasmin,
 +
Kevin Hussey,
 +
Jeff McWhirter,
 +
Robert Simmon,
 +
Marit Jentoft-Nilsen,
 +
Eric Russell,
 +
Suresh Santhanavannan,
 +
Tom Rink,
 +
John Moreland,
 +
David Nadeau,
 +
Chris Torrence.
  
                          Overall questions
+
This summit was Funded by a NASA Reason Grant: NNX06AB08A
 
+
</em></p>
Question 1: Imagine the perfect earth data remote sensing visualization
+
<h1>Overall questions</h1>
tool/system. What are the main components of this system?
+
<h2>Question 1: Imagine the perfect earth data remote sensing visualization tool/system. What are the main components of this system?</h2>
 
+
<ul>
  * Needs to closely track needs and abilities of a variety of
+
<li>Needs to closely track needs and abilities of a variety of audiences</li>
    audiences
+
<li>Support multiple GUIs (meaning <span class="caps">API</span>-based back-end to support tools?)</li>
  * Support multiple GUIs (meaning API-based back-end to support
+
<li>Open Source</li>
    tools?)
+
<li>Solves data format/standards problem &#8212; new data type? derivative data types?</li>
  * Open Source
+
<li>Social web, community-based: support collaborative workflow and data discovery</li>
  * Solves data format/standards problem -- new data type? derivative
+
<li>One-stop repository to support aggregate querying &#8212; e.g., What is the causality relationship between meningitis outbreaks and precipitation patterns in sub-Saharan Africa?</li>
    data types?
+
<li>Provide ability to store and track provenance</li>
  * Social web, community-based: support collaborative workflow and
+
</ul>
    data discovery
+
<h2>Question 2: Visualization for understanding vs. conveying &#8212; It&#8217;s all about the user: How should tools and visualizations be tailored for distinct user groups?</h2>
  * One-stop repository to support aggregate querying -- e.g., What is
+
<p>Two different issues:</p>
    the causality relationship between meningitis outbreaks and
+
<ul>
    precipitation patterns in sub-Saharan Africa?
+
<li>Conveying &#8212; you know the answer, need to display it in a good way. &#8220;Static&#8221;</li>
  * Provide ability to store and track provenance
+
<li>Understanding &#8212; trying to find the answer via exploration. Needs more facilities. &#8220;Interactive&#8221;</li>
 
+
</ul>
Question 2: Visualization for understanding vs. conveying -- It's all about
+
<p>Hard to build a general-purpose tool, to satisfy power users vs. non-power users, cross-discipline. Same tool needs to scale with abilities.</p>
the user: How should tools and visualizations be tailored for distinct user
+
<h2>Question 3: Are there user groups that you know are under-served by the current data visualization technology? What needs to change to serve these groups?</h2>
groups?
+
<ul>
 
+
<li>Everyone is underserved, but not equally.</li>
Two different issues:
+
<li>PIs/those closest to data are best served by way of their familiarity with the data.</li>
 
+
<li>Solving this problem is an argument in favor of plug-in based development.</li>
  * Conveying -- you know the answer, need to display it in a good
+
<li>Tools need to be developed to support each audience &#8212; different audiences have wildly different needs in terms of capabilities and end result.</li>
    way. &quot;Static&quot;
+
<li>Of course, serving user groups only comes after they have located the data.</li>
  * Understanding -- trying to find the answer via exploration. Needs
+
</ul>
    more facilities. &quot;Interactive&quot;
+
<h2>Question 4: Delivering data vs. images of data. What are the sweet spots for each? What are the areas where we need to focus or change?</h2>
 
+
<ul>
Hard to build a general-purpose tool, to satisfy power users vs.
+
<li>It depends on audience: images <strong>and</strong> data can serve both &#8220;public&#8221; and scientific audiences<br />
non-power users, cross-discipline. Same tool needs to scale with
+
- In general images are great for the general public or less interested specialists (maybe 80%), data needs to be available for the interested 20%.<br />
abilities.
+
- The <a href="http://www.ifp.illinois.edu/nabhcs/abstracts/shneiderman.html">Ben Shneiderman UI design mantra</a> &#8220;overview, zoom &amp; filter, details on demand&#8221; applies in this case.</li>
 
+
</ul>
Question 3: Are there user groups that you know are under-served by the
+
<p>Images sweet spots</p>
current data visualization technology? What needs to change to serve these
+
<ul>
groups?
+
<li>As a means of discovery and monitoring production</li>
 
+
<li>Able to meet needs of wider audience [formats and general ease of use]</li>
  * Everyone is underserved, but not equally.
+
</ul>
  * PIs/those closest to data are best served by way of their
+
<p>Data sweet spots</p>
    familiarity with the data.
+
<ul>
  * Solving this problem is an argument in favor of plug-in based
+
<li>Scientist users who create their own visualizations</li>
    development.
+
<li>maintaining metadata (provenance)</li>
  * Tools need to be developed to support each audience -- different
+
<li>can be used to derive multiple visualizations</li>
    audiences have wildly different needs in terms of capabilities and
+
</ul>
    end result.
+
<h2>Question 5: How can visualizers bridge the science-outreach divide? How to teach the public about science and the scientists about the public?</h2>
  * Of course, serving user groups only comes after they have located
+
<p>Visualization is a compelling medium that science communicators can use to make complex scientific ideas approachable to a broad audience. Carl Sagan’s Cosmos series is the prototypical example, weaving visuals with narrative to explain astrophysics. It is crucial to define an audience: there is no such thing as a uniform general public. Concrete visualizations, such as planet walks and painted lines representing sea level rise, can be particularly effective. Effective visualization requires focus: emphasize important elements of a dataset, and de-emphasize or eliminate less important data. Ideally, tools would be designed by visualizers, not computer scientists. Try for verisimilitude: make things appear how the audience expects it to appear (for example Google Earth’s discontinuous boundaries between scenes are very distracting).</p>
    the data.
+
<p>(T1)<br />
 
+
Perhaps current outreach programs are too top-down, we need a tighter, iterative relationship between viz developers, scientists and the outreach audience. We didn’t like the notion that the public can’t understand science, and consider the question: aren’t scientist part of the public? It’s important to bring scientists and their work into the community, can effective visualization facilitate the needed two-way motivation needed between scientists the community they serve?</p>
Question 4: Delivering data vs. images of data. What are the sweet spots for
+
<p>Features for tools, easy to read, embedded description with the displays. Simple initial presentation, but allows a progressive disclosure of information and concepts as far as the user desires.</p>
each? What are the areas where we need to focus or change?
+
<p>(T2)</p>
 
+
<ul>
  * It depends on audience: images and data can serve both &quot;public&quot;
+
<li>Tune the visualization to the audience &#8212; but how?<br />
    and scientific audiences
+
      &#8211; use of toolkits, plug-ins, component frameworks<br />
    - In general images are great for the general public or less
+
      &#8211; providing different presentations of the same data</li>
    interested specialists (maybe 80%), data needs to be available for
+
</ul>
    the interested 20%.
+
<ul>
    - The Ben Shneiderman UI design mantra &quot;overview, zoom &amp; filter,
+
<li>Provide <span class="caps">HTML</span>/Flash/<span class="caps">CMS</span> for data<br />
    details on demand&quot; applies in this case.
+
      &#8211; easier mechanism for scientists/educators to convey info<br />
 
+
      &#8211; equivalent of wiki or “build your own web page” for general public</li>
Images sweet spots
+
</ul>
 
+
<p>Part B, How to teach the public about science and the scientists about the<br />
  * As a means of discovery and monitoring production
+
public?</p>
  * Able to meet needs of wider audience [formats and general ease of
+
<ul>
    use]
+
<li>Force scientists through <span class="caps">NSF</span>, etc, new requirements on publishing</li>
 
+
<li>Use emerging technologies like social media</li>
Data sweet spots
+
<li>Enable new ways of publishing<br />
 
+
      – e.g. Data <span class="caps">CMS</span>, like <span class="caps">RAMADDA</span></li>
  * Scientist users who create their own visualizations
+
</ul>
  * maintaining metadata (provenance)
+
<p>General comments:</p>
  * can be used to derive multiple visualizations
+
<ul>
 
+
<li>Visualization is conveying information</li>
Question 5: How can visualizers bridge the science-outreach divide? How to
+
<li>Teacher knows the answer, needs to find way to convey</li>
teach the public about science and the scientists about the public?
+
<li>Scientist does not know the answer, needs exploratory analysis</li>
 
+
<li>There is a declining importance of traditional journal publications</li>
Visualization is a compelling medium that science communicators can
+
</ul>
use to make complex scientific ideas approachable to a broad audience.
+
<h1>Current Issues</h1>
Carl Sagan’s Cosmos series is the prototypical example, weaving
+
<h2>Question 1C: You tackle visualization tasks every day. What is the one thing that you do every day that needs to be different in 5 years for your life to improve?</h2>
visuals with narrative to explain astrophysics. It is crucial to
+
<ul>
define an audience: there is no such thing as a uniform general
+
<li>data search and retrieval</li>
public. Concrete visualizations, such as planet walks and painted
+
<li>one data format to rule them all &#8212; standards-based in all aspects (data structure, metadata)</li>
lines representing sea level rise, can be particularly effective.
+
</ul>
Effective visualization requires focus: emphasize important elements
+
<h2>Question 2C: Name the three (or more) top obstacles today that limit the effective development/deployment of earth data visualizations.</h2>
of a dataset, and de-emphasize or eliminate less important data.
+
<h2>Question 3C: State of gridded data fusion capabilities: what are the main obstacles/opportunities?</h2>
Ideally, tools would be designed by visualizers, not computer
+
<p>(T1)<br />
scientists. Try for verisimilitude: make things appear how the
+
Error tracking and error propagation during data fusion has to be done and provided to the user. For example when resampling data for fusion we need to keep track of the errors and how it affects the grid fusion,</p>
audience expects it to appear (for example Google Earth’s
+
<p>Knowledge embedded in the tool to facilitate data fusion. Sort of an knowledge based system that recommends to the user how the data fusion has to be done.</p>
discontinuous boundaries between scenes are very distracting).
+
<p>Provide raw data,processed re-gridded data and software to data fusion packaged together. Users have all the pieces needed for data fusion.</p>
 
+
<p>Algorithm/Tools/Gridding are data dependent.What techniques are used are dependent on the data brought together.</p>
(T1)
+
<p>Embed data units,scaling, offsets within the data to facilitate fusion. We have data in different units, for example we have temperature data in Fahrenheit and in Celsius. Scaling/ data unit conversion has to be embedded in the data or interpreted readily by the tool.</p>
Perhaps current outreach programs are too top-down, we need a tighter,
+
<p>Strengthen interaction between standards for data fusion. OpenDAP to <span class="caps">OGC</span> to web services.</p>
iterative relationship between viz developers, scientists and the
+
<p>(T2)<br />
outreach audience. We didn’t like the notion that the public
+
How do we get the grid itself – some domains end up creating gridded data from point observation data. How does the different algorithms and parametrizations affect the outcome? How do we show the provenance?</p>
can’t understand science, and consider the question:
+
<p>How to handle large scale grids – tiling, etc. Its not just a visualization issue – we need integrated data systems to deal with this issue – not just client applications.</p>
aren’t scientist part of the public? It’s important to
+
<p>Services include – gridding (e.g., Barnes objective analysis), varying temporal and spatial resolutions, resampling, irregular and unstructured grids, pushing analysis onto the server due to the data set size. Need to stream them, etc.</p>
bring scientists and their work into the community, can effective
+
<p>For example, NCEP’s global 0.5 degree <span class="caps">GFS</span> model a single 3D field has dimensions of 720×361×26 and 61 time steps. This results in 412233120 points or 1.6 GB of data per field. Lots of data!</p>
visualization facilitate the needed two-way motivation needed between
+
<p>(T3)<br />
scientists the community they serve?
+
Grids are a sampling of an underlying continuous function that is reality. That sampling has aliasing artifacts. Those artifacts may vary from point to point within the data (imagine a satellite image at an angle across the Earth’s surface – the sample size varies from one side of the image to the next).</p>
 
+
<p>Grid to grid fusion often involves resampling those grids to merge them into a common grid. This introduces more aliasing artifacts. To reduce those artifacts we should use an interpolation function that models that underlying continuous function. However, we often do not. Perhaps the “correct” interpolation function is not known, a subject of debate, or not available in the software.</p>
Features for tools, easy to read, embedded description with the
+
<p>To get a handle on the artifacts introduced, and the interpolation function to use (or not use), we need to track the source of the data and the propagation of error. This becomes a file format issue because files often store the data result, but not the path to that result and the error function.</p>
displays. Simple initial presentation, but allows a progressive
+
<h2>Question 4C: Gridded data and <span class="caps">GIS</span>&#8230; getting these to talk to each other. How to make smart maps of <span class="caps">GIS</span> point and polygon data <span class="caps">AND</span> gridded satellite data.</h2>
disclosure of information and concepts as far as the user desires.
+
<h1>Future Development Ideas</h1>
 
+
<h2>Question 1F: 2D, 3D, 4D: What&#8217;s the future of data display?</h2>
(T2)
+
<p>(T1)<br />
 
+
- Data, software, and hardware all can have different dimensionality.</p>
  * Tune the visualization to the audience -- but how?
+
<p>- How soon, if ever will 3 spatial dimension displays (stereo glasses, etc) become common.</p>
    - use of toolkits, plug-ins, component frameworks
+
<p>- Issues of user interaction/control of displays, augmented reality (such iPhone location app) where user location in real world drives position in a 3 spatial dimension world shown on 2d hardware.</p>
    - providing different presentations of the same data
+
<p>- Single user vs. collaborative displays – for single user you could have level of detail optimizations, where eye look direction drives detail of display.</p>
 
+
<p>(T2)<br />
  * Provide HTML/Flash/CMS for data
+
Google earth but with data.<br />
    - easier mechanism for scientists/educators to convey info
+
Immersive technology:<br />
    - equivalent of wiki or “build your own web page” for
+
2D – image, contour<br />
    general public
+
3D – volume, surface<br />
 
+
4D – 3D + time, animated<br />
Part B, How to teach the public about science and the scientists about
+
5D – multiple parameters</p>
the
+
<p>User interacts with data. Showing data in different ways, both geographically and analytically. Not just pretty pictures.<br />
public?
+
Probing<br />
 
+
Transects through data<br />
  * Force scientists through NSF, etc, new requirements on publishing
+
Scatter plot<br />
  * Use emerging technologies like social media
+
Slice &amp; dice<br />
  * Enable new ways of publishing
+
Time series analysis<br />
    – e.g. Data CMS, like RAMADDA
+
Multiple linked views into same data (“4-up”)<br />
 
+
Geographic displays coupled with charts</p>
General comments:
+
<p>For doing science, 3D has a problem with perspective</p>
 
+
<p>Exploration capability. Educators are adopting 3D, via Google Earth.</p>
  * Visualization is conveying information
+
<p>Seems like we have the tools. But on a 2D display, you can control pixel color, transparency, and glyph, that’s it.</p>
  * Teacher knows the answer, needs to find way to convey
+
<p>(T3)<br />
  * Scientist does not know the answer, needs exploratory analysis
+
Our consensus is that 3D and beyond are not a good use of resources and that visualizations should concentrate on 2D and 2.5D visualizations. It is desirable but with very little return on the massive efforts.</p>
  * There is a declining importance of traditional journal
+
<p>There are many reasons for this. Those being:</p>
    publications
+
<p>Humans are really poor at perceiving depth. Studies have shown that humans do not perceive more than 6 depths at one time. Humans eyes are rarely of the same strength and contributes to depth perception problems.</p>
 
+
<p>Computer technology, at least at this point, does a very poor job or creating the illusion of depth.</p>
                            Current Issues
+
<p>Human perception in the Z plane is only about 10% of our 2D perception strength.</p>
 
+
<h2>Question 2F: Java/Spring/<span class="caps">AJAX</span> vs. Flex/Flash&#8230; what will rule in 5 years? What are the issues that need to be watched in terms of data visualization?</h2>
Question 1C: You tackle visualization tasks every day. What is the one thing
+
<p>(Tri)<br />
that you do every day that needs to be different in 5 years for your life to
+
All of these technologies have problems. Some are heavy-weight that work well for applications, but not on the web (Java). Some are overly complex (<span class="caps">AJAX</span>). Some are proprietary or nearly so (Flash, Silverlight). Often they lack well-defined toolkits for building effective user interfaces. Weak standard support among browsers (such as IE) complicate matters.</p>
improve?
+
<p>An evolving trend is <span class="caps">HTML</span> 5 and JavaScript, plus JavaScript-based toolkits. These cover up browser quirks, leverage scheduling features in browser JavaScript schedules and sandboxing, and provide a client-side <span class="caps">GUI</span>. These toolkits, as the emerge and mature, may provide a good technology for building vis tools accessed via browsers on platforms from desktops to iPhones.</p>
 
+
<p>(Square)<br />
  * data search and retrieval
+
Level of programming skill to produce rich application development for novices would be a factor. There’s some relationship to the previous question concerning presentation vs. interactive data analysis. Flash and Java seem well established, respectively, in dealing with these broad user categories. Popularity of small devices and social networking, are Flash/Java others amenable to these environments? With current pace of technological acceleration, 5 years could be a little outside a useful prediction.</p>
  * one data format to rule them all -- standards-based in all aspects
+
<p>(T3)</p>
    (data structure, metadata)
+
<ul>
 
+
<li>Both will be around and viable/popular still</li>
Question 2C: Name the three (or more) top obstacles today that limit the
+
</ul>
effective development/deployment of earth data visualizations.
+
<ul>
 
+
<li>Plug-in presence used to be an issue, but muss less so now</li>
Question 3C: State of gridded data fusion capabilities: what are the main
+
</ul>
obstacles/opportunities?
+
<ul>
 
+
<li>Other technologies that may emerge as contenders<br />
(T1)
+
      – Canvas w/video audio <span class="caps">HTML</span> 5 tags<br />
Error tracking and error propagation during data fusion has to be done
+
      – Silverlight (less likely to dominate <span class="caps">IOO</span>)</li>
and provided to the user. For example when resampling data for fusion
+
</ul>
we need to keep track of the errors and how it affects the grid
+
<ul>
fusion,
+
<li>Issues to watch in terms of data visulalization<br />
 
+
      – ability to save/export to alternate formats<br />
Knowledge embedded in the tool to facilitate data fusion. Sort of an
+
      e.g. export flex app as iPhone app<br />
knowledge based system that recommends to the user how the data fusion
+
      – capabilities that will emerge with Canvas<br />
has to be done.
+
      <span class="caps">API</span> (specifically relating to data import) enhancements</li>
 
+
</ul>
Provide raw data,processed re-gridded data and software to data fusion
+
<h2>Question 3F: Standards for viewing earth data: what is the future&#8230; <span class="caps">JPEG</span> 2000? GeoTIFF? <span class="caps">KML</span>? Where are we going?</h2>
packaged together. Users have all the pieces needed for data fusion.
+
<p>(T3)<br />
 
+
The data <strong>formats</strong> are generally adequate, but the structure of the data is often inadequate. GeoTIFF can easily be abused, <span class="caps">JPEG</span> 2000 isn’t well supported, <span class="caps">KML</span> is useful but primitive (limited support for map projections). Hopefully we’ll refine existing standards, rather than proliferating poorly supported standards.</p>
Algorithm/Tools/Gridding are data dependent.What techniques are used
+
<h2>Question 4F: Data and video: Thoughts on building animations for video distribution. Where is this going?</h2>
are dependent on the data brought together.
+
<p>(Red)<br />
 
+
Tools need to have more batch processing capabilities.</p>
Embed data units,scaling, offsets within the data to facilitate
+
<p>Automated production (creation and serving from web) vs one-off visualizations.</p>
fusion. We have data in different units, for example we have
+
<p>Time series data vs fly-thoughs – which animations needs human intervention.</p>
temperature data in Fahrenheit and in Celsius. Scaling/ data unit
+
<p>There is a distinction between animations vs video with voice-over, music, close captioning. Video takes much more time, resources.</p>
conversion has to be embedded in the data or interpreted readily by
+
<p>(Green)<br />
the tool.
+
We agree that if the data set is appropriate for time series distribution that applications should contain a component to output animation video.</p>
 
+
<p>(Blue)<br />
Strengthen interaction between standards for data fusion. OpenDAP to
+
Question: Data and video: Thoughts on building animations for video distribution. Where is this going?</p>
OGC to web services.
+
<p>Video is dieing. It’s been years since we produced DVDs or tape. When a video is created, it’s in MPEG4, <span class="caps">AVI</span>, or whatever format can be played in PowerPoint or on a web page.</p>
 
+
<p>However, the trend is away from these canned video presentations and instead towards live demos of visualization software run on the presenter’s laptop. This is often more credible, and it allows the presenter to adapt their presentation up to the last minute before their talk, or during their talk.</p>
(T2)
+
<h2>Question 5F: Open Source / <span class="caps">COTS</span> vs. custom tools &#8212; issues with not-invented-here and the ability to write plugins for existing packages (or is the future going to continue to be comprised of an ever-expanding repertoire of software)?</h2>
How do we get the grid itself – some domains end up creating
+
<p>(T2)<br />
gridded data from point observation data. How does the different
+
Significant factors:</p>
algorithms and parametrizations affect the outcome? How do we show the
+
<ul>
provenance?
+
<li>Politics are a major driver. Motivation to take credit for tangible results, branding etc.</li>
 
+
</ul>
How to handle large scale grids – tiling, etc. Its not just a
+
<p>Assessment:</p>
visualization issue – we need integrated data systems to deal
+
<ul>
with this issue – not just client applications.
+
<li>Status quo will likely continue. Proliferation of semi-redundant software is probably overall a positive.<br />
 
+
      – refinement<br />
Services include – gridding (e.g., Barnes objective analysis),
+
      – reinforces successes.</li>
varying temporal and spatial resolutions, resampling, irregular and
+
</ul>
unstructured grids, pushing analysis onto the server due to the data
+
<p>(Red)<br />
set size. Need to stream them, etc.
+
Too often scientists, etc., need to get their work done and do not (or cannot) have the luxury of time to use <span class="caps">COTS</span>, to do effective software engineering and management, i.e., to do the right thing from a software engineering perspective.</p>
 
+
<p>This is neither good nor bad – it is just the reality of the way things work.</p>
For example, NCEP’s global 0.5 degree GFS model a single 3D
+
<p>(Green)<br />
field has dimensions of 720×361×26 and 61 time steps.
+
Open Source<br />
This results in 412233120 points or 1.6 GB of data per field. Lots of
+
Has well documented code, and <span class="caps">API</span><br />
data!
+
Good design architecture for extension for customization ie, has a plug-in capability. No need to change or access system software.<br />
 
+
Good community involvement, has anyone been able to do this.<br />
(T3)
+
If above yes, then why not use?</p>
Grids are a sampling of an underlying continuous function that is
+
<p><span class="caps">COTS</span>: issues with propriety data formats a problem, in either case, support for issues with software may be out of the users control.</p>
reality. That sampling has aliasing artifacts. Those artifacts may
+
<p>Custom tools can be very optimized to do a few tasks very well, from UI level to rendering level. A good visualization system would allow developers to extend the system at these different levels.</p>
vary from point to point within the data (imagine a satellite image at
+
</div>
an angle across the Earth’s surface – the sample size
 
varies from one side of the image to the next).
 
 
 
Grid to grid fusion often involves resampling those grids to merge
 
them into a common grid. This introduces more aliasing artifacts. To
 
reduce those artifacts we should use an interpolation function that
 
models that underlying continuous function. However, we often do not.
 
Perhaps the “correct” interpolation function is not
 
known, a subject of debate, or not available in the software.
 
 
 
To get a handle on the artifacts introduced, and the interpolation
 
function to use (or not use), we need to track the source of the data
 
and the propagation of error. This becomes a file format issue because
 
files often store the data result, but not the path to that result and
 
the error function.
 
 
 
Question 4C: Gridded data and GIS... getting these to talk to each other.
 
How to make smart maps of GIS point and polygon data AND gridded satellite
 
data.
 
 
 
                      Future Development Ideas
 
 
 
Question 1F: 2D, 3D, 4D: What's the future of data display?
 
 
 
(T1)
 
- Data, software, and hardware all can have different dimensionality.
 
 
 
- How soon, if ever will 3 spatial dimension displays (stereo glasses,
 
etc) become common.
 
 
 
- Issues of user interaction/control of displays, augmented reality
 
(such iPhone location app) where user location in real world drives
 
position in a 3 spatial dimension world shown on 2d hardware.
 
 
 
- Single user vs. collaborative displays – for single user you
 
could have level of detail optimizations, where eye look direction
 
drives detail of display.
 
 
 
(T2)
 
Google earth but with data.
 
Immersive technology:
 
2D – image, contour
 
3D – volume, surface
 
4D – 3D + time, animated
 
5D – multiple parameters
 
 
 
User interacts with data. Showing data in different ways, both
 
geographically and analytically. Not just pretty pictures.
 
Probing
 
Transects through data
 
Scatter plot
 
Slice &amp; dice
 
Time series analysis
 
Multiple linked views into same data (“4-up”)
 
Geographic displays coupled with charts
 
 
 
For doing science, 3D has a problem with perspective
 
 
 
Exploration capability. Educators are adopting 3D, via Google Earth.
 
 
 
Seems like we have the tools. But on a 2D display, you can control
 
pixel color, transparency, and glyph, that’s it.
 
 
 
(T3)
 
Our consensus is that 3D and beyond are not a good use of resources
 
and that visualizations should concentrate on 2D and 2.5D
 
visualizations. It is desirable but with very little return on the
 
massive efforts.
 
 
 
There are many reasons for this. Those being:
 
 
 
Humans are really poor at perceiving depth. Studies have shown that
 
humans do not perceive more than 6 depths at one time. Humans eyes are
 
rarely of the same strength and contributes to depth perception
 
problems.
 
 
 
Computer technology, at least at this point, does a very poor job or
 
creating the illusion of depth.
 
 
 
Human perception in the Z plane is only about 10% of our 2D perception
 
strength.
 
 
 
Question 2F: Java/Spring/AJAX vs. Flex/Flash... what will rule in 5 years?
 
What are the issues that need to be watched in terms of data visualization?
 
 
 
(Tri)
 
All of these technologies have problems. Some are heavy-weight that
 
work well for applications, but not on the web (Java). Some are overly
 
complex (AJAX). Some are proprietary or nearly so (Flash,
 
Silverlight). Often they lack well-defined toolkits for building
 
effective user interfaces. Weak standard support among browsers (such
 
as IE) complicate matters.
 
 
 
An evolving trend is HTML 5 and JavaScript, plus JavaScript-based
 
toolkits. These cover up browser quirks, leverage scheduling features
 
in browser JavaScript schedules and sandboxing, and provide a
 
client-side GUI. These toolkits, as the emerge and mature, may provide
 
a good technology for building vis tools accessed via browsers on
 
platforms from desktops to iPhones.
 
 
 
(Square)
 
Level of programming skill to produce rich application development for
 
novices would be a factor. There’s some relationship to the
 
previous question concerning presentation vs. interactive data
 
analysis. Flash and Java seem well established, respectively, in
 
dealing with these broad user categories. Popularity of small devices
 
and social networking, are Flash/Java others amenable to these
 
environments? With current pace of technological acceleration, 5 years
 
could be a little outside a useful prediction.
 
 
 
(T3)
 
 
 
  * Both will be around and viable/popular still
 
 
 
  * Plug-in presence used to be an issue, but muss less so now
 
 
 
  * Other technologies that may emerge as contenders
 
    – Canvas w/video audio HTML 5 tags
 
    – Silverlight (less likely to dominate IOO)
 
 
 
  * Issues to watch in terms of data visulalization
 
    – ability to save/export to alternate formats
 
    e.g. export flex app as iPhone app
 
    – capabilities that will emerge with Canvas
 
    – API (specifically relating to data import) enhancements
 
 
 
Question 3F: Standards for viewing earth data: what is the future... JPEG
 
2000? GeoTIFF? KML? Where are we going?
 
 
 
(T3)
 
The data formats are generally adequate, but the structure of the data
 
is often inadequate. GeoTIFF can easily be abused, JPEG 2000
 
isn’t well supported, KML is useful but primitive (limited
 
support for map projections). Hopefully we’ll refine existing
 
standards, rather than proliferating poorly supported standards.
 
 
 
Question 4F: Data and video: Thoughts on building animations for video
 
distribution. Where is this going?
 
 
 
(Red)
 
Tools need to have more batch processing capabilities.
 
 
 
Automated production (creation and serving from web) vs one-off
 
visualizations.
 
 
 
Time series data vs fly-thoughs – which animations needs human
 
intervention.
 
 
 
There is a distinction between animations vs video with voice-over,
 
music, close captioning. Video takes much more time, resources.
 
 
 
(Green)
 
We agree that if the data set is appropriate for time series
 
distribution that applications should contain a component to output
 
animation video.
 
 
 
(Blue)
 
Question: Data and video: Thoughts on building animations for video
 
distribution. Where is this going?
 
 
 
Video is dieing. It’s been years since we produced DVDs or
 
tape. When a video is created, it’s in MPEG4, AVI, or whatever
 
format can be played in PowerPoint or on a web page.
 
 
 
However, the trend is away from these canned video presentations and
 
instead towards live demos of visualization software run on the
 
presenter’s laptop. This is often more credible, and it allows
 
the presenter to adapt their presentation up to the last minute before
 
their talk, or during their talk.
 
 
 
Question 5F: Open Source / COTS vs. custom tools -- issues with
 
not-invented-here and the ability to write plugins for existing packages (or
 
is the future going to continue to be comprised of an ever-expanding
 
repertoire of software)?
 
 
 
(T2)
 
Significant factors:
 
 
 
  * Politics are a major driver. Motivation to take credit for
 
    tangible results, branding etc.
 
 
 
Assessment:
 
 
 
  * Status quo will likely continue. Proliferation of semi-redundant
 
    software is probably overall a positive.
 
    – refinement
 
    – reinforces successes.
 
 
 
(Red)
 
Too often scientists, etc., need to get their work done and do not (or
 
cannot) have the luxury of time to use COTS, to do effective software
 
engineering and management, i.e., to do the right thing from a
 
software engineering perspective.
 
 
 
This is neither good nor bad – it is just the reality of the
 
way things work.
 
 
 
(Green)
 
Open Source
 
Has well documented code, and API
 
Good design architecture for extension for customization ie, has a
 
plug-in capability. No need to change or access system software.
 
Good community involvement, has anyone been able to do this.
 
If above yes, then why not use?
 
 
 
COTS: issues with propriety data formats a problem, in either case,
 
support for issues with software may be out of the users control.
 
 
 
Custom tools can be very optimized to do a few tasks very well, from
 
UI level to rendering level. A good visualization system would allow
 
developers to extend the system at these different levels.
 
 
 
                        Book related questions
 
 
 
Question 1B: Specific content suggestions
 
 
 
Data / meta-data standards, conventions.
 
Standards and conventions for visual display (color map issues /
 
conflicts between domains).
 
Color maps – selection based upon different intents (perception
 
of different colors, generic conventions: “blue ==
 
cold”, “red=hot” ; “rainbow” is
 
problematic, etc).
 
Software Interoperability (and relation to file format standards and
 
conventions).
 
Survey of existing tools.
 
What’s the current state of the technology?
 
What software is available now?
 
What needs to be fixed?
 
Future directions.
 
What do we need to do better?
 
Common (dimension-independant) issues VS problems specific to 2D or 3D
 
data.
 
 
 
(T3)
 
- How to incentivize sharing of data in usable formats a) requiring b)
 
build an online community, ranking system for contributors c) build
 
the best tool, providers will want to comply (Google Earth is good
 
example – everyone wants to get their data working in GE now) -
 
The most common current problems with status quo: data retrieval,data
 
formats, etc.
 
- Current State-of-the-art tool summaries - Post meeting, have all
 
attendees agree on Book chapter outline
 
 
 
(Tri)
 
- should discuss file formats/interoperability issues in abstract way
 
- example of successes: can be partial successes such as NetCDF, which
 
is very flexible syntax, but conventions for using syntax were not
 
well established so many variations exist
 
- describes example visualization going through thought behind design
 
decisions – examples of this already are “sound on
 
sound” tutorials, visualization blogs such as
 
http://eagereyes.org/
 
 
 
Question 2B: Audience ideas and considerations
 
 
 
  * non-professional visualizers as a way of communicating the
 
    practices, but not &quot;dumbed-down&quot;
 
 
 
Question 3B: Website issues (links to durable URLs for product info, etc.)
 
 
 
  * durable links
 
  * Have online examples or not: examples will become dated quickly
 
    vs. interaction with examples discussed in book that will
 
    facilitate understanding of them
 
 
 
Question 4B: Getting others involved OR not
 
 
 
Perhaps, but must be a strong commitment; number still needs to be
 
kept so that logistics do not overwhelm.
 
 
 
What is the equivalent of pagerank for searching geoscience data?
 
 
 
Criteria for computing &quot;pagerank&quot;:
 
 
 
  * Spatial and temporal resolution (assuming high-frequency and
 
    -resolution = good)
 
  * Time distance from desired date
 
  * Quality of metadata (compliant with standards)
 
  * Includes calibration / control data
 
  * Provenance
 
  * Lowest error/trustworthiness
 
  * Data provider (credibility of curator)
 
  * Consistency (outlier detection)
 
  * Supported data formats (NetCDF vs. custom binary)
 
  * Popularity (links, citations, social ranking)
 
  * Frequency of updates (is dataset current and reliably updated)
 
  * Access method (direct online access better than placing an order)
 
 
 
Common Sources of Error and How Can Visualization Tools Help?
 
(not absolutely certain what this actual question was)
 
 
 
Error or uncertainty? Uncertainty can be shown by flagging areas at
 
certain data quality levels, showing confidence intervals, or plotting
 
the results of model ensembles.
 
 
 
Sources of error:
 
 
 
  * Floating-point representation as text
 
  * Gridding (of points), resampling, changing resolution,
 
    reprojecting
 
  * Interpolation
 
  * Incomplete or inconsistent metadata
 
  * Undocumented satellite data correction
 
  * Incorrect math (e.g., floor vs. ceil vs. trunc)
 
 
 
What visualization tools can do:
 
 
 
  * Represent error in visualization using error bars, or color coding
 
    by uncertainty
 
  * Only possible when the error or possible sources of error are
 
    captured in the data or metadata
 

Latest revision as of 13:31, July 26, 2011

* RESPONSES TO KEY VISUALIZATION QUESTIONS

Earth Data Visualization Summit. Santa Barbara, CA. October 26-27, 2009. Participants: Bruce Caron, Marty Landsfeld, Kevin Ward, Tommy Jasmin, Kevin Hussey, Jeff McWhirter, Robert Simmon, Marit Jentoft-Nilsen, Eric Russell, Suresh Santhanavannan, Tom Rink, John Moreland, David Nadeau, Chris Torrence. This summit was Funded by a NASA Reason Grant: NNX06AB08A

Overall questions

Question 1: Imagine the perfect earth data remote sensing visualization tool/system. What are the main components of this system?

  • Needs to closely track needs and abilities of a variety of audiences
  • Support multiple GUIs (meaning API-based back-end to support tools?)
  • Open Source
  • Solves data format/standards problem — new data type? derivative data types?
  • Social web, community-based: support collaborative workflow and data discovery
  • One-stop repository to support aggregate querying — e.g., What is the causality relationship between meningitis outbreaks and precipitation patterns in sub-Saharan Africa?
  • Provide ability to store and track provenance

Question 2: Visualization for understanding vs. conveying — It’s all about the user: How should tools and visualizations be tailored for distinct user groups?

Two different issues:

  • Conveying — you know the answer, need to display it in a good way. “Static”
  • Understanding — trying to find the answer via exploration. Needs more facilities. “Interactive”

Hard to build a general-purpose tool, to satisfy power users vs. non-power users, cross-discipline. Same tool needs to scale with abilities.

Question 3: Are there user groups that you know are under-served by the current data visualization technology? What needs to change to serve these groups?

  • Everyone is underserved, but not equally.
  • PIs/those closest to data are best served by way of their familiarity with the data.
  • Solving this problem is an argument in favor of plug-in based development.
  • Tools need to be developed to support each audience — different audiences have wildly different needs in terms of capabilities and end result.
  • Of course, serving user groups only comes after they have located the data.

Question 4: Delivering data vs. images of data. What are the sweet spots for each? What are the areas where we need to focus or change?

  • It depends on audience: images and data can serve both “public” and scientific audiences
    - In general images are great for the general public or less interested specialists (maybe 80%), data needs to be available for the interested 20%.
    - The <a href="http://www.ifp.illinois.edu/nabhcs/abstracts/shneiderman.html">Ben Shneiderman UI design mantra</a> “overview, zoom & filter, details on demand” applies in this case.

Images sweet spots

  • As a means of discovery and monitoring production
  • Able to meet needs of wider audience [formats and general ease of use]

Data sweet spots

  • Scientist users who create their own visualizations
  • maintaining metadata (provenance)
  • can be used to derive multiple visualizations

Question 5: How can visualizers bridge the science-outreach divide? How to teach the public about science and the scientists about the public?

Visualization is a compelling medium that science communicators can use to make complex scientific ideas approachable to a broad audience. Carl Sagan’s Cosmos series is the prototypical example, weaving visuals with narrative to explain astrophysics. It is crucial to define an audience: there is no such thing as a uniform general public. Concrete visualizations, such as planet walks and painted lines representing sea level rise, can be particularly effective. Effective visualization requires focus: emphasize important elements of a dataset, and de-emphasize or eliminate less important data. Ideally, tools would be designed by visualizers, not computer scientists. Try for verisimilitude: make things appear how the audience expects it to appear (for example Google Earth’s discontinuous boundaries between scenes are very distracting).

(T1)
Perhaps current outreach programs are too top-down, we need a tighter, iterative relationship between viz developers, scientists and the outreach audience. We didn’t like the notion that the public can’t understand science, and consider the question: aren’t scientist part of the public? It’s important to bring scientists and their work into the community, can effective visualization facilitate the needed two-way motivation needed between scientists the community they serve?

Features for tools, easy to read, embedded description with the displays. Simple initial presentation, but allows a progressive disclosure of information and concepts as far as the user desires.

(T2)

  • Tune the visualization to the audience — but how?
    – use of toolkits, plug-ins, component frameworks
    – providing different presentations of the same data
  • Provide HTML/Flash/CMS for data
    – easier mechanism for scientists/educators to convey info
    – equivalent of wiki or “build your own web page” for general public

Part B, How to teach the public about science and the scientists about the
public?

  • Force scientists through NSF, etc, new requirements on publishing
  • Use emerging technologies like social media
  • Enable new ways of publishing
    – e.g. Data CMS, like RAMADDA

General comments:

  • Visualization is conveying information
  • Teacher knows the answer, needs to find way to convey
  • Scientist does not know the answer, needs exploratory analysis
  • There is a declining importance of traditional journal publications

Current Issues

Question 1C: You tackle visualization tasks every day. What is the one thing that you do every day that needs to be different in 5 years for your life to improve?

  • data search and retrieval
  • one data format to rule them all — standards-based in all aspects (data structure, metadata)

Question 2C: Name the three (or more) top obstacles today that limit the effective development/deployment of earth data visualizations.

Question 3C: State of gridded data fusion capabilities: what are the main obstacles/opportunities?

(T1)
Error tracking and error propagation during data fusion has to be done and provided to the user. For example when resampling data for fusion we need to keep track of the errors and how it affects the grid fusion,

Knowledge embedded in the tool to facilitate data fusion. Sort of an knowledge based system that recommends to the user how the data fusion has to be done.

Provide raw data,processed re-gridded data and software to data fusion packaged together. Users have all the pieces needed for data fusion.

Algorithm/Tools/Gridding are data dependent.What techniques are used are dependent on the data brought together.

Embed data units,scaling, offsets within the data to facilitate fusion. We have data in different units, for example we have temperature data in Fahrenheit and in Celsius. Scaling/ data unit conversion has to be embedded in the data or interpreted readily by the tool.

Strengthen interaction between standards for data fusion. OpenDAP to OGC to web services.

(T2)
How do we get the grid itself – some domains end up creating gridded data from point observation data. How does the different algorithms and parametrizations affect the outcome? How do we show the provenance?

How to handle large scale grids – tiling, etc. Its not just a visualization issue – we need integrated data systems to deal with this issue – not just client applications.

Services include – gridding (e.g., Barnes objective analysis), varying temporal and spatial resolutions, resampling, irregular and unstructured grids, pushing analysis onto the server due to the data set size. Need to stream them, etc.

For example, NCEP’s global 0.5 degree GFS model a single 3D field has dimensions of 720×361×26 and 61 time steps. This results in 412233120 points or 1.6 GB of data per field. Lots of data!

(T3)
Grids are a sampling of an underlying continuous function that is reality. That sampling has aliasing artifacts. Those artifacts may vary from point to point within the data (imagine a satellite image at an angle across the Earth’s surface – the sample size varies from one side of the image to the next).

Grid to grid fusion often involves resampling those grids to merge them into a common grid. This introduces more aliasing artifacts. To reduce those artifacts we should use an interpolation function that models that underlying continuous function. However, we often do not. Perhaps the “correct” interpolation function is not known, a subject of debate, or not available in the software.

To get a handle on the artifacts introduced, and the interpolation function to use (or not use), we need to track the source of the data and the propagation of error. This becomes a file format issue because files often store the data result, but not the path to that result and the error function.

Question 4C: Gridded data and GIS… getting these to talk to each other. How to make smart maps of GIS point and polygon data AND gridded satellite data.

Future Development Ideas

Question 1F: 2D, 3D, 4D: What’s the future of data display?

(T1)
- Data, software, and hardware all can have different dimensionality.

- How soon, if ever will 3 spatial dimension displays (stereo glasses, etc) become common.

- Issues of user interaction/control of displays, augmented reality (such iPhone location app) where user location in real world drives position in a 3 spatial dimension world shown on 2d hardware.

- Single user vs. collaborative displays – for single user you could have level of detail optimizations, where eye look direction drives detail of display.

(T2)
Google earth but with data.
Immersive technology:
2D – image, contour
3D – volume, surface
4D – 3D + time, animated
5D – multiple parameters

User interacts with data. Showing data in different ways, both geographically and analytically. Not just pretty pictures.
Probing
Transects through data
Scatter plot
Slice & dice
Time series analysis
Multiple linked views into same data (“4-up”)
Geographic displays coupled with charts

For doing science, 3D has a problem with perspective

Exploration capability. Educators are adopting 3D, via Google Earth.

Seems like we have the tools. But on a 2D display, you can control pixel color, transparency, and glyph, that’s it.

(T3)
Our consensus is that 3D and beyond are not a good use of resources and that visualizations should concentrate on 2D and 2.5D visualizations. It is desirable but with very little return on the massive efforts.

There are many reasons for this. Those being:

Humans are really poor at perceiving depth. Studies have shown that humans do not perceive more than 6 depths at one time. Humans eyes are rarely of the same strength and contributes to depth perception problems.

Computer technology, at least at this point, does a very poor job or creating the illusion of depth.

Human perception in the Z plane is only about 10% of our 2D perception strength.

Question 2F: Java/Spring/AJAX vs. Flex/Flash… what will rule in 5 years? What are the issues that need to be watched in terms of data visualization?

(Tri)
All of these technologies have problems. Some are heavy-weight that work well for applications, but not on the web (Java). Some are overly complex (AJAX). Some are proprietary or nearly so (Flash, Silverlight). Often they lack well-defined toolkits for building effective user interfaces. Weak standard support among browsers (such as IE) complicate matters.

An evolving trend is HTML 5 and JavaScript, plus JavaScript-based toolkits. These cover up browser quirks, leverage scheduling features in browser JavaScript schedules and sandboxing, and provide a client-side GUI. These toolkits, as the emerge and mature, may provide a good technology for building vis tools accessed via browsers on platforms from desktops to iPhones.

(Square)
Level of programming skill to produce rich application development for novices would be a factor. There’s some relationship to the previous question concerning presentation vs. interactive data analysis. Flash and Java seem well established, respectively, in dealing with these broad user categories. Popularity of small devices and social networking, are Flash/Java others amenable to these environments? With current pace of technological acceleration, 5 years could be a little outside a useful prediction.

(T3)

  • Both will be around and viable/popular still
  • Plug-in presence used to be an issue, but muss less so now
  • Other technologies that may emerge as contenders
    – Canvas w/video audio HTML 5 tags
    – Silverlight (less likely to dominate IOO)
  • Issues to watch in terms of data visulalization
    – ability to save/export to alternate formats
    e.g. export flex app as iPhone app
    – capabilities that will emerge with Canvas
    API (specifically relating to data import) enhancements

Question 3F: Standards for viewing earth data: what is the future… JPEG 2000? GeoTIFF? KML? Where are we going?

(T3)
The data formats are generally adequate, but the structure of the data is often inadequate. GeoTIFF can easily be abused, JPEG 2000 isn’t well supported, KML is useful but primitive (limited support for map projections). Hopefully we’ll refine existing standards, rather than proliferating poorly supported standards.

Question 4F: Data and video: Thoughts on building animations for video distribution. Where is this going?

(Red)
Tools need to have more batch processing capabilities.

Automated production (creation and serving from web) vs one-off visualizations.

Time series data vs fly-thoughs – which animations needs human intervention.

There is a distinction between animations vs video with voice-over, music, close captioning. Video takes much more time, resources.

(Green)
We agree that if the data set is appropriate for time series distribution that applications should contain a component to output animation video.

(Blue)
Question: Data and video: Thoughts on building animations for video distribution. Where is this going?

Video is dieing. It’s been years since we produced DVDs or tape. When a video is created, it’s in MPEG4, AVI, or whatever format can be played in PowerPoint or on a web page.

However, the trend is away from these canned video presentations and instead towards live demos of visualization software run on the presenter’s laptop. This is often more credible, and it allows the presenter to adapt their presentation up to the last minute before their talk, or during their talk.

Question 5F: Open Source / COTS vs. custom tools — issues with not-invented-here and the ability to write plugins for existing packages (or is the future going to continue to be comprised of an ever-expanding repertoire of software)?

(T2)
Significant factors:

  • Politics are a major driver. Motivation to take credit for tangible results, branding etc.

Assessment:

  • Status quo will likely continue. Proliferation of semi-redundant software is probably overall a positive.
    – refinement
    – reinforces successes.

(Red)
Too often scientists, etc., need to get their work done and do not (or cannot) have the luxury of time to use COTS, to do effective software engineering and management, i.e., to do the right thing from a software engineering perspective.

This is neither good nor bad – it is just the reality of the way things work.

(Green)
Open Source
Has well documented code, and API
Good design architecture for extension for customization ie, has a plug-in capability. No need to change or access system software.
Good community involvement, has anyone been able to do this.
If above yes, then why not use?

COTS: issues with propriety data formats a problem, in either case, support for issues with software may be out of the users control.

Custom tools can be very optimized to do a few tasks very well, from UI level to rendering level. A good visualization system would allow developers to extend the system at these different levels.