Data analysis and
visualization topics
Sergei MAURITS,
ARSC HPC Specialist
[email protected]
Content – Day 1
- Visualization of 3-D data - basic concepts - packages- steady graphics formats and compression - animation formats and compression (codecs) - practical approach for animation
- examples - Hands-on training
compositing animation from frames with shake
Content – Hands-on Day 2
- X Windows connection to the ARSC workstations (http://www.arsc.edu/arsc/support/howtos/usingws/) - basic concepts of ParaView
- animation with ParaView - making frames with ParaView
- moving files (frames) from Linux realm to Mac - using shake to edit the frames
- preparing stand-alone animation with QuickTime 7 All in the hands-on training mode!
3-D Data Visualization
-
isosurfaces (close to natural appearance) - cross-sections (3-D ==> 2-D)- probes/sounding (convenience tools) - vectors, streamlines, particles paths - volume rendering (mix of all above) - time dependence (animation) - multi-variable sets (clutter problem) - miscellaneous contexts (CFD objects,
terrain, geographical maps, etc.)
- discrete (molecules) vs. continuousfields - ==> NO UNIVERSAL SOLUTION
A Few Popular Visualization Packages
- AVS & AVS Express- IDL & IDL iTools - Matlab
- TecPlot
- NCAR Graphics + NCL (semi-free) - Paraview/VTK (free)
- IDV (free) - Vis5D (free)
Vis5D
- Vis5D (3D+time+multi-variable = 5D) is a full suite of volumetric visualization tools, created originally for meteorological data (somewhat defines Vis5D context)
- multi-scalars + two (at the time) vector field, + stream lines, “trajectories”, + customizable map & terrain - exists as a stand-alone GUI and/or API suite (advanced) - some customization (although, limited in terms of new graphics) is possible through Tcl-scripting (screen shots, spinning, etc.) and custom user functions - new variables: (array syntax) and user functions
Vis5D (cont.)
- data I/O: conversion Fortran or C codes to Vis5D-format data base .v5d, commented templates are available at
/import/projects/CLASSES/Vis5D/Linux/ARSC_Vis5D_Help/convert/
- .v5d - cross-platform, effectively compressed (up to 1 byte/node/variable) 3-D format of semi-standard status - upon conversion, .v5d files can be rendered immediately - …/Vis5D/doc and …/Vis5D/man - on-line documentation in PDF, SGML, HTML (at the ARSC Linux environment search for INDEX.HTM at /usr/local/pkg/vis5d.../doc)
Vis5D: Hands-on Tutorial
1. Find directory/projects/CLASSES/Vis5D/Linux/ARSC_Vis5D_Help 2. Review README-files (README.first -general, README.environment - Vis5D environment at ARSC, README.tutorial – hands-on step-by-step tutorial) 3. Copy the entire content of the directory above to local HD under unique name (/scratch/Vis5D_YourUID) (see README.tutorial for details)
4. Proceed with step-by-step tutorial from Section 2 in README.tutorial
ParaView (
http://www.paraview.org/
)
ParaView is an open-source, multi-platform data analysisand visualization application. ParaView users can build visualizations to analyze their data using qualitative and quantitative techniques. The data exploration can be done interactively in 3D or programmatically using ParaView's batch processing capabilities.
ParaView was developed to analyze extremely large datasets using distributed memory computing
resources. It can be run on supercomputers to analyze datasets of terascale as well as on laptops for smaller data.
ParaView tutorial can be found at http://
www.paraview.org/Wiki/The_ParaView_Tutorial
Raster Graphics Formats
- consists of “pixels”, usually square
- resolution – number of pixels, N x M - color = R,G,B [+ Alpha]
- Brute force approach - you need N*M*3- or, if transparency is involved, N*M*4 - numbers to store a graphical image – a lot!
- in majority of applications space saving is desired (or highly desired)
Raster Graphics Formats
savings are possible either
1. by limiting the number of colors (discretization of R,G,B-space to 256 or even less colors and introduction of the color tables:
R'G'B’ => color #n: 0-1, 0-3, …, 0-255) or 2. by limiting the shapes precision, thus departing from simple but costly NxM model
or 3. no savings at all for the quality
Raster Graphics Formats
1. Compression by limiting colors to 2, 4, 8, 16, ... ,128, 256 colors (so called “pseudocolor”)
- GIF, (also PNG-8)
small file size, limiting color space
applications in WEB, science (few colors are used) supports transparency, animation (animated GIF)
Raster Graphics Formats
2. Compression by simplifying shapes
JPEG, JPEG2000
flexible compression level ==> variable file size (10% is frequently OK, 30% is good for almost everything)
“lossy” compression – loss of graphics quality, artefacts, especially at the shapes’ edges
Widely used in photo, video (TTL – Through the Lens) application
Raster Graphics Formats
3. No compression at all
TIFF, PNG-24, XWD, BMP, ...
Highest image quality
LARGEST file sizes
formats of choice in professional graphics, polygraphy, satellite imagery, archiving
Raster Graphics Formats (cont.)
4. Both types of compressions (color, shape) are irreversible, (keeping uncompressed archival copy is a reasonable approach)
5. Worst case scenario – second pass of compression of already compressed JPEG. This way compression artefacts will be preserved (stealing file size) and magnified (stealing quality).
6. Better way – keep uncompressed archival copy and compress it to different levels of quality.
Raster Graphics Formats (cont.)
7. Surprise – 30% quality for JPEG compression is not so bad, sometimes (photo for Web, for instance) 10% is quite sufficient.
Photoshop SAVE_FOR_WEB utility is very useful for visual determination of necessary compression level
Raster Graphics Formats
Raster Graphics Formats
Example of compression levels (using Save_for_Web in
Photoshop) for scientific graph with the uniform background of 1700x1000 resolution (close to HDTV, 1080p), but just 16 colors were used to draw it
TIFF - 5 150 000 B or 5.15 MB - no compression
JPEG - 314 100 B (max = 100%), subjectively, no artefacts 129 700 B (mid = 50%), subjectively, minor artefacts 70 400 B (min = 10%), subjectively, a lot of artefacts
GIF - 133 200 (256 colors) - no artefacts, adequate colors 113 000 (64 colors) - no artefacts, adequate colors 85 360 (16 colors) - no artefacts, adequate colors 76 760 ( 8 colors) - color distortion starts here (16)
Animation
From Wikipedia:
Animation is the rapid display of a sequence of images of 2-D or 3-2-D artwork or model positions in order to create an illusion of movement. It is an optical illusion of motion due to the phenomenon of persistence of vision, and can be created and demonstrated in a number of ways.
Standard frequency of the TV-based animation is 30 frames per second (or fps). All software interfaces adopt it as a base rate. By simple repetition of your frames, you can depart from this rate, but not dramatically - 8-10 fps is the practical limit. This means, you can repeat your frames 3- 4 - or 5 (may be) times, but usually not much more.
(Duration example: 45 frames x 3 = 135 135/30 = 4.5 sec)
Animation
From the compression standpoint, the animation frequency of 30 fps means that your graphics compression problem is 30 times worse than in case of the static graphics From http://www.apple.com/quicktime/technologies/h264/ Full HD uncompressed (4:4:4): 1920W x1080H x24bit x30 fr = or 2MP x24bit x30 fr = 1424 Mbps
Compression ratio 1:200
Animation
The good news – animated stream has a lot of redundancy, its compression can be dramatically more effective than compression of the static graphics. Earlier coding techniques used singly predicted (P) frames depended only on previous independently coded frames (I) and bipredicted (B) frames, which are depended on a past and a future I or P frames. The current advanced codecs (H.264) are much more flexible, which improves quality and decreases bitrate for the given resolution
Animation
or increases resolution for the given bitrateAnimation
Animation – practical approach
- Use all ARSC resources (supercomputers, Linux boxes with 3D-graphics subsystems, Mac OS X fine software, etc. )
- Make frames in sufficient quantities (remember 30 fps) with any capable viz package, use large fonts and thick lines - To make the sequence UNIX-friendly for scripting, number it
without leading zeros (f_0001.png vs. f_1001.png) if possible
- Linux – utilities animate+convert – can crop, change formats & quality, it provides fast results, but quite limited in scope (high-end codecs are commercial products)
Animation – practical approach
- Mac OS (ARSC-supported) – QuickTime 7 (compression, timing, outputs for various media ) - Quick Time Player (screen recording
- Final Cut Pro (industry standard) - Shake – advanced composing - Windows – DviX Pro (was AVI, uses the same codec H.264) – basically the same functionality as in QuickTime 7, DviX free player is available for Mac while QuickTime 7 is available for Windows