• No results found

Data analysis and visualization topics

N/A
N/A
Protected

Academic year: 2021

Share "Data analysis and visualization topics"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

Data analysis and

visualization topics

Sergei MAURITS,

ARSC HPC Specialist

[email protected]

Content – Day 1

- Visualization of 3-D data - basic concepts - packages

- steady graphics formats and compression - animation formats and compression (codecs) - practical approach for animation

- examples - Hands-on training

compositing animation from frames with shake

Content – Hands-on Day 2

- X Windows connection to the ARSC workstations (http://www.arsc.edu/arsc/support/howtos/usingws/) - basic concepts of ParaView

- animation with ParaView - making frames with ParaView

- moving files (frames) from Linux realm to Mac - using shake to edit the frames

- preparing stand-alone animation with QuickTime 7 All in the hands-on training mode!

3-D Data Visualization

-

isosurfaces (close to natural appearance) -  cross-sections (3-D ==> 2-D)

-  probes/sounding (convenience tools) -  vectors, streamlines, particles paths -  volume rendering (mix of all above) -  time dependence (animation) -  multi-variable sets (clutter problem) -  miscellaneous contexts (CFD objects,

terrain, geographical maps, etc.)

- discrete (molecules) vs. continuousfields -  ==> NO UNIVERSAL SOLUTION

(2)

A Few Popular Visualization Packages

- AVS & AVS Express

- IDL & IDL iTools - Matlab

- TecPlot

- NCAR Graphics + NCL (semi-free) - Paraview/VTK (free)

- IDV (free) - Vis5D (free)

Vis5D

-  Vis5D (3D+time+multi-variable = 5D) is a full suite of volumetric visualization tools, created originally for meteorological data (somewhat defines Vis5D context)

-  multi-scalars + two (at the time) vector field, + stream lines, “trajectories”, + customizable map & terrain -  exists as a stand-alone GUI and/or API suite (advanced) -  some customization (although, limited in terms of new graphics) is possible through Tcl-scripting (screen shots, spinning, etc.) and custom user functions - new variables: (array syntax) and user functions

Vis5D (cont.)

- data I/O: conversion Fortran or C codes to Vis5D-format data base .v5d, commented templates are available at

/import/projects/CLASSES/Vis5D/Linux/ARSC_Vis5D_Help/convert/

-  .v5d - cross-platform, effectively compressed (up to 1 byte/node/variable) 3-D format of semi-standard status -  upon conversion, .v5d files can be rendered immediately -  …/Vis5D/doc and …/Vis5D/man - on-line documentation in PDF, SGML, HTML (at the ARSC Linux environment search for INDEX.HTM at /usr/local/pkg/vis5d.../doc)

Vis5D: Hands-on Tutorial

1. Find directory

/projects/CLASSES/Vis5D/Linux/ARSC_Vis5D_Help 2. Review README-files (README.first -general, README.environment - Vis5D environment at ARSC, README.tutorial – hands-on step-by-step tutorial) 3. Copy the entire content of the directory above to local HD under unique name (/scratch/Vis5D_YourUID) (see README.tutorial for details)

4. Proceed with step-by-step tutorial from Section 2 in README.tutorial

(3)

ParaView (

http://www.paraview.org/

)

ParaView is an open-source, multi-platform data analysis

and visualization application. ParaView users can build visualizations to analyze their data using qualitative and quantitative techniques. The data exploration can be done interactively in 3D or programmatically using ParaView's batch processing capabilities.

ParaView was developed to analyze extremely large datasets using distributed memory computing

resources. It can be run on supercomputers to analyze datasets of terascale as well as on laptops for smaller data.

ParaView tutorial can be found at http://

www.paraview.org/Wiki/The_ParaView_Tutorial

Raster Graphics Formats

-  consists of “pixels”, usually square

-  resolution – number of pixels, N x M -  color = R,G,B [+ Alpha]

-  Brute force approach - you need N*M*3- or, if transparency is involved, N*M*4 - numbers to store a graphical image – a lot!

-  in majority of applications space saving is desired (or highly desired)

Raster Graphics Formats

savings are possible either

1. by limiting the number of colors (discretization of R,G,B-space to 256 or even less colors and introduction of the color tables:

R'G'B’ => color #n: 0-1, 0-3, …, 0-255) or 2. by limiting the shapes precision, thus departing from simple but costly NxM model

or 3. no savings at all for the quality

Raster Graphics Formats

1. Compression by limiting colors to 2, 4, 8, 16, ... ,128, 256 colors (so called “pseudocolor”)

- GIF, (also PNG-8)

small file size, limiting color space

applications in WEB, science (few colors are used) supports transparency, animation (animated GIF)

(4)

Raster Graphics Formats

2. Compression by simplifying shapes

JPEG, JPEG2000

flexible compression level ==> variable file size (10% is frequently OK, 30% is good for almost everything)

“lossy” compression – loss of graphics quality, artefacts, especially at the shapes’ edges

Widely used in photo, video (TTL – Through the Lens) application

Raster Graphics Formats

3. No compression at all

TIFF, PNG-24, XWD, BMP, ...

Highest image quality

LARGEST file sizes

formats of choice in professional graphics, polygraphy, satellite imagery, archiving

Raster Graphics Formats (cont.)

4. Both types of compressions (color, shape) are irreversible, (keeping uncompressed archival copy is a reasonable approach)

5. Worst case scenario – second pass of compression of already compressed JPEG. This way compression artefacts will be preserved (stealing file size) and magnified (stealing quality).

6. Better way – keep uncompressed archival copy and compress it to different levels of quality.

Raster Graphics Formats (cont.)

7. Surprise – 30% quality for JPEG compression is not so bad, sometimes (photo for Web, for instance) 10% is quite sufficient.

Photoshop SAVE_FOR_WEB utility is very useful for visual determination of necessary compression level

(5)

Raster Graphics Formats

Raster Graphics Formats

Example of compression levels (using Save_for_Web in

Photoshop) for scientific graph with the uniform background of 1700x1000 resolution (close to HDTV, 1080p), but just 16 colors were used to draw it

TIFF - 5 150 000 B or 5.15 MB - no compression

JPEG - 314 100 B (max = 100%), subjectively, no artefacts 129 700 B (mid = 50%), subjectively, minor artefacts 70 400 B (min = 10%), subjectively, a lot of artefacts

GIF - 133 200 (256 colors) - no artefacts, adequate colors 113 000 (64 colors) - no artefacts, adequate colors 85 360 (16 colors) - no artefacts, adequate colors 76 760 ( 8 colors) - color distortion starts here (16)

Animation

From Wikipedia:

Animation is the rapid display of a sequence of images of 2-D or 3-2-D artwork or model positions in order to create an illusion of movement. It is an optical illusion of motion due to the phenomenon of persistence of vision, and can be created and demonstrated in a number of ways.

Standard frequency of the TV-based animation is 30 frames per second (or fps). All software interfaces adopt it as a base rate. By simple repetition of your frames, you can depart from this rate, but not dramatically - 8-10 fps is the practical limit. This means, you can repeat your frames 3- 4 - or 5 (may be) times, but usually not much more.

(Duration example: 45 frames x 3 = 135 135/30 = 4.5 sec)

Animation

From the compression standpoint, the animation frequency of 30 fps means that your graphics compression problem is 30 times worse than in case of the static graphics From http://www.apple.com/quicktime/technologies/h264/ Full HD uncompressed (4:4:4): 1920W x1080H x24bit x30 fr = or 2MP x24bit x30 fr = 1424 Mbps

(6)

Compression ratio 1:200

Animation

The good news – animated stream has a lot of redundancy, its compression can be dramatically more effective than compression of the static graphics. Earlier coding techniques used singly predicted (P) frames depended only on previous independently coded frames (I) and bipredicted (B) frames, which are depended on a past and a future I or P frames. The current advanced codecs (H.264) are much more flexible, which improves quality and decreases bitrate for the given resolution

Animation

or increases resolution for the given bitrate

Animation

(7)

Animation – practical approach

-  Use all ARSC resources (supercomputers, Linux boxes with 3D-graphics subsystems, Mac OS X fine software, etc. )

-  Make frames in sufficient quantities (remember 30 fps) with any capable viz package, use large fonts and thick lines -  To make the sequence UNIX-friendly for scripting, number it

without leading zeros (f_0001.png vs. f_1001.png) if possible

-  Linux – utilities animate+convert – can crop, change formats & quality, it provides fast results, but quite limited in scope (high-end codecs are commercial products)

Animation – practical approach

-  Mac OS (ARSC-supported) – QuickTime 7 (compression, timing, outputs for various media ) - Quick Time Player (screen recording

- Final Cut Pro (industry standard) - Shake – advanced composing - Windows – DviX Pro (was AVI, uses the same codec H.264) – basically the same functionality as in QuickTime 7, DviX free player is available for Mac while QuickTime 7 is available for Windows

References

Related documents

Develop powerful presentation skills for increased business and personal success L 14 CPD hours Course code 0065 Level Masterclass Duration 2 days. Venues

HOST data interface Data transfer rate Protocols Supply voltage Power consumption Enclosure rating Conformity Dimensions Temperature (operating/storage) CLV 295 (line scanner).

Abstract: The aim of this study is to investigate the heat transfer and fluid flow characteristics in a rectangular channel in the presence of triangular

– Stand alone – ITSM on SAP Solution Manager 7.1 Change Management Create request for change Qualify change Approve change Close change Create request for change Qualify

- The Adaptec RAID Controllers Installation and User's Guide contains complete installation information for the controllers and drivers, as well as complete instructions

- When using Adaptec Storage Manager and the CLI concurrently, configuration changes may not appear in the Adaptec Storage Manager GUI until you refresh the display

Finally, to determine the goodness of the metaheuristics considered in this work, we present a comparison of the results from the HDE with several compet- itive algorithms present

In [ 14 ], a Genetic Algorithm (GA) was proposed to solve the multi-purpose machine (MPM) scheduling problem with fixed non-crossable unavailable periods in a job shop environment