Download Polyglot Slides (PPTX)

Download Report

Transcript Download Polyglot Slides (PPTX)

Polyglot

Kenton McHenry Rob Kooper Michal Ondrejcek Chris Navarro Liana Diesendruck Aswhini Vaidya Peter Bajcsy National Center for Supercomputing Applications University of Illinois at Urbana –Champaign

The Problem

• The abundance of file formats is a problem when preserving electronic records • Why?

• • • Will there be software to load the file in the future?

If not will the specification for the format still exist?

Was the specification ever available to begin with (closed/proprietary formats)?

*.pdf (*.prc, *.u3d) *.ma, *.mb, *.mp

*.k3d

*.w3d

*.vtk, *.vtp

*.lwo

*.c4d

*.dwg

*.blend

*.iam

*.skp

*.max, *.3ds

Available 3D File Formats…

Converting Formats

• In order to preserve content for future use one option is to convert the file to an open/standardized format that is likely to be supported for some time.

• Store both this file and the original for provenance • Ideally with one file format for a particular content type it will be easy for users to view/use the data.

Converting Formats

• How and which format!?

• Fully supporting the many available formats is an enormous undertaking • If a file format is closed/proprietary it may be difficult to retrieve the data directly from the file • May be possible to reverse engineer and recover some of the content • File formats sometimes store application feature specific pieces of information that’s not supported in other formats • • Examples: animations, physics, … When converting to a format that doesn’t have a place for such information we must drop it.

Information loss!

3D Data Representations

• There are different ways of storing 3D content • Faceted: • • Comprised of vertices and faces Popular within the graphics community • Boundary Representation: • Comprised of vertices, edges, edge loops, and primitive surfaces • Popular among CAD users • • Constructive Solid Geometry • Comprised of boolean operations on primitive volumes …

3D Data Representations

• Translating geometry representation may not be trivial • B-Rep to Faceted • Translating involves triangulating the surfaces created from the bounded primitives (tesselation) • The resulting sampled surface will suffer from aliasing at high viewing resolutions • Can accommodate by performing a finer triangulation (i.e. more triangles and a larger file) • Faceted to B-Rep: • Translating in this direction is non-trivial!

• How does one decide if a group of triangles should be grouped together as part of some larger primitive (e.g. part of a cylinder).

Information Loss Across 3D Formats

• •

Format 3ds igs lwo obj ply stp wrl u3d x3d Faceted

√ √ √ √ √ √ √ √ √

Geometry Parametric CSG

√ √ √ √ √ √ √ √ √

B-Rep

√ √

C olor

√ √ √ √ √ √ √ √ √

Appearance M aterial T exture

√ √ √ √ √ √ √ √ √ √ √ √ √

B ump

L ights

V iews Scene Trans.

√ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ How can we empirically track information loss?

G roups

√ √ √ √ √ √

Animation

√ √ √ Which is the best format in terms of preserving the data?

Conversion, Information Loss, and Optimal File Formats

• With a “universal” converter we could convert files from every format A to every other format B • Assuming we then had a loader for both format A and format B we could load and compare the 3D content independent of how it is stored • Within the context of digital preservation we can define an optimal format as one that retains on average the most information when converted to by other formats

Converting 3D Files

• i.e. Loading and saving 3D files • There is no one application that can convert between any pair of 3D file formats • • Nothing quite like ImageMagick for image conversions Many formats, closed specifications • Machine readable specifications • • • XML, EXPRESS DFDL, XCL Specification must written (implemented) in this language!

NCSA Polyglot (2009)

• Conversion service based on utilizing any and all available software support • Software reuse (analogous to code reuse) • Attach a programmable interface to compiled software • Scripted operations within software (e.g. open, save) • GUI scripting (e.g. AutoHotKey) • Created a simple workflow across application open/save operations • Compared files before/after conversion to measure information loss • NCSA 3D Utilities • Distributed across multiple machines for horizontal scalability

ISDA Tools for Digital Preservation

• • • • Software Servers The Conversion Software Registry (CSR) Polyglot 2 Versus

ISDA Tools for Digital Preservation

• • • •

Software Servers

The Conversion Software Registry (CSR) Polyglot 2 Versus

Software Servers

• Provides access to software functionality over the web • Similar to services such as: telnetd, sshd, VNC, rdesktop • The main difference is in the interface: • • REST interface and Java API Capable of being programmed against • Widely accessible across a number of programming/scripting languages • Uniform across all software http://host:8182/software/{application}/{task}/{output_format}/{input_file} • Simple

Problem

• Diverse software interfaces • • • • Command Line Macros/Scripts API Graphical User Interface (GUI) • Characteristics of the desired interface: • • Programmable Consistent across all software • Some sort of interface will be present • Must be usable by something

Approach

• Wrap individual units of software functionality • • Support GUIs!

Consistent • Convert ALL software to command line interfaces • Expose a new API • Manage software • • • Protect environment Protect non-atomic tasks Maintain operation (i.e. don’t crash the system)

Wrapping Software Functionality

• Command line, API interfaces achievable in any of a variety of languages • GUIs • Apple Script • Requires software support • AutoHotKey, AutoIT, Xnee • OS dependent, however no software support required • Sikuli • OS and software independent

Wrapper Script Conventions

• • • Filename: • • alias_operation.ahk

e.g. A3DReviewer_open.ahk

Arguments: • • Input file, Output file, Scratch space e.g. A3DReviewer_open C:\foo.obj C:\foo.stp C:\Temp Header: • • • • Name (Version) Type Input files Output files

Operations

• • • Input/Output: • open, save, convert Maintenance: • exit, monitor, kill Everything else: • modify

Modify Operation

• • • Added support for arbitrary operations Implied by any non-recognized operation Can take input and output or nothing (determined from header) • Examples: • • • • • • blur image decimate mesh change font to X in a document record video/audio make be a pie chart in a spreadsheet with given data anything that can be scripted…

Manage Software

• Hide direct access to software operations • e.g. “open” doesn’t make sense by itself • Tasks • Convert: • convert • open + save • Modify: • • modify open + modify + save •

Files are the main means of input and output!

Manage Software

• Monitor task execution • • •

Monitor

desktop for rare events

Exit

less reliable applications after each task Timed execution • If time is exceeded

kill

the operation • Protect the state of the desktop!

Expose an Interface

• Uses the Java restlet framework to provide a web based interface • Functionality can be easily accessed from many languages through URLs http://host:8182/software/{application}/{task}/{output_format}/{input_file}

AutoHotKey

• GUI based scripting language • • Uses the message passing system behind the Windows OS Store and send messages to the appropriate buttons/menu items to carry out tasks • • Window ID’s identify individual widgets Tools: AutoIt3 Windows Spy, WinSpectorSpy • Some care needs to be taken to respond to dialogue boxes that may appear if an error occurs.

} ;Run program if not already running IfWinNotExist, Adobe 3D Reviewer { WinWait, Adobe 3D Reviewer

AutoHotKey Scripting

Run, C:\Program Files\Adobe\Acrobat 9.0\Acrobat\plug_ins3d\prc\A3DReviewer.exe

;Activate the window WinActivate, Adobe 3D Reviewer WinWaitActive, Adobe 3D Reviewer … ;Open model PostMessage, 0x111, 57601, 0,, Adobe 3D Reviewer WinWait, Open ControlSetText, Edit1, %1% ControlSend, Edit1, {Enter} ;Make sure model is loaded before exiting Loop { IfWinExist, Adobe 3D Reviewer - [%name%] { break } ;Click "OK" if this was an unknown format ControlGetText, tmp, Static2, Adobe 3D Reviewer if(tmp = "Unknown Format") { ControlClick, Button1, Adobe 3D Reviewer } } Sleep, 500

AutoHotKey Scripting

;OpenOffice ;document ;ImageMagick ;doc, odt, rtf, txt ;doc, odt, pdf, rtf, txt ;Run program Run, "C:\Program Files\OpenOffice.org 3\program\soffice.exe" -headless -norestore "-accept=socket`,host=local…" RunWait, "C:\Program Files\OpenOffice.org 3\program\python.exe" "C:\Converters\DocumentConverter.py" "%1%" "%2%"

Problems with AutoHotKey

• Can suffer from timing issues • • • e.g. clicking on a button before the button has appeared Can often be avoided with some work GUI interfaces are designed for ease of human use • Humans see!

Graphical User Interfaces

Graphical User Interfaces

WinMenuSelectItem, Untitled - Notepad,, File, Open ControlSetText, Edit1, %1% ControlSend, Edit1, {Enter}

If not done carefully the script will attempt to fill the edit box and press open before the open dialogue box is even visible!

Graphical User Interfaces

Vision Based Scripting

• Avoid the timing issues inherent to GUI scripting languages like AutoHotKey • • Uses screen shots to see the desktop Acts based on what is “seen”

Sikuli Script

• Uses JPython and the Java Robot class to allow scripting with screenshots

Sikuli Script

• • • • Because it can “see” the desktop it doesn’t suffer from the timing issues observed with AutoHotKey Relatively slow compared to AutoHotKey Dependent on desktop theme Current release only allows script creation through IDE via Python and manual screen grabs

Monkey Script

• • “Monkey see, monkey do” Uses open source TightVNC library and the Java Robot class to record user actions and screenshots before and after each action

Monkey Script

AHKlipse

Software Server Setup

• ScriptDebugger • • • Automatically tests functionality of scripts using a small data set Configures scripts to run on a given machine Runs scripts a number of times using random data to evaluate robustness • ScriptInstaller • • • • Searches local system for installed software Queries the Conversion Software Repository (CSR) Uses debugger to configure script for local system Optionally downloads test data from CSR and tests the robustness of the script on the local system

ISDA Tools for Digital Preservation

• • • • Software Servers

The Conversion Software Registry (CSR)

Polyglot 2 Versus

The Conversion Software Registry

• There is a lot of software available, each with its own unique capabilities • How can someone know what software to get for their needs?

http://isda.ncsa.illinois.edu/NARA/CSR

The Conversion Software Registry

Adobe 3D Reviewer

The Conversion Software Registry

Input/Output Graphs

Adobe 3D Reviewer

Input/Output Graphs

3DS Max Adobe 3D Reviewer AutoCAD Blender Cinema 4D K-3D LightWave 3D Maya Wings 3D

Input/Output Graphs

Shortest conversion path

The Conversion Software Registry

The Conversion Software Registry

The Conversion Software Registry

The Conversion Software Registry

The Conversion Software Registry

The Conversion Software Registry

The Conversion Software Registry

Software Server Setup

Software Server Setup

Software Server Setup

Software Server Setup

Software Server Setup

A Two Click Configuration

• • • The Script Debugger and Script Installer connected with the CSR make possible a two click configuration of a software server.

Modify Windows registry: [HKEY_CLASSES_ROOT\lnkfile\Shell\Share] @=“Share Functionality” [HKEY_CLASSES_ROOT\lnkfile\Shell\Share\command] @=“C:\\Program Files\\Polyglot2\\Share.bat %1” … to point to a small batch file, Share.bat: cd “%~dp0” call ScriptInstaller -shortcut \”%1\” SoftwareServer

A Two Click Configuration

Software Servers

Software Servers

Software Servers

Software Servers

Software Servers

Software Servers

Software Servers

Software Server REST Interface

Software Server REST Interface

Software Server REST Interface

Software Server REST Interface

Programming/Scripting with Software Servers

#!/bin/bash host="http://141.142.224.231:8182" application="A3DReviewer" task="convert" output="igs" input="stp" url=$host/software/$application/$task/$output for input_file in `ls *.$input` ; do output_url=`

curl

-s -H "Accept:text/plain" -F "file=@$input_file" $url` output_file=${input_file%.*}.$output echo "Converting: $input_file to $output_file" while : ; do

wget

-q -O $output_file $output_url if [ ${?} -eq 0 ] ; then break fi done sleep 1 done

Authentication

http://{username}:{password}@{host:port}/software/{application}/...

Java API

SoftwareServerClient softwareserver = new SoftwareServerClient("localhost", 50002); Vector applications = softwareserver.getApplications(); for(int i=0; i

Java API

SoftwareServerClient softwareserver = new SoftwareServerClient("localhost", 50002); Tasklist tasks = new TaskList(softwareserver); tasks.add("Blender", "convert", "./heart.wrl", "heart.stl"); tasks.add("A3DReviewer", "open", "heart.stl", ""); tasks.add("A3DReviewr", "export", "", "heart.stp"); tasks.execute("./"); softwareserver.close();

Software Server Client

> SoftwarServerClient localhost:50000 -cwd data softwareserver>

Software Server Client

> SoftwarServerClient localhost:50000 -cwd data softwareserver> ls heart.wrl

softwareserver>

Software Server Client

> SoftwarServerClient localhost:50000 -cwd data softwareserver> ls heart.wrl

softwareserver> tasks task 1>

Software Server Client

> SoftwarServerClient localhost:50000 -cwd data softwareserver> ls heart.wrl

softwareserver> tasks task 1> Blender convert ./heart.wrl heart.stl

task 2>

Software Server Client

> SoftwarServerClient localhost:50000 -cwd data softwareserver> ls heart.wrl

softwareserver> tasks task 1> Blender convert ./heart.wrl heart.stl

task 2> A3DReviewer open heart.stl "" task 3> A3DReviewer export "" heart.stp

task 4> end softwareserver>

Software Server Client

> SoftwarServerClient localhost:50000 -cwd data softwareserver> ls heart.wrl

softwareserver> tasks task 1> Blender convert ./heart.wrl heart.stl

task 2> A3DReviewer open heart.stl "" task 3> A3DReviewer export "" heart.stp

task 4> end softwareserver> ls heart.stp

heart.wrl

Scientific Workflow Systems

Cyberintegrator

Expanding Workflow Systems

Robustness

• Measure throughput of software on a Software Server • • Change file extensions in filename TRY TO MAKE IT FAIL!!!

• Experiment: • Software: 3D Studio Max, Adobe 3D Reviewer, Blender, Google Sketchup, ImageMagick, IrfanView, Microsoft Paint, Microsoft Word, ParaView, VTK • • Machine: VM with 1 core and 1GB of memory • Case 1 - correctly named files: • 1395 tasks/hour with an average wait of 4.42 s Case 2 – incorrectly named files: • 945 tasks/hour with an average wait of 11.17 s.

• Server did not crash!

Robustness

• Command line software (baseline): • • ImageMagick: 1871 tasks/hour IrfanView: 3163 • GUI software: • • 3DS Max: 355 tasks/hour Microsoft Word: 756 tasks/hour • How many people would it take using this software for the same throughput?

Software Parameters

• Software Servers initially utilized software according to how they were called within the wrapper scripts • • Fixed parameters Write multiple scripts with different parameter values

Software Parameters

Software Parameters

Supporting Parameters

> alias_convert.ahk path1/filename1 path2/filename2 > alias_convert.ahk path1/filename1 path2/filename2 p1=v1 p2=v2

Supporting Parameters

;json ;name (version) ;type ;input formats (comma separated) ;output formats (json format) ;software parameters (comma separated)

Supporting Parameters

;json ;Image J (v1.6) ;image ;jpg ;png,tif,jpg ;{"outputFormat":[{"format":"tif"},{"format":"jpg", "paramSelections":[{"continuousParam":"true", "nameForDisplay":"quality", "nameInSoftware":"Save quality", "range":"true", "minRange":"1", "maxRange":"100", "possibleValues":[], "nestedParams":[[]]}, {"discreteParam":"true", "nameForDisplay":"progressive", "nameInSoftware":"Save as progressiev JPG", "possibleValues":["true", "false"], "nestedParams":[[], []]}, {"discreteParam":"true", "nameForDisplay":"grayscale", "nameInSoftware":"Save as grayscale JPG", "possibleValues":["true", "false"], "nestedParams":[[], []]}, {"discreteParam":"true", "nameForDisplay":"color subsampling", "nameInSoftware":"Disable color subsampling", "possibleValues":["true", "false"], "nestedParams":[[], []]}, {"discreteParam":"true", "nameForDisplay":"original EXIF", "nameInSoftware":"Keep original EXIF data", "possibleValues":["true", "false"], "nestedParams":[[], []]}, {"discreteParam":"true", "nameForDisplay":"original IPTC", "nameInSoftware":"Keep original IPTC data", "possibleValues":["true", "false"], "nestedParams":[[], []]}, {"discreteParam":"true", "nameForDisplay":"Reset EXIF orientation tag", "nameInSoftware":"Reset EXIF orientation tag", "possibleValues":["true", "false"], "nestedParams":[[], []]}, {"discreteParam":"true", "nameForDisplay":"keep comment", "nameInSoftware":"Keep original JPG-Comment", "possibleValues":["true", "false"], "nestedParams":[[], []]}]}]}

Supporting Parameters in the CSR

Supporting Parameters

Software Parameter Recorder

Supporting Parameters in Software Servers

http://host/software/{application}/{task}/{output_format}/{input_file}

Supporting Parameters in Software Servers

Supporting Parameters in Software Servers

Supporting Parameters in Software Servers

http://host/software/{application}/{task}/{output_format}/{input_file}/{param1=x}&{param2=y}

ISDA Tools for Digital Preservation

• • • • Software Servers The Conversion Software Registry (CSR)

Polyglot 2

Versus

Polyglot

• • • • Listens for Software Servers on the network Catalogues available input/output operations and constructs an I/O-graph Identifies conversion paths between input and output formats

Carries out distributed chained conversions

Polyglot

Polyglot

Polyglot

Polyglot

Polyglot Client

> PolyglotClient image.jpg gif ./

Polyglot Panel

Scalability

Java API

PolyglotClient polyglot = new PolyglotClient("localhost", 50002); Vector formats = polyglot.getOutputs("wrl"); for(int i=0; i

Java API

PolyglotClient polyglot = new PolyglotClient("localhost", 50002); polyglot.convert("heart.wrl", "./", "stp"); polyglot.close();

ISDA Tools for Digital Preservation

• • • • Software Servers The Conversion Software Registry (CSR) Polyglot 2

Versus

Versus

• Distributed service and framework for content based comparisons • • http:///api/v1/comparisons POST: adapter, extractor, measure, file URLs • Reusable/extensible components • Adapters: Parse file and load contents into a suitable data structure • Extractors: Extract semantically meaningful features from content and represent as a suitable descriptor • • Descriptors: Numerical representations of content Measures: Methods taking two descriptors as input and return a value of similarity or dis-similarity.

• Indexers: Methods which organize descriptors for the purpose of fast comparisons

Adapters Name

Mesh Audio Bytes PDF Buffered Image Image Object SIFT GPU

Package

3D Audio Core Doc2Learn Image Image GPU

Description

Load 3D files content into a mesh made up of vertices and polygons connecting those vertices.

Encapsulation of audio files.

Simplest possible representation of data.

Encapsulation of the Doc2Learn PDF document.

Standard Java representation of image data. Encapsulation of the Im2Learn Image Object.

Encapsulation of image data for SIFT Gpu specific processing.

Extractors Name

Light Field Statistics Surface Area Audio MD5 Image Histogram Line Graphics Histogram Text Histogram Array Feature Color Average Vector Feature Grayscale Histogram Pixel Histogram RGB Histogram Signature Vector MOPS Features SIFT Features SIFT Gpu Harris Corners Hough Circles Hough Lines SURF Features

Package

3D

Description

Surface is represented by silhouettes taken from 3 canonical positions capturing the surface shape minus any concavities (i.e. the convex hull).

3D 3D Ignores the surface and focuses on the vertices of a 3D object returning their mean and standard deviation. Simple, but fast to compute.

The sum of the area occupied by the polygons making up a surface. Considers surface and is still fast to compute.

Audio Core Doc2Learn Sampling of audio from existing file for histogram usage and comparison.

Creation of the MD5 hash from data.

Generates a non-standard color histogram Doc2Learn Doc2Learn Generates a histogram to compare vector graphics found in documents.

Generates a label histogram based on word frequency.

Image Image Image Image Image Image Fiji Fiji Gpu OpenCV OpenCV OpenCV OpenCV Generates the three-dimensional double array; a generic image container.

Generates an average RGB color over 9 regions taken from the image.

Generates the histogram for grayscale images. Useful for image comparison.

Generates the multidimensional histogram for feature matching.

Generates the histogram for color images. Useful for image comparison.

Feature vector (for an image) containing colorspace information and pixel position. Open source implementation for the MOPS detector.

Open source implementation of Lowe’s method.

Gpu implementation for the SIFT detector.

Corner detector for images.

Circle detector for images Line detector for images Open source implementation for the SIFT detector.

Descriptors Name

Double Array MD5 Digest Three Dimensional Double Array Vector Label Histogram Keypoint Pixel Color Layout Grayscale Histogram RGB Histogram Pixel Histogram MOPS Features SIFT Features SIFT Gpu Harris Corners Hough Circles Hough Lines SURF Features

Package

Core Core Core Core Doc2Learn Image Image Image Image Image Image Fiji Fiji Gpu OpenCV OpenCV OpenCV OpenCV

Description

A single dimensional array containing double values.

A data integrity structure generated from the raw data.

A three-dimensional array containing double values.

A list of generic elements, allows greater storage flexibility. A histogram of labels obtained through Doc2Learn.

Generic container for invariant feature detectors.

Generic type for various image package descriptors.

A two dimensional grid of sub-images over the input image.

A one-dimensional grayscale image histogram.

A three-dimensional RGB color histogram. A multidimensional histogram for a pixel’s intensity and position. Invariant feature type used for image stitching.

Popular invariant feature type used for image comparison and object matching. Same as SIFT but implemented through Gpu libraries. Well-known corner detector used for image inference, tracking, and recognition.

Circles detected in an image with the Hough Transform.

Lines detected in an image with the Hough Transform.

Invariant feature type that can be computed faster than standard SIFT.

Measures Name

Chessboard Distance Dynamic Time Warping Euclidean Distance Manhattan Distance MD5 Hash Bhattacharyya Distance Neyman’s χ 2 Czekanowski Distance Histogram Euclidean Distance Histogram Intersection KL Divergence Jeffrey Divergence Motyka Distance Normalized Cross Correlation Ruzicka Similarity Sum of Squared Differences Tanimoto Distance Wave Hedges Distance Invariant Feature Comparison

Package

Core Core Core Core Core Image Image Image Doc2Lear n / Image Doc2Lear n / Image Image Image Image Image Image Image Image Image

Description

Also known as Chebyshev; the greatest difference along any coordinate dimension (between two vectors) Similarity metric between two (possibly) varying sequences over time.

Distance between two n-dimensional points in Euclidean space.

Absolute difference of coordinates of points, distance between two points measured along right angled axes.

Binary measure; either equal or not.

Measures the overlap between two probability distributions.

Tests the goodness of fit between two distributions. Variant of the standard χ 2 test.

Sum of the absolute value of the difference of two distributions divided by the sum of the two distributions.

Bin-by-bin comparison using the standard Euclidean distance. Well known and widely used.

Sum of the absolute value of the difference of two distributions, scaled by one-half. Well known and widely used.

Non-symmetric measure of the difference between two probability distributions. Well known measure of entropy.

Symmetric measure of the difference between two probability distributions.

Sum of the maximum of two distributions divided by the sum of the two distributions.

Similar to sum of squared differences; invariant to the magnitude of two points. Sum of the minimum of two distributions divided by the sum of the maximum of the two distributions.

Sum of squared differences between two arrays, cheaper to computer than Euclidean distance.

Sum of the difference of the max and the min of two distributions divided by the sum of the max. Sum of the absolute value of the difference between two distributions divided by their maximum. Fiji / OpenCV / SIFT Gpu Compares the invariant features between two images by calculating the pairwise Euclidean distance and voting for a match using a predetermined threshold.

Earth Mover’s Distance OpenCV Measures the similarity between two probability distributions. This is the minimum cost of transforming one distribution to the other.

Content Based Retrieval

Content Based Retrieval

Content Based Retrieval

Content Based Retrieval

Versus

• Includes or subsumes: • • • • • • • Comparisons within 3D Utilities Doc2Learn Comparison/Indices in Census work Comparisons in LSVA work OpenCV ImageJ … •

Versus is a Framework!

• • Distributed, scalable web service for content based comparisons Consistent, reusable, repurposable components

Measuring Information Loss

We would like to assign a value to each conversion edge …

With a “universal” converter we could convert files from every format A to every other format B • Assuming we then had a loader for both format A and format B we could load and compare the 3D content independent of how it is stored

Measuring 3D Information Loss

• • • • • •

Adobe 3D Reviewer Blender Cyberware PlyTool K-3D NIST VRML/X3D VTK

good… (e.g. 1.0) not so good… (e.g. 0.1)

3D Information Loss and 3D File Loaders

• If we were able to load every file format we wouldn't have needed to build a “universal” converter.

• Implement loaders for a small number of formats that will make up our test data set • Convert from format A along path to some format B then back to A again • Estimate path scores by comparing before/after content and assigning scores to all edges along path

STP to X3D to STP

STP A3D Reviewer Vrml97ToX3d WRL X3D X3dToVrml97 WRL A3D Reviewer STP

I/O-Graph Weights Tool

Measuring 3D Information Loss

• • • Data representation • Meshes Loaders Use 3D similarity as a means of comparing 3D models • • • • Statistics Surface Area [Brunnermeier, RTI 1999] Spin Images [Johnson, PAMI 1999] Light Fields [Chen, Eurographics 2003]

Statistics

• • • Use the mean and standard deviation of the vertices to represent the model Simple but fast to compute Sensitive to size and orientation of the model

Surface Area

• • • Use the sum of face areas to represent the model Also simple and fast to compute Sensitive to size, somewhat sensitive to shape. Will detect loss of faces.

Light Fields [Chen, 2003]

• Compares silhouettes from various viewing angles around a model.

Light Fields

Light Fields

Light Fields

Light Fields

Light Fields

• • Fairly fast to compute Sensitive to shape of convex hull, invariant to rigid transformations

Spin Images [Johnson, 1999]

• 2D histograms of the in plane and out of plane distances of vertices neighboring a given vertex.

N q

b

p

a

Spin Images

Spin Images

Spin Images

Spin Images

Spin Images

Spin Images

• • Expensive to compute Sensitive to relative vertex position, ignores surface, invariant to rotations and translations

Which conversion preserved the most?

• Using the light fields measure: • • Emphasizes shape through silhouettes Adobe 3D Reviewer between *.pdf and *.stp (61.67) • Using the spin image measure: • • Emphasize shape through relative vertex positions Adobe 3D Reviewer between *.obj and *.pdf (59.07)

Which is the best format?

Within the context of digital preservation, which format retains on average the most information when converted to by other formats.

• Using the light fields measure: • • Emphasizes shape through silhouettes *.stp (40.73) • Using the spin image measure: • • • Emphasizes shape through relative vertex positions *.stl (34.89) *.stp being a CAD format has more variability in vertex positions due to tessellation

Conclusion Image Software by Information Preservation 3D Software by Information Preservation

1. ImageMagick 2. Adobe Photoshop 3. GIMP

Image Formats by Information Preservation 3D Formats by

1. Adobe 3D Reviewer 2. 3DS Max 4. Microsoft Paint 5.

… 1. STL 2. PNG 3. GIF 2. STP 5.

4. MAX … 3. Maya

Video Formats by

4. Blender 1. AVI

Image Formats by File Size

2. MOV … 1. MP4 2. MAX 3. PLY 5.

… 3. WMV 1. JPG 4. MPG 5.

… 2. GIF 3. PNG 4. OBJ 5.

… 4. PPM 5.

1,682 formats 2,007 applications

*

Rankings shown demonstrate the type of output that can be obtained and does NOT represent actual results!

Data Access Proxy

https://dap.ncsa.illinois.edu/convert/{output_format}/

Hands-on

• http://isda.ncsa.illinois.edu

• Services -> Conversion Software Registry • Software -> Polyglot -> Documentation (Web) • WindowsQuickInstall.txt

https://opensource.ncsa.illinois.edu/stash/projects/POL/repos/polyglot/browse/src/main/documentation/WindowsQuickInstall.txt