AXF – Archive eXchange Format:
Interchange & Interoperability
for Operational Storage and Long-Term Preservation
Report to SMPTE Washington DC Section
Bits By the Bay Conference
21 May, 2014
S. Merrill Weiss / Merrill Weiss Group LLC
Chairman, TC-31FS30 AXF Working Group
• History
– Proprietary Systems • Formats on Media • Interface Protocols
• Same System Type Required for Recovery • Media Migration Not Easy
– Danger of Orphaned Archives — If System Support Ended – High Costs of Implementation & Operation
• Individualized System Integration Requirements
• Transfer Costs Resulting from Inability to Interchange Media
• User Requirements
– Movement of Material Between Different Archive Systems • Between Different Operations of Same Company
• Between Companies
• Retrieving Files & Metadata from Media Into Different Systems – Flexibility in Changing Archive Management Vendors
• No Loss of Data or Metadata When Changing Vendors • No Requirement for Native Use of Format
• User Requirements – cont’d.
– Archive Investment Protection
• Ability to Retrieve Files & Metadata In Absence of Creating System • Ability to Read Media Into Other Archive Systems
• Ability to Retrieve Media Contents w/Simple Utility on Many OSs – Automation Metadata Support
• Inclusion of Metadata for Systems Interacting w/Archive Systems – E.g., Traffic, Automation, Color Correction, Editing Systems • Allow Importing “Discovered” Items into Databases
• Additional Requirements
– Providing Unlimited Storage Capability
• File Size, Number of Files, Number of Media, etc.
– Providing an Implementation Strategy
– Providing Support for All Types of Media – Current & Future – Providing for Storage on More Than One Medium – “Spanning” – Providing for Updates/Changes to Archive Objects over Time – Providing for Information Recovery from Damaged Media
– Providing for File Exchange via Communications Channels • Enables Use of Cloud Storage
• Underlying Assumption
–
Same Type of Media Supported on Both Systems• Source of Medium & Recipient of Medium • Drives, Drivers, & Control Software
• Large Existing Installed Base & Large Archive Inventory
• Initially, Explicit Export & Import
– Export of Archive Objects to Media Specifically for Interchange
– Importing of Archive Objects into Receiving Systems • By Translating to Native Formats of Receiving Systems • By Inclusion of Interchanged Objects in Databases
• Later, Adoption of Format as Native in Archive Systems
– Eliminating Need for Separate Export/Import Steps – Permitting Direct Transfer of Media Between Systems
• Data Recovery Utilities Available Through SMPTE
– To Be Contributed by WG Participants – For Wide Variety of Operating Systems – For All Media Types
– Permitting Data Recovery without an Archive System – To Help Ensure Access to Archived Files & Associated
Metadata
• SMPTE ST 2034 Part 1 Written & In Ballot
– Result of Years of Work & Refinement
– Uses XML Schema to Define Most Structures – Expected to Be Completed This Year
• Part 2 Will Be Standard on Use of AXF Schema Upstream
– Providing Mechanism for Communicating Metadata to Archive Systems
– Most Major Functions Defined – Writing of Document Begun
• Steps Required to Ensure Long-Term Accessibility
– Valuable File-Based Assets Stored in Digital Archives in All Industries
• Key Goals of the “Ideal Storage Format”
–
Ensuring Long Term Accessibility– Self-Describing Assets & Self-Describing Storage Media
– Encapsulation to Maintain Important Metadata & File Relationships – Scalability for Any Number of Elements of Any Size & Type
– Standardized Regardless of Storage Media Technology – Transportability & Compatibility between Systems
– Preservation (OAIS) Features such as Fixity, Provenance, etc.
• TAR
– Tape ARchive Format (Originally from UNIX) • Has Been Around for Many Decades
– No Truly Universal TAR “Standard”
• Rather, Many “Customized” Implementations – TAR Is a Legacy Format
• Cannot Support Many Core Functionalities Required in M&E Space
– TAR Misses Many of Storage Format Goals Outlined
• LTFS
– Linear Tape File System is Simple File System for Linear Data Tape
– Makes Data Tapes Appear as “Removable Storage”
– Currently No Standards Bodies Have Documented LTFS • Often Mistakenly Referred to as a “Standard”
• Being Documented by Storage Networking Industries Association – Considered By Itself, Has Some Significant Limitations
• With Respect to Storage Format Goals Outlined
– By Itself, Is Very Useful for Physical “Transport” of Content • But Not for Long Term Storage or Preservation
– Why Not?
• LTFS cont’d.
– No Media Encapsulation
• Relies on Simple Folder Hierarchies to Form Important Asset Relationships
• Lacks Context
– Does Not Scale Well
• Due to Lack of Support for Spanning Across Storage Media
– A Significant Problem in M&E
– Only Supports Modern Data Tape Technologies
• Is Not Applicable to Any Other Storage Technologies
• TAR & LTFS
– Neither Achieves 100% of Long-Term Storage & Preservation Goals
• What other choices are there?
•
Caution: Following Detailed Descriptions Are Not
Adopted Yet
– Some Details Still May Change in Response to Ballot Inputs
• AXF – A File Collection “Wrapper”
– Essentially an Advanced ZIP
– Can Encapsulate Any Number of Files of Any Type & Size
• Brings Universal Transport & Interoperability to
Storage
– At Same Level as MXF Brings to Content
• Designed to Support All Storage Technologies
– Now and into the Long-Term Future
• IT‐Centric and Not Tied to Media Applications Alone
• Being Standardized by SMPTE
• Considered within an IT System Stack
AXF Layered Context
AXF-Aware Application
Server/Storage Stack with AXF support
Archive eXchange Format (AXF), Including Internal File System Block Level Addressing File System
Medium (without File System) Medium (with File System) Operating System – Hardware Abstraction Layer
Driver Physical Drive
• Unlimited Storage Support
– Any Number of Files, Any Size of Files, Plus Media Spanning
• Resilience to Media Damage and Corruption
– Redundancy in All Structures
– Payload Independently Recoverable
• Support for Media With & Without a File System
– Raw Data Tape, LTFS Data Tape, Spinning Disk, Flash Media, Optical Storage, Holographic Storage, & Anything in Future
• Support for Any Operating System and/or File System
– Embedded File System Abstracts Underlying Systems
• Self Describing Objects & Self Describing Media
– Enable Simple Transport of Objects & Media between Systems
• Object Versioning & Collection Support
– Support Complex Relationships between Objects – Support Additions, Updates, etc.
• Support for All File Types – Not Just Media Files
– IT‐Centric Implementation
• Based on Experience Handling Big Data in M&E
• Streaming & File-Based Asset Transport & Delivery
– Support for Streaming De‐Encapsulation
– Support for In‐Path Checksums for Structures & Files – Support for Cloud Storage & Delivery Applications – Streaming Design Maximizes Transfer Speeds
• To/from All Media Types
What Is AXF?
Structured Unstructured Proprietary Open A A V V AccessControl Provenance Fixity Context Reference
Universal Storage-Agnostic File System
Metadata PayloadFile Ancillary Files
Asset Components Preservation Elements File System AXF Object
Storage Technology & File Systems
AXF Object Spinning Disk
(NTFS, FAT, etc.) Solid State Disk (NTFS, FAT, etc.) Holographic (UDF, etc.) Blu-Ray (BDFS, UDF, etc.) Flash Media (NTFS, FAT, etc.) DVD (UDF, etc.) Data Tape (Block-Based, LTFS, etc.) Future Storage Technologies
AXF on Storage Medium
AXF Medium Ide
nt ifi er AXF Object 1 AXF Object 2 AXF Object N … A XF Objec t He ade r Me ta da ta Co nt ain er Me ta da ta Co nt ain er Fi le 1 A XF F il e Fo ot er Fi le 2 A XF F il e Fo ot er Fi le N A XF F il e Fo ot er A XF Objec t Fo ot er … … Metadata Payload File Payload
• Binary Structure Containers Define Elements
• File Tree Acts as
Light-Weight
File System
– Contained in Object Header & Footer – Identifies Payload Files, Locations,Sizes, & Other Characteristics
Spanned Sets
Medium 1 (Initial Medium of a Spanned Set)
Medium Identifier: Medium.UUID = Um1 Object Header: Object.UUID = Uo1 Fragment Footer:
FragmentPairUUID = Uf1 FragmentNumber = 1 NextMediumUUID = Um2
Medium 2 (Intermediate Medium of a Spanned Set)
Medium Identifier: Medium.UUID = Um2 Object Header: Object.UUID = Uo1 Fragment Header: FragmentPairUUID = Uf1 FragmentNumber = 2 PreviousMediumUUID = Um1 Fragment Footer: FragmentPairUUID = Uf2 FragmentNumber = 2 NextMediumUUID = Um3 Medium Content Medium Content
Medium 3 (Final Medium of a Spanned Set)
Medium Identifier: Medium.UUID = Um3 Object Header: Object.UUID = Uo1 Fragment Header:
FragmentPairUUID = Uf2 FragmentNumber = 3 PreviousMediumUUID = Um2
Medium Content Object Footer: Object.UUID = Uo1
Collected Sets
Anchor Object (Initial Object of a Collected Set)
Medium Identifier: Medium.UUID = Um1 Object Header: Object.UUID = Uo1 CollectedSetUUID = Uo1 SetSequence = 1 File: Video1 (processing=default) File: Audio1 (processing=default) File: Audio2 (processing=default) Object Footer: Object.UUID = Uo1 CollectedSetUUID = Uo1 CollectedSetSequence = 1
Subsequent Object (Final Object of a Collected Set)
Medium Identifier: Medium.UUID = Um1 Object Header: Object.UUID = Uo2 CollectedSetUUID = Uo1 CollectedSetSequence = 2 Object Footer: Object.UUID = Uo2 CollectedSetUUID = Uo1 CollectedSetSequence = 2 File: Audio1 (processing=REPLACE) File: Audio2 (processing=DELETE)
File: Closed Caption1 (processing=ADD)
Anchor Object
+ Subsequent Object(s)
Compile per Instructions
Product Object
Processing Instructions: ADD
REPLACE DELETE
New Files Carried with ADD & REPLACE No Files Carried with
• Objects Scale to Any Size & Encapsulate Any Number of Files
– With Full Support for Media Spanning
• No Need to Upgrade Existing Storage Infrastructures • Ensures Long-Term Compatibility & Resiliency
– With Self-Describing Features for Both AXF Objects & AXF Media
• Overcomes All Technical, Operational, & Functional Limitations
– Of TAR & LTFS, Works Harmoniously with LTFS
• An IT‐Centric Implementation Not Limited to Media Files Alone
– Supports Documents, Imaging Data, etc.
• Satisfies All of Ideal Storage Format Goals Outlined