• No results found

Sony Pictures Production Backbone. Asset Management System

N/A
N/A
Protected

Academic year: 2021

Share "Sony Pictures Production Backbone. Asset Management System"

Copied!
13
0
0

Loading.... (view fulltext now)

Full text

(1)

Sony Pictures Production Backbone

Asset Management System

Summary

The development of a studio-wide Production Backbone requires the deployment of a flexible, robust digital asset management system that can operate across multiple productions and studio production departments. This document proposes current requirements for an asset management system and reviews several potential systems with an eye towards the deployment of an asset management system as part of the Constellation project. The Production Backbone system has an ambitious future, and the current limitations of asset management systems are discussed. Finally, recommendations are provided for the next steps forward, even though this review has not yielded a clear choice for a particular system.

Introduction

The Production Backbone provides work-in-progress support for storage of all digital elements created for a motion picture, including picture, sound, titles, captions, metadata, and visual effects, as well as making packaged elements such as image proxies, digital cinema packages, MXF wrapped edit streams, Quicktimes, and Avid DNX files.

The Backbone must provide controlled access to production users of all elements for a show, and allow search and retrieval of any particular sub-element. The Production Backbone also provides automated transcoding, conversions, and packaging of assets in a user-requested format.

To be effective, Backbone assets must be tightly coupled to the organization of the Production’s naming conventions and data model so that assets can be quickly found and managed by the Editorial staff. Since many elements are in an incomplete state when stored in the Backbone, it is essential that the system manage element versions, and trace dependencies of assets upon each other. Each production often follows different standards for naming and data management, thus flexible and customizable database organization is a key requirement of an asset management system.

Future Backbone capabilities include functions to allow automated conform and assembly from edit decision lists and automated hierarchical storage policies. As more departments become capable of providing digital services in a model of distributed processing and distributed storage, it is expected that the Backbone Asset System may extend into use in other areas within the Studio, and will be an important source of generated content for both the Distribution Backbone and other content services like Cineshare, and EAGL. The needs of the

(2)

Backbone go beyond the traditional use of asset management systems as a simple store and retrieval system. The need to link digital objects with the processes that transform them are important for production media data management.

Requirements for the Production Backbone

The capabilities of the Production Backbone can be described as several sub-systems. Some of these sub-systems are provided by commercial DAM systems and some are not.

The Storage Manager

Organizes and tracks all versions and properties of assets in the system.

Implements backup and file migration policies. Maintains information on dependencies between assets, and moves, recreates or updates assets as changed by users or processes. The Metadata Manager

Organizes metadata into searchable tables and provides custom metadata storage services. Allows search and retrieval based upon metadata and/or sequential time information for all assets. Maintains status, versioning, and lineage of assets. Web-Based User Interface

Allows users to initiate work orders and processing requests and to perform ingest, search, management, and retrieval of assets.

Script-Based User Interface(s) and APIs

Used by command line users, especially in the UNIX environment. Allows users to initial processing request and perform ingest and retrieval of assets. API for

integrating other software systems. User and Security Management

Allows management of user roles and access to assets with audit trails and logging.

Processing Service Engine

Fulfills all requests for proxies, transcoding, image processing, or color processing using managed server (Ellcami) or render farm capabilities.

(3)

Packages elements for local or remote transmission, and interfaces with external delivery systems such as SmartJog, SFTP, Aspera, Netflight, DBB, etc…

It is apparent that these capabilities must be built upon a set of flexible software components. Among these software components are…

Web services components

Secure authentication and login services User interface builder

Relational database Data modeling schema Flexible metadata ingest Search and sort

Interprocess communication libraries

Work order, usage audits, production reports Queuing and scheduling

Tape systems control

Hierarchical storage manager software

Resource and bandwidth management flow control Render farm management

Image processing

Transcoding and compression systems Digital packaging

EDL/ALE interpretation and project management Verified file transfer utilities

As pitched by vendors, a Digital Asset Management system typically only provides a subset of these functions. A commonly available subset would include:

Web access interface

Secure authentication and login services Relational database

Data modeling schema Flexible metadata ingest Search and sort

DAM systems often provide a mechanism for storing arbitrary user data (metadata) as unique database rows of keyword strings attached to character or numeric data. DAMs also impose a data model on the collection of assets that must be stored, and while this data model is often done in as general a fashion as possible, the model can also constrain the methods used to access, link, or process the data in the DAM.

An integration user of a DAM must often work-around the hidden assumptions of the asset system developers. In addition, the ways in which data is organized into searchable tables within a database has a significant effect on the response time of a system. Many DAM systems have been focused on managing general catalogs of multimedia file assets, thus the focus of

(4)

development of these systems is often not in line with the motion picture production workflow. To highlight these issues with commercial DAM systems, requirements of a Digital Assets

Management system for motion picture production use would include:

1) Ability to manage file sequence-based assets: The great majority of the assets handled by the backbone will be file sequences of arbitrary length, from a few files to roughly 200,000. These file sequences will typically be handled file by file (for example in the case of proxy generation) or in sub-sequences (for example in the case of breaking up a long sequence to write it to tape).

2) Asset lineage and inheritance: Assets will often be derived from other assets, and any change in the “parent” asset (such as a dirt fix or re-render) will need to be reflected down the

lineage chain. Since work-in-progress assets change quickly, a system must allow updates in place for both basic essences such as image and sound directories, but also packaged assets. This also implies support for a ‘processing’ model in the relationships between assets.

3) Volume: The production backbone is expected to handle billions of individual files, breaking down into millions of file sequences

4) Versioning and Storage Replication: Assets may be stored in more than one place and have multiple redundant copies, some on disk and some on tape. Backbone data can be expected to live on different storage systems, and sometimes (such as when it is on tape) in a

completely different form than the original representation of the asset (for example, as part of an archive object, or as a collection of archive objects).

5) Ability to work in a heterogeneous environment: The Digital Assets Management system will need to interact with resources available on a wide variety of systems, be it UNIX/Linux, Windows, OS X, and other more proprietary systems. It is essential that it offers the necessary APIs to at least enable interfacing with these systems.

6) Well-defined API using widely available standards protocols: Wrappers will need to be written on a variety of systems to interface scripts with the Digital Assets Management system, and interface the Digital Assets Management system with services. It is preferable to have APIs for both the SOAP and XML-RPC specifications, at a minimum.

Lastly though not part of standard DAMs, the Backbone requires..

7) Robust render management: The large volumes of data (and significant sizes of individual files) require robust, distributed file handling and file processing systems – render farms. There is a need for a heavy-duty render management system, capable of processing millions of files per day through potentially hundreds of nodes, performing varied tasks such as image processing, tape operations, file moves, and digital packaging. Proper resource allocation and bandwidth estimation for distributed systems is required.

(5)

Summary of commercial Digital Asset Management systems

[ed: this section may be expanded as additional systems are examined ]

Blue Order (Avid)

Popular in the broadcast space.

Folder workspace user navigation and folder-view todo lists Timecode annotated files

Scheduling only for ingest SOA design

User configurable data models

Metadata based on catalog and annotation model Contains ‘ordering’ web GUI

Media Archive ‘Edit’ provides simple editing tools and interface to Avid/FCP Drag and drop delivery of assets

Accounting based on asset licensing model Some high-level use reporting

Artesia (OpenText Digital Media Group)

Choice for Distribution Backbone.

Confederated set of software tools Flexible metadata editor

Searching via category or keyword metadata SOA architecture with flexible APIs

JEE based application with cross-platform apps using .NET Bulk processing (attached metadata to multiple objects) Unique ID per asset

Provides user video shot list editor for viewing and playback of clips over the web. Web Toolkit for custom portal interface

Artesia setup for low-res proxies, must use adaptor to handle full-res assets. Artesia uses another product for a Storage Manager (Open Text Archive server) Has integrated desktop for management of video clips

TACTIC (Southpaw Technology)

[ quick summary pending their visit and demo ]

Designed towards production workflows including visual effects, animation, and episodic TV. Web interface with security controls, custom user views, personal user notes, and tracking

(6)

Built on top of Oracle or Postgress databases Production reports and e-mail notification systems. Python and Javascript support

Custom data fields with simple API query function

Some workflow management with triggers (project manager setups)

Front Porch Digital

Video-centric content management system for the broadcast industry

ActiveMedia (Kit Digital)

Medium to small scale video services asset manager

Review of ‘Constellation’

As of Apr 1, 2010, The Constellation team is choosing to develop their own DAM functionality on top of an Oracle database. The comments in this section focuses only on features as outlined in the data model schema that was provided to SPE. This schema may or may not be changing The ‘Constellation’ project has designed a data schema as the foundation of their workflow management system that incorporates many of the functions needed for both asset management and workflow design. It includes sections for devices and profiles, transactions, tasks and work orders, transport, processes, hierarchical storage, user access controls, and services. From the start, the Constellation design assumes a data model allowing integration of assets with processing methods -- along with file transport, notifications, etc…

The data model, as designed, however is still simple and has not had the additions to the database that an actual implementation will require. The linkages between tables are designed in a

somewhat inflexible manner, and do not allow for much user customization of metadata and essence relationships. While additional attributes can be added to tables to provide these linkages, a fair amount of code may be effected when these changes are made late in the development cycle. While metadata is permitted in a very general form, this may not allow quick searches of key subsets, and establishing structured metadata relationships may also be a challenge without a fair amount of database redesign. The task and process definitions (recipes, et.al.) are also incomplete. The schema works as a prototype example, but it is clear that a full production system would require more sub-tables, and a reorganization of existing tables to accommodate a full range of capabilities needed for asset management in the Studio

environment.

Since the focus of the project is providing a workflow manager, it is a concern that the data model does not provide a better match to the specific needs of an application like the Backbone. Changes to the data model reportedly would have significant effect on the development schedule,

(7)

yet changes to the data model are also essential or the Backbone will not be able to provide some of its needed functionality with Constellation.

Further, the Constellation data model is overly focused on certain tasks such as dailies and ingest, but fails to reflect the wide production uses of rich metadata through to post. This can be seen most clearly in the drastically limited ‘Essence’ container (which is actually a wrapper container more at the level of a ‘Dailies’ roll.) In common use, an ‘essence’ is sound, picture, or title file(s) of any length with associated metadata that can be packaged. Constellation describes an ‘essence’ as a multi-format video or film media type that can point to one or more CamRoll, LabRoll, SoundRoll with Scene and Take information. Either one of these models would be insufficient for the multi-layered and version viewpoint that the Production Backbone requires. The single attached sound assumption is also poor from the standpoint of packaging and

versioning where multiple sound ‘track’ files and multiple language versions are needed. An additional layer of data abstraction is needed in the model to handle the diverse essences needed for production.

Overall, the Constellation data model provides a useful start because it allows integration of process definition with assets and work orders. This matches the need of the Backbone to not only store assets, but transform them upon request. The Asset and Storage Manager portions of the data schema for essences and metadata are the weakest component and will need to be expanded. The Production Backbone requirements are no different than many large facilities will encounter in typical file-based production workflows, and refining the data schema to handle the diverse products of a motion-picture studio needs priority attention.

Review of Sony’s Digital Media Repository

Sony’s Digital Media Repository (D.M.R.), in one version or another, forms the basis of both Cineshare and EAGL, as an asset manager primarily for digital photos and multimedia files, and is based upon an Oracle database.

A review of this custom DAM data schema was conducted as well as examination of the Services APIs. As with many schemas, the layout and design of the data and the process relationships are optimized for the tasks that it is designed for. The schema has components for asset types, user metadata, file processing and folder management, search filters, and notification systems. Flexibility in certain key parts of the diagram could only be achieved by adding further attributes (i.e. database columns) or by creating separate sub-diagrams that have minimal

interaction with the main body of the schema.

DMR certainly is a good start in some aspects – some features, such as the user and security model that ties into Sony corporate systems, are very complete, and would be sufficient for reuse in the Backbone. These isolated elements could be re-used in one form or another as the source code is available.

However some aspects of DMR will need significant work in order for it to be suitable for use in a post-production environment. Key issues with the DMR data schema are:

(8)

- No inherent file sequence-based support. DMR offers the capability of assigning files to an asset, and therefore a possible approach would be to expand all file sequences into individual file records associated with an asset. It would lead to billions of rows in the database, and while Oracle with proper table partitioning can scale fairly high, there are many scalability risks with this approach. Organization of the Backbone schema around ‘media clips’ would reduce the size of the database, and only those files that needed specific metadata storage would have their own searchable tables. Another possible approach is to declare the sequence time intervals as part of the asset metadata, and play with one or the other table organizations according to the needs at hand. But either way, file sequence support is something that is currently not present in DMR.

- No concept of asset lineage and lineage-driven process cascading. Assets in DMR are isolated entities, where metadata is used for history information. This requires adding new fields to database assets, or creating ‘lineage’ tables.

- It is hard to picture how DMG would handle dual-instantiated assets, such as things that are (in *different* forms and sub-asset representations) both on disk and tape for example. This will be a constant fact of life on the Digital Backbone at Sony. Having levels of data abstraction and instantiation built-into the data model are essential to support these features. - The schema contains a general metadata schema for holding numerals or strings that uses

selection-value based metadata where values are associated to an asset via a metadata select ID. A user can create and associate a new metadata field on the fly, and the values are stored in such a way that allow for both keyword searching as well as advanced searches based on specific metadata fields and values. While this is a very general method of metadata that is common in many commercial DAM systems, there are times when structured metadata relationships provide superior performance in the organization of a database. These

additional relationships are particularly beneficial when creating media-centric automation systems for re-processing of work-in-progress elements.

- One of the drawbacks of the current schema is that the highest level tables (the ones with the most relationships that form the core of the system) are file-centric (an asset is only a

collection of files; metadata is a ‘value’ attached to a file; etc…) The high-level organization of the schema attempts to be flexible in adding new fields and tables, but has built-in

assumptions about how data relationships are organized. To the extent that the Backbone requires different relationships between new data objects, a large number of API services would need to be modified. Establishing optimal searches for commonly used items also usually requires renormalization of database tables.

It is pretty clear that significant additions and changes would be needed to the DMR schema. At a certain point, the number of changes in the schema needed to support the Backbone might lead to the conclusion that is better to rewrite the data model and service layers than to ‘hack’ around it.

(9)

functions well with millions of file assets, but may have difficulty scaling to a set of projects that will reach hundreds of millions of files. Although many systems are scalable with addition of more hardware, it is not clear that the complex software requirements of the Backbone could be scaled using the types of servers at the foundation of DMR. Reorganization of database tables to achieve efficient access is a more cost-effective solution to scalability issues than just adding more servers. Moving away from a file-centric database to a media-clip style database is already part of the Backbone software goals.

DMR has a simple task distribution system, but the level of render and resource management needed for the Backbone is greater than the DMR schema allows (and which it wasn’t designed for anyway). Many commercial DAMs focus solely on storage and retrieval with some add-ons that allow remote file delivery. However, integrated management of bandwidth and processing resources is an important addition to the Studio’s functionality to be provided by the Backbone. In summary, DMR provides a useful starter set of asset management tools allowing asset and metadata definition with a file-based model of relationships. Extensions to the data schema would be needed to support file-based sequences of rich multimedia content. Process automation support in the schema and focused structured metadata would be needed that might have a large impact on the design of the existing schema. There is a sense that much time would be

consumed trying to find work-arounds to the limitations of the DMR system. At a certain level of effort, the number of changes in the schema needed to support the Backbone might lead to the conclusion that is better to rewrite the data model and service layers than to ‘hack’ around it.

(10)

Discussion

There are multiple ‘givens’ in the selection of an asset manager for the Backbone. Among them are the need to support the SOA model, web-based GUIs, and a strong underlying relational database such as Oracle. While these are all present in varying degrees in commercial DAM products, the details of how they work, and the ways in which users can integrate with them matter a great deal. There are multiple development systems involved ( .Net/ODBC,

JSEE/JDBC, Oracle, XML, etc…) and the choice of these systems can lock in a long-term development path that force certain implementation decisions further down the road. For some choices of a DAM, differences between production Linux and Mac systems and database

Windows servers would have to be carefully accommodated, and are an area of potential concern for a development project. A detailed review of these choices is beyond the scope of this note, but has as much bearing on a decision for development as the other issues being discussed. The SOA model of application development requires a lot of forethought to design a long-lived solution. The interaction of multiple layers of database access, media bus operation, enterprise services, and web-access have to be carefully generalized so that the system retains a high degree of flexibility for future uses. Our internal planning for the Production Backbone using a SOA model has been dependent upon an outside development team, and it is likely that we have to spend more time analyzing and planning for our expected uses of the Backbone.

Data model

The ‘correctness’ of a data model for organizing assets, metadata, and processes takes on a surprisingly important role in creating an asset-based system. The data model (or schema) organizes the key relationships and capabilities of the system and forms the foundation of the APIs and services that are built upon it. Changes to the data model after building an entire system have a dramatic effect on the number of software components that may have to be modified to maintain the systems integrity. The data model also provides the terms and conditions for search and retrieval of both assets and metadata. Having flexible and

customizable definitions for metadata, for example, can pay a key part in maintaining a system over it’s lifetime. A system data design that fails to capture needed relationships will cause substantial rework at a later point in time.

One of the challenges in finding an appropriate solution are the need for an asset system for file-based sequences at one level, and media clip organization at another. Production file

organization is based upon a hierarchy of file, then scene, sequence, reel, and production. Yet all the larger containers incorporate a collection of smaller assets, and changes to a lower level asset mean the higher level containers need to be updated with the latest version. This is not a typical feature of any DAM. Having multiple views into the asset at different levels is almost certainly an area where customization of a package (and database model) will be required. Maintaining linkages between assets and versions of dependent elements is also key.

Data models also have a tendency to grow with time and become more complex. Because of this, it is useful to have levels of indirection for high-level tables that only serve to point other

(11)

more complete tables for actual searchable assets. DMR has some features that operate this way, while other tables don’t have full generality as can be seen in the file-centric tables linking assets to fileID to metadata. Constellations data model on the other hand does not provide this

flexibility. Comparison of the Constellation data model with that of the DMR system shows the distance between a concept level system and a system that has grown through several

implementation phases. Metadata Management

Because of the varied nature of production, allowing user extensible metadata is a major requirement for any system. This goes well beyond ‘keyword’ metadata or name-value pair metadata which is the focus of most DAM systems. It is the ability to create organized and searchable links between media assets and processing methods that make possible the workflow automation systems of the backbone. The Constellation data model captured the distinctiveness of this need in their data schema design. Updates in place and automated versioning however are not clearly defined within the Constellation model. While arbitrary metadata collection models can provide the same functionality, there may be a response time cost since the metadata is in one large collection that has to be searched for every query.

Storage Management

A Storage Manager system is also needed that allows creating a custom hierarchical storage manager that provides production-specific migration and backup policies. This if often

considered an external software package in commercial DAMs, but the lack of integration with the automated processing systems means it is difficult to build an intelligent and responsive media server. Quality of service improvements such as pre-staging of assets based upon current EDL working sets are not present in any current DAM system, and a custom resource

management layer will be needed for any solution that is chosen. Working with IBM GPFS and TSM tape systems will require a substantial control system for data management, which is already partly underway.

Processing and automation

The Processing Service Engine functions of the Backbone need scheduling and queuing software to provide resource management of distributed render servers and storage clusters. This is not a common feature in any DAM. While sometimes an API exists or can be created for specific software, this area will also involve much customization for the specific layout of the Backbone and associated department servers.

Asset Manager Implementation

Many of the other functions (web-based GUIs, scripting, user and security management, and digital delivery system integration) are standard features of commercially available DAMs. It is clear that none of the currently discussed solutions, including the Constellation data schema, satisfy the already known needs of the Production Backbone. Substantial additions, revision, or

(12)

customization is likely needed regardless of which component is chosen for the Digital Asset Manager.

The selection criteria then should reflect which software component will provide the most

flexible and customizable solution. It is apparent that a custom data model solution built directly within Oracle data structures provides the most control for long-term development. Using a pre-made package can accelerate development in the beginning up until the developers hit the

limitations of the package’s assumptions. The Sony in-house DAM would need significant work (in essence becoming another generation of the software).

It is a current concern that the Constellation model does not appear to provide adequate customization opportunities to provide the Backbone Storage Manager and Metadata manager feature sets. Since the Constellation team has decided to build their own DAM at this point in time, it is possible that we could provide sufficient feedback to make their system useful for a studio-level feature set. The time needed to build-out a fully functioning system is likely to be longer than currently advertised by the development team.

Further planning is needed to provide the most cost-effective path to the solution we ultimately want. There is currently insufficient information to yield a ‘build-vs-buy’ decision for deploying a DAM for the Backbone. Among the factors that have to be considered is the cost of an interim throwaway solution since the Backbone is already in use by Production, the use of a rapid

prototyping development model, or a stand-alone longer-term software development build from scratch approach. These factors would have to be evaluated with respect to realistic estimates of the progress of the Constellation effort and its potential for production use within the next year.

(13)

Recommendations

Since there is no clear-cut solution, we will have to continue down several paths at the same time.

Summary points:

1) Commercial systems seek a wide user base by providing general asset and metadata storage but have rarely focused on the specific needs of file-based production workflows. While they may provide fundamental store and retrieval functions needed for the

Backbone, a whole new set of data relationships would have to be maintained outside of the DAM database to handle the dependent versioning and processing methodologies needed for the Backbone. A commercial system could help at the outset of the project, but imposes significant expansion and integration issues for the rest of the Backbone development process.

2) The known needs of the Production Backbone data model do not map well to either the DMR system or the Constellation data model. The Constellation data model is closer to the full needs of the Backbone but strongly needs revision and additions. For

Constellation to succeed in the studio asset production environment, their existing data model must improve. After the NAB prototype phase is complete, it needs to be an urgent task to work with the Constellation team on modifications for Studio needs. 3) DMR has some database organization concepts that would be useful if adopted by the

Constellation team, but it’s data model is not organized appropriately enough to recommend as a solution as the DAM component of Constellation.

4) Fully defined details of the Studio’s needs for the Production Backbone have not been captured in earlier analysis. The Production Backbone System needs a clear Functional Specifications Document to guide development which should contain lists of the asset objects, lists of metadata types, and definition of individual service functions that each production department will need. The feature set needed for a Studio operation needs to be seen as an industry wide solution, not just as a Colorworks or SPE specific

implementation. The Functional Specification document should be as generalized as much as possible for use in file-based production and post-production. Having this level of detail developed by SPE will also provide greater guidance on our exact needs to the Constellation team.

5) Further planning for a Digital Asset Management system is needed that can evaluate the development cost and integration effort of various approaches leading to a “build versus buy” decision.

Jim Houston Version 1.2 April 5, 2010

References

Related documents

Child labour in agriculture.. However, it highlights a relevant societal problem of extreme urgency that needs attention. The study exposes the exploitation and abuse

However, the effect of vega on the merger-related risk effects is driven by deals completed in the post-GLBA period and by larger banks — suggesting that bank CEOs

Quantum Secure’s core offering was the SAFE software suite, a commercial off-the-shelf solution that streamlined the identity management and access provisioning processes for clients

geophysical survey identified a probable chapel constructed with buttresses.. geophysical survey identified a probable chapel constructed

( Boneschi 2003 ) of glatiramer acetate trials in patients with RRMS, based on the same trials we have included in this review ( Bornstein 1987 ; Johnson 1995 ; Comi 2001 ), has found

Pioneer Life failed to produce proof of certifications/informational filings or annual rate certifications required under Sections 627.410 and 627.9406, Florida Statutes, for 12