• No results found

OBJECT-BASED STORAGE INTRODUCTION

N/A
N/A
Protected

Academic year: 2021

Share "OBJECT-BASED STORAGE INTRODUCTION"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

HOW GEO-DISPERSED OBJECT-BASED STORAGE

PROVIDES METADATA DRIVEN INTELLIGENCE TO

ENABLE MORE ROBUST CONTENT DELIVERY

NETWORKS

Shane Archiquette Hitachi Data Systems Santa Clara, California

ABSTRACT - The world of broadcasting is changing

rapidly to accommodate new viewer habits and Content Delivery Networks are a key part of this new model. Advanced technologies are required to keep up with the increasing demand across heterogeneous CDN networks. This discussion will offer a comprehensive look at how geo-dispersed object-based storage provides metadata driven intelligence to enable more robust CDNs. New technologies in object storage that leverage intelligent metadata for aware content placement from the origin to the edge, and that make use of subscriber analytics and software-defined networking, can provide the future architecture to help content delivery networks become autonomic, self-adapting and self-healing. Heterogeneous Content Delivery Cloud Networks (CDCN) provides the basis for adapting to increasing Over the Top (OTT) subscriber driven media consumption.

INTRODUCTION

The various ways that consumers are changing their methods of video consumption are rapidly shifting from traditional broadcast television, digital versatile discs, blu-ray discs, and cinema/theatre movies to increasingly mobile formats on multiple devices such as mobile phones, tablets, and laptops. This allows viewers to access content wherever and whenever desired and enhances the viewers experience of selective content consumption. This change in how content is consumed is forcing a drastic change to the content delivery methods with content delivery network providers, telecommunications companies, and cloud service providers all seeing exponential increased video traffic (Brogan 2013).

This dramatic change to data networks will require new methods of how content delivery is handled in the network, storage and compute infrastructure and improved intelligence on content placement. Object based storage platforms can enable meta-data driven content movement and placement across geo-dispersed networks and provide greater efficiency, reliability, and improve the end user experience. We review how object based storage can improve content delivery networks with meta-driven intelligence that will optimize how content is placed dynamically across heterogeneous content deliver cloud networks.

OBJECT-BASED STORAGE

The architecture of object-based storage has some unique advantages over traditional file-systems and storage area networks. The object structure provides for flexibility in access methods, storage medium, data placement, data preservation, multi-tenancy, and how content replication, distribution, and duplication is handled. Object Storage is comprised of seven primary elements:

• Object – representation of data and standard metadata

• Custom Metadata – contextual information about the data that has relevance and relation

• Storage Servers – standardized servers that are configured in a localized compute cluster and interconnect typically 10GbE, control access, and object distribution

• Storage Network – The data storage connectivity to one or more storage arrays to hold the object data

• Storage Medium – abstracted hard drive disk (HDD), or magnetic tape, or cloud-based storage connectivity

• Policy Engine – controls all aspects of metadata relationships, content time horizon, content movement, content duplication, and tenant and namespace operations.

• Access Gateways – provides a file-system and edge like interface such as NFS, CIFS, and Sync/Share Applications

Figure 1 represents typical distributed object storage architecture:

The symbols with HCP represent an Origin Object Stores and the servers are the distributed file and content servers

(2)

that provide access to the laptops and mobile phones depicted in the diagram, all within a secure policy driven manner. The attributes of object storage allow for leveraging content ingest, placement, and distribution in a cloud computing like model and can be orchestrated with a combination of a policy engine and relational metadata (Primmer 2010). Object-based storage can be scaled to billions of objects and multi-petabyte distributed content stores holding millions of hours of high-resolution video and associated metadata. Additionally object-based storage is hardware abstracted which allows for very long term storage of media assets with autonomic hardware refreshes with advances in technology. Object-based storage is considered software-defined storage as the entire framework can be abstracted from the underlying hardware and dynamically scale.

GEO-DISPERSION

Replication of content for the purpose of copies that are utilized in protection mechanisms, localized content, and distribution of content, is controlled by a customizable policy engine that has a series of complex attributes and control structures. The initial purpose of geo-dispersed object stores was to provide a method of

data protection across large geographic boundaries in the event of a disaster with one or more whole copies of the content. As the technology improved it was apparent that a localized copy of the content at any object storage site could prove useful for a variety of means. Distributed object stores began to come in to prominence for many enterprises that took advantage of the advanced object storage technology for data dispersion and replication. Policies for how content is ingested, copied, replicated as well as retained exist within object storage systems and are the basis for the mechanics of geo-dispersion techniques. The data protection level (DPL) of the content is a policy of how the content is protected within the local compute cluster and the data replication level (RPL) is how the content is distributed external to the cluster. Data can be protected as well by a policy data-tier engine that can write inactive content to a magnetic tape library and very inactive content to a cloud storage entity. By leveraging the attributes of a distributed object store and adding a tenant security model for multi-site read and multi-multi-site write of the data, the underpinnings of a content distribution network began to take place (Primmer 2010).

Figure 2 shows how content geo-dispersion works:

METADATA INTELLIGENCE

A structure for managing data and collecting information about the data that is being stored is known as metadata, which provides a powerful architecture to add an intelligent layer of data analytics, movement, and relational connectivity between distinct data objects. Metadata and data are separately (Wikipedia 2013) stored to provide independent scaling and operational

characteristics such as storage optimization for both data (mostly sequential I/O) or metadata (mostly random I/O), custom or application update capable metadata for contextual attribute association with an object or set of objects. More recent object storage systems have the ability to push metadata out to the edge with frequency based access algorithms that can move an object from a

DC  =  Data  Center   DR  =  Data  Replication  

(3)

primary object store out to an edge device. By leveraging metadata for programmatic application level updates via a REST protocol of where a particular object or content is located, the content can be geo-distributed based on

dynamic location and copy quantity that optimizes the amount of copies and reduces the replication movement of how often the content is moved between major metro areas and wide area movement.

Figure 3 indicates an example of how metadata driven content intelligence is enabled between a core and edge site architecture:

Metadata can enable the movement of content between edge nodes without having to traverse to the primary object storage tier. This can enable significant efficiency and overall infrastructure reduction in a traditional content delivery network and without having to be concerned about the underlying networking protocols that are only concerned with packet routing optimization. Intelligent metadata also provides the ability to enable self-describing assets, which leverage their intrinsic object based definitions so heterogeneous content delivery networks can utilize similar metadata fields for content asset management and location based tagging. The overall goal of dynamic content placement based on user access patterns can be realized by leveraging metadata intelligence at the core and at the edge. Analytics and reporting of content assets and various movement/replication copies can be reported upon for efficiency measurement, service level agreement adherence, and ‘out-of-band’ real-time monitoring of content that would exist in a given content delivery network. The relational metadata engine acts like a distributed data base sitting on top of the data and content with dynamic and transactional updates.

CONTENT DELIVERY CLOUD NETWORKS Current content delivery networks use a series of algorithms that detect the request and delivery of various content assets within the defined delivery network using mostly a ‘closest proxy’ method (Cahill 2005). Traditional content delivery networks have the following core components:

• Ingest and staging area – this is the primary landing zone for content that will be distributed via the content delivery network and will be verified, indexed, and metadata extracted before being packaged for delivery

• CDN Management Platform – the orchestration and mechanics of the content delivery operation with reporting, analytics, and supplementary services

• Origin Store – the main repository for all of the content that has been ingested, transcoded, and packaged for delivery, normally placed at a geo-centralized site with connectivity to other primary replica Origin Stores

• Regional Cache – a dynamic repository that will contain current scheduled content to be delivered over a wide area and a staging area for infrequently accessed content that is pulled from the Origin Store

(4)

• Edge Cache/Video Server – the primary server that will have a specific smaller and target set of content that is staged for a current scheduled

lineup in a given area and provide the actual video delivery server logic and touch point to the end user device.

Figure 4 represents a common Content Delivery Network:

The basis of cloud computing is to provide elastic compute and storage capacity to a large scale of end users and businesses that can focus on their primary business and worry much less on information technology management. Cloud computing provides the basis for the content delivery infrastructure to exist in a dynamically scalable (scale up and scale down) architecture (Li 2012) and can provide an object based storage platform to

enabled software defined methods of deployment. This enables a robust and secure multi-tenant content delivery network to exist with heterogeneity to retail video content providers. When combining cloud computing with a metadata driven content delivery engine the formation of a content delivery cloud network can be realized in several forms and across multiple geographically dispersed sites.

(5)

Figure 5 shows the architecture of a content delivery cloud network:

With the increasing shift to Over-The-Top (OTT) content consumption the intelligence, sophistication, and precision of content delivery cloud networks will have to improve over time to dynamically increase the efficiency of content placement algorithms. Greater reliance upon data networks will require placement of content to reduce the traffic load and the content delivery storage infrastructure in the network. Object based storage will be at the center of the technical enablement of such content delivery networks to provide metadata access and data access methods that feed content delivery analytics and decision support mechanisms on where content is placed.

INTELLIGENT VIDEO CONTENT DELIVERY The ultimate goal of a content delivery network is to provide the fastest routing of content to the end users request for streaming video in the fastest time possible and reduce or eliminate the buffering that is encountered with high frequency access content. With infrequently accessed content the goal is to reduce the latency that is encountered when content is not at the edge cache video delivery server. The methods that are normally used are based on nearest upstream regional cache that is within the content delivery network that provides the infrequently accessed content to the edge cache server, then to the end user. Also consequently if the regional cache server does not have the content (which is sometimes the case), then the Origin store will have any content that is accessible via the video delivery front-end application that a user would be browsing. The overall idea is to make all of the content look like it can be every

where at any given point in time but not have to retain a copy of all copies of content in a given content delivery network video library. As more types of content delivery networks come online that cater towards specific user regions, themes, movie collections, and episodic television shows, the variety and libraries of content are increasing at an exponential rate, with some video providers even caching their own content. (Fitchard 2012) Intelligent and dynamic video content delivery will be a requirement in the very near future just to keep up with the demand of over seven billion devices that are slated to drive the majority of Internet traffic growth (CORDIS 2013). To ensure that consumers are able to access their programming that is backed by a monthly subscription fee, advertisements, or both, the methods in which content will be delivered must be intelligently automated and dynamically placed which is the entire premise of intelligence infused in to content delivery networks. The ideal workflow would be enabling an operation of uploading content from a distribution source completely digital from a studio, broadcaster, or content aggregator to an intelligent content delivery network, which would handle all dynamic transcoding, packaging, streaming, codec, and device detection methods. Another key attribute of an intelligent content delivery network is to enable the offline watching of content similar to a cable broadband personal video recorder that enables suspended streaming. Intelligent content delivery networks will improve the reliability, efficiency, and cost structure using metadata driven methods that are implemented on an object based storage platform. Higher resolution content such as 2k, 4k, and 8k will dramatically increase the need for intelligent content delivery as well. All of

(6)

this with the goal of enhancing and improving the user experience and consumer demand.

IMPROVING THE VIDEO CONSUMER EXPERIENCE

The consumer of OTT content will increasingly desire to watch content on a variety of devices based on the content type resolution, level of immersion, and where they may be at a given time. For example if a person is at the airport and they are watching a 2 hour movie, they should have the ability to selectively choose an option that streams and captures the entire movie on the device within the video player application so it can be watched offline, then purged from the users ‘stream cache’ as soon as they connect to a wireless network when they land. In the near future consumers will be choosing the content delivery networks and content providers that have the most complete library of video and music that is all at a nominal price per month with the ability to add content ‘tiers’ based on their content viewing habits and shows they like to watch. Telecommunications providers in five to ten years will become the major broadcast networks as the move to higher resolution content and will more than likely be the primary distribution arm for the major broadcasters in the world instead of over the air. The consumer is driving massive change in content delivery networks and expects to have all content available based on their ever changing desires and advertisers promoting the newest shows to watch. The typical content consumer will have four devices that will be video content consumption specific that they will consume content from, an ultra high definition television (for immersive theater like movies, sports, and interactive games), a laptop (for mobile movie viewing that secondary run ‘catch up’ movies and some television content), a tablet (for second screen and compact mobile movie viewing), and a mobile phone (for some long movies, short movies, video clips, and social media video), automotive video (for families with children watching movies in the car or sport utility vehicle). The methods of video delivery to each of these devices will need to arrive at a user device profile and viewing habits with metadata definition to provide the proactively most applicable watched video to the end user’s variable devices. This is how the future of content will be delivered. After all, content is king!

REFERENCES

[1] Brogan, Patrick, “US Data Traffic To Triple Over 5 Years” http://www.ustelecom.org/blog/us-data-traffic-triple-over-five-years August 2013 [2] Cahill, Adrian J., Sreenan, Cormac J., “An Efficient Content Placement Algorithm For High Quality TV Content”

http://www.cs.ucc.ie/~cjs/docs/2005/euroimsa05.pdf

April 2005

[3] Primmer, Robert, “Distributed Object Store, Principles of Operation”

http://www.hds.com/assets/pdf/distributed-object-store-principles-of-operation.pdf July 2010 [4] Separation of Data and Metadata,

http://en.wikipedia.org/wiki/Object_storage January 2014

[5] Li, Yale, Shen, Yushi, Liu, Yudong “Utilizing Content Delivery Network In Cloud Computing”

http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnu mber=6383505 2012

[6] Fitchard, Kevin, “Forget the CDN Players, Netflix is Caching its own video”

http://gigaom.com/2012/06/04/forget-the-cdn-players-netflix-is-caching-its-own-video/

June 2012

[7] CORDIS “Efficient, Intelligent, Content Aware Networks” http://phys.org/news/2013-04-efficient-intelligent-content-aware-networks.html April 2013

AUTHOR INFORMATION

Shane Archiquette, CTO Communications, Media & Entertainment, Hitachi Data Systems, Santa Clara, California

References

Related documents

2019 Review of Melissa Schoenberger, Cultivating Peace: The Virgilian Georgic in English, 1650–1750. This is an open-access article distributed under the terms of the Creative

Mount the supplied ball joints with the 5/16” bolts, flat washers, and locknuts provided in Hardware Package 9 033 to either the top or bottom of the ball joint pocket of the

A  number  of  scale  out  storage  solutions,  as  part  of  open  source  and  other  projects,  are   architected  to  scale  out  by  incrementally  adding  

Advanced, policy driven data management based on attributes and metadata Flexible access methods that support traditional object storage models and new web based

Bub and Masson (2008) distinguish functional and volumetric gestures, the first associated to object use and the second used to pick up the objects. Recent results show that the

Only self-rated health status, work experience, overworking more than official working hours, and working 14 or more consecutive hours in the past week were

This is because e-cigarettes are not being formally regulated in the same way.19 Patients who are looking to switch from traditional tobacco cigarettes to

 Accessing a medication vial with a syringe that has already been used to administer medication to a patient and then using medication from that vial for other patients?.