A Web-Based Architecture for tracking Multimedia using SCORM

(1)

A Web-Based Architecture for tracking Multimedia using SCORM

P. Casillo, C.Cesarano, A. Chianese, V. Moscato

Dipartimento di Informatica e Sistemistica, University of Naples Federico II, via Claudio 21 80125 Naples, Italy

{p.casillo,carmine.cesarano,angchian,vmoscato}@unina.it

Abstract

In the context of a driven and collaborative on-line learning, multimedia resources, in particular videos, are more and more used. Due to their complex nature, there is the increasing need to manage the video contents in or-der to ensure a more fine-grained tracking on audio-video assets and obtain a continual feedback of student activities. In according to the ADL’s SCORM model, a video is usually assumed to be an atomic Learning Object: this assumption restricts the interaction between client and Learning Man-agement System to a merely ON/OFF tracking process and limits the reusability of such type of contents. In our ap-proach the video resources are considered as SCO or, more in details, as a SCO container; in this manner, each single component asset can be just a frame or a video segment in order to achieve the wanted tracking grain size. In accord-ing to the last hypothesis the main focus of the paper has been the design and implementation of an architecture for the tracking of video SCOs in web learning environments.

1 Introduction

Usually learning is seen as involving three processes[8]: • acquiring skills, constructing knowledge or

con-cepts, and developing values: these correspond

roughly with the use of technologies (understood not only as involving computer-based technologies but a wide range of tools including reading and writing); • the use of knowledge resources;

• the negotiation of shared meanings and values among the members of a learning community. In e-learning environment such processes take place specifically involving computer-based tools, accessing dig-ital knowledge resources, and participating in computer-mediated or online communities. Numerous authors have

discussed the benefits of ICT course delivery for learn-ers, tutors and institutions. Technology has been used as a means of electronically distributing course material, al-lowing flexibility for students favored learning styles, pace, etc. and giving greater access to information; as well as enabling remote communication between students and tu-tors, and between student peer groups. On the course man-agement side, it can allow greater communication with the course team and provides flexibility to maintain and update course material and documentation [7].

To this purpose [1], multimedia resources can yield a more contribution to learning process especially in e-learning scenario, in which the web is the front-end of the contents presentation. In this particular context, the videos are generally considered the most diffused resources for multimedia presentation and they are assumed as atomic ob-jects with a set of global metadata. This assumption restricts the interaction between client and Learning Management System to a merely ON/OFF tracking process and limits the reusability of such type of contents. The problem is to char-acterize the single parts of the video that can be utilized for its traceability: for example, it should be useful divide the video in scenes and control their learning because these video segments can be considered as part of a learning pro-cess. In our approach the video resources are considered as SCO or, more in details, as a SCO container. In our ap-proach the video resources are considered as SCO or, more in details, as a SCO container. In this manner, each single component asset can be just a frame or a video segments in order to achieve the wanted tracking grain size. In accord-ing to the last hypothesis the main focus of the paper has been the design and implementation of an architecture for the tracking of video SCOs in web learning environments.

2 Background and Motivations

The SCORM (Sharable Content Object Reference Model) metadata Information Model is a reference to the IMS Learning Resource metadata Information Model, which itself is based on the IEEE 1484.12.1 LOM

(2)

(Learn-ing Object metadata) standard. The SCORM [2], [3] de-fines a Web-based learning “Content Aggregation Model” and “Run-time Environment” for learning objects.

The purpose of the SCORM Content Aggregation Model is to provide a common means for composing learning con-tent form discoverable, reusable, sharable and interopera-ble sources. The SCORM Content Aggregation Model fur-ther defines how learning content can be identified and de-scribed, aggregated into a course or portion of a course and moved between systems that may include Learning Management Systems (LMS)and repositories. The pur-pose of the SCORM Run-time Environment is to provide a means for inter-operability between Sharable Content Object-based learning content and LMSs. A requirement of the SCORM is that learning content be inter-operable across multiple LMSs regardless of the tools used to cre-ate the content. For this to be possible, there must be a common way to start content, a common way for content to communicate with an LMS and predefined data elements that are exchanged between an LMS and content during its execution.

Content Model is a nomenclature defining the content components of a learning experience. Content Packaging defines how to represent the intended behavior of a learning experience (Content Structure) and how to package learn-ing resources for movement between different environments (Content Packaging). The SCORM Content Model Compo-nents is composed of three compoCompo-nents:

• Asset

• SCO

• Content Aggregation between the members of a learning community

A SCORMAsset is a collection of one or more resources

(Whole web page, HTML Fragment, JavaScript Functions, Flash Object, JPEG Images...) that are appropriate for shar-ing among SCOs; when packaged, an Asset should contain the appropriate metadata making it searchable in a SCORM repository. Sharable Resource metadata is data contain in the resources manifest file that describes the resource. An Asset can be described with Asset metadata. An Asset metadata is a definition of metadata that can be applied to “raw media” Assets that provides descriptive information about the Asset independent of any usage or potential us-age within courseware content. This metadata is used to facilitate reuse and discoverability principally during con-tent creation, of such Assets within, for example, a concon-tent repository. Assets are not only resources that can be shared among SCOs, they are also resources that have the capabil-ity of being discovered in SCORM repositories.

SCOs, Sharable Content Objects, are built using

re-sources, such as web pages, graphics files, etc. In some

cases, these resources might need to be shared with other SCOs. Resources that can be shared with other SCOs are called Course Assets, or Sharable Resources. According with the SCORM documentation, a SCO essentially con-sists of three defining characteristics:

• a SCO is the lowest level component that might be used in another course;

• a SCO should provide useful learning content by itself;

• a SCO must be designed to be launched and tracked by a SCORM-compliant LMS.

In other words a SCO is a set of related resources that comprise a complete unit of learning content compatible with SCORM run-time requirements. With this definition, a SCO can be extracted from a learning object and used by another learning object. This is essential to achieving the SCORM reusability goal. A SCO represents a collection of one or more Assets that include a specific launchable asset that utilizes the SCORM Run-Time Environment to com-municate with Learning Management System (LMSs). A SCO represents the lowest level or granularity of learning resources that can be tracked by an LMS using the SCORM Run-Time Environment. A SCO is required to adhere to the SCORM Run-Time Environment. This implies that it must have a means to locate an LMSs API Adapter and must con-tain the minimum API calls ( LMSInitialize() and LMSFin-ish() ). There is no obligation to implement any of the other API calls as those are optional and depend upon the nature of the content.

AContent Aggregation is a map (content structure) that

can be used to aggregate learning resources into a cohesive unit of instruction (e.g. course, chapter, module, etc.), apply structure and associate learning taxonomies. In SCORM

A Learning Object, or SCORM Content Aggregation, is a collection of Sharable Content Objects (SCOs) described by a SCORM manifest file.

3 System architecture

3.1 Design Objectives

The main goal of this work is the design and implemen-tation of an architecture that ensures the following features in the e-Learning process:

• Adaptivity

• Interactivity

• Openness

In the following we describe more in details the listed fea-tures.

(3)

Adaptivity is needed to select and customize the

learn-ing resources to the learner/student and to the context in which the learning is taking place. These two aspects ex-hibit a wide range of variability for an application that is web-based. Such application can not make a-priori assump-tions about the characteristics of the learner, such as educa-tional background, cognitive style, etc., nor about the con-text and purpose for the learning process. Instead it must be able to adapt dynamically based on explicit knowledge about these aspects that need to be maintained indepen-dently of the more generic learning content knowledge.

Interactivity is desirable to make the learning resources

more responsive, autonomous and proactive, and to better exploit the inherently distributed nature of web-based e-Learning. Web-based learning lacks the advantages of the traditional student-tutor relationship as it involves learners interacting on their own with the e-Learning application. Such application needs to be able to approximate many of the characteristics of a human tutor or coach, such as being non-obtrusive, have a good feel or sensing of the status of the student with respect to the subject being learned and be able to adopt motivational strategies among others.

Openness is a requirement for any technology that

as-pires to become a global undertaking spanning many collab-orating institutions, within many cultures, and across many languages. Experience has demonstrated that technology alone is not sufficient to achieve successful adoption of new solutions. Open standards have shown to be a key factor in the achievement of widespread adoption, by facilitating re-use, specialization, and quality improvements.

3.2 Architectural Layout

All the above objectives point in the direction of a web-based solution. Among the actual web technologies, we have chosen amixed Browser-based/Client-based imple-mentation. To communicate with LMS the learner has only

to support javascript (as in ADL’s SCORM recommenda-tion); the other active portion at the learner is a plug-in which permit the media to be transmitted to the learner and facilitates the interaction with him. The reliance on the Browser although allowing the plug-in to be a relatively thin client, however imposes/inherits all the Browser limitations upon the plug-in, constraining the ultimate flexibility of the approach.

For what concerns the server side the proposed architec-ture (see figure 1) for video tracking is made up of differ-ent modules: some of them are parts of typical web-based learning environment, other ones are the components of a classical video management architecture [6]. A glue infras-tructure has been introduced to enhance the tracking process and fully support trackable video streams.

Most of the components of the shown diagram are also

common parts of commercial industry-standard LCMS and LMS products and different background colors remark that a block belongs to one of these categories [5]. As shown in figure 1, the proposed architectural layout is a set of five different entities:

• Lightweight Directory Access Protocol: the whole system relies on a common LDAP authentication in-frastructure to provide efficient and affordable access control to the courses and a per-user video navigation-level tracking.

• Learning Management System which stores the

learning contents provided by the content experts by

the means of learning content authoring tools. The system receives the course contents from the LMS ar-chitecture and delivers courses through a pool of con-tent servers also denoted as Delivery Servers. Its other main finalities are to store per-user profiling informa-tion, course details and manage the course authoring phase. Progress-tracking information are also retrieved and stored by the tracking engine which controls the content delivery.

• Learning Content Management System which stores the learning resources provided from teachers, tutors and content experts. It is a support for im-prove the functionalities of the Traditionally LMS. It enhances the traditional process of sharing learning re-sources through all the actors/actresses of the System. • Learning Authoring Tool. It’s a Client/Server appli-cation which allows the Content Experts to produce Learning Objects according to the SCORM standards. The Learning Authoring Tool allows to assemble the Learning Object and Asset to create SCOs and provide simple instruments to create a structured course. Then it can communicates with the LMS to deploy the pack-age in which the course (and its metadata) is stored. • Streaming Server. It’s the component of the

architec-ture in which the Multimedia files are stored; its work is to the deliver the Multimedia files to the Learner and / or (during the on-line management) to the Teacher. • Multimedia Client. It’s a Client/Server application

which allows the Content Experts to catalogue, man-age and logically divide the Multimedia Objects. It also allows the Content Experts to transform the Mul-timedia Asset in a MulMul-timedia SCO.

• Browser and Plug - In. No more than a traditional browser: for the Learner, it has only to allow him to see the Multimedia files and communicate whith the LMS through javascript; for the Content Experts and

(4)

LMS Meta Data Repository (MDR ) Content Repository (CR ) Streaming Server Browser + Plug-in MultiMedia Repository (MMR ) Browser Plug-in MM Client HTML + MEDIA Meta Data Repository (MDR ) Content Repository (CR ) Authoring Tool Redation LCMS LEARNER LEARNER EXPERTISE LDAP TEACHER

Figure 1. System Architecture - a glance

the Teachers, it allows to change on-line the descrip-tion of the Multimedia Object.

Most of the items of our layout are conform to the gen-eral implementable e-Learning architectures described by known literature as in [5] and [6]; and most of these items are commercial systems (see 5 Current Implementation). The remaining parts (as the LCMS) are indeed handcrafted to support deep Multimedia content tracking and driven content navigation. In the next section we deeply care about the Multimedia Client which is the main focus of our work.

4 A Client for Multimedia Learning Content

Management

The design and development of a client application able to manage multimedia resources with didactic aims is a cur-rent research challenge in the LCMS field. From a general point of view, such client has to be considered an integrated system capable, from one hand, of satisfying, all learner re-quests, from the other one, of simplifying and accelerating the content-expert metadatation task (Figure 2).

In order to achieve these purposes the didactic video is at first segmented into shots in according to a well-know auto-matic indexing process [4]. After that, the shots are grouped into scenes, that can be considered as video parts having a particular semantic meaning, by means of a semi-automatic process (Silence Detection) [9] driven and supervised by the content-expert. In this manner the video is not assumed as a closed black box (autonomous LO) in the learning process but, it can be seen as a SCO container where each SCO can be a video scene.

Figure 2. Multimedia Client - general metadata

As underlined previously a crucial focus is to choice the algorithms for video indexing. In particular, for video shot segmentation we have used a technique based on Animate Video [4] in which video cuts are determined on the base of local minimum points detection thought a visual attention function to make this logical partition. On the opposite, for video scene detection in literature we have a lot of differ-ent algorithms which could be chosen for this purpose [11], [12]. Our study has been addressed by two different needs: (i) we have to manage ”learning” multimedia file which are, in most of cases, very poor in shot contents; (ii) we have to use software which really improves and simplifies the work of the content expert. For these reasons, in the context of learning-video, scenes can be correctly detected by a Fast Silent Detection[9].

The next step is the metadatation of the video scenes and a correction (usually making a manual video scenes

(5)

merg-ing) of the automatic algorithm results that, in the majority of cases, can be caused false detections (Figure 3).

Figure 3. Multimedia Client - scene metadata

Using this approach it is possible to build up an an-notated video database in which each multimedia content (consequently each scene) can be indexed and retrieved by a dedicated search engine, in this manner, teachers can use it to enrich their courses. All metadata are stored in the LCMS and can be used by the LMS to dynamically offer a detailed description of multimedia contents.

For what concerns the SCORM tracking process we as-sume the following hypothesis:

• What is to be tracked is always a HTML page

• This HTML page is linked to a LMS by invoking ap-propriate API

• These invoking procedures contain information about: (i) the resource (page) to be tracked, (ii) a flag indicat-ing if the resource has been already seen by the learner, (iii) the time (expressed in seconds) spent for the re-source fruition and the user break-point, (iiii) a further flag indicating if the fruition is completed.

In particular, to invoke the communication API and man-age the data exchange with LMS, for each SCO some javascripts are opportunely included in a HTML page (by inserting appropriate video starting/stopping buttons, see Figure 4; for this reason the control buttons of video player are disabled) that drives the learner fruition on the resource. In the description bar of video player information about SCO, picked up on-fly by the LCMS in which they are stored, continuously appear. By the pressure of stop button or in correspondence of the video end, the fruition state of the SCO is updated. In this way the teacher can be always informed on the progress of learner and monitor students activities .

Figure 4. The HTML page for Multimedia Con-tent fruition

5 Current Implementation

The proposed Multimedia tracking system, developed in the e-learning framework of University of Naples ”Federico II”, operates in according to the following specifications.

• The videos are converted using the Windows Media

Encoder 9 Series encoders to publish video assets by

the means of the Microsoft Windows Media Services streaming server provided in the standard Microsoft

Windows Server 2003 installation. This product has

been chosen because of the advanced functionalities of bandwidth-adaptive streaming. We made successful experimentations also with Darwin Streaming Server. • The core element of the system is the IBM Websphere

5.0 application server which deploys a customization

of the IBM Learning Management System 1.0.5. But we have succsessfuly tried this architecture on Oracle iLearning 4.x.

• The LMS application uses a common authentication infrastructure and a uniform naming schema based on the Oracle Internet Directory 9.0.2 LDAP server. • The ’timeline’ dependencies and metadata into the

Multimedia-SCO are managed by a J2EE servlets de-ployed on our running instance of the Tomcat 5.0.x ap-plication server. This apap-plication relies on the under-lying database tier to store and compute info about the time intervals that the students spend browsing each single video element of the course.

• The Multimedia Client has been developed in Java and is compatible with JVM 1.4 or sup.. JavaTM has been selected as reference platform mainly for its character-istic of portability.

(6)

• The standard tracking info and the additional timeline dependencies are stored in schemas provided by a large

Oracle 9i database instance. Platform metadata and

per-user tracking data are separated by our additional time data by the means of additional database schemas. The streaming server runs on Microsoft Windows 2003

Server. The remaining parts of our architecture runs on a Linux Red Hat Advanced Server 2.1 distribution.

Eventually additional application server is a set of Tom-cat instances running on Debian GNU/Linux 3.0r2 distri-bution customized with recompiled critical-mission soft-ware infrastructural components. We merged three different video compression levels into a single video stream to adapt the performances to the standard baudrates of the common PSTN, DSL and LAN/Fiber connections.

6 Conclusions

A web based system for video LO tracking has been pre-sented. It allows to users a driven browsing on video con-tents in a e-learning environment and to LMS the possibil-ity of maintaining information about user progress status on such educational resources. Future works will be devoted to three different study directions:

• Create a MultiMedia mark - up language for learning objects which will allow to optimize the interaction between the Multimedia Client and the LCMS [10]. At this implementation, for each change in multime-dia description, the MMClient communicates with the LCMS DB so the Content Expert has to be on line to use this software. In future release, the Multimedia Client will produce an XML as result which had to be uploaded on the LCMS; in this way, all the changes will become effective in one time.

• The LCMS will produce the HTML pages: in this way the teacher will not need to install the Multimedia Client anymore. And he will be able to obtain all that he needs for multimedia contents every time and every-where on line, without direct interaction with Content Expert nor any software (except Authoring Tool). • Our efforts are concentrated to create a software which

will computerize the search of ontologies in the multi-media content.

7 Acknowledgments

This work has been carried out partially under the finan-cial support of the Ministero dell’ Istruzione, dell’ Univer-sita’ e della Ricerca (MIUR) in the framework of the FIRB Project Middleware for advanced services over large-scale, wired-wireless distributed systems (WEB-MINDS).

References

[1] A. Large,J. Beheshti, A. Breuleux, A. Renaud, “THE INFLUENCE OF MULTIMEDIA ON LEARNING: A COGNITIVE STUDY”, Proceedings of the ACM Multimedia ’94, pages 315–320, San Fransisco, CA, October 1994.

[2] www.adlnet.org

[3] “Shareable content object reference model initiative (SCORM), the xml cover pages”, October 2001, http://xml.coverpages.org/scorm.html.

[4] G. Boccignone, A. Chianese, V. Moscato, A. Pi-cariello, “Foveated Shot Detection for Video Segmen-tation”, IEEE Transaction on Circuits and Systems for Video Technology, VOL. 15, NO. 3, MARCH 2005. [5] “E-Learning Applications Infrastructure”, White

Pa-per, SUN.

[6] S. Shermann, M. Chang, Q. Li, “Architecture and Mechanisms of a Web-Based Video Data Manage-ment System”, Proceedings of the IEEE International Conference on Multimedia and Expo (ICME’00), New York City, NY, USA, Jul 30-Aug 2, 2000. [7] eLearning Industries Group, “Developing eLearning

Communities in the EU - The eLearning Industry Group Perspective”.

[8] H. Beetham,“Understanding Learning”, Skills for e-learning, Ulster, 29 July, 2002.

[9] P. De Souza, “A statistical approach to the design of an adaptive self-normalizing silence detector”, Acous-tics, Speech, and Signal Processing [see also IEEE Transactions on Signal Processing], IEEE Transac-tions on Volume 31, Issue 3, Jun 1983, Page(s):678 - 684.

[10] G. Amato, C. Gennaro, P. Savino, F. Rabitti. “Milos: a Multimedia Content Management System for Digital Library Applications”, Proceedings of the 8th Euro-pean Conference on Research and Advanced Technol-ogy for Digital Libraries (ECDL 2004), Volume 3232 of Lecture Notes in Computer Science, pages 14-25. Springer, September 2004.

[11] A. Hampapur, R. Jain, and T. Weymouth, “Digital video segmentation, Proc. ACM Multimedia94, 1994, pp. 357364.

[12] H. Lu, “Model based video segmentation, IEEE Trans. Circuits Syst. Video Technol., vol. 5, no. 5, pp. 533544, May 1995.