Web Technologies for Advanced Applications in Audio Processing

(1)



Abstract— Modern technologies and globalization have led

to media convergence and the massive use of personal media. Individuals become the source of information, and they generate, record and distribute content through personal media. Web technologies are becoming a very important competitor to standard desktop applications and programs. In this paper we will present one solution for advanced generation and sound processing based on client and server web-based technologies - Node.js, Socket.IO, Web audio API i Node Web kit. Sound source is a MIDI keyboard or virtual keyboard in the graphical interface and the management of generated sound, processing and its capture can be realized from any physically distant location. The aim of this paper is to show the possibilities of advanced web technologies in working with audio files.

Index Terms—Audio files, processing, client-side programming, server-side programming, web technologies.

I. INTRODUCTION

Modern technologies have changed the way we listen, perceive and consume sound and music. For many generations that are growing up in the 21st century, music is something primarily heard through speakers and headphones, and is only listened to live on a live basis. This way of consuming music influences the socio-cultural and economic development of society and art practice [1]. The Concert Hall has its physical limitations in the real world, while Internet, as a global virtual network, has almost unlimited possibilities. The real physical space for listening and watching a musical performance always has a limited number of visitors, while in a virtual world only the limit is the ability to access the Internet from the users of audio content. Consuming live music usually involves gathering people at a certain place at the same time, but in the virtual world it is also possible to gather people who listen to the same media content at the same time. We can view the Internet as a virtual version of the concert hall, which gathers listeners from around the world and they can all enjoy a music event that is not spatially restricted. Music lovers and music lovers can enjoy the most diverse audio content they choose. Extremely

Manuscript received Sept, 2017.

Nenad Kojić, Internet technologies, ICT College of vocational studies, Belgrade, Serbia.

Đorđe Petrović, Propulsion Apps, Belgrade, Serbia.

Natalija Vugdelija, Internet technologies, ICT College of vocational studies, Belgrade, Serbia.

Milanko Kragović, Internet technologies, ICT College of vocational studies, Belgrade, Serbia.

intensive technological development in the field of information and communication technologies enables constant changes and new opportunities for end users, which results in their personal and social transformation. In an ultra-digitized, globalized and connected network of listeners and fans, individuals become spatially and time isolated from each other in the consumption of music [2].

The convergence of the media that happens involves five separate processes [3]:

1. Technological convergence - enables converting analogous information to which we are surrounded, such as audio and video content in digital information, and expanding their processing, storage and distribution capabilities through multiplatform media.

2. Economic convergence - allows horizontal integration within the culture and entertainment industry, which are increasingly intertwined and lead to the restructuring of cultural production and transmedia exploitation of media content and products.

3. Social convergence - it allows the development of multitasking strategies and skills that allow consumers to move between the various media environments available.

4. Cultural convergence - enables the development of new forms of creativity related to different media technologies, industry and consumers.

5. Global convergence - allows the emergence of hybrid cultures created as a result of international circulation of media content.

Media convergence allows the end user to download, archive, adapt, redistribute, and modify media content. Spectators, listeners and readers are trained to manipulate multimedia content, and as a result of this process there is an increased loyalty of media consumers and massive generation of cheap media content.

Digitalization and personal use of media technologies destabilized the traditional difference between mass communication and interpersonal communication, and therefore between mass media and personal media. Mass media are no longer the primary sources and providers of information, because individuals using modern media technologies create different multimedia content and distribute personal expressions through digital networks. Personal media are not institutionalized and professionalized as mass media and, therefore, facilitate interaction. The implementation of digital technologies and the intensive development of hardware and software in the field of multimedia communications just influence the creation and contextualization of personal media. Personal

Web Technologies for Advanced Applications

in Audio Processing

(2)

communication tools have experienced unprecedented development and increased use with the digitization of media technologies [4]. The same technology for the production and distribution of content used by the mass media has become accessible to ordinary end users and this has enabled the massive use of personal media. Technical development in the field of information and communication technologies has also facilitated mass media and personal media communication - the Internet is the best example, as a medium that simultaneously gives space to commercial online content generated in existing mass media, as well as personal web pages created by individual users.

Digital audio processing provides great opportunities for personal media users for different audio modifications that were not available with analogue technology. The audio signal in the digital domain becomes the data, and it differs from many generic data in that it needs to be played over a certain period of time [5].

Generating music has been an active research area for decades [6]. Modern technologies have made it possible to generate very compact musical forms [7]. This paper presents one solution for the advanced applications of web technologies that is a form of personal media and aims to enable the creation, processing and recording of sound. This paper is organized in five chapters. After Introduction, the second chapter defines motivation for this paper and this research. The third section brings the structure of the proposed system. Subsequently, the forth chapter briefly describes used technologies and software solutions for the created application, while the fifth chapter provides a conclusion and further guidance in the development of the application.

II. MOTIVATION FOR THIS RESEARCH

Development of web-oriented languages and environments was initially intended only for displaying data and multimedia content in browser. Rapid increase of the number of Internet users, and the increasing popularity of Internet services, have contributed to an intensive development of numerous web-oriented programming languages, development environments and libraries, and, especially, to their uses in the form of web services [8]. The next major change was the development of a client programming language JavaScript, which was initially used for interactive management of DOM site elements and event management [9], [10]. Starting with creating JavaScript libraries such as jQuery, Angular, React, and especially WebGL, this language becomes very suitable for realizing sophisticated requests in real time, even for complex multimedia processing [10], [11]. However, all these changes contributed to the development of JavaScript as a client-side language, while its key quantitative role in use was further enhanced by the appearance of Node.js as the server-side format of this language [12]. From that moment on, a large number of different libraries, tools, software packages and databases are available for integration with JavaScript. Currently, one of the most popular technology combinations with Node.js is MEAN [13].

On the other hand, the term Internet of Things, or abbreviated IoT, is becoming more and more popular as a concept of using and connecting different types of technologies in order to better automate and control different types of processes and activities [14]. One of the primary approaches of IoT is to connect various hardware with the aim of exchanging data among themselves or the applications that control or monitor them in order to integrate physical components and the computer system [15]. The computer system, with the application of the described web technologies and the Internet, becomes global, so integration and management can be viewed globally, unrelated to the physical location of the device itself.

In this paper, one solution for generating sound from a MIDI keyboard, for OSX operating systems, is realized using only web programming languages and technologies. The aim of this work is a complete synthesis of sound, its processing, generation of rhythm sections, mixing of several channels, i.e. several individual sounds of instruments, their editing, reproduction and recording capability. As the proposed solution is implemented as a web application, the place where the sound is interactively played or edited is not related to the place where the MIDI keyboard is located. In this way, the sound is not transferred from the hardware, but is programmatically generated and modified by changing the frequency of sound and other parameters, thus achieving different effects in timbre, height and quality of the sound. The idea for the created application came from already known software for music production such as "Cubase", "Ableton" and "FL Studio". The application is written in the JavaScript programming language and aims to show the capabilities and options that it offers, both on the client and server side. This application was implemented using a number of other technologies and solutions that are also web-oriented, which are available to all users on the Internet.

III. STRUCTURE OF THE PROPOSED SYSTEM

[image:2.595.307.549.625.763.2]

The proposed solution can be viewed in the form of a block diagram in Figure 1. The solution is divided into two basic components: the server and client part of the application. A MIDI keyboard is connected to the server part of the application, which allows user, a musician, to use keyboards in the real world, but it is not mandatory. Connecting the keyboard to the computer i.e. the application is via a USB port.

(3)

The server application is a Node.js HTTP server that communicates with the MIDI keyboard, while on the other side the connection to the client is achieved via the Socket.io library. On the client side, the complete code is executed in the Node-WEBKIT container, which is a "wrapper" for the initial web application.

The complete user interface was made using HTML 5 and CSS 3, allowing the latest graphic visualizations in the browser. The user can see a virtually created keyboard in the browser that simulates every pressing of a physical keyboard key, or a virtual key can be pressed. This visual identity is shown in Fig. 2.

[image:3.595.65.273.313.450.2]

In the left part of the user application, there is a section containing a synthesizer with basic sound settings and filters that control the timbre of the generated sound (oscillator, low frequency and low pass filter). On the right side there is a rhythm section with basic controls such as "STOP", "PLAY", "PAUSE" and a BPM controller. This rhythm section contains options for adjusting the rhythm speed (BPM), and a predefined basic set-up of sounds for the rhythm section, such as kick, snare, clap, and hit.

Figure 2. Graphic interface for displaying keyboards to the client

If the block diagram of Fig. 1 is viewed functionally, then it can also be presented as a diagram of used technologies, which is shown in Fig. 3, i.e. Node.js, Socket.IO, Web Audio API and Node Web Kit.

Figure 3. Structure of the project in the form of the used web technologies

The key component that enabled the implementation of the proposed solution is Web Audio API. This API is integrated

in all of the newer Webkit browsers and its purpose is to synthesize signal into sound, so this API was used to generate sound for the user, which is transmitted across the network. With this API, it is possible to choose audio sources, add effects, perform audio visualization, etc.

In this way, the whole system allows the user to plug a MIDI keyboard on one side, and then sublimate the keyboard signal, combined with signals from the virtually created keyboard, individually set them into channels, define each channel’s timbre, sound effects, rhythm section and rhythm section effects and create all together as a composite signal. All these functionalities are displayed graphically in the user part of the application and they accept, transmit, process and edit using web technologies. At the end of interaction, the user is given the possibility to save the final audio sequence, and so this functionality is also supported in this interactive and real-time system.

This project is a fully functional application for generating sound from a MIDI controller that can be used in real situations. Types of applications in which user functionalities are realized on remote access, using web technologies, are becoming more and more widely used [16]. By giving the user complete control and allowing him or her to save his/her work, JavaScript has confirmed that it is "up to the task" even for advanced and more demanding applications that were mostly addressed only to the desktop environment so far.

IV. USED TECHNOLOGIES

In this project, emphasis is placed on the server and client JavaScript code, and their advanced uses for a specific purpose in the domain of sound generation. Fig. 3 defines what basic technologies and languages were used, but lots of additional libraries and environments with specific purposes were used also:

1. Node.js (Express.js framework),

2. Underscore (a JavaScript library that provides useful functions for frequent programming tasks),

3. JQuery (a JavaScript library for graphic effects and event control),

4. Socket.io,

5. Modernizr (a JavaScript library that detects HTML 5 and CSS 3 functions in different browsers), etc.

a) Node.js (Express.js framework)

Node.js is a sophisticated platform implemented in JavaScript, and built on Chrome's V8 JavaScript engine. The key feature of Node.js is that it is used for event management and with its non-blocking I/O model it is a very powerful tool for creating very efficient applications. Since it is based on work with asynchronous events, Node.js is suitable for scalable network applications [17]. We user Node.js primary for creating HTTP servers.This is example of the base HTTP server based on Node.js.

// loading http modul var http = require('http');

[image:3.595.68.286.537.725.2]

(4)

http.createServer(function (req, res) {

res.writeHead(200, {'Content-Type': 'text/plain'}); res.end('Hello World\n');

// we set server "to lisen" on port 1337 }).listen(1337, '127.0.0.1');

console.log ('Server running at http://127.0.0.1:1337/'); One of the most popular frameworks for Node.js is Express.js framework [18]. It enables fast and easy making of one-page, multi-page and hybrid applications. The popularity of the framework has been positively affected by the fact that some of the leading companies i.e. brands, such as MySpace and Klout stood behind it. The main advantage of the framework itself is that it has a very simple routing system and a large number of built-in features, the most common of which are: Sessions, Post Body / query string parsing, as well as simple template engines such as Jade, Mustache and Ejs, which accelerate the process of code development and testing for the end user. The main part of Experss.js was routing. This is example of HTTP server in Express.js-u for routing proccess:

// loading express modul var express = require("express"); var app = express();

// example of rooting system

app.get("/index", function(req, res) { // Rendering of View

res.render('index.hbs') });

var port = process.env.PORT || 1337; app.listen(port, function() {

console.log("Listening on " + port); });

b) Socket.io

Socket.io is a JavaScript library. Socket.io provides a real-time communication between a web client and server, that is based on events. It can be used on any platform, browser or device, which gives it a wide range of applications. It consists of two parts: client library, running in the browser, and server library for Node.js. Socket.io primarily uses the WebSocket protocol, but also, in case of uncommon problems, it has a built-in fallback on technologies such as JSONP polling and AJAX long polling. Socket.io is able to select the best protocol for real-time communication by itself, requiring the developer only to have knowledge of the Socket.io module syntax. In this project Socket.io was used to forward data, depending on the event that activates it, from the server to the client in a very fast and reliable way, which was very important for the purposes of this project.

This is example of server side code where user connection to Socket.io is trigerd by event named ‖konektovan‖ and send to client side object with data about authorized user.

io.sockets.on('connection', function(socket){ socket.emit('konektovan',{‖Ime‖:‖Djordje‖}); });

On the client side "listening" of this event is realized by:

// create connection to Node.js server where socket.io is loaded

var socket = io('http://localhost'); // receiving data

socket.on('konektovan', function (data) { console.log(data.ime);

});

c) Node-Webkit

Node-Webkit is a runtime application based on Node.js and Chromium, which is designed to build applications that can work equally well on operating systems like Linux, Windows and OSX. It was created by INTEL as an open-source project in 2011, aimed at reducing the volume of jobs when creating offline single-page web applications.

A webkit-based browser features an integrated Node.js that allows the Node-webkit application to realize a much greater range of functionalities than the one originally provided by HTML 5 API alone, and is therefore very popular.

d) Web Audio API

Web Audio API is a JavaScript API for processing and generating sound in web applications [19]. The goal that creators wanted to achieve is the implementation of modern options providing contemporary audio components that can be found in modern desktop applications for music production. Although the project is still under development, the API is slowly finding its application in WebGL games. The logic of the application of this API is as follows: firstly, a developer creates a so-called "AudioContext" in which he will define all subsequent elements and effects he wishes to apply. Next, sources of sound that are going to be used are defined for the created "AudioContext" (most often it is an <audio> element or oscillator use). In parallel with this, various types of effects are created, such as: reverb, filter, panner, compressor, ...

Detection of whether the browser supports the Audio API, creating AudioContext, and creating a simple sound:

var context;

window.addEventListener('load', init, false); function init() {

try {

// Creating AudioContexta

window.AudioContext =

window.AudioContext||window.webkitAudioContext; context = new AudioContext();

// Creating Oscillator

oscilator = context.createOscilator(); // Seting type of Oscillator

oscilator.type = 2;

// Seting destination for AudioContext oscilator.connect(context.destination); // Activating soung oscillator

oscilator.noteOn(0) }

catch(e) {

(5)

} }

Then, for the defined "AudioContext" a final sound destination is selected (for example, the user's sound system) and finally, the defined sound source is associated with the chosen effects and the destination selected. In this way, the real signal is easily processed, or effects on it changed, and it is immediately transmitted as modified to the desired destination.

[image:5.595.46.293.354.535.2]

Some additional components or libraries were used in this project which did not have a significant effect on the generation or modification of the sound, but they allowed quicker and easier integration of all the used technologies. One of them is the Dependencies manager, which represents a system for accelerating the process of programming during the setup of the application, and allows all the necessary packages for the project to be installed from just one site and with one command. Node.js has its own dependency manager system called NPM (Node Package Manager) that is included directly in the environment, so there is no need for additional installation. At the end, final application and its integration with Midi keyboard is shown on Fig 4.

Figure 4. Real time usage of proposed solution

All the described and used technologies are mostly JavaScript-based and completely web-oriented. It turned out that the used technologies can in quite high-quality, and in real time, provide the user with all the functionalities for controlling sound, and with the expected quality because there was no physical sound transmission, but the sound was generated just before its delivery to the client.

V. CONCLUSION

Personal media takes a significant place in contemporary media production. Multimedia content generated by personal media becomes easily accessible to a global population using the Internet. In this paper, one solution for generating, modifying, editing, recording and reproducing sound is presented. The audio signal is generated using the MIDI keyboard or from the virtually created keypad in the graphical interface. The entire application is realized using Web-oriented programming languages, which are based on

JavaScript, both for client and server components. An important feature is the realization of advanced functionality, processing and assembly in both the usual desktop technology and advanced web technologies, using a number of different current libraries and environments. The software application can be controlled remotely, so that the audio signal source, actual processing of the generated sound and the user that plays the final media content can take places at physically different locations.

REFERENCES

[1] M. Katz, Capturing Sound: How Technology Has Changed Music, London, University of California Press, 2004.

[2] D. Levitin, V. Menon, The Rewards of Music Listening: Response and Physiological Connectivity of the Mesolimbic System, NeuroImage 28: 175–184, 2005.

[3] H. Jenkins, ―The cultural logic of media convergence‖, International Journal of Cultural Studies, London: Sage, 2004.

[4] M. Luders, ―Conceptualizing personal media‖, in New Media and Society, London: Sage, 2008.

[5] J. Watkinskon, An Introduction to Digital Audio (second edition), Focal Press, 2002.

[6] S. Kang, S. Y. Ok, Y. M. Kang. ―Automatic Music Generation and Machine Learning Based Evaluation‖ (pp. 436–443). Springer Berlin Heidelberg, 2012.

[7] A. Huang, R. Wu, ―Deep learning for music‖. arXiv preprint arXiv:1606.04930, 2016.

[8] A. L. Lemos, F. Daniel, B. Benatallah, ―Web service composition: a survey of techniques and tools‖. ACM Computing Surveys (CSUR), 48(3), 33.1, 2016.

[9] A. Nitze, ―Evaluation of JavaScript quality issues and solutions for enterprise application development‖. In International Conference on Software Quality (pp. 108-119). Springer, Cham, January 2015. [10] E. Brown, Learning JavaScript: ―JavaScript Essentials for Modern

Application Development‖, O'Reilly Media, Inc. 2016.

[11] A. Sahu, A. G. Singh, CRISP: ―A JavaScript strategy for cloud application development‖ (Doctoral dissertation), 2016.

[12] J. Krause, Introduction to Node. In Programming Web Applications with Node, Express and Pug (pp. 15-46). Apress, 2017.

[13] S. Holmes, Getting MEAN with Mongo, Express, Angular, and Node. Manning Publications Co, 2015.

[14] A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, M. Ayyash, Internet of things: A survey on enabling technologies, protocols, and applications. IEEE Communications Surveys & Tutorials, 17(4), 2347-2376, 2015.

[15] A. Whitmore, A. Agarwal, L. Da Xu, The Internet of Things—A survey of topics and trends. Information Systems Frontiers, 17(2), 261-274, 2015.

[16] P. Quax, J. Liesenborgs, A. Barzan, M. Croonen, W. Lamotte, B. Vankeirsbilck, M. McLin, ―Remote rendering solutions using web technologies‖, Multimedia Tools and Applications, 75(8), 4383-4410, 2016.

[17] https://nodejs.org/en/

[18] B. Augarten, M. Kuo, E. Lin, A. Shaikh, F. P. Soriani, G. Tisserand, K. Zhang, Express. js Blueprints. Packt Publishing Ltd, 2015.

[19] H. Rawlinson, N. Segal, J. Fiala, Meyda: an audio feature extraction library for the web audio api. In The 1st Web Audio Conference (WAC). Paris, Fr., 2015.

Nenad Kojić received his B.Sc., M.Sc. and Ph.D. from the School of Electrical Engineering, University of Belgrade. His research interests include neural network, routing algorithms, heterogeneous wireless networks, image processing, web programming and multimedia. He is an author and co-author more than 80 papers. He was involved in the European COST292 project as member of the Image Processing, Telemedicine and Multimedia(IPTM) group from the School of Electrical Engineering of the University of Belgrade. Now, he works at the ICT College of Vocational Studies in Belgrade.

Đorđe Petrović received his B.Sc., from the ICT College of vocational studies, in Belgrade. His research interests include web programming and multimedia. Special interest he shows for the creation and processing of digital sound. Now works as a web programmer in Belgrade.

(6)

learning, networks and software security. She is an author and co-author more than 20 papers. Now, she works at the ICT College of Vocational Studies in Belgrade.