V
V
o
o
I
I
P
P
C
C
o
o
n
n
f
f
e
e
r
r
e
e
n
n
c
c
e
e
S
S
e
e
r
r
v
v
e
e
r
r
E
E
v
v
g
g
e
e
n
n
y
y
E
E
r
r
l
l
i
i
h
h
m
m
a
a
n
n
j
j
e
e
n
n
i
i
a
a
.
.
e
e
r
r
l
l
i
i
h
h
m
m
a
a
n
n
@
@
g
g
m
m
a
a
i
i
l
l
.
.
c
c
o
o
m
m
R
R
o
o
m
m
a
a
n
n
N
N
a
a
s
s
s
s
i
i
m
m
o
o
v
v
r
r
o
o
m
m
a
a
n
n
.
.
n
n
a
a
s
s
s
s
@
@
g
g
m
m
a
a
i
i
l
l
.
.
c
c
o
o
m
m
S
S
u
u
p
p
e
e
r
r
v
v
i
i
s
s
o
o
r
r
E
E
d
d
w
w
a
a
r
r
d
d
B
B
o
o
r
r
t
t
n
n
i
i
k
k
o
o
v
v
e
2
SHORT DESCRIPTION OF THE PROJECT...3
INTRODUCTION ...4
MOTIVATION...4
Session Initiation Protocol (SIP) ...5
Java Media Framework (JMF)...5
PROJECT GOAL...6
THE SOLUTION...7
INTRODUCTION...7
REQUIREMENTS LIST...8
SERVER PACKAGE...9
Architecture Overview ...9
Use case Sscenarios...11
Class Diagram ...14
USER AGENT PACKAGE...15
Architecture Overview ...15
Main Classes...15
Class Diagram ...17
FUTURE DEVELOPMENT ...18
APPENDIX A ...19
SIP Requests Codes ...19
Short description of the project
The project goal is to build remote Server for routing VoIP packets between conference participants and User Application for starting and managing conference.
User agent interface allows starting a call between two users and extend it to conference - invite other users to participate in call. Each conference has its own moderator that can perform different actions as invite users to conference, remove users from it or end the entire conference.
VoIP Conference Server supports multiple conferences, so at the same time multiple conferences can be carried. Thus, user can be member of one conference only at the same time.
Both Server and User Agent (UA) are built using Java technology with JMF package for handling (playing/sending/receiving) voice. The connection and messaging between the Server and UAs is handled using SIP protocol.
4
Introduction
Motivation
There are may ways to implement conference calls using VoIP technology. For example, consider following possible implementations – direct connection between participating users, while most efficient user (powerful computer, wide bandwidth) is used as a server, or
alternatively using single server receiving chunks of data from peers and sending it to other participants of conference. Thus, each one of these implementations has its own advantages and disadvantages. While first implementation strongly depends on computer force of participants, the second is not, and it can give better results, but in case of server failure – all existing calls are closed. Number of connections on each computer is also important (direct connection requires about n2 connections for n users), while in case of single server – each computer has single opened connection to server.
So, in that way, our goal is to build user independent and transparent for the user server with minimum required connections during the session.
Session Initiation Protocol (SIP)
The Session Initiation Protocol (SIP) is a signaling protocol used for establishing sessions in an IP network. A session could be a simple two-way telephone call or it could be a
collaborative multi-media conference session. The ability to establish these sessions means that a host of innovative services become possible, such as voice-enriched e-commerce, web page click-to-dial, instant messaging with buddy lists, and IP Centrex services.
Over the last couple of years, the Voice over IP community has adopted SIP as its protocol of choice for signaling. SIP is an RFC standard from the Internet Engineering Task Force (IETF), the body responsible for administering and developing the mechanisms that comprise the Internet. SIP is still evolving and being extended as technology matures and SIP products are socialized in the marketplace.
The IETF's philosophy is one of simplicity: specify only what you need to specify. SIP is very much of this mould; having been developed purely as a mechanism to establish sessions, it does not know about the details of a session, it just initiates, terminates and modifies sessions. This simplicity means that SIP scales, it is extensible, and it sits comfortably in different architectures and deployment scenarios.
SIP is a request-response protocol that closely resembles two other Internet protocols, HTTP and SMTP (the protocols that power the World Wide Web and email); consequently, SIP sits comfortably alongside Internet applications.
Java Media Framework (JMF)
The Java Media Framework API (JMF) enables audio, video and other time-based media to be added to applications and applets built on Java technology. This optional package, which can capture, playback, stream, and decode multiple media formats, extends the Java 2 Platform, Standard Edition (J2SE) for multimedia developers by providing a powerful toolkit to develop scalable, cross-platform technology.
6
Project Goal
The main goals of the project are:
• Developing independent server that will support voice conferences between multiple users (technically – receiving and forwarding RTP streams to appropriate users).
• Developing user agent application for starting managing conference.
As educational project further goals are:
• Getting familiar with Java programming language and particular JMF package technology for handling multimedia.
The Solution
Introduction
The VoIP conference server aims to create a SIP established VoIP conference call between multiple users. This project is based on an off-the-shelf SIP managing module and JMF package. The first is written for two participants call setup and maintenance, hence need to be altered because of some changes in the SIP protocol that where maid in this project.
The second one is an official Sun Microsystems multimedia handling package that is used in the project for managing real-time media.
In the next part we list the requirements of the project. These requirements will be listed and described by different scenarios that describe how our application should behave. Then we shall describe the design of solution to give a good feeling of the system architecture.
8
Functional Requirements
Server Application
1. The Server application will support the creation of multiple conference calls. . 2. Conference call will support the following actions for participants:
2.1. A participant (moderator) will be able to create a conference call .
2.2. A participant (moderator) will be able to invite another participant to join the conference.
2.3. A participant (moderator) will be able to disconnect another participant from a conference .
2.4. A participant will be able to leave a conference.
2.5. A participant (moderator) will be able to end a conference.
Client Application
The Client application will be able to perform the role of call moderator or as a call invitee and as such to perform all the actions (according to the role) described above.
Each application will have the physical abilities to perform the described above e.g. the application will be able to use a pre defined protocol for accomplishing a given task.
Server Package
The server package implements the server application. It is separated into to two logical units. The SIP handling unit and the RTP streams handling unit. The first one is responsible for all the administrative side of the server functionality e.g calls setup, call managing and call teardown. The RTP streams handling unit is responsible for low level handling of UDP packets e.g receiving them, forwarding them to other call participants.
Architecture Overview
The Model
Server application is the main block of our project. The application is divided into two modules that are connected via common database (see Databases section). Each module is responsible for a different functionality of the application. A description of the application is:
The Controllers
We have two main controllers. The SIP controller handles all SIP related operations. This includes the addition and removal of participants to a conference call, notifications of participants statuses and etc. It manages HashTables to keep the data in memory, in future if needed the data may also be stored on disk. The RTP controller manages all UDP packets related operations, such as receiving and sending them out to the network and more.
The View
SIP module
RTP module Shared Data Structures
10
Main Classes
As mentioned above, the main classes that perform most of the client functionality are the controller classes. These classes are the “manager” classes that control the flow of the user requests.
Data bases
Call structure
Each active session (conference) has its unique Call structure instance containing following fields: Call structure{ call_id; //participants list userList; //moderator id master_id; }
Instances of this structure are kept in Hash Table, while call_id serves the key to Hash Table entry.
User Node structure
Each session participant will have a unique UserNode structure:
User Node{
//ip address
ip;
//port
port;
//id of participated call
call_id; }
Instances of this structure are kept in Hash Table, while ip serves the key to Hash Table entry. In such way we can easily determine in which conference the user is participating (for example, after receiving RTP stream from particular IP address) and determine who are the other peers in same conference (for example, to forward the stream to them).
Use case scenarios
In order to start conference and to maintain it moderators and regular callees communicate with each other via Server, sending and receiving SIP messages. Following SIP messages can be send to/from Server to UAs (we listed here only main requests that influence the call. See appendix for SIP requests codes):
1. creation of a new session:
INVITE request is received from a new ip address: the Server will start a new session
with the invite source as a session moderator (master) and the invitee as a session participant. As shown below:
2. adding new participant to existing session:
INVITE request received from master of existing session (i.e. existing conference).
The diagram is same as in previous case, the only difference is that invitee will be connected to existing session, no new session between master and peers will be made. In this case we omit the diagram since it is same as in previous case. The already existing participant (not moderator) does not receive any SIP request, only additional RTP stream from the new participant.
Server UA invitee
UA Session master SESSION invite invite 100 200 200 ACK ACK
12
4. removal participant from session:
INVITE request from session master for already existing participant in the session will
remove the participant from this session.
5. session termination:
BYE request received from session master will terminate the entire conference by sending BYE request to all participants and closing it. In this case we will omit the diagram.
Server UA invitee
UA
Session master
200 BYE(from user)
ACK ACK
BYE
Server UA invitee
UA
Session master invite
BYE 100
200 200
6. user unavailable:
INVITE request received from moderator relayed to user that is away or that already participates in conference (this or another):
Server UA invitee
UA
Session master invite
invite 100
400 400
ACK
14
User Agent Package
The user agent package implements the user agent. Its purpose is to allow user to create, manage, and join conference calls. The package is an event driven, i.e. different features allowed in different scenarios.
Architecture Overview
Since full RTP and SIP messaging is made via server UA does not maintain list of conference participating users. In that way we do not need to use special structures. Although , SIP and RTP modules are independent (for RTP module see description below). As mentioned before the UA’s behavior is event based. It behaves according to user requests or Server instruction through following handlers:
onCallIncoming - Callback function called when arriving a new INVITE method (incoming
call)
onCallRinging - Callback function that may be overloaded (extended). Called when arriving a
180 Ringing
onCallAccepted - Callback function called when arriving a 2xx (call accepted)
onCallConfirmed - Callback function called when arriving an ACK method (call confirmed) onCallRedirection – Overloaded callback function called when remote user (callee) is busy, i.e.
participates in other conference
onCallRefused - Callback function called when arriving a 4xx (call failure)
onCallClosing – Callback function called when arriving a BYE request
onCallClosed - Callback function called when arriving a response after a BYE request (call
closed)
Main Classes
MyListener class is playing a crucial role in handling SIP requests, both coming from user and from Server. The way user communicates with this class is GUI that allows to use all UA features.
16
begins. When a voice conversation is terminated sending is not needed (and will not be received by anyone) and thread is terminated.
18
Future Development
This projects gives good base for further extensions with conference specialization. Since the base is ready one can add variety of features, as:
• Sending voice instead of music files (UA extension)
• NAT support – make Server and UA work behind NATs (both UA and Server extensions)
• Authentication and Registration – add an authentication and registration features (both UA and Server extensions)
• Messaging between conference users (both UA and Server extensions)
• Multimedia content – video support (UA extension). Good luck!!!
Appendix A
Used SIP Requests Codes
1xx—Informational Responses
• 100 Trying
• 180 Ringing
2xx—Successful Responses
• 200 OK
4xx—Client Failure Responses
• 400 User is unavailable, cannot connect
For full list of SIP codes see http://en.wikipedia.org/wiki/SIP_Responses
Literature
“Internet Communications using SIP” by Henry Sinnreich & Alan B. Johnston “Java Media Framwork tutorial”, can be found at
http://java.sun.com/javase/technologies/desktop/media/jmf/1.0/guide/index.html
MJSIP documentation can be found at
http://www.mjsip.org/doc/index.html
Plenty of additional material can be found in internet, since SIP and VoIP messaging used widely.