Secure SIP: Do It Yourself
Rick van Rein, OpenFortress 11 maart 2008
SIP is the only digital telephony protocol that is likely to reach the same global span as POTS. But digital media are prone to undetected wiretapping, and cannot in general do with security/privacy measures. The lesson learnt below is that if we want such security, we should do it ourselves.
OpenFortress*is active in areas of digital infrastructure, such as digital signing and encryption. SIP1, being an important part of tomorrow’s infrastructure has recently been a focal point for OpenFortress.
1
Why SIP?
Progress. There are numerous reasons why SIP is a good development for telephony, the most notable one being that POTS2 development has come to a grinding halt about the
moment the first phone was plugged in. The phone system is hardly changing, while so many possibilities open up when going digital. How long have we seen predictions of video calling over POTS or ISDN3? And how quickly did digital phone systems add it?
Implementing new features for POTS is expensive, especially when more signals (voltage/frequency bursts) have to be incorporated into the system in a backward-compliant manner. A good ex-ample is the supposedly most demanded feature, which is to configure the number of seconds (or rings) before voice mail answers a call. This cannot be set because (in the Netherlands) KPN has supplied most subscribers with a device that flashes a light after four rings on an unanswered line, as an indication that theremight be a voice mail in your box. These devices cannot tolerate any change in the number of rings before voice mail picks up. It’s the classical problem of backward compatibility on analog signalling lines.
In comparison, the digital voice mail system works much more intuitively; phones subscribe to voice mail notifications, and hear when there are changes to the number of voice mails waiting. The phone shows this by flashing a special light or on a display. Multiple phones can signal the same voice mail box, and when the owner catches up with it, all those phones are updated so they can stop signalling the availability of voice mail. Such a system could not be designed in a backward-compliant manner with analog signals, but it is easy in SIP: just define a new method or a new header in an RFC4 that extends the SIP RFC and you’re done.
Interoperability. Interoperability is the main feature that makes POTS useful. When a phone is plugged into a wall socket, it can immediately reach all other phones in the world. Of all the digital systems, SIP is the only one that stands a chance of being as interoperable as POTS. The ITU-designed H.323 protocol does not seem to catch on as SIP does, and other
1SIP=Session Initiation Protocol, an Internet standard for setting up media connections. 2
POTS=Plain Old Telephone System, a nickname for PSTN=PublicSwitched Telephone Network.
3ISDN=Integrated Services Digital Network, often nicknamed ISDN=It Still Does Nothing. 4
protocols are either proprietary (Skype) or they are SIP-in-a-jacket (MSN5, Jingle6). SIP is designed to be interoperable in much the same way as the email system is interoperable: new addresses and server nodes can be added by just about anyone.
SIP also is not interoperable like ISDN which is implemented in devices which ”add features” that only work with devices from the same manufacturer. ISDN held great promise, but never delivered on them. SIP on the other hand is delivering today. Special features are added through the public RFC process of IETF7 and manufacturers of SIP-devices actually make an effort to support interoperable use of their hardware. This leads to configuration kludges such as device-bug-compensating options, but at least interoperability is happening. SIP is different from POTS in that the phones are a lot smarter, and a lot of the functionality that is located in POTS switches now ends up in phones. To connect to another phone, the basic need is an interoperable manner of communicating and locating others. Nothing the Internet can’t handle. In addition, a publicly advertised contact address must be serviced, for which a multitude of solutions exist today.
Quality. Speech quality of a 100% digital call is superb — given a quality codec such as A-law orµ-law and a phone that cancels its own echo. Since the phone can know or learn its own acoustics, it can do just the right thing to avoid that its speaker output that lands in the microphone is sent out again. Echo cancellation is also done in good-quality analog phones, albeit in an analog way.
The longer voice delays caused by buffering and travelling over the Internet are acceptable for voice communication, but the other end must not bounce back part of the voice data like a cheap analog phone does, because that can annoy the speaker, who hears himself back. It can be a burden of mixed digital/analog telephony to hear the (lack of) quality of a phone on the analog end. This is in part due to the longer transit times, because our brains stop to ignore echo after about a tenth of a second.
Reliability. Another aspect of POTS calling is its robustness. The system hardly ever suffers from problems. Rolling digital endpoints into homes is not helping that, but this robustness is definately something to match. SIP can do that, using multiple servers to cover for one domain. Just like we are replacing mainframes with clustered PCs, we can also replace POTS with a redundant set of SIP servers.
Mobility. An advantage of SIP is that I can take my phone with me wherever I go. It is possible to answer+31.534782239on a beach on the Bahama’s. It’d better not be a videochat though. . .
Also, multiple SIP phones can connect to the same ”line” (or phone number, or SIP account). These phones need not be all located in the same place. Routing gets much more flexible if it is taken care of by the Internet, rather than by a POTS operator!
Cost. VoIP is often marketed as a cheaper way of calling. Sure, there is less infrastructure to maintain for the telco8, but in practice it is still used as a mechanism to connect to the rigid bureaucracy of POTS. As a gentle form of customer lock-in, telco’s often provide free calls between others who use the same product, but there usually is no interoperability with other digital networks.
Even though we usually prepay our Internet bandwidth, even though voice traffic is not nearly as bulky as our daily downloads, telco’s still force us through POTS to connect to
5MicroSoft Network, a popular chatprotocol outside professional circles. 6
Jingle is the protocol of GoogleTalk.
7IETF=Internet Engineering Task Force, the Internet standards body. 8
other telco’s. Why? Simply because we’ve grown up thinking of phone calls as being charged per minute. And so there is a profit to be made.
In the end of March, ENUM9 is introduced in the Netherlands, but it is not likely that telco’s will support this attack on their revenue. Telco’s do not even offer an incoming digital access point, with the exception of xs4all and BudgetPhone. In addition, it is unlikely that telco’s will be offering ENUM-bypasses to the POTS-calls that make them so much profit.
This is so annoying that OpenFortress is working on an easy-to-use setup that will enable home users to make increasing use of digital bypasses by simply using their home phones. To avoid an underground look&feel and to ensure longevity, we will be charging for digital connection buildup, but never any cost per minute. What we are hoping to achieve is to become such a commonplace alternative that telco’s will see themselves forced to build in public-access SIP interfaces and to use ENUM for outgoing traffic. SIP was never intended to be confined! Visit0cpm.nl for details.
2
Why Secure SIP?
Is advocation of Secure SIP a helping hand to terrorists? Nah, terrorists are evil, so they probably use a much more evil OS than anything UNIX-based, so they wouldn’t appear on NLUUG. More seriously though, it is common knowledge (and common sense) that it is straightforward to conceal one’s footprints. An example technique can be found on
http://openfortress.nl/essay/crypto-abuse/
Privacy. We’ve all seen James Bond movies, where bald crooks switch on the scramble mode on their phones while they stroke a fluffy cat. With SIP, this is all possible. Since world leaders resort to things like phone tapping to give an impression that they’re on top of terrorists, privacy is increasingly becoming a concern that we should deal with ourselves. In the Netherlands, every telco must register with OPTA10 and support undetectable wire tapping for the ministry of Justice even if they only run a publicly usable SIP server.
Another situation where privacy becomes a concern is when making a phone call from the network of a (prospective) customer. It is possible to discuss what discounts that customer can get, and be tapped by the network admin who informs his boss how far he can go in subsequent negotiations. We may like that telephony is moving into our digital realm, but it also moves into the realm of improper conduct.
Aside from that, if we wish to provide good service to customers, we could do them the service of encrypted phone calls. An international customer in, say, the banking business may not be inclined to accept our country’s wire tapping facilities when he wants to discuss site security. Or, to a lesser extent, when trade secrets are being exchanged.
Authenticity. Another matter of concern is knowing the remote party is the proper party. A caller wants to know that his call ends up with the intended contact, and the called party wants to know if the caller ID popping up in the display is reliable. In this respect, POTS is much more reliable than SIP without added security: anyone can claim any phone number with SIP, while POTS uses its hierarchical structure to control where caller ID values may be injected on the network.
9
ENUM=Mapping from international phone numbers to contact info.
10
3
What forms of Secure SIP?
There is a wide range of secure extensions to SIP, but they are all difficult in practice, as well as unappealing for a SIP service provider. In practice, you will end up doing it yourself if you want to employ Secure SIP.
SIP and RTP. SIP is most often used to initiate RTP11sessions. As an alternative, SRTP12 sessions can be setup, using AES13. But to get the two end points in line, they must exchange an AES key beforehand. This is usually done through SIP, while setting up the RTP-session.
Login. When trying to register a phone or to initiate a call, it is common to reject the attempt with a challenge. After that, a second attempt is made, this time adding a response to the challenge. This challenge/response scheme verifies if the initiator of the request has the right to make the call.
This form of login is a pragmatic attempt to shield off very crude attacks, but it is not good enough to thwart men in the middle who modify the message to suit their needs. Since money is involved, this means that more security is no luxury.
That being said, at least the password itself is well enough protected under RFC 3261 to travel safely in unencrypted packets. That is, replay attacks and account hijacking are not very likely.
TLS. Most SIP URLs14look likesip:[email protected] are connected through one of UDP or TCP. The secure alternative is sips:[email protected], with the great advantage of visible security. Unfortunately, the constraints are not very stringent.
Thesips: protocol implies TLS-encryption between hops, at least to the domain name being addressed. What happens behind that domain name’s SIP server is the responsibility of the recipient, and may actually not be encrypted. The idea is that LAN-routing could perhaps be considered secure enough, and if the receiving server resides on the same LAN as the eventual phone it would be possible to receive secure calls on a phone that does not support such calls. The trick however, is hop-to-hop links being encrypted by TLS. Although the authentication is two-way, it will rely on lists of trusted root certificates, which may vary from domain to domain. Furthermore, the authentication is not based on end-user credentials, so the reliability waters down when the message hops up- or downstream.
Most notably, the security of sips: URLs does not imply security of the user at the other end, as its format intuitively suggests (to me at least). It validates servers and domains, not users. So a URL likesips:[email protected] does not add any security.
S/MIME. It is possible to establish end-to-end security by encrypting the attachment that describes the upcoming connection, including SRTP keys. This works through S/MIME attachments instead of plain SDP attachments.
In an ideal world, proxies between the end points should only need the SIP headers. In a more pragmatic world such as ours, proxies tend to help out with NAT15 problems and may need to investigate SDP as well. This may lead to voice connection problems when calling securely from behind a NAT that needs such help. (The aforementioned ideal world would run IPv6 and would have gotten rid of NAT altogether.)
11
RTP=RealTime Protocol, a mechanism to exchange time-critical data such as interactive voice.
12SRTP=Secure RealTime Protocol, an encrypted form of RTP. 13
AES=Advanced Encryption Standard, the generally advised symmetric encryption algorithm.
14URL=Uniform Resource Locator. SIP locates rather than names, so URL is the most specific subclass of
URI and is proper to use here.
15
4
Public Support for Secure SIP
It is highly unlikely that public parties (such as commercial SIP-providers) are going to embrace Secure SIP. And it is unlikely that you would be interested, given the motivations for Secure SIP in the foregoing.
TLS. First off, Dutch telecom laws do not mind about encryption, as long as it is removed before a tap(p)ed conversation is shipped to the ministry of Justice. Other countries probably have similar laws. Luckily, the law doesn’t go so far overboard as to demand key escrow, but the telco has no way of hiding from its responsibilities. This means that sips: encryption does not really help privacy forward. As discussed before, it does not do much for authenticity either.
TLS is also a much heavier protocol than plain SIP — which does not even run on TCP, but on UDP. In spite of sending several packages of a few kB for every signalling change, the use of UDP makes SIP a very efficient protocol. TLS takes a lot more traffic, and a lot more computing power. Some optimisation is possible with caching, or by sharing connections in one SCTP-over-TLS connection, but it is still a futile attampt at privacy.
Having said that,sips: is not an optional part of RFC 3261:
Proxy servers, redirect servers, and registrars MUST implement TLS, and MUST support both mutual and one-way authentication. It is strongly RECOMMENDED that UAs be capable initiating TLS; UAs MAY also be capable of acting as a TLS server. Proxy servers, redirect servers, and registrars SHOULD possess a site certifi-cate whose subject corresponds to their canonical hostname. UAs MAY have cer-tificates of their own for mutual authentication with TLS, but no provisions are set forth in this document for their use. All SIP elements that support TLS MUST have a mechanism for validating certificates received during TLS negoti-ation; this entails possession of one or more root certificates issued by cer-tificate authorities (preferably well-known distributors of site cercer-tificates comparable to those that issue root certificates for web browsers).
S/MIME. A pressing concern for use of S/MIME is that many proxies between end points tend to constrain the length of a SIP message to a fairly tight maximum. This maxium does not always leave a lot of spare room, certainly not for bulky attachments in S/MIME. This means that all sorts of things can start to fall down when using this approach.
Direct connections between end points would circumvent such light-headed intermediates, at the expense of the advantages that intermediate proxies bring in terms of flexibility, NAT-support and so on. But when setting up a direct connection, it is perhaps much simpler just to use asips: URL.
DNS. An important concern to take into with Secure SIP is DNS. As wel all know, this system can easily be spoofed on a local scale. DNSsec would be the solution, and as SIDN16 tells us, they only need to roll out the final form of an experimental setup for NLnetlabs that has worked well in the past. This must be a good argument, because SIDN has been using it for years.
SIP uses SRV records to point to a server, so there is a layer of dynamicity to get to a server. Dynamicity over an easy-to-spoof protocol spells disaster, so this is definately a place where DNSsec would aid in getting some security.
16
Device support. Not all devices support Secure SIP. Some professional phones do (Cisco/Linksys, Mitel) but not all (Innovaphone, GrandStream, Siemens, Aastra, Polycom) and consumer de-vices certainly don’t (SMC, Tiptel, GrandStream).
5
How to get Secure SIP
Getting Secure SIP is not as easy as it would seem. But everything changes when you start doing it yourself.
The first thing to do is to ask SIDN about DNSsec, and as soon as it is implemented, to add it to your domain. For now you will have to settle for an insecure SRV record pointing to your server(s) from sips. tls under your domain, and possibly also for sip. tcpand sip. udp. With SIDN hard at work on DNSsec, setup a server with SER from iptel.org or its fork OpenSER from openser.org. The reason why these two forked is a bit of a mystery. It is not because of licences such as we normally see, merely a matter of splitting who controls the code. But at the same time the code is kept fairly consistent between the two.
SER/OpenSER is a proxy, meaning a relay station for SIP messages. In doing its work, it can pass information from/to protocols. SER/OpenSER supports all of the UDP, TCP and TLS transports. So if you want to link to Asterisk (which only supports SIP over UDP), you could use SER to translate traffic to UDP. Doing this as local as possible is all it takes. A basic setup for SER would suffice — all it needs to support is registrations from phones and passing incoming calls to those phones. Example scripts are provided as part of the package. Just beware that SER assumes you understand SIP and its routing implications.
If your phone can handle TLS-traffic directly, and you open up a hole in any NAT from the outside to your phone (for RTP, and possibly also for SIP), you could even bypass the SER proxy server and have the DNS SRV record point directly to your phone’s external location. Finally, you will need to install a certificate for your domain name. Get one for free from CAcert or purchase one elsewhere. You should install it in your SER proxy, or with a directly connected phone, in the phone itself.
6
Conclusion
Secure SIP is going to be just like secure DNS and secure whatever: Everybody needs it, but users are clueless about the risks they run, so nobody is asking for it, and nobody is selling it.
Given telecom laws such as the Dutch, Secure SIP cannot even be sold as a hosted service, but it is possible for any company to hire a contractor for a dedicated setup.
As so often with privacy and security matters, we will have to take care of it ourselves, or accept the situation as it is.
Offering a Secure SIP address is not difficult. Aside from a domain-named certificate you only need a DNS SRV record pointing to a registration server that can translate traffic between TLS and TCP/UDP protocols.