Appendix A. More About SIP

The SIP protocol

SIP(Session Initiation Protocol), defined in RFC 3261 (with various extensions), handles creation, modification and
terminationof various media stream sessions over an IP network. It is for example used for Internet telephone calls
anddistribution of video streams.
SIPalso supports user mobility by allowing registration of a user and proxying or redirecting requests to the user’s
currentlocation. This is performed by the user registering his presence at a machine with the central registrar. The
SIPregistrar keeps track of the user, but doesn’t hold any information about which media streams the computers or
clientscan manage. This is negotiated between the parts when initiating a SIP session.

Why use SIP?

Today,two protocols for transmitting IP telephony exist; SIP and H.323. The H.323 protocol was originally
designedfor video conferences over ISDN and is a mix of several protocols and standards for performing the
variousphases of a connection. The SIP protocol was designed for general session initialization over the Internet.
Bothprotocols have the disadvantage (from a firewall point of view) of needing dynamically allocated ports for the
datatransmission, but today no protocol supports tunneling random media streams.
Whencomparing the two protocols, there is one major drawback to the H.323 protocol: its lack of scalability. H.323
ismostly used in small LANs. When extending to world-wide IP networks, SIP has many advantages:
Loopdetection
Whentrying to locate a user over several domains, loops can occur. H.323 has no support for loop detection,
whichcan cause network overload.
Loopsare easily detected using SIP headers, as they specify all proxies that have handled the SIP packet.
Distributedcontrol
H.323uses gatekeepers, which are devices used for handling call states and redirecting calls to aliases. As
everycall is carried out statefully, the gatekeepers must keep a call state during the entire call. This of course
makesthe gatekeepers a major bottleneck in the system.
Thereis also a need for a central point when performing multi-user calls, which means that someone must
providethis central point, and that this machine must be dimensioned for the size of the call.
SIPsessions are completely distributed, making the need of these central points disappear.
Smallconnection overhead
Establishinga connection using H.323 takes about three times the data and turnarounds compared to when
usingSIP.
Apartfrom this, there are some more disadvantages with H.323. As it uses many protocols, more ports need to be
openedin a firewall to enable H.323 traffic through. SIP is a single protocol, which means that only one port has to
beopened for SIP traffic. For both protocols, however, more ports must be opened for the data streams.
SIPruns on both TCP and UDP (and, in fact, can be extended to run on almost any transport protocol), making it
possibleto use UDP for large servers, thereby enabling stateless sessions. H.323 only runs on TCP, which as
alreadystated loads the servers by requiring state management.

SIP and firewalls

Whentrying to use SIP through a firewall, there are some problems.
SIPinitiates sessions of other protocols. This means that when a SIP session has been started, various other
protocolsare used as well, usually transmitted over TCP or UDP on some port. For a firewall, this is a problem, as it
125