Session Initiation Protocol
Session Initiation Protocol
net/publication/2896926
CITATIONS READS
22 1,331
1 author:
Tuomas Nurmela
Aalto University
14 PUBLICATIONS 45 CITATIONS
SEE PROFILE
All content following this page was uploaded by Tuomas Nurmela on 22 March 2014.
Abstract—Session Initiation Protocol, SIP, provides control- complicated to do on packet switching networks, since
plane signaling for the IP networks. SIP enables initiating, Quality of Service cannot inherently be determined.
modifying and terminating sessions for a user, while maintaining
However packet switching is essential in order to
neutrality to physical media capabilities and using other
protocols to negotiate these. SIP assumes that the transport layer
maintain a scalable, fault-tolerant system without need
is inherently unreliable and as such provides transport layer or resorting to expensive special-purpose equipment.
mechanisms. For target device discovery SIP requires the use of The new architecture is an evolutionary step extending
application layer routing. Besides these, the protocol is extensible the current TCP/IP protocol family. It could be said its
and has already been extended to support IETF presence somewhat of a compromise, forcing middleware and
framework and instant messaging. However, in order to perform systems engineers to choose correct combinations of the
in its core area, IP telephony call signaling, in regards to PSTN-
whole stack instead of just safely choosing one of the
IP Telephony integration, the protocol requires further work
especially in the area of emergency calls. 3GPP has decided to transport layer alternative protocols and be done with it
use SIP for signaling and work is ongoing to meet 3GPP network like with the basic Internet TCP/IP-architecture.
and IP multimedia system requirements. The Session Initiation Protocol, SIP [19], is one of the
protocols used in the IETF multimedia architecture. The
Keywords— SIP, SIPPING, state of standardization of SIP, architecture includes a number of other protocols, such
SDP, session application layer routing, emergency calls
as Real-time Transport Protocol (RTP) [1] for
transporting real-time data and providing QoS feedback,
I. INTRODUCTION the Real-Time streaming protocol (RTSP) [2] for
controlling delivery of streaming media, the Media
Gateway Control Protocol (MGCP) [3] or the joint ITU-
New multimedia application needs drive towards new T and IETF developed Megaco, also called H.248 [4]
functionalities in the IP network. This is coupled with [5], used for controlling gateways to the Public Switched
continuing pressure to enable IP-based Internet Telephone Network (PSTN) and the Session Description
Telephony in order to avoid having network providers to Protocol (SDP) [10] for describing multimedia sessions.
replace the aging telephony networks with new SIP is a text-based application-layer control protocol
dedicated hardware. All this coincides with huge that is mainly used to establish, modify, and terminate
amounts of Internet fiber overcapacity for which demand multimedia sessions e.g. Internet Telephony calls.
is hard to find. However, SIP is not limited to either devices, supporting
IETF has been trying to adjust to the new conditions. also pagers, laptops etc, nor to a specific call type,
As an answer to challenges IETF has been developing supporting one-to-one as well as multiparty conferences.
new multimedia architecture. The architecture aims to be While applications of SIP involve mainly human-to-
flexible enough to support various application needs as human communication, SIP design clearly addresses the
well as deployable, in order to enable incremental needs of a generic user, enabling anything with an
transfer to production by standardizing interoperability address to participate.
mechanisms. The establishing phase supports locating the user,
The multimedia architecture capabilities in Internet negotiating whether the party wishes to accept the call
and Wireless networks are closely linked to IETF efforts and what the supported and required features of the
to develop a flexible Quality of Service architecture [6] communicating parties and communication media. SIP
as well as ongoing research efforts in multicast protocols does not define the actual session attributes, treating
[7]. However, intra-company networks or close to core them as opaque payload data in order to remain
networks with high bandwidth overcapacity allow independent of the communication media capabilities.
limited deployment without requiring either. Modification of session includes changing parameters of
Multimedia handling, with its soft real-time or hard
real-time requirements depending on target of use, is
1 (17)
the session, inviting additional participant to conference including description of APIs, key SIP extensions, key
calls and invoking available data-plane services. differences to related protocols such as ITU H.323 and
Currently SIP is intended to address multiple needs: Cisco SCCP and short summaries on work done to
these include IP telephony special needs such as integrate SIP to PSTN and 3GPP IP multimedia systems.
supporting caller availability status change, emergency Section VI draws conclusions regarding SIP.
call connectivity as well as supporting IETF presence As a clarification to terminology, the paper uses call,
framework and new applications such as instant session and conference (multiparty session)
messaging. Originally this was not the case: the primary interchangeably, mostly depending on context at a given
focus was audio and video conferencing over the time. When not otherwise referred, [19] is used as the
Internet prior to carrier-grade signaling needs [55]. source.
To manage SIP development in a controlled manner
IETF currently has two main working groups (WGs): the
II. BASICS OF SESSION INITIATION PROTOCOL
SIP WG concentrates on basic functionality of SIP and
its extensions to ensure the protocol suitability is
considered in areas where it will be applied while the This section describes the typical participants to a SIP
Session Initiation Proposal Investigation (SIPPING) WG infrastructure. This is followed with introduction to SIP
concentrates on evaluating and prioritizing SIP special messaging, routing of request in session establishing and
needs and multimedia requirements, documenting SIP SIP transport mechanisms to e.g. provide reliability to
extension requirements and forwarding these to SIP WG UDP-based messaging. Section concludes with basics of
for standardization. While SIP is currently at its version SDP and ends with an example of SIP flow.
2 defined in 2002 in [19], three years after the initial
RFC [15], with over 100 documents (RFCs, internet- A. SIP components
drafts and working papers) the question of managing the
development still lingers in the air.
In addition to the basic SIP development, there is work SIP has four logical entity types (user agents,
done in IP Telephony WG to integrate SIP and PSTN registrars, redirect servers, proxies) and an abstract
signaling, in Geographic Location Privacy (GEOPRIV) service known as the location service. SIP doesn’t define
WG to extend user location-based service to cover how logical entities are implemented or deployed: a SIP
geographical location and in Authentication, element can include multiple entity types. Basic network
Authorization and Accounting (AAA) WG to support services such as DHCP (for boot-strapping) and DNS
and finalize SIP security issues. (for name-to-IP address, port transport protocol -
Besides IETF activities, 3rd Generation Partnership resolution) [21] are also required. Each entity that
Project, 3GPP, has adopted SIP as a mandatory protocol actively participates is said to have a core, an identity.
for handling signaling in IP multimedia services The abstract location service on the other hand is used
provided to 3G devices. This assures SIP deployment to by SIP but not defined by it. Figure 1 provides a possible
millions of phones. configuration of SIP capable network.
Two active forums are associated to SIP: the SIP
Forum [8] promotes general awareness by providing Redi rect Server /
Proxy Server /
information about SIP whereas SIP Center [9] promotes DNS / DHCP /
Registrar
2 (17)
User agents (UA) have two roles: a Client (UAC) that request to two or more destinations. The forking can be
issues requests and receives responses and a Server done in unicast or in multicast to e.g. provide better
(UAS) that receives requests directed to it and issues support for automatic call distribution (ACD) systems.
responses by either accepting, rejecting or redirecting the The stateful proxy groups “best” responses (i.e.
request. A SIP user can be represented by multiple SIP responses that allow the UAC to continue session
addresses, each of which can point to multiple devices. establishing process) in a response context from which it
A device can be accessible through multiple SIP chooses the final response based on its response
addresses. precedence rule-set. The proxy can cancel all non-
The SIP address is similar to an email address and is suitable responses, i.e. errors or responses that were not
assumed to remain stable in relation to how it is defined: selected due a better response, in order to keep state
it can be given by a network provider management down to a minimum in the SIP network
(e.g. tuomas@inet.fi), be in relation to ones work role Stateful proxies can be further divided into call
(e.g. admins@cs.helsinki.fi) or affiliated organization stateful proxies, which maintain state of the entire call,
(e.g. tuomas@sonera.com). The address changes when from the establishing to the termination of the call and
e.g. the user changes the network provider, moves to transactional stateful proxies that maintain state of at
another job or changes organization, not necessary when least a single request of the UAC. As such all call
the user switches location. For e.g. temporary change of stateful proxies are transactional, but the reverse doesn’t
location purposes, the user can have multiple SIP apply. The concept of transaction in SIP is explained
addresses and redirect calls to the current location. later in Section III.C.
As a SIP address can concurrently relate to multiple Redirect servers manage redirecting contacts to UAs
devices, a SIP request has to be able to fork. This is that are out of the registrars domain. Redirect servers can
something that no other signaling protocol currently be used to redirect callers to another SIP address in order
does. User agents have to be implemented in a way that to avoid having a SIP user know all SIP addresses of the
they can manage multiple responses to a single request, target user. Redirections are done by a specific status-
although under normal one-to-one circumstances, they code in the reply, like in HTTP1.1 [63].
receive only a single response to requests. In addition to user availability, it must be remembered
Registrars are responsible for maintaining User Agent that proxies are the centralized component in the SIP
access information based on User Agents informing on architecture. One server is likely to handle small-to
modification needs with specific request containing the medium deployment, but multiple proxies are likely to
SIP address and the contact addresses i.e. IP-addresses be needed in large domains. To alleviate possible
of the devices bound to the SIP address. The registrars problems, the redirect servers can provide another SIP
accept requests that are targeted to SIP addresses within address for the UAS in order to direct the UAC to use
its managed domain and communicate these onward to e.g. another proxy path.
Location Service that maintains this information. Location Service is a database that contains the SIP-
Proxy servers are intermediaries used mainly for address to a list of contact IP- addresses bindings. The
routing requests to another target that must be closer to location service is used by proxy and redirect services to
the final target than the proxy. Proxies also allow policy locate the UAS and by the Registrar to update UA
enforcement and rerouting of requests. location information. The location service also maintains
One way of classifying the proxies is by the location of user level availability and preferences as well as contact
the proxy in the path from the UA to the target UA. The address-specific capabilities. Contact-address specific
closest proxy to the UAC is the outbound proxy, while state of the device (e.g. whether turned on mute, not
the closest proxy to the target UAS is the inbound connected etc) is not maintained in the location service.
proxy. All proxies in between these two are the
intermediate proxies.
B. SIP Messages
Another way of classifying the proxies is the
statefulness. Stateless proxies simply forward requests
and responses without actively generating new types of SIP request and response message form resembles
request and response messages. Stateful proxies on the HTTP1.1, consisting of a request or status line, header
other hand act as UASs: they respond to UAC requests fields and an optional entity body.
with the best response out of possible UASs’, which is The basic SIP 2.0 defines six request methods:
closest to the UAC’s requirements. To find multiple REGISTER-request is used to provide location
answers the stateful proxy can fork the original UA information by the UA. The method is passed
3 (17)
periodically to a Registrar that updates the Location standardization of parameters provides support for
Service. interoperability between vendors, it also provides means
INVITE-request is used to establish a session. Because for proprietary functionality.
the invitation can lead to a long pause before e.g. the For now, the most common parameter used is the tag
target party answers the phone, the method is linked to a parameter that is contained in the From- and To-header
separate additional reliability mechanism, provided by field as a random local session identifiers, that can, with
the ACK-request. the globally unique identifier in Call-ID -header field,
-method is used by the caller to confirm reliable identify a peer-to-peer relationship called a dialogue.
INVITE-request exchange to UAS, somewhat analogous Since a flow state cannot be established with UDP,
to TCP three-way handshake. The use of the method is dialogue identifiers with separate message ordering
independent of the transport protocol used. mechanism is used to help in message sequencing and
OPTIONS-request enables to negotiate session options proper routing as intermediate SIP elements can
without requiring establishing of a session. This enables distinguish SIP dialogue state [19, pp.69].
both caller preferences (e.g. if in a shower and a phone Section I.C contains further information on how
with a video-capability rings, one may want to turn the application-layer routing is performed with header
video-transfer off despite phone capabilities. Likewise fields. Section I.D provides information on header field
choosing a language is typically something that can be used for message ordering.
useful for text-based communication or when calling a Response messages are divided into provisional and
work role-based SIP address) and device preferences final responses based on the status-code. Provisional
(e.g. which authentication protocol should be used, what responses (status-code 1xx) are used to indicate that the
algorithm is used for payload encoding compression request was received and is being processed. Provisional
etc). responses enable the requesting party to e.g. know that
CANCEL-request is used to terminate requests due to the VoIP-phone is ringing on the other end of the line.
e.g. request forking. The use of the method doesn’t There are no reliability mechanisms in the basic SIP for
affect an ongoing session. provisional responses. Final responses (status-code
BYE-request is used to terminate a session. The 2xx-6xx) indicate a resolution of UAC request. Final
request is valid if he requestor has already established responses are divided into successes (2xx), redirects
the session or is negotiating the establishing. (3xx) and different types of failures (4xx-6xx) that e.g.
Extensions to SIP describe a SUBSCRIBE-request suggest trying again (possibly later) or indicate global
used to indicate interest in knowing when the party is inability to provide service.
available and a NOTIFY-request for informing of status An optional entity body enables carrying (control-
changes [16]. Implementations supporting the additional plane) data. This can be used to create additional
messages can automatically handle informing a person functionality by defining another protocol that conforms
when the called party becomes available by sending a to the SIP request-response model. As such, entity body
NOTIFY-request after state modification by a enables extending application of SIP beyond parameter-
REGISTER-request. In order to seamlessly work with based extensions to SIP itself. The entity-body can use
the PSTN system, a separate PSTN-Internet Networking Multipurpose Internet Mail Extensions (MIME) [65]
(PINT) Server logical entity is defined to communicate encoding to carry. The MIME message type has to be
the methods to and from the PSTN and VoIP networks. indicated through separate header fields in the INVITE-
Additional extensions describe the UPDATE-request request or with separate OPTIONS-requests. Likewise,
[23] that is used to modify the session either during SIP can tunnel itself in the entity body. In this case it can
INVITE-request exchange prior to final ACK or after the e.g. use encryption with Secure MIME [66] to ensure
INVITE-request has resolved. However, since the privacy, yet some of the header fields used for routing
UPDATE-request is not allowed to affect dialogue state still need to be clear-text. The main use for entity body is
(see below), specific rules apply to how it must be used. carrying SDP, discussed in section I.E.
Header fields manage device caller id’s, content type,
loop-prevention, packet-order handling, party identifiers,
C. SIP application-layer routing and SIP mobility
and SIP routing. Header fields can be often expressed in
a compact form and don’t require a specific order
(excluding internal order of stackable header fields like SIP application layer routing includes basic routing,
those used in routing). Certain header fields can contain loop-prevention, and mobility support. The last issue is
parameters that identify extensions to SIP. While dealt from MobileIP and SIP perspective.
4 (17)
SIP application layer routing is required mainly in cases, the NAT may need an additional fix to maintain
the call establish -phase. As the application layer routing the NAT-binding, since it is maintained for a minute or
forms an overlay network, SIP entities have no so as it creates no flow. Alternatively the UAC should
knowledge of the actual network layer topology or even retransmit the INVITE-request with e.g. 20s frequency.
adjacent link strain. As such, the path to a device that is, Loop-prevention [19, pp.173] in the application-layer
in the network layer, very close can become burdened is an optional feature in SIP. This is done with a hop
with extra hops. count kept in Max-Forwards -header field that is
The application layer routing is independent of decremented by one by each entity in the path from the
network layer protocol. SIP is not tied to IP addressing UAC to the target. The default value for Max-Forwards
in any way, supporting both IPv4 and IPv6. When the is 90, that is estimated to cover large SIP deployments.
parties have located each other though call establishing, Mobility support is limited in SIP as contact address
the contact addresses (IP addresses) are known and no negotiation is not meant to be ongoing or be done during
application layer routing is required. a call. To further look at mobility in SIP context we
SIP response route is created in the request path. Each divide the needs to mobile IP managed mobility and
SIP elements adds a Via-header field, forming a session layer SIP mobility.
sequential list of hops on the route the request has Mobile IPv4 [67] [68] uses a home agent to represent
passed. The response message is routed back based on the user to the network. The home agent tunnels IP
these header fields with each SIP element removing the packets sent to the mobile node to a foreign agent that is
Via-header field it inserted before forwarding the located in the visited network. The foreign agent
response to the next hop. Compared to many peer-to- forwards these to the care-of-address allocated for the
peer application layer routing algorithms (e.g. Chord mobile node. Packets from the mobile node are routed
[49], CAN [50]), SIP similarly doesn’t try to do peer-to- through the foreign agent directly towards the target
peer for all transfer. The UAs only use it for session host, creating what is called triangle routing.
establishing, or more specifically for service discovery, Triangle routing adds additional complexity to SIP
as direct contact addresses are shared during session application layer routing. Since the UAS device (e.g.
invitation. Contact addresses can be cached by UAs and laptop) has been registered to the home agent and the
stateful proxies based on the expiration information. home agent manages the Mobile IP connection, the
In case the UA (original client, a redirect server or location service directs all incoming calls to the home
stateful proxy) wants to force a specific request path, it agent, which in turn tunnels the connection. SIP UAC
can define a list of Route -header fields, called a route will always see the Home Agent as the UAS address.
set, that explicitly indicate the target and intermediate Likewise, the SIP UAS would direct INVITE-request
systems. Proxies can request to remain in the path by replies to the home agent proxy. Mobile IPv4 might be
using Record-route header field. required if e.g. the visited network firewalls only permit
Symmetric response routing [30] is a critical tunneling to a foreign agent address in the network or the
extension to the routing. It allows the UAs addresses to visited address has no SIP entities at all, similar to UA C
be NATted. According to the basic SIP the UAs are in Figure 2.
expected to use public IP addresses, which are recorded
HomeAgent
with ports to the SIP message. NATted private IP (C)
UA (C)
addresses are not a problem, as (outside) IP-address is MobileNode
Redirect Server /
Proxy Server /
same port it sent the request from. Additionally, in UDP Figure 2: SIP entities and MobileIP enabled SIP host
5 (17)
SIP mobility [40] [37] [38] can be divided into Session re-establishing by quick re-negotiation
roaming mobility, personal mobility, session mobility created with a proactive make-before-break –mechanism
and service mobility. Some of these mobility scenarios is not supported in SIP or SDP. Disconnection could
can be divided into pre-call mobility, which is the happen for a number of reasons including temporary
situation when mobility happens prior to the session local network failure and temporary device (e.g. cell
being established and mid-call mobility, where the phone) failure. SIP and SDP probably see this as an
session has been established and the user has to be able implementation issue, since the protocols avoids making
to maintain the session during mobility. Mid-call assumptions regarding UA capabilities.
mobility is currently an open issue under investigation.
Pre-call mobility in each of the cases is done with SIP
D. SIP application-layer transport mechanisms
redirection. All approaches require that the visited
network is SIP aware.
Roaming mobility is the situation when a user is not in SIP is typically transported on top of UDP to avoid
a home network. To enable pre-call mobility, the UA, TCP handshake delay although TCP support is also
after address resolution through DHCP, registers to the mandatory. The default port for UDP and TCP is 5060
visited network registrar. After this, it registers to the with TLS [69] [70] encrypted TCP in 5061. SCTP
home network registrar through the visited network support is still in draft state [33], though it is not
outbound proxy and the home network proxy. To avoid currently further developed. It should be noted that due
this double registration, the visited domain registrar to application layer routing, SIP transport-layer protocol
could do the registering on behalf of the UA or the choice is not an end-to-end but a per application layer
administrative domains of the Registrars could be hop decision. Even though the original UA uses UDP,
combined. The first option requires further specification proxies may use another transport protocol.
of SIP. Basic SIP has multiple mechanisms to provide
Personal mobility refers to users ability to redirect additional application-layer transport mechanisms to
calls to any user device. This basically refers to how SIP overcome UDP problems. These include packet message
deals with addressing as discussed in II.A. reliability, congestion management and message
Session mobility is about e.g. changing from a VoIP ordering. SIP has no mechanisms for fast session re-
phone to a SIP capable mobile phone because one is establishing to support recovery from connectivity
leaving the office. This can done in a preplanned manner failure. Additionally SIP supports message-level multi-
or occur due outage e.g. because the battery of phone homing, however, this requires taking into account
died out, which would require automatically initiated connection reuse and symmetric response routing issues
recovery, which SIP doesn’t support. There are at least and is as such scoped out.
three ways in which session mobility can be worked into Reliability is handled in two different ways,
SIP: the original sender issues a new INVITE-request to depending on whether the protocol is using an INVITE-
the same address and it is transferred to the new devices message or other than INVITE-request.
contact addresses and negotiated normally. This requires For INVITE- message, since it can take a while before
sending party intervention and of course the apps must the phone is answered, the entities send provisional
be able to send to multiple destinations. Another responses to notify the UAC that call is being processed.
approach is third party call control (3pcc), in which the In addition to this, the final response is separately
receiver sends an INVITE-request to the new device ACKed by the requestor. The UAC has a retransmission
indicating other party’s parameters. A third approach timer that is initially the estimated RTT, defaulting to
could be a REFER-request the other UA, indicating the 500ms. This grows exponentially. Besides the UAC
new target to which a session should be negotiated. retransmission timer, the overall INVITE-request
These require full use of the current device until the tear resolution has a 3 minute timer, after which proxies
down of the connection. automatically timeout the connection.
Service mobility is about keeping ones personal For other than INVITE-request on top of UDP, every
services (e.g. calender, buddy list etc). Service mobility request is retried if no answer is received within the
is related to extending SIP to cover signaling between UAC retransmission timer. For other than INVITE-
different types of services and maintaining these. requests, the timer is 64*estimated RTT, also grown
REGISTER and NOTIFY messages are one part of the exponentially with retransmission needs.
solution. Service signaling is still under investigation. While the reliability mechanism seems to be
necessary, it should be noted that if used in mobility
6 (17)
supporting middleware like e.g. Wireless CORBA [51], two header fields extending SIP: the Proxy-Max-Size
the middleware may itself be equipped with an that indicates the maximum UDP packet size and the
adaptation layer that in a similar manner adds reliable Proxy-Seen-Size that expresses the size the packet had
transport properties when communicating over UDP. when received from the UA.
Provisional response reliability [20] is not Congestion safe response is something that is either no
guaranteed for provisional responses over UDP in basic larger is size than the request or that is a response to a
SIP. Since this could lead to interoperability issues when request that was congestion-safe. As such, the
integrating SIP with PSTN-signalling, a reliability mechanism is based on request control.
mechanism has been created, which simply mirrors the Message ordering of requests is done through the
ACK-approach by defining a PR-method message that is Cseq-header field that provides locally unique sequence
used to acknowledge provisional responses. In this case value containing an integer and request method used.
the UAS must wait for all acknowledgements to the The header field shows retransmissions by increase of
provisional responses it has sent prior to sending the any integer for the same request.
final responses with status-code indicating success to the Keepalive of a session is obvious to the dialogue
INVITE. members (UAs), but the intermediate SIP entities have
While these provide reliability for end-to-end no knowledge of the session state in case a UDP packet
communication, it doesn’t really help in the case any of is lost. This can lead to stateful proxies ending up
the proxies in the path loose connectivity to the next hop maintaining dialogue state indefinitely.
proxy in the path. SIPs approach to this problem is DNS- Session timers [32] have been proposed to correct
based: by giving multiple proxy-addresses, in case of this. The proposal suggests UACs supporting the
proxy failure the requestor can use another proxy, mechanism express this in the INVITE-request. The SIP
although statefulness may be required in this case [21]. elements in the path can insert a Session-expires -header
Congestion management is done by exponential field that contains the desired interval for a refresh
backoff of reliability providing retransmission timer as message. Each of the stateful proxies can evaluate on
well as packet size limiting. Packet size limiting states whether this interval is suitable and if not, reject the
that if packet size is (path MTU – 200 bytes) or more or INVITE message with an error status while indicating
if its 1300 bytes or more when path MTU is unknown with another header field what its refresh requirements
(an implicit Ethernet assumption), the packet should be are. After possibly multiple iterations the INVITE
sent using a reliable transport [19, pp.141]. arrives to the target UAS which finalizes the Session-
While these congestion management approaches are a expires interval. After the caller UAC receives this, it
good start, it has to be remembered that the transport is sends an INVITE- or UPDATE-request within the timer
still UDP with no mechanisms to deal with congestion period to refresh the proxies in the path.
avoidance or adaptation. Paths between proxies can be
unnecessarily stressed due inappropriate UA behaviour.
E. Session Description Protocol (SDP)
Assured congestion safety [31] is a planned extension
that is meant to counter this effect. The extension defines
congestion-safe request as meaning either use of a The Session Description Protocol (SDP) [10] is
reliable transport or using the extension and a developed by the Multiparty Multimedia Session Control
congestion-safe request that are paced and managed by WG (MMUSIC WG). The primary focus originally was
proxies that are able to reject UDP packets that would the announcement of multimedia conferences, but SIP,
require fragmentation. MGCP and RTSP have presented new needs that the
SIP request UDP pacing requires waiting for a protocol has been adapted to. This effort has required
response before resending the request. This way all the stretching the syntax and semantics, which in the long
entities in the path (not just the UA) are waiting for the run is not a viable solution. The MMUSIC WG is
timer to time out prior to resending requests. currently actively drafting [14] a new version of the
Proxy UDP packet rejection is used on a situation protocol, called SDPng (SDP next generation) that
where the proxy is given a large message to be would be able to express the wider variety of needs.
forwarded over UDP and that would require fragmenting SDP is required to provide information about
the message to multiple UDP packets. The fragmentation multimedia capabilities so that the parties involved can
need is based on the Path MTU that the proxy knows or decide whether or not the session will be established.
estimates (using its local network MTU). In this case the This information includes most importantly media
proxy rejects the package using basic SIP error code and streams, which define the content (e.g. audio, video,
7 (17)
application, data, control) of the stream, in a manner Figure 3 illustrates a SIP session messages based on
similar to MIME content. In addition each streams example in SIP RFC [19, pp. 213-219]. SDP session
payload type (i.e. media format), destination address and attribute negotiation and Via-header field parameters
port number are provided. Encryption mechanism can involving SIP transaction identification (indicated by
also be described. Basic SDP assumes each stream will branch –parameter), sender IP addresses and ports,
be independent and have a dedicated connection. In (indicated by received and sent-by–parameters),
addition to media stream description, issues such as start, were left out. Transactions are described in Section II.A.
stop and repeat times for e.g. an Internet radio program First of, as Alice picks her VoIP phone and dials the
can be indicated. Contrary to SIP SDP header fields need SIP address bob@metro.com, the UA sends an INVITE-
to be in a specific order. request (1) to the outbound proxy for delivery to Bob,
Basic SDP only describes the parameters used. A recording IP to the Via-header field, globally unique SIP
simple offer/answer model [11] was required to session identifier to the Call-ID -header field as well as
describe how the actual negotiation of SDP parameters direct contact information to the Contact-header field.
between parties happens. The model describes how To assist proxies, the UA adds the tag identifier, in order
initial offer and answer are generated, how the media to allow Bobs UA to eventually finalize the dialogue by
stream description is updated and how the UAC and the adding another tag:
UAS iterate from the initial offer to the final acceptance
INVITE sip:bob@metro.com SIP/2.0
of SDP parameters. All this is covered both in terms of Via: SIP/2.0/UDP vo1.hq.buffalo.com
unicast messages as well as multicast messages. To: Bob <sip:bob@metro.com>
Media stream grouping [12] enables describing how From: Alice <sip:alice@buffalo.com>; tag=18271
Call-ID: 3223842@vo1.hq.buffalo.com
multiple media streams relate to each other, forming a CSeq: 12921 INVITE
media flow, contrary to the basic SDP that handles every Contact: <sip:alice@vo1.hq.buffalo.com>
media stream independently. This can be used e.g. to
express lip synchronization requirements with a video. The outbound proxy, once receiving this, replies with a
For wireless networks the extension is especially useful provisional response, indicating that the specific session
since multiple codecs can exist in multiple ports and all is being processed (2):
are used for the speech. Without this extension, SDP SIP/2.0 100 Trying
could not describe this relation. To: Bob <sip:bob@metro.com>
Mapping of media streams to resource reservation From: Alice <sip:alice@buffalo.com>; tag=18271
CSeq: 12921 INVITE
flows [13] extends the media stream grouping by Call-ID: 3223842@vo1.hq.buffalo.com
enabling the group to make a joint resource reservation.
Other apps that require dedicated media streams can do In addition to this, the outbound proxy does a DNS
these alongside grouped reservations. queries (3) for the metro.com sip service and receives
Bob’s inbound proxy-server IP address, protocol and
F. An example of a basic SIP flow port as a replies (4), as there are no stateless
intermediary proxies. The reply indicates that e.g. UDP
UA Proxy server Proxy server UA
is preferred, but TCP is also available. IP address
Location
vo1.hq.buffalo.com hq.buffalo.com
199.121.1.203 199.121.1.204
DNS south.metro.com service
mp1.south.metro.com
203.168.11.207
203.168.11.203 prefers UDP and has the service in port
203.168.11.203
1
INVITE
SIP/2.0 100 Trying
5060. Alice’s outbound proxy forwards the request (5),
2
3
Query DNS:
buffalo.com
adding only its own Via-header field:
DNS Response:
203.168.11.203 4
5
INVITE
SIP/2.0 100 Trying
INVITE sip:bob@south.metro.com SIP/2.0
6
7
Query LS server Via: SIP/2.0/UDP hq.buffalo.com
Bob@metro.com
Response: Via: SIP/2.0/UDP vo1.hq.buffalo.com
bob@mp1... 8
INVITE
To: Bob <sip:bob@metro.com>
9 From: Alice <sip:alice@buffalo.com>; tag=18271
SIP/2.0 180 Ringing
10 Call-ID: 3223842@ vo1.hq.buffalo.com
SIP/2.0 180 Ringing
11 SIP/2.0 200 Success
SIP/2.0 180 Ringing
12 SIP/2.0 200 Success
13 CSeq: 12921 INVITE
SIP/2.0 200 Success 14 Contact: <sip:alice@vo1.hq.buffalo.com>
15
ACK
16
Non-SIP data transfer (e.g. RTP media)
17
18
BYE Bob’s inbound proxy receives this and sends a
SIP/2.0 200 Success
19 provisional response (6), similar to (2) but the Alice’s
outbound proxies Via-header field included. The
Figure 3: SIP flow between Alice and Bob outbound proxy decides not to send this to Alice’s UAC
since it has already sent this to it. The inbound proxy
8 (17)
first priority is locating Bob’s contact address. It To provide reliability, Alice’s phone acknowledges the
therefore queries the server providing location service final response. This is sent directly to Bob’s UA (16):
for Bob’s contact address (7). The location server
responds (8) with the IP address (either only address or ACK sip:bob@mp1.south.metro.com SIP/2.0
Via: SIP/2.0/UDP vo1.hq.buffalo.com
preferred address that Bob defined through a registrar), To: Bob <sip:bob@metro.com>; tag=129991
protocol and port of mp1.south.metro.com. Bob’s From: Alice <alice@buffalo.com>; tag=18271
Call-ID: 3223842@vo1.hq.buffalo.com
inbound proxy adds its Via-header field, rewrites the CSeq: 12922 ACK
INVITE SIP URI and forwards the request (9):
After this, the media session begins (17). If no
INVITE sip:bob@mp1.metro.com SIP/2.0
Via: SIP/2.0/UDP south.metro.com modifications are made to the session or SDP session
Via: SIP/2.0/UDP hq.buffalo.com attributes, Alice finally terminates the session with the
Via: SIP/2.0/UDP vo1.hq.buffalo.com
To: Bob <sip:bob@metro.com>
following message (18):
From: Alice <sip:alice@buffalo.com>; tag=18271
Call-ID: 3223842@vo1.hq.buffalo.com BYE sip:bob@mp1.south.metro.com SIP/2.0
CSeq: 12921 INVITE Via: SIP/2.0/UDP vo1.hq.buffalo.com
Contact: <sip:alice@vo1.hq.buffalo.com> To: Bob <sip:bob@metro.com>; tag=129991
From: Alice <alice@buffalo.com>; tag=18271
Call-ID: 3223842@vo1.hq.buffalo.com
As the request arrives to Bob’s VoIP phone, it starts CSeq: 12923 BYE
Contact: <sip:alice@vo1.hq.buffalo.com>
ringing. The phone sends a provisional response. This
now includes the locally unique tag in the To-header
field, establishing an early dialogue, prior to session The effect of modifications to termination message are
being established and direct contact address to Bob (10): only important in terms of CSeq, although its uniqueness
is decided by both the integer and the method name. As
SIP/2.0 180 Ringing such, the integer could still have the same value.
Via: SIP/2.0/UDP south.metro.com In case Bob would terminate the session, the integer
Via: SIP/2.0/UDP hq.buffalo.com
Via: SIP/2.0/UDP vo1.hq.buffalo.com
could be anything, since the sequence number would be
To: Bob <sip:bob@metro.com>; tag=129991 dependent on Bob’s UAs sequencing. Also in this case,
From: Alice <alice@buffalo.com>; tag=18271 the From-header field would contain Bob’s SIP address
Call-ID: 3223842@vo1.hq.buffalo.com
CSeq: 12921 INVITE with the same tag and To-header field would contain
Contact: <sip:bob@mp1.south.metro.com> Alice’s SIP address with the same tag.
To finalize the session termination, Bob’s terminal
The response is sent (11) by inbound proxy to Alice’s provides a response (19) to Alice’s BYE request:
outbound proxy with one Via-header field (Bob’s proxy)
SIP/2.0 200 Success
removed and onward (12) to Alice’s UA with still one To: Bob <sip:bob@metro.com>; tag=129991
more Via-header field (Alice’s proxy server) removed. From: Alice <alice@buffalo.com>; tag=18271
While the provisional response is being routed on the Call-ID: 3223842@vo1.hq.buffalo.com
CSeq: 12923 BYE
application layer, Bob picks up the phone. Bob’s phone
sends a final response (13) to Bob’s proxy as a The effects of user mobility would depend on how it is
notification of success: done. In case Bob was in the move with a Mobile IP –
based terminal, the tunnel from Home Agent to Foreign
SIP/2.0 200 Success
Via: SIP/2.0/UDP south.metro.com agent and from there to Bob’s terminal would be
Via: SIP/2.0/UDP hq.buffalo.com responsible for the session maintenance and basic
Via: SIP/2.0/UDP vo1.hq.buffalo.com routing would carry the message from Bob’s terminal to
To: Bob <sip:bob@metro.com>; tag=129991
From: Alice <alice@buffalo.com>; tag=18271 the in-bound proxy in Bob’s home network. SIP would
Call-ID: 3223842@vo1.hq.buffalo.com be totally oblivious to this as it is done in the network
CSeq: 12921 INVITE
Contact: <sip:bob@mp1.south.metro.com>
layer. Alice’s UAC would assume the Home Agent
address is Bob’s terminal.
The response is forwarded (14) by Bobs outbound On the other hand, in case Bob was e.g. visiting a
proxy to Alices outbound proxy with one Via-header daughter company of metro.com and had a possibility to
field (Bobs proxy) removed and onward (15) to to Alices register there, he would have used the Registrar to
UA with still one more Via-header field (Alices proxy change metro.com location service to redirect all calls to
server) to remove. his SIP address to a new SIP address, e.g.
bob@minibus.com. In this case Alices SIP INVITE
9 (17)
would have proceeded as in Figure 3 up until the grammar supplemented by additional code to generate
locations server response (8), which would indicate the the parser.
new address. The proxy would generate the redirect Transport layer that defines how a client sends
response, also indicating how long the stateful proxies requests and receives responses and how a server
and UA can cache the information (in this case 10 hours receives requests and sends responses over the network.
or 7200 seconds) for new session establishing purposes, The layer is contained by all SIP entities and is mostly
prior to renewing the information: interested in reliability, transport properties as well as
discussed in I.D.
SIP/2.0 302 Moved Temporarily Transaction layer provides a concept of transaction
Via: SIP/2.0/UDP hq.buffalo.com
Via: SIP/2.0/UDP vo1.hq.buffalo.com that is “a request sent by a client transaction (using the
To: bob@metro.com; transport layer) to a server transaction, along with all
From: alice@buffalo.com; tag=18271
Call-ID: 3223842@vo1.hq.buffalo.com
responses to that request sent from the server transaction
CSeq: INVITE 12921 back to the client” [19, pp.19]. More concretely said all
Contact: <sip:bob@minibus.com>; expires:7200 provisional responses are included in a transaction, the -
method to the INVITE-request final response is its own
Alice’s UA would then initiate a new INVITE-request to transaction and retransmissions should not be visible to
the new SIP address in the Contact header field and the next higher layer [19, pp.121-123]. Transaction layer
follow the steps in Figure 3. is not used by stateless proxies.
In some cases responses from this layer will not be SIP
III. SIP DESIGN AND PROTOCOL PROPERTIES messages but transaction timers. In these cases the
output is modified to look like a SIP specific error
message, depending on layer error type [19, pp.42].
The section describes the layer model that guides SIP
Transactions are identified uniquely with branch
design by layers principle that reflects the previous
parameter in the Via-header field. The current SIP
sections features. In addition the protocol properties such
specification inficates a magic cookie sequence
as security, quality of service, performance and
(z9hG4bK) that has to be used as a beginning value for
limitations to usage are discussed. Finally, the section
the parameter, in order to seperate the implementation
describes the usage considerations in the specific context
from the prior proposed standard.
of emergency calls during disasters.
Transaction user (TU) is any non- stateless proxy IP
entity. Basically the TU is the owner of a specific
A. SIP design by layers transaction: it creates the transaction with the required
parameters and can cancel it if so sees fit.
To avoid restricting usage to specific types of session Core is the identity of a SIP entity. Where as the
initiation, modifying and terminating purposes, SIP is transaction user owns a specific transaction, the core
layered to five separate logical layers that enable making establishes or participates to dialogues.
generaizations regarding a specific layer behavior. This
help remind people working on the protocol of the basic B. Security
ideology behind the protocol. The layers are from the
lowest to the highest[19, pp.18]:
Since SIP is very much about user presence
Syntax and encoding layer is specified using an
communicated through a device (a user agent), the
augmented Backus-Naur Form grammar (BNF) [71] for
nature of SIP makes security particularly important:
SIP messages that are UTF-8 character set based. This
privacy and trust issues become paramount. Of course
forms three areas that need to be addressed: protocol
the basic AAA issues are be already taken into account
performance, security and parser implementation.
in the link-layer in mobile networks, independent of SIP.
Performance and Security issues regarding plain-text
SIP provides security services based on HTTP and
format are included later in this section.
S/MIME, including authentication (both user to user and
The parser implementations require a lot of effort: the
proxy to user), message integrity protection and
parser must be implemented manually, it cannot be auto
confidentiality. In addition, privacy supporting UA
generated from syntactic definition of Augmented BNF.
behavior has been defined as well as a logical privacy
With each specification change the parser must be
service for intermediaries that helps UAs. This is further
modified. Either a parser needs to be manually
enhanced by the security agreement mechanism.
implemented or a separate translation to another
10 (17)
Authentication is done on request-by-request basis effectively deny user invitation to anything and multiple
and is based on the HTTP basic and digest contact addresses could be registered for a given SIP-
authentication [64]: challenge-response mechanisms address to be used as an amplifier in a DoS attack.
like CHAP [43] can be used, whereby the target of the Privacy Service [24] is provided by the intermediate
request initiates the authentication with a challenge, entities, typically proxies. The service is responsible for
which the requestor responds to. This ensures one-way supplying privacy functions that are unavailable for the
authentication (as authenticating the target is another UAs, such as withholding identity and personal
issue) and provides protection against replay attacks (by information of a SIP user.
use of nonce).. The UAs activate privacy service by attaching a
If the request has been forked by a stateful proxy, Privacy-header field to the requests and responses. For
intermediate and inbound proxies may require multiple legacy clients, messaging to application level, without a
UAC authentications. To alleviate UA strain, the proxies Privacy-header field is supported. The intermediate
forking requests are required to aggregate authentication entity evaluates the request and conforms to it if it’s
challenges to the minimum and send these to the UA. allowed to pass anonymous requests. The receiver must
Integrity of messages is can be guaranteed by using accept anonymous sessions.
PGP-based digital signatures. The other possibility is to The Security Mechanism Agreement [25] procedure
use of S/MIME and tunneling of SIP messages. enables the UAs to securely agree on arbitrary security
Confidentiality is not supported with any new mechanisms with the next hop entity. The basis for this
mechanisms. Due to this, TLS/SSL could be typically mechanism is the reality that as SIP deployment will
used if TCP is used as a transport. UDP on the other consist millions of phones, one really cannot make
hand would require use of IPSec or link-layer security assumptions about what security mechanisms are
encryption. One possibility also is to use S/MIME to available to the phones and network.
encrypt SIP header field information and use SIP
tunneling to guarantee confidentiality and integrity.
C. Quality of Service
However, some header fields (e.g. ones required for
dialogue) are still required to be in plaintext to enable
application layer routing [19, pp.207]. Handling Quality of Service resource reservation -
Denial-of-Service (DoS), i.e. attacks bombarding the based architectures contains a problem for SIP: in which
SIP servers with requests in order to disallow actual order should reservation of resources with e.g. RSVP
users from using them, is a complicated issue for SIP as [73] and initiation of session be done. If reservation of
the protocol offers multiple ways to exploit this attack. resources is made before actual session initiation, the
There are no straightforward answers to counter this in target might be unreachable and the capacity was
addition to other security services and sound network reserved for no purpose. On the other hand if session is
design. Possible exploits include e.g. Via-header field initiated before resource reservation, failing the latter
misuse, Record-route -header field information misuse may lead to disconnection or inability to conform to
and REGISTER-request misuse [19, pp.236]. required QoS parameters.
Via-header fields can be used with spoofed addresses To alleviate the problem, preconditions [18] were
to harness multiple UAs and proxies to generate an introduced to both SDP and SIP. Preconditions separate
amplified denial-of-service attack if authentication is not current and desired state, enabling all parties to express
required. This is analogous to a smurf DoS attacks [72]. their state and requirements for each multimedia stream.
The Record-route –header fields expose the proxies to Resource reservations are unidirectional. The SDP
a DoS-attack since the mobile terminal user can becomes precondition is sent in the INVITE-request, which is not
aware of the all the proxies in the path to the service processed further before target and caller both agree on
route. In 3GPP networks this basically means network resource reservation. As such, resource reservation
operator proxies required by the architecture become becomes interleaved with the session initiation.
exposed and a terminal user can use the information to
launch an attack with separate equipment and cause D. Performance
unavailability of the service.
In case authentication is not required, the REGISTER-
request offers multiple possibilities: a contact address of Additional protocol features can improve SIP
the intruder can be added to enable joining multiparty performance. However, some issues may require use of
conferences, contact addresses can be deleted to external specialized equipment. This section discusses
11 (17)
connection reuse, compression of messages and load is no feedback channel with UDP. Likewise, Via-header
balancing. field and Record-route -header field with private IP
Connection reuse [34] in SIP is aimed for reliable addresses complicate the matter, since the load balancers
transports. While responses to a request are returned to currently do not interpret SIP header-fields at all.
the correct port (e.g. the whole INVITE dialogue
described in I.E example flow), requests from the target
E. Limitations to usage
UA are unlikely to use the same connection. In order to
reduce latency, especially if doing TLS over TCP, which
requires additional round trips to set up the encryption SIP and SDP are extremely extendable protocols. Still,
just to e.g. send a BYE-request, a connection reuse there are areas where they are not appropriate. Some of
mechanism is at work. the areas are discussed below [39].
The draft defines an additional parameter alias to SIP is not an application-layer data transfer protocol.
the Via-header field, which is used to recognize that the Even though it may look like HTTP, it is a control-plane
UAC allows for reuse of existing connection for the signaling protocol that is used carry session attributes
requests from target UA. The connection ephemeral but not actual application data.
port, i.e. the high port that needs to be reused, must be SIP is not a routing protocol. It does not transfer
found by target UA itself. It will not be provided by e.g. routing information or in anyway interleave its
DNS manipulation or by the server of the connection. To functionality with routing protocols.
avoid connection hijacking by an intruder, additional While SIP is used for session initiation setup, it does
mechanisms are used for authentication of connection. not provide signaling for resource reservation nor would
Compression [28] of SIP messages is currently at its there be any point in extending SIP to provide such
draft stage. The compression will be based on basic features and all the extensions that are already available.
Signalling Compression, as defined by Signaling SIP does not offer conference control services such as
Compression (SIGCOMP) WG. member ejection, feedback, virtual microphone passing,
Use of compression is expressed as a comp=sigcomp chair control, voting or polling. SIP also doesn’t have
parameter in either the SIP URI of request or Via-header any protocol features that would make assumptions on
field of the response. The other option would have been how conferences should be managed. SIP can be used to
to use DNS records, but, considering that there are three initiate a session that uses some other conference control
transports (if you include SCTP) of which two of them protocol. This is an important difference when
(TCP and SCTP) can have TLS used on top of them, a comparing to the likes of H.323 protocol.
single server would require multiple entries to express SIP does not have mechanisms to control user device
compression support for all the possible choices. state (e.g. turn SIP devices to mute). In terms of
If SIP is tunneled in case e.g. S/MIME is used to implementing this device control, the implementation
encrypt the original SIP message, payload compression may or may not use the locations service to discover the
available in basic SIP and can be negotiated with entities contact addresses of the devices to be managed.
in the path [19, pp.169]. SIP request methods, responses and header fields all
Load balancing in SIP is based on DNS records. have a basic primitive issue that they are trying to
Since one outbound proxy probably doesn’t ask the accomplish that needs to be obvious to the recipient, not
target proxy address that many times, rather caching the something that needs to be evaluated. Therefore requests
request to a local resolver, corporate sessions to another and responses should not be extended with header fields
corporation probably travel the same route. This problem or parameters that break primitive rule and header fields
is analogous to forwarding HTTP-proxies. should not be extended with parameters that break this
Currently corporates are moving to portal rule.
infrastructure architecture. The inbound proxies in
principle could be transferred behind an external load F. Emergency Calls During Disasters
balancer middleboxes to guarantee more transparent fail-
over and load based request distribution. SIP URI –
based hash persistence could be used analogous to HTTP Emergency calls form a problem in a packet switched,
forwarding proxy farms persistence. However, SIP due to only best effort service, congestion probability
proxy failure recovery by another SIP proxy would be during demand rise, lack of admission control and no
dependent on dialogue state. Also, keepalive checking of central control. SIP further complicates this compared
SIP proxies using UDP is somewhat difficult since there to traditional PSTN signaling, since if SIP were to carry
12 (17)
resource reservation information, it would do this in the transparency as well as discovery of available ECSs
call setup phase that uses the SIP application layer route become critical.
to the target. This can naturally be different from the Geographical location support [61] is a critical
actual route between the two communicating parties. additional requirement for the emergency services in a
Also, basic SIP call setup itself cannot be prioritized wireless network. Currently the work is ongoing, the
over other traffic. While under normal circumstances draft only requires support for SIP, as well as allowing
signaling of emergency calls can be done over a packet for passing of emergency information despite emergency
network, disaster events require additional support. center authentication failure. Cases where user
Support for existing priority schemes is critical. While authentication fails, or both the center and the user fail in
many national PSTN networks have their own, US authentication are open among other things.
Government Emergency Telecommunications Service
(GETS) and Multilevel Precedence and Pre-emption
IV. RELATED WORK
(MLPP) used by the US military are used as examples.
GETS [57] uses multiple levels of preference with
emergency calls as the highest. This allows for multiple While control-protocols are an old idea dating to FTP,
paths to destination, call setup priority over normal calls, the extension ideas around SIP make it hard to say to
call queue priority over lower preference calls and just what it should be compared to. However, in order to
override of network management restriction capabilities be a viable choise for signalling, the basic premise is that
to guarantee connectivity in PSTN. GETS is activated by it can perform this function well.
normal phones that dial to specific numbers. This section covers APIs for developing with SIP,
MLPP [29] is used in non-public military networks to most visible SIP extensions and alternatives to SIP. In
secure call resources during war and national addition basic PSTN signalling requirements and
emergencies. It allows selection of call precedence that approach in 3GPP IP multimedia systems is commented.
is used to decide which calls are pre-empted
(disconnected) in case of resource limits. The A. SIP software development
precedence information is communicated during call
setup with the signaling protocol. MLPP and its variants
are typically pre-configured to special equipment. Currently, there are multiple APIs for developers,
The whole problem of emergency calls was initially mainly due to differing requirements. To further help
approached by creating a specific user identity [41], sos, application development, there are a few deployed SIP
and then supporting 911 and 112 codes in the SIP uri- test servers [52] that can be used.
string [42]. After this narrow approach, the requirements Parlay [54] provides a similar API for both SIP and
definition was moved to a recently created Internet PSTN call handling. While suitable for those with
Emergency Preparedness WG (ieprep WG), that looks at background in PSTN applications engineering, it doesn’t
enabling PSTN-to-IP-telephony emergency calls in a allow to access all the features provided by the
wider scope among other things. extensibility of SIP.
Requirements to Resource Priority Mechanism [29] Call Providing Language (CPL) [35] is a dedicated
describe requirements for four SIP-related resources that language, suitable for handling SIP transactions. CPL
can be constrained. These are the SIP proxies, IP abstracts the specifics of the signaling protocol,
network, the possible VoIP-gateway and the sender providing also support for H.323 (see below, section
and/or receiver, depending on the environment. The III.C). CPL is intended also for advanced UAs.
approach the requirements take aim towards enabling Cgi-scripts [17] are also a good approach, especially
seamless usage of the current PSTN system by mapping for quick development and freedom in choosing the
the priority information to PSTN emergency call system programming language. Compared to HTTP, the
(ECS) specific codes, each in their own name space to dependence on proxies and dialogue state. Therefore the
avoid creating a global prioritization scheme. In addition cgi application can be in the path to the final UAS,
to this, it is central that a SIP signal can safely arrive at instead of being the actual stateless target for the service.
the PSTN gateway. This approach has to support PSTN- oSIP [53] is a GNU C-library issued by Free Software
to-PSTN signaling where SIP is used in an intermediate Foundation. It provides an API for SIP and SDP
VoIP-network and vice versa and work on any SIP message parsing as well as managing SIP transactions.
method. In addition to these, several security issues, IP The library is based on the newest SIP RFC 3261 [19].
13 (17)
The oSIP page also contains links to SIP applications protocol. Instant Messaging basically uses it on a data-
that can be used to speed up development and testing. plane to send the actual messages. Of course for text
The Java Community Process (JCP) has standardized a messages this might be fine and all, but with multimedia
low-level Java API for SIP in a JSR-32 [58]. Reference messaging and – perhaps in the future – video or
implementation is available. Besides this, JSR 116 [59] soundclips, it makes you wonder just how far will SIP be
defines a higher-level Servlets API for server use, stretched. However, SIP for instant messaging is but one
supporting sessions, data storage and retrieval and other of the proposals for IETF instant messaging standard.
matters similar to HTTP servlets. However, both of these
are based on the older SIP RFC 2543 [15].
C. SIP alternatives
B. SIP extensions H.323 [55] [56] is the ITU alternative to SIP that has
over ten years of development history behind it, if one
SIP Event Notification [22] is a basic notification considers it to have evolved from H.324, which was
framework that allows for asynchronous messaging. It is developed to improve H.320. Contrary to typical
based on the NOTIFY and SUBSCRIBE messages. In remarks, H.323 is not a legacy protocol that would need
order to represent the user, the framework defines a to be replaced; the newest release H.323v4 was
subscriber that subscribes to certain kind of target state approved in 2000. In addition to long development
changes and a state agent that is responsible for sending period, H.323 has been the signaling protocol deployed
the notifications. The framework can be fully exploited with non-proprietary VoIP solutions.
by extending it with Event Packages, which describe the There are many fundamental differences when
states and the content exchanged in the SUBSCRIBE comparing H.323 to SIP as H.323 is binary encoded,
and NOTIFY message payloads. This enables defining uses ASN.1 syntax in message definition, includes
e.g. buddy lists and message waiting –services. features to handle intermediate network entity failure
The Event Package approach lacks hierarchy since and has inbuilt conference call management messages.
Event Packages are in themselves independent of each Also, devices that are H.323-based use a reliable
other. Event Template Packages allow defining state and transport protocol, TCP, although UDP is supported as
information exchange that affects all Event Packages. well. Compared to IETF SIP development effort, H.323
SIP Instant Messaging [27], developed by SIP for also has a more hierarchical and layered specifications
Instant Messaging and Presence Levereging Extensions structure, which helps in understanding how individual
(SIMPLE) WG, is one of the possible approaches to specifications relate to the whole.
Instant Messaging standardization over the Internet. Skinny Client Control Protocol (SCCP) [60], also
SIMPLE conform to the IETF Presence Framework [26]. called Skinny, is a half-proprietary VoIP terminal
SIMPLE defines an Event Package that whereby a control protocol defined by Cisco Systems, Inc. It is
Presence Agents (PAs), which are extended SIP UAs, used to control Cisco 7960 and 7960 voice over IP
receive and subscribe to notifications as well as notify phones. Cisco mainly promotes Skinny as a lightweight
the user on presence state change. A user can have implementation of H.323. The operations usage may
multiple PAs, as users can have multiple UAs in SIP. result to typical proprietary protocol problems (e.g. 3rd-
Each PA has a unique SIP address. The presence party firewalls can’t NAT correctly or otherwise
information is targeted towards one or many of them function incorrectly and debugging support is dependent
explicitly. In addition there are Presence User Agents on vendor support engineers).
(PUAs) that are independent of the SIP UA. These In addition VoIP-phones and media gateways,
enable to produce and manipulate presence information companies like Symbol Technologies and SocketIP have
without having to have other SIP properties. As in PAs, implemented this protocol also in softswitches that
a user can have multiple PUAs. However, PUAs cannot enable interaction of devices that use different signaling
be sent presence information. As it is, if you e.g. want to protocols.
sent Christmas Greetings Instant Messages to a bunch of
D. PSTN integration
friends, the PUA entity allows for that quick-and-dirty
SIP client, since you most probably don’t want it to
receive anything. SIP for Telephones (SIP-T) [62], created by the IP
While all this is neat, it’s perhaps good to recall that Telephony WG, provides architecture for SIP usage over
SIP was first and foremost a control-plane signaling PSTN connections. It describes four different scenarios
14 (17)
by either SIP bridging gateways for SIP UA messages V. CONCLUSIONS
over PSTN to another SIP UA, PSTN signaling
protocols over SIP-based VoIP-network to another While SIP may be the IETF protocol for doing
PSTN phone as well as interworking SIP-to-PSTN and signaling over the Internet, one has to wonder about SIPs
PSTN-to-SIP. design priorities and goals. First of, due to e.g. the
SIP and PSTN signaling protocols such as ISUP are unresolved issue of emergency calls, the PSTN
not totally compatible. Possibilities to enable integration will remain an open issue for some time.
internetworking include carrying ISUP messages as Secondly the symmetric response routing obviously
MIME data in SIP payload or adding messages that can signals (no pun intended) lack of appreciation for real-
be used to carry ISUP messages without affecting SIP world networks that thankfully do include NAT and
dialogue state, similarly to UPDATE-request. private networks. Thirdly SDP, a protocol intended for
Third Party Call Control (3pcc) [37], currently announcing multimedia conferences, was adopted to do
under investigation for SIP, allows a controlling party to the multimedia parameter negotiation. Except of course
manage the session. Besides possible session mobility it lacked the negotiation part.
use, 3pcc is usable for conferencing, call center call While IETF has still got some work to do, the 3GPP,
transfers and similar typical PSTN services. However, with its intension to put SIP to tens of millions of
with SIP, there are a multitude of new applications for phones, can expect to likewise encounter challenges on
3pcc including a click-to-dial approach whereby the user the road: even though adoption will be gradual, a
simply clicks a web page that initiates the call for the piloting phase with pure VoIP solutions prior to 3GPP
user to the given SIP address. However, as with deployment would enable getting production experience
mobility, managing this becomes non-trivial when with the protocol. However, with H.323 on the market
taking into account the resource reservations that could with its maturity, we might be forced to miss out this
have already been established for the call. opportunity on many deployment occasions. From
protocol adoption point of view, the Megaco/H.248
E. 3GPP SIP usage and modifications approach of joint development might have at least
reduced the competitive spirit in H.323 and SIP camps.
Whatever the goals for SIP are, whatever the future for
In order to use SIP in 3GPP network architecture [48]
SIP deployment holds, everything in SIP screams of
and more specifically in the 3GPP IP multimedia system
evolutionary design. If one looks at the design by layers
(IMS) [46], SIP has to conform to the system
-approach, it does enable handling SIP concepts such as
requirements. Besides IMS, 3GPP has packet-switched
entity identity, transactions and dialogue in its own cute
network domain This domain however lacks real-time
little way. However, it really doesn’t help one bit when
data, error correction or header-compression. The 3GPP
thinking about extending SIP. And to be honest, that’s
requirements [36] and dependencies [44] are under
all that has been thought about ever since the protocol
active update.
gained the proposed standard status.
3GPP SIP defines three SIP proxies called
The question remains whether the next proposed
Call/Session Control Functions (CSCF). Each has a
standard of SIP will just have some of the new features
distinct role with specific responsibilities in the IMS,
integrated to it or if it will actually make changes to the
taking into account the fact that the user is located often
protocol core features. There is still time for a revision
in a non-home domain. The responsibilities vary in areas
prior 3GPP deployment. Today it can be done.
of registrations; session management and charging and
Tomorrow on the other hand is another story completely.
resource utilization [45].
Then again, maybe the whole signaling issue really is a
The interoperation in protocol message level for each
small thing in the big picture and as long as the
SIP entity and IMS component has been defined in a
emergency call issue is resolved, it really doesn’t matter
step-by-step example that also explains the interleaving
that much.
with 3GPP network issues [45]. To supplement this,
conformance requirements in order to support SIP in IP
multimedia systems have been further refined [47].
15 (17)
REFERENCES [32] S.Donovan et al.,”Session Timers in the Session Initiation Protocol
(SIP)”, Internet-Draft, July 1, 2003
draft-ietf-sip-session-timer-11.txt
[1] H. Schulzrinne et al., “RTP: A Transport Protocol for Real-Time [33] S. Donovan, J,Rosenberg, “The Stream Control Transmission Protocol
Applications”, RFC 3550, July 2003 as a transport for the Session Initiation Protocol”, Internet-Draft,
[2] H. Schulzrinne et al., “Real Time Streaming Protocol (RTSP)”, RFC expired, 2002
2326, April 1998 draft-ietf-sip-sctp-03.txt
[3] R. Arango, “Media Gateway Control Protocol (MGCP) Version 1.”, [34] R.Mahy, “Connection Reuse in the Session Initiation Protocol
RFC 2705, October 1999 (SIP)”, Internet-Draft, Aug 2003
[4] F.Cuervo, “Megaco Protocol Version 1.0”, RFC 3015, November 2000 draft-ietf-sip-connect-reuse-00.txt
[5] Tom Taylor, “Megaco/H.248: A new standard for Media Gateway [35] J. Lennox et al.,"CPL: A language for user control of internet telephony
Control”, IEEE Communications, October 2000 services", Internet-Draft, August 2003
[6] G. Huston, “Next Steps for the IP Quality of Service Architecture”, RFC draft-ietf-iptel-cpl-08.txt
2990, November 2000
[36] M.Garcia-Martin, “3rd Generation Partnership Project (3GPP) Release 5
[7] IEEE Network, Multicasting: an enabling technology, Special Issue requirements for Session Initiation Protocol (SIP)”, Internet-Draft,
January/February 2003
draft-ietf-sipping-3gpp-r5-requirements-00.txt
[8] SIP Forum, www.sipforum.org
[37] J. Rosenberg et al, “Best Current Practices for Third Party Call Control
[9] SIP Center, www.sipcenter.com in the Session Initiation Protocol”, Internet-Draft, June 30, 2003
[10] M.Handley, V. Jacobson, “SDP: Session Description Protocol”, RFC draft-ietf-sipping-3pcc-04.txt
2327, April 1998
[38] G. Camarillo, P. Kyziwat, “Interactions of Preconditions with Session
[11] J. Rosenberg, H. Schulzrinne: “An Offer/Answer Model with the Mobility in the Session Initiation Protocol (SIP)”, Internet-Draft, August
Session Description Protocol (SDP)”, RFC 3264, June 2002 28, 2003
[12] C. Camarillo et al, “Grouping of Media Lines in the Session Description draft-camarillo-sip-rfc3312-update-00.txt
Protocol (SDP)”, RFC 3388, December 2002
[39] Henning Schulzrinne et al., ”Session Initiation Protocol: Internet-Centric
[13] G.Camarillo, A.Monrad, “Mapping of Media Streams to Resource Signalling”, IEEE Communications Magazine, October 2000
Reservation Flows”, RFC 3524, April 2003
[40] Henning Schulzrinne, Elin Wedlund, “Application-Layer Mobility using
[14] M. Handley, “SDP: Session Description Protocol”, Internet-Draft, SIP”, ACM Mobile Computing and Communications Review, Volume
September 4, 2003 4, Number 3, July 2000
draft-ietf-mmusic-sdp-new-14.txt [41] Schulzrinne, “Universal Emergency Address for SIP-based Internet
[15] M Handley et al., ”SIP: Session Initiation Protocol”, RFC 2543, June Telephony”, expired Internet-Draft, February 2002
1999 draft-schulzrinne-sipping-sos-01.txt
[16] S.Petrack et al., “The PINT Service Protocol: Extensions to SIP and [42] Schulzrinne, ”Requirements for Session Initiation Profocol (SIP) –based
SDP for IP Access to Telephone Call Services”, RFC 2848, June 2000 Emergency Calls”, expired Internet-Draft, February 21, 2003
[17] J.Lennox et al., “Common Gateway Interface for SIP”, RFC 3050, draft-schulzrinne-sipping-emergency-req-00.txt
January 2001 [43] W. Simpson, “PPP Challenge Handshake Authentication Protocol
[18] G. Camarillo, Ed., W. Marshall, Ed., J. Rosenberg, “Integration of (CHAP)”, RFC 1994, August 1996
Resource Management and SIP”, RFC 3312, October 2002 [44] Stephen Hayes, “3GPP IETF Dependencies and Priorities”,
[19] J. Rosenberg et al., ”SIP: Session Initiation Protocol”, RFC 3261, June www.3gpp.org/TB/Other/IETF.htm
2002
[45] 3GPP TS 24.228, “Signalling flows for the IP multimedia call control
[20] J. Rosenberg, H.Schulzrinne, ”Reliability of Provisional Responses in based on SIP and SDP” (Release5), v.5.5.0, 2003
the Session Initiation Protocol (SIP)”, RFC3262, June 2002
[46] 3GPP TS 23.228: “IP Multimedia Subsystem (IMS)” (Release 5),
[21] J. Rosenberg, H. Shultzrinne: “SIP – locating servers”, RFC 3263, June v.6.2.0, 2003
2002
[47] 3GPP TS 24.229: “IP Multimedia Call Control Protocol based on SIP
[22] A.B.Roach, “Session Initiation Protocol (SIP)-Specific Event and SDP”, Stage3 (Release 5), v.5.5.0, 2003
Notification”, RFC 3265, June 2002 [48] 3GPP TS 23.002: “Network architecture” (Release 5), v.6.1.0, 2003
[23] J.Rosenberg, “The session initiation protocol (SIP) UPDATE method”, [49] Ion Stoica et al., “Chord: A Scalable Peer-to-Peer lookup service for
RFC 3311, September 2002 internet applicatrions”, SIGCOMM01, 2001
[24] J.Peterson, “Privacy Mechanism for SIP”, RFC 3323, November 2002 [50] Sylvia Ratnasamy et al., ”A Scalable Content-Addressable Network”,
[25] J. Arkko et al., “Security Mechanism Agreement for the Session SIGCOMM01, 2001
Initiation Protocol”, RFC 3329, January 2003 [51] OMG, Wireless CORBA adopted specification, 2003
[26] M.Day et al, “A Model for Presence and Instant Messaging”, RFC 2778, www.omg.org/technology/documents/formal/telecom_wireless.htm
February 2000
[52] List of Public SIP Servers web page
[27] B.Cambell et al., “Session Initiation Protocol (SIP) Extension for Instant
Messaging”,RFC 3429, December 2002 www.cs.columbia.edu/sip/servers.html
[53] GNU oSIP library, www.fsf.org/software/osip/
[28] G. Camarillo, “Compressing the Session Initiation Protocol (SIP)”, RFC
3486, February 2003 [54] Parlay, http://www.parlay.org/
[29] H. Schulzrinne, “Requirements for Resource Priority Mechanisms for [55] Hong Liu and Petros Mouchtaris, “Voice over IP Signalling: H.323 and
the Session Initiation Protocol (SIP)”, RFC 3487, February 2003 beyond”, IEEE Communications, October 2000
[30] J. Rosenberg et al, “An Extension to the Session Initiation Protocol [56] Packetizer Inc, “H.323 versus SIP: a comparison”, August 11, 2003
(SIP) for Symmetric Response Routing”, RFC 3581, August 2003 www.packetizer.com/H_323 versus SIP A Comparison.htm
[31] D.Willis, B. Campbell, ”Session Initiation Protocol Extension to Assure [57] OMNCS, “GETS planning guide”, July 2003, gets.ncs.gov
Congestion Safety”, Internet-Draft, February 12, 2003 [58] JSR-32 JAINTM SIP API Specification , Final Release, 5.8.2003
draft-ietf-sip-congestsafe-01.txt www.jcp.org/en/jsr/detail?id=32
16 (17)
[59] JSR-116 SIP Servlet API, Final Release, 27.1.2003
www.jcp.org/en/jsr/detail?id=116
[60] Wikipedia encyclopedia
www.wikipedia.org/wiki/Skinny_Client_Control_Protocol
[61] Jorge Cueller et al, “Geopriv requirements”, Internet-Draft, Mar 2003
draft-ietf-geopriv-reqs-03
[62] A. Vemuri, J. Peterson, “SIP –T: Context and architecture”, RFC 3372,
September 2002
[63] R.Fielding et al., “Hypertext Transfer Protocol -- HTTP/1.1”, RFC 2616,
June 1999
[64] J. Franks et al, “HTTP authentication: Basic and Digest Access
Authentication", RFC 2617, June 1999
[65] N. Freed, N. Borenstein, “Multipurpose Internet Mail Extensions
(MIME) Part Two: Media Types”, RFC 2046, November 1996
[66] B. Ramsdell, “S/MIME Version 3 Message Specification”, RFC 2633,
June 1999
[67] Charles Perkins (ed), “IP Mobility Support for IPv4”, RFC 3344, August
2002
[68] Charles Perkins, “Mobile IP”, Landmark 10 IEEE articles, 50th
Anniversary Commemorative Issue, IEEE Communications Magazine,
May 2002
[69] T. Dierks, C. Allen, “The TLS Protocol version 1.0”, RFC 2246, January
1999
[70] S. Blake-Wilson et al., “Transport Layer Security (TLS) Extensions”,
RFC 3546, June 2003
[71] D. Crocker (ed), “Augmented BNF for Syntax Specification: ABNF”,
RFC 2234, November 1997
[72] CERT, “Advisory on Smurf Attacks”, CA-1998-01, 1998
www.cert.org/advisories/CA-1998-01.html
[73] Lixia Zhang et al, “RSVP: A new Resource ReSerVation Protcol”,
Landmark 10 IEEE articles, 50th Anniversary Commemorative Issue,
IEEE Communications Magazine, May 2002
17 (17)