Table of Contents
This chapter shall provide technical background information about protocols and components used in IP telephony. It introduces the relevant component types, gives detailed information about H.323 and SIP and RTP as well as some information about Media Gateway Control and vendor specific protocols.
An IP telephony infrastructure usually consists of different types of components. This section shall give an overview of typical components without describing them in a protocol specific context.
A terminal is a communication endpoint that terminates calls and their media streams. Most commonly this is either a hardware or software telephone, videophone, possibly enhanced with data capabilities.
There are terminals that are intended for user interaction and others that are automated - e.g. answering machines.
An IP telephony terminal is located on at least one IP address. There may well be multiple terminals on the same IP address but they are treated independently.
Most of the time a terminal has been assigned one or more addresses (see Section 2.1.5), which others will use to dial to it. In case that IP telephony servers are used a terminal registers this addresses with its server.
To place an IP telephony call it requires at least two terminals - and the knowledge of the IP address and port number of the terminal to call. Obviously, forcing the user to remember and use IP addresses for placing calls is not ideal, and dynamic IP addressing schemes (DHCP) make this requirement even more intolerable.
As mentioned before terminals usually register their addresses with a server. The server stores these telephone addresses along with the IP addresses of the respective terminals, thus becoming able to map a telephone address to a host.
When a telephone user dials an address the server tries to resolve the given address into a network address. To do so the server may interact with other telephony servers or services. It may also provide further call routing mechanisms like CPL (Call Processing Language) scripts or skill-based routing (e.g. route calls to "WWW-Support" to a list of persons who a tagged to be responsible for this subject).
Finally a telephony server is responsible for authenticating registrations, authorizing callers and performing the accounting
Gateways are telephony endpoints that facilitate calls between endpoints that usually would not inter operate. Usually this means that a gateway translates one signaling protocol into another (e.g. SIP/ISDN Signaling gateways), but translating between different network addresses (IPv4/IPv6) or codecs (Media Gateways) can be considered gatewaying as well. It is of course possible that multiple functionalities exist in a single gateway.
Finding gateways between VoIP and a traditional PBX is usually quite simple. Gateways that translate different VoIP protocols are harder to find. Most of them are limited to basic call functionality.
Conference Bridges provide means of having 3-point or multi-point conferences that can be either ad-hoc or scheduled. Because of the high resource requirements, conference bridges are usually dedicated servers with special media hardware.
A user willing to use a communication service needs an identifier to describe himself and the called party. Ideally, such an identifier should be independent of the user's physical location. The network should be then responsible for finding the current location of the called party. A specific user may define to be reached by multiple contact address identifiers.
Regular telephony systems use E.164 numbers - the international public telecommunication numbering plan. An identifier is composed of up to 15 digits with a leading plus sign, for example +1234565789123. When dialing, the leading plus is normally replaced by the international access code, usually double zero (00). This is followed by a country code and a subscriber number.
First IP telephony systems used IP addresses of end-point devices as user identifiers. Sometimes they are still used now. However, IP addresses are not location independent (even if IPv6 is used) and are hard to remember (especially if IPv6 is used) and are therefore not suitable for user identifiers.
Current IP telephony systems use two kinds of identifiers:
A Universal Resource Identifier (URI) uses a registered naming space to describe a resource in a location independent way. Resources are available under a variety of naming schemes and access methods including e-mail addresses (mailto), SIP identifiers (sip), H.323 identifiers (h323,RFC3508) or telephone numbers (draft-ietf-iptel-rfc2806bis-02). E-mail like identifiers have several advantages. They are easy to remember, nearly every Internet user already has an e-mail address and a new service can be added using the same identifier. The user location can be find with a Domain Name System (DNS). The disadvantage of URIs is that they are difficult or impossible to dial on some user devices (phones).
If we want to integrate a regular telephony system with IP telephony, we must deal with phone number identifiers even on the IP telephony side. The numbers are not well suitable for the Internet world relying on domain names. Therefore, the ENUM system was invented, using adapted phone numbers as domain names. We will describe ENUM in Chapter 7.