Endpoint modules are critically important and add some of the key features that make FreeSWITCH the powerful platform it is. The primary role of endpoint modules is to take certain common communication technologies and normalize them into a common abstract entity, which we refer to as a session. A session represents a connection between FreeSWITCH and a particular protocol. There are several Endpoint modules that come with FreeSWITCH, which implement several protocols such as SIP, H.323, Jingle, Verto, WebRTC, and some others. We will spend some time examining one of the more popular modules named mod_sofia.
Sofia-SIP (http://sofia-sip.sourceforge.net) is an open source project originally developed by Nokia, which provides a programming interface for the Session Initiation Protocol (SIP). FreeSWITCH developers have worked a lot evolving and fixing its original code base, further enhancing its robustness and features. We use our version of this library in FreeSWITCH (/usr/src/freeswitch/libs/sofia-sip) in a module named mod_sofia. This module registers to all the hooks in FreeSWITCH necessary to make an Endpoint module, and translates the native FreeSWITCH constructs into Sofia SIP constructs and vice versa. Configuration information is taken from FreeSWITCH configuration files, which makes for mod_sofia to load user-defined preferences and connection details. This allows FreeSWITCH to accept registrations from SIP phones and devices, register itself to other SIP servers such as service providers, send notifications, and provide services to the phones such as blinking lights and voicemail.
When a SIP audio/video call is established between FreeSWITCH and another SIP device, it will show up in FreeSWITCH as an active session. If the call is inbound, it can be transferred or bridged to IVR menus, hold music (and/or video), or to one or more extensions. Or, it can be bridged to a newly created outbound call, toward a PSTN subscriber, or a WebRTC browser, for example. Let's examine a typical scenario where an internal SIP phone registered as extension 2000 dials extension 2001 with the hope of establishing a call.
First, the SIP phone sends a call setup message to FreeSWITCH over the network (mod_sofia is listening for such messages). After receiving the message, mod_sofia in turn parses the relevant details, builds the abstract call session data structure the core understands, and passes that call into the core state machine in FreeSWITCH. The state machine (in the FreeSWITCH core) then puts the call into the ROUTING state (meaning looking for a destination).
Core's next step is to locate the Dialplan module based on the configuration data for the calling Endpoint. The default and most widely used Dialplan module is the XML Dialplan module. This module is designed to look up a list of instructions from the XML tree within FreeSWITCH's memory. The XML Dialplan module will parse a series of XML objects using regular expression pattern-matching.
As we are trying to call 2001, we are looking around for an XML extension where the destination_number field matches 2001. Also, let's remember the Dialplan is not limited to matching only a single extension. An incoming call can match more than one extension in the Dialplan, and in Chapter 6, XML Dialplan, you will get an expanded definition of the term extension. The XML Dialplan module builds a TODO list for the call. Each matched extension will have its actions added to the call's TODO list.
Assuming FreeSWITCH finds at least one extension with a matching condition, the XML Dialplan will insert instructions into the session object with the information it needs to try and connect the call to 2001 (the TODO list for this call). Once these instructions are in place, the state of the call session changes from ROUTING to EXECUTE, where the core drills down the list and executes the instructions piled during the ROUTING state. This is where the API comes into the picture.
Each instruction is added to the session in the form of an application name and a data argument that will be passed to that application. The one we will use in this example is the bridge application. The purpose of this application is to create another session with an outbound connection, then connect the two sessions for direct audio exchange. The argument we will supply to bridge will be user/2001, which is the easiest way to generate a call to an internal registered phone at extension 2001. A Dialplan entry for 2001 might look like this:
<extension name="example">
<condition field="destination_number"
expression="^2001$">
<action application="bridge" data="user/2001"/>
</condition>
</extension>
This extension is named example, and it has a single condition to match. If that condition is matched, it has a single application to execute. It can be understood as: If the caller dialed 2001, then establish a connection between the caller and the endpoint (that is, the phone) at 2001.
Once we have inserted the instructions into the TODO list for the session, the session's state will change to EXECUTE, and the FreeSWITCH core will start to use the data collected to perform the desired action(s). First, it will parse the list and find it must execute bridge on user/2001, then it will look up the bridge application and pass the user/2001 data to it. This will cause the FreeSWITCH core to create a new outbound session (that is, a call) of the desired type. Let's assume that user 2001 is registered to FreeSWITCH using a SIP phone. So user/2001 will resolve into a SIP dialstring, which will be passed to mod_sofia to ask it to create a new outbound session (toward user 2001's SIP phone).
If the setup for that new session is successful, there will be two sessions in the FreeSWITCH core, the new session and the original session (from the caller phone). The bridge application will take those two sessions and call the bridge function on it. This will make the audio and/or the video stream to flow in both directions once the person at extension 2001 answers the phone. If that user is unable to answer or is busy, a timeout (that is, a failure) would occur and a failure message will be sent back to the caller's phone. If a call is unanswered or an extension is busy, many reactions can be built in the Dialplan, including call forwarding or voicemail.
FreeSWITCH takes all the complexity of SIP and reduces it to a common (internal to its core) denominator. Then it reduces the complexity further by allowing us to use a single instruction in the Dialplan for connecting the phone at 2000 to the phone at 2001. If we also want to allow the phone at 2001 to be able to call the phone at 2000, we can add another entry in the Dialplan going the other way:
<extension name="example 2">
<condition field="destination_number" expression="^2000$">
<action application="bridge" data="user/2000"/>
</condition>
</extension>
In this scenario, the Endpoint module (mod_sofia) turned incoming SIP call into a FreeSWITCH session and the Dialplan module (mod_dialplan_xml) turned XML into an extension. The bridge application (from mod_dptools) turned into a simple application/data pair the complex code of creating an outbound call and connecting its media streams. Both the Dialplan module and the application module interface are designed around FreeSWITCH sessions. Not only does the abstraction make life easier for us at the user level, it also simplifies the design of the application and the Dialplan because they can be made agnostic of the actual endpoint technology involved in the call. It is because of this abstraction that when tomorrow we will write a new Endpoint module to the 3D holographic communication network we will be able to reuse all the same applications and Dialplan modules. To the FreeSWITCH core, and to all of the FreeSWITCH modules, an holographic 3D call will then appear as just one more standard abstract session.
It is possible that you may want to work with some specific data provided by the Endpoint's native protocol. In SIP, for instance, there are several arbitrary headers as well as several other bits of interesting data inside the SIP packets. You may need or want to access those specific bits of information. We solve this problem by adding variables to the channel (that is, the part of the session structure that interact with the endpoint). Using channel variables, mod_sofia can create these arbitrary values as they are encountered in the SIP data, and you can retrieve them from the channel by variable name, in your Dialplan or application. Those were originally specific (and arbitrary) parts of the low-level SIP messages. However, the FreeSWITCH core just sees them as arbitrary channel variables that the core can ignore. There are also several special reserved channel variables that can influence the behavior of FreeSWITCH in many useful ways. If you have ever used a scripting language or configuration engine that uses variables (sometimes called attribute-value pairs or AVP, channel variables are pretty much the same concept. There is simply a name and a value that are passed to the channel, and the data is set.
The application interface for this, the set application, lets you set your own variables from the Dialplan:
<extension name="example 3">
<condition field="destination_number" expression="^2000$">
<action application="set" data="foo=bar"/>
<action application="bridge" data="user/2000"/>
</condition>
</extension>
This example is almost identical to the previous example, but instead of just placing the call, we first set the variable foo equal to the value bar. This variable will remain set throughout the call and can even be referenced at the end of the call in the detail logs (CDRs).
The more we build things by small pieces, the more the same underlying resources can be reused, making the whole system simpler to use. For example, the codec interface knows nothing else about the core, other than its own isolated world of encoding and decoding audio and video packets. Once a proper codec module has been written, it becomes usable by any Endpoint interface capable of carrying that codec in one of its media streams. This means that if we get a text-to-speech module working, we can generate synthesized speech on any and all Endpoints that FreeSWITCH supports. The TTS module becomes more useful because it can use more codecs; the codecs have become more useful because we added a new function that can take advantage of them. The same idea applies to applications. If we write a new application module, the existing endpoints will immediately be able to run and use that application.