Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
WebRTC Integrator's Guide
WebRTC Integrator's Guide

WebRTC Integrator's Guide: Successfully build your very own scalable WebRTC infrastructure quickly and efficiently

eBook
€22.99 €32.99
Paperback
€41.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

WebRTC Integrator's Guide

Chapter 1. Running WebRTC with and without SIP

WebRTC lets us make calls right from a web page without any plugin. This was made possible using media APIs of the browser to fetch user media, WebSocket for transportation, and HTML5 to render the media on the web page. Thus, WebRTC is an evolved form of WebSocket communication. WebSocket is a Transport Layer protocol that carries data. The WebSocket API is an Application Programming Interface (API) that enables web pages to use the WebSocket protocol for (duplex) communication with a remote host.

In this chapter, we will study how WebRTC really works. We will also demonstrate the use of WebRTC media APIs to capture and render input from a user's microphone and camera onto a web page. In the later part of chapter, we will find out how to build a simple standalone WebRTC client using the plain WebSocket protocol as the signaling mechanism.

JavaScript Session Establishment Protocol (JSEP)

The communication model between a client and remote host is based on the JSEP architecture, which differentiates the signaling and media transaction into different layers.

The differentiation is shown in the following figure:

JavaScript Session Establishment Protocol (JSEP)

JSEP signaling and media

As an example, let's consider two peers, A and B, where A initiates communication with B. Initially, in the first case, A being the offerer will have to call the createOffer function to begin a session. A also mentions details such as codecs through a setLocalDescription function, which sets up its local config. The remote party, B, reads the offer and stores it using the setRemoteDescription function. The remote party, B, calls the createAnswer function to generate an appropriate answer, applies it using the setLocalDescription function, and sends the answer back to the initiator over the signaling channel. When A gets the answer, it also stores it using the setRemoteDescription function, and the initial setup is complete. This is repeated for multiple offers and answers. The latest on JSEP specifications can be read from the Internet Engineering Task Force (IETF) site at http://datatracker.ietf.org/doc/draft-ietf-rtcweb-jsep/.

Signal and media flows

The differentiation between signal and media flows is an important aspect of the WebRTC call setup.

The signaling mechanism can be any among HTTP/REST, JavaScript Object Notation (JSON) via XMLHttpRequest (XHR), Session Initiation Protocol (SIP) over websockets, XMPP, or any custom or proprietary protocol. The media (audio/video) is defined through the Session Description Protocol (SDP) and flows from peer to peer.

A few instances of end-to-end signaling and media flow variants are shown in the following screenshot:

Signal and media flows

The preceding figure depicts signaling over the WebRTC API in the JSON format via XHR.

Now, the following figure depicts signaling over the WebRTC API in eXtensible Messaging and Presence Protocol (XMPP):

Signal and media flows

While it's very popular to use the WebRTC API with SIP support through JavaScript libraries such as JSSIP, SIPML5, PJSIP, and so on, these libraries cater to the SIP/IMS (IP Multimedia Subsystem) world and are not mandatory for setting up enterprise-level WebRTC Infrastructure. In fact, it is a misconception that WebRTC is coupled with SIP in itself; it isn't.

Note

IP Multimedia System (IMS) is part of the Next Generation Network (NGN) model for IP-based communication.

Running WebRTC without SIP

HTML5 websockets can be defined by ws:// followed by the URL in the server field while readying a WebRTC client for registration. This enables bidirectional, duplex communications with server-side processes, that is, server-side push events to the client. It also enables the handshake after sharing media metadata such as ports, codecs, and so on.

It should be noted that WebRTC works in an offer/answer mode and has ways of traversing the Network Address Translation (NAT) and firewalls by means of Interactive Connectivity Establishment (ICE). ICE makes use of the Session Traversal Utilities for NAT (STUN) protocol and its extension, Traversal Using Relay NAT (TURN). This is covered later in the chapter.

Sending media over WebSockets

WebRTC mainly comprises three operations: fetching user media from a camera/microphone, transmitting media over a channel, and sending messages over the channel. Now, let's take a look at the summarized description of every operation type.

getUserMedia

The JavaScript getUserMedia function (also known as MediaStream) is used to allow the web page to access users' media devices such as camera and microphone using the browser's native API, without the need of any other third-party plugins such as Adobe Flash and Microsoft Silverlight.

Tip

For simple demos of these methods, download the WebRTC read-only by executing the following command:

svn checkout http://webrtc.googlecode.com/svn/trunk/ webrtc-read-only

The following is the code to access the IP camera in the Google Chrome browser and display the local video in a <video/> element:

/*The HTML to define a button to begin the capture and HTML5 video element on web page body */
<video id="vid" autoplay="true"></video>
<button id="btn" onclick="start()">Start</button>

/*The JavaScript block contains the following function call to start the media capture using Chrome browser's getUserMedia function*/
video = document.getElementById("vid");
function start() {
  navigator.webkitGetUserMedia({video:true}, gotStream, function() {});
  btn.disabled = true;
}

/*The function to add the media stream to a video element on a page*/
  function gotStream(stream) {
  video.src = webkitURL.createObjectURL(stream);
}

Note

When the browser tries to access media devices such as a camera and mic from users, there is always a browser notification that asks for the user's permission.

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you

The following screenshot depicts the user notification for granting permission to access the camera in Google Chrome:

getUserMedia

The following screenshot depicts the user notification for granting permission to access the camera in Mozilla Firefox:

getUserMedia

The following screenshot depicts the user notification for granting permission to access the camera in Opera:

getUserMedia

RTCPeerConnection

In WebRTC, media traverses in a peer-to-peer fashion and is necessary to exchange information prior to setting up a communication path such as public IP and open ports. It is also necessary to know about the peer's codecs, their settings, bandwidth, and media types.

To make the peer connection, we will need a function to populate the values of the RTCPeerConnection, getUserMedia, attachMediaStream, and reattachMediaStream parameters. Due to the fact that the WebRTC standard is currently under development, the JavaScript API can change from one implementation to another. So, a web developer has to configure the RTCPeerConnection, getUserMedia, attachMediaStream, and reattachMediaStream variables in accordance to the browser on which we are running the HTML content.

Note

It is noted that WebRTC standards are in rapid evolution. The API that was used for the first version of WebRTC was the PeerConnection API, which had distinct methods for media transmission. As of now, the old PeerConnection API has been deprecated and a new enhanced version is under process. The new Media API has replaced the media streams handling in the old PeerConnection API.

The browser APIs of different browsers have different names. The criterion is to determine the browser on which the web page is opened and then call the appropriate function for the WebRTC operation. The identity of the browser can be determined by extracting a friendly name or checking for a match with a specific library name of the different browser. For example, when navigator.webkitGetUserMedia is true, then WebRTCDetectedBrowser = "chrome", and when navigator.mozGetUserMedia is true, then WebRTCDetectedBrowser = "firefox". The following table shows the W3C standard elements in Google Chrome and Mozilla Firefox:

W3C Standard

Chrome

Firefox

getUserMedia

webkitGetUserMedia

mozGetUserMedia

RTCPeerConnection

webkitRTCPeerConnection

mozRTCPeerConnection

RTCSessionDescription

RTCSessionDescription

mozRTCSessionDescription

RTCIceCandidate

RTCIceCandidate

mozRTCIceCandidate

Such methods also exist for Opera, which is a new addition to the WebRTC suite. Hopefully, Internet Explorer, in the future, would have native support for WebRTC standards. For other browsers such as Safari that don't support WebRTC as yet, there are temporary plugins that help capture and display the media elements, which can be used until these browsers release their own enhanced WebRTC supported versions. Creating WebRTC-compatible clients in Internet Explorer and Safari is discussed in Chapter 9, Native SIP Application and Interaction with WebRTC Clients.

The following code snippet is used to make an RTC peer connection and render videos from one HTML video frame to another on the same web page. The library file, adapter.js, is used, which renders the polyfill functionality to different browsers such as Mozilla Firefox and Google Chrome.

The HTML body content that includes two video elements for the local and remote videos, the text status area, and three buttons to start capturing, sending, and stop receiving the stream are given as follows:

<video id="vid1" autoplay="true" muted="true"></video>
<video id="vid2" autoplay></video>
<button id="btn1" onclick="start()">Start</button>
<button id="btn2" onclick="call()">Call</button>
<button id="btn3" onclick="hangup()">Hang Up</button>
<xtextarea id="ta1"></textarea>
<xtextarea id="ta2"></textarea>

The JavaScript program to transmit media from the video element to another at the click of the Start button, using the WebRTC API is given as follows:

/* setting the value of start, call and hangup to false initially*/
btn1.disabled = false;
btn2.disabled = true;
btn3.disabled = true;

/* declaration of global variables for peerconecection 1 and 2, local streams, sdp constrains */
var pc1,pc2;
var localstream;
var sdpConstraints = {'mandatory': {
                        'OfferToReceiveAudio':true,
                        'OfferToReceiveVideo':true }};

The following code snippet is the definition of the function that will get the user media for the camera and microphone input from the user:

function start() {
  btn1.disabled = true;
  getUserMedia({audio:true, video:true},
  /* get audio and video capture */
  gotStream, function() {});
}

The following code snippet is the definition of the function that will attach an input stream to the local video section and enable the call button:

function gotStream(stream){
  attachMediaStream(vid1, stream);
  localstream = stream;/* ready to call the peer*/
btn2.disabled = false;
}

The following code snippet is the function call to stream the video and audio content to the peer using RTCPeerConnection:

function call() {
  btn2.disabled = true;
  btn3.disabled = false;
  videoTracks = localstream.getVideoTracks();
  audioTracks = localstream.getAudioTracks();
  var servers = null;

  pc1 = new RTCPeerConnection(servers);/* peer1 connection to server */
  pc1.onicecandidate = iceCallback1;
  pc2 = new RTCPeerConnection(servers);/* peer2 connection to server */

  pc2.onicecandidate = iceCallback2;
  pc2.onaddstream = gotRemoteStream;
  pc1.addStream(localstream);
  pc1.createOffer(gotDescription1);
}

function gotDescription1(desc){/* getting SDP from offer by peer2 */
pc1.setLocalDescription(desc);
pc2.setRemoteDescription(desc);
pc2.createAnswer(gotDescription2, null, sdpConstraints);
}

function gotDescription2(desc){/* getting SDP from answer by peer1 */
pc2.setLocalDescription(desc);
pc1.setRemoteDescription(desc);
}

On clicking the Hang Up button, the following function closes both of the peer connections:

function hangup() {
  pc1.close();
  pc2.close();
  pc1 = null; /* peer1 connection to server closed */
  pc2 = null; /* peer2 connection to server closed */
  btn3.disabled = true; /* disables the Hang Up button */
  btn2.disabled = false; /*enables the Call button */

}

function gotRemoteStream(e){
  vid2.src = webkitURL.createObjectURL(e.stream);
}

function iceCallback1(event){
  if (event.candidate) {
    pc2.addIceCandidate(new RTCIceCandidate(event.candidate));
  }
}

function iceCallback2(event){
  if (event.candidate) {
    pc1.addIceCandidate(new RTCIceCandidate(event.candidate));
  }
}

In the preceding example, JSON/XHR (XMLHttpRequest) is the signaling mechanism. Both the peers, that is, the sender and receiver, are present on the same web page; this is represented by the two video elements shown in the following screenshot. They are currently in the noncommunicating state.

RTCPeerConnection

As soon as the Start button is hit, the user's microphone and camera begin to capture. The first peer is presented with the browser request to use their camera and microphone. After allowing the browser request, the first peer's media is successfully captured from their system and displayed on the screen. This is demonstrated in the following screenshot:

RTCPeerConnection

As soon as the user hits the Call button, the captured media stream is shared in the session with the second peer, who can view it on their own video element. The following screenshot depicts the two peers sharing a video stream:

RTCPeerConnection

The session can be discontinued by clicking on the Hang Up button.

RTCDataChannel

The DataChannel function is used to exchange text messages by creating a bidirectional data channel between two peers. The following is the code to demonstrate the working of RTCDataChannel.

The following code snippet is the HTML body of the code for the DataChannel function. It consists of a text area for the two peers to view the messages and three buttons to start the session, send the message, and stop receiving messages.

<div id="left">
<br>
<h2>Send data</h2>
<textarea id="dataChannelSend" rows="5" cols="15" disabled="true">
</textarea>
<br>
<button id="startButton" onclick="createConnection()">
  Start</button>
<button id="sendButton" onclick="sendData()">Send Data</button>
<button id="closeButton" onclick="closeDataChannels()">Stop Send Data
</button>
<br>
</div>
<div id="right">
<br>
<h2>Received Data</h2>
<textarea id="dataChannelReceive" rows="5" cols="15" disabled="true">
</textarea><br>
</div>

The style script for the text area is given as follows; to differentiate between the two peers, we place one text area aligned to the right and another to the left:

#left { position: absolute; left: 0; top: 0; width: 50%; }
#right { position: absolute; right: 0; top: 0; width: 50%; }

The JavaScript block that contains the functions to make the session and transmit the data is given as follows:

/*Declaring global parameters for both sides' peerconnection, sender, and receiver channel*/
var pc1, pc2, sendChannel, receiveChannel;

/*Only enable the Start button, keep the send data and stop send data button off*/
startButton.disabled = false;
sendButton.disabled = true;
closeButton.disabled = true;

The following code snippet is the script to create PeerConnection in Google Chrome, that is, webkitRTCPeerConnection that was seen in the previous table. It is noted that a user needs to have Google Chrome Version 25 or higher to test this code. Some old Chrome versions are also required to set the --enable-data-channels flag to the enabled state before using the DataChannel functions.

function createConnection() {
  var servers = null;
  pc1 = new webkitRTCPeerConnection(servers,{
    optional: [{RtpDataChannels: true}]});
    try {
      sendChannel = pc1.createDataChannel("sendDataChannel", {
        reliable: false});
    } catch (e) {
      alert('Failed to create data channel.' + 'You need Chrome M25 or later with --enable-data-channels flag'););
    }

  pc1.onicecandidate = iceCallback1;
  sendChannel.onopen = onSendChannelStateChange;
  sendChannel.onclose = onSendChannelStateChange;
  pc2 = new webkitRTCPeerConnection(servers,{
    optional: [{RtpDataChannels: true}]});
  
  pc2.onicecandidate = iceCallback2;
  pc2.ondatachannel = receiveChannelCallback;

  pc1.createOffer(gotDescription1);
  startButton.disabled = true; /*since session is up, disable start button */
  closeButton.disabled = false;  /*enable close button */
}

The following function is used to invoke the sendChannel.send function along with user text to send data across the data channel:

function sendData() {
  var data = document.getElementById("dataChannelSend").value;
  sendChannel.send(data);
}

The following function calls the sendChannel.close() and receiveChannel.close() functions to terminate the data channel connection:

function closeDataChannels() {
sendChannel.close();
receiveChannel.close();
pc1.close();      /* peer1 connection to server closed */
pc2.close();      /* peer2 connection to server closed */

  pc1 = null;
  pc2 = null;
startButton.disabled = false;
sendButton.disabled = true;
closeButton.disabled = true;
document.getElementById("dataChannelSend").value = "";
document.getElementById("dataChannelReceive").value = "";
document.getElementById("dataChannelSend").disabled = true;
}

Peer connection 1 sets the local description, and peer connection 2 sets the remote description from the SDP exchanged, and the answer is created:

function gotDescription1(desc) {
  pc1.setLocalDescription(desc);
  pc2.setRemoteDescription(desc);
  pc2.createAnswer(gotDescription2);
}

function gotDescription2(desc) {
  pc2.setLocalDescription(desc);
  trace('Answer from pc2 \n' + desc.sdp);
  pc1.setRemoteDescription(desc);
}

The following is the function to get the local ICE call back:

function iceCallback1(event) {
  if (event.candidate) {
    pc2.addIceCandidate(event.candidate);
  }
}

The following is the function for the remote ICE call back:

function iceCallback2(event) {
  if (event.candidate) {
    pc1.addIceCandidate(event.candidate);
  }
}

The function that receives the control when a message is passed back to the user is as follows:

function receiveChannelCallback(event) {
  receiveChannel = event.channel;
  receiveChannel.onmessage = onReceiveMessageCallback;
  receiveChannel.onopen = onReceiveChannelStateChange;
  receiveChannel.onclose = onReceiveChannelStateChange;
}

function onReceiveMessageCallback(event) {
  document.getElementById("dataChannelReceive").value = event.data;
}

function onReceiveChannelStateChange() {
  var readyState = receiveChannel.readyState;
}

function onSendChannelStateChange() {
  var readyState = sendChannel.readyState;
  if (readyState == "open") {
    document.getElementById("dataChannelSend").disabled = false;
    sendButton.disabled = false;
    closeButton.disabled = false;
  } else {
    document.getElementById("dataChannelSend").disabled = true;
    sendButton.disabled = true;
    closeButton.disabled = true;
  }
}

The following screenshot shows that Peer 1 is prepared to send text to Peer 2 using the DataChannel API of WebRTC:

RTCDataChannel

Empty text areas before beginning the exchange of text

On clicking on the Start button, as shown in the following screenshot, a session is established between the peers and the server:

RTCDataChannel

Putting in text from one's peers after hitting the Start button

As Peer 1 keys in the message and hits the Send button, the message is passed on to Peer 2. The preceding snapshot is taken before sending the message, and the following picture is taken after sending the message:

RTCDataChannel

Text is exchanged on DataChannel on the click of the Send button

However, right now, you are only sending data from one localhost to another. This is because the system doesn't know any other peer IP or port. This is where socket-based servers such as Node.js come into the picture.

Media traversal in WebRTC clients

Real-time Transport Protocol (RTP) is the way for media to flow between end points. Media could be audio and/or video based.

Note

Media stream uses SRTP and DTLS protocols.

RTP in WebRTC is by default peer-to-peer as enforced by the Interactive Connectivity Establishment (ICE) protocol candidates, which could be either STUN or TURN. ICE is required to establish that firewalls are not blocking any of the UDP or TCP ports. The peer-to-peer link is established with the help of the ICE protocol. Using the STUN and TURN protocols, ICE finds out the network architecture and provides some transport addresses (ICE candidates) on which the peer can be contacted.

An RTCPeerConnection object has an associated ICE, comprising the RTCPeerConnection signaling state, the ICE gathering state, and the ICE connection state. These are initialized when an object is first created. The flow of signals through these nodes is depicted in the following call flow diagram:

Media traversal in WebRTC clients

ICE, STUN, and TURN are defined as follows:

  • ICE: This is the framework to allow your web browser to connect with peers. ICE uses STUN or TURN to achieve this.
  • STUN: This is the protocol to discover your public address and determine any restrictions in your router that would prevent a direct connection with a peer. It presents the outside world with a public IP to the WebRTC client that can be used by the peers to communicate to it.
  • TURN: This is meant to bypass the Symmetric NAT restriction by opening a connection with a TURN server and relaying all information through this server.

Note

STUN/ICE is built-in and mandatory for WebRTC.

WebRTC through WebSocket signaling servers

Signaling is a crucial activity to establish any kind of network-based communication. It lets the endpoints share the session description and media information before setting up the path to actually exchange media. For a simple WebRTC client, there are JavaScript-based WebSocket servers that can provide such signaling in a permanent, full duplex, real-time manner. Node.js is one such server.

Node.js

Node.js is an asynchronous, server-side JavaScript server powered by Chrome's V8 JS engine. There are many WebSocket libraries, such as Socket.io and SockJS, that can run over it. Why are they used? They are used because the WebSocket server will do the WebSocket signaling between WebRTC clients and the server without using other protocols such as XMPP or SIP.

Let's see how we can use Node.js signaling server through the following simple steps:

  1. On a Windows machine, install nodejs.exe from the official download site, http://www.nodejs.org.
  2. To check whether Node.js is properly installed and working, check the version using the following command lines
    node -v
    

    The output in my case is v0.10.26.

  3. Open the command prompt, and type node <name of the JS file> in the window. Consider the following command line as an example:
    node signaler.js
    

To write and run a simple server-side program, open Notepad, make a sample JS file with a name, say, console, and add some content to the console.log('node.js running fine') file. Run this file using the following Node.js command from the command prompt:

node console.js

The following screenshot shows the output of the preceding command line:

Node.js

Let's now look at the overview of steps using Node.js to set up the signaling environment for a WebRTC client.

  1. First, we need a JavaScript library to support WebRTC signaling operations. We can use signaller.js for this. Download signaller.js from https://github.com/muaz-khan/WebRTC-Experiment/blob/master/websocket-over-nodejs/signaler.js.
  2. Next, we should run the JavaScript library using the Node.js server. We can do so by executing the following command in the terminal window:
    node signaler.js
    
  3. Specify the address of the Node.js server machine in the WebRTC client.

    Now, we can make inter-browser WebRTC audio/video calls, where the signaling is handled by the Node.js WebSocket signaling server. The following diagram depicts how Node.js is used as a signaling server:

    Node.js

The preceding diagram denotes signaling across WebRTC clients over the Node.js WebSocket-based server. The media flows from peer to peer.

Making a peer-to-peer audio call using Node.js for signaling

We have seen how a JavaScript program is hosted on a Node.js signaling server. Now, let's study the process of making an audio/video call using this setup. The following code references Muaz Khan WebRTC experiments, which is under the MIT license. The library used is PeerConnection.js. The following are the CSS descriptions for the audio and video content on a page:

audio, video {
  vertical-align: top;
}
.setup {
  border-bottom-left-radius: 0;
  border-top-left-radius: 0;
  margin-left: -9px;
  margin-top: 8px;
  position: absolute;
}
.highlight { color: rgb(0, 8, 189); }

Next, we will look at the JavaScript functions that define the behavior of the WebRTC client. This is a modified version of code from one-to-one-peerconnection.html under the WebRTC experiments master from Muaz Khan. For better clarity and easy understanding, I have removed the functions of unique ID, rotate video, and scale video, and have minimal CSS styling.

The following code defines the websocket.onopen and websocket.send operations:

var channel = location.href.replace( /\/|:|#|%|\.|\[|\]/g, '');
var websocket = new WebSocket('ws://' + document.domain + ':12034');
websocket.onopen = function() {
  websocket.push(JSON.stringify({
    open: true, channel: channel
  }));
};
websocket.push = websocket.send;
websocket.send = function(data) {
  websocket.push(JSON.stringify({
    data: data, channel: channel
  }));
};

The following code is for the creation of a new peer connection and for every user who joins a session:

  var peer = new PeerConnection(websocket);
  peer.onUserFound = function(userid) {

    if (document.getElementById(userid)) return;

    /* adding the name of room to room list */
    var tr = document.createElement('tr');
    var td1 = document.createElement('td');
    var td2 = document.createElement('td');
    td1.innerHTML = userid + ' video call';

    /* creating element button to room list */
    var button = document.createElement('button');
    button.innerHTML = 'Join';
    button.id = userid;
    button.style.float = 'right';

    /* add the user to session on button click */
    button.onclick = function() {
      button = this;
      getUserMedia(function(stream) {
      // get user media
      peer.addStream(stream);
      // add the stream

      peer.sendParticipationRequest(button.id);
      });
      button.disabled = true;
    };

    td2.appendChild(button);
    tr.appendChild(td1);
    tr.appendChild(td2);
    roomsList.appendChild(tr);
  };

The following code adds streaming to the video element of HTML and sets its characteristics:

    peer.onStreamAdded = function(e) {
      if (e.type == 'local')
        document.querySelector('#start-broadcasting').disabled = false;
      var video = e.mediaElement;
      video.setAttribute('width', 400);
      video.setAttribute('height', 400);
      video.setAttribute('controls', true);
      videosContainer.insertBefore(video, videosContainer.firstChild);
      video.play();
    };

The following code is to close the streaming session:

    peer.onStreamEnded = function(e) {
    var video = e.mediaElement;
    if (video) {
      video.style.opacity = 0;
      setTimeout(function() {
        video.parentNode.removeChild(video);
        scaleVideos();
      }, 1000);
    }
  };

  document.querySelector('#start-broadcasting').onclick = function() {
    this.disabled = true;
    getUserMedia(function(stream) {
    peer.addStream(stream);
    peer.startBroadcasting();
    });
  };
  document.querySelector('#your-name').onchange = function() {
    peer.userid = this.value;
  };

  var videosContainer = document.getElementById('videos-container') || document.body;
  var btnSetupNewRoom = document.getElementById('setup-new-room');
  var roomsList = document.getElementById('rooms-list');

  if (btnSetupNewRoom) btnSetupNewRoom.onclick = setupNewRoomButtonClickHandler;

The following code is to capture the user media:

  function getUserMedia(callback) {
    var hints = {
      audio: true,
      video: {
        optional: [],
        mandatory: {
          minWidth: 200, minHeight:200, maxWidth: 400, maxHeight: 400, minAspectRatio: 1.77
        }
      }
    };
    navigator.getUserMedia(hints, function(stream) {
      var video = document.createElement('video');
      video.src = URL.createObjectURL(stream);
      video.controls = true;
      video.muted = true;
      peer.onStreamAdded({
        mediaElement: video,
        userid: 'self',
        stream: stream
      });
      callback(stream);
  });
}

The following is the web page's HTML content to add a button to start transmitting the media, a video element to display the media, a text field to add a user's name, and a table to list the existing available sessions:

<input type="text" id="your-name" placeholder="your-name">
<button id="start-broadcasting" class="setup">
Start Transmitting Yourself!</button>

<!-- list of all available conferencing rooms -->
<table id="rooms-list" style="width: 100%;"></table>

<!-- local/remote videos container -->
<div id="videos-container"></div>

The following screenshot depicts a user, Alice, creating a new session named alice. Here, the user Alice creates a session for broadcasting video, which will be added to the room list.

Making a peer-to-peer audio call using Node.js for signaling

Alice's media is streamed on the session space, as shown in the next screenshot:

Making a peer-to-peer audio call using Node.js for signaling

A new user, Bob, views the list of ongoing sessions from his remote computer, and clicks on the Join button, as shown in the following screenshot, to join Alice's session:

Making a peer-to-peer audio call using Node.js for signaling

The following screenshot displays a two-way audio and video session in progress between Bob and Alice:

Making a peer-to-peer audio call using Node.js for signaling

Bob and Alice are in an audio/video sharing session. Using other WebRTC APIs, we can also add file sharing, text chat, screen sharing capabilities, and so on to this simple demonstration to turn it into a multifeatured communication tool.

Running WebRTC with SIP

This section introduces the approach to use the SIP signaling mechanism with WebRTC. Like any other VoIP protocol, SIP also provides the signaling framework before setting up an actual media path. However, the foundation of open standard and industry-adopted signaling protocol such as SIP is recommended, as it provides the first and most crucial step to a strong, scalable architecture.

Session Initiation Protocol (SIP)

As we already know, SIP is a signaling protocol that is used to establish an RTP between two endpoints.

Note

As per the official document, RFC 3261, SIP is an application-layer control protocol that can establish, modify, and terminate multimedia sessions (conferences) such as Internet telephony calls.

The SIP stack defines the Request and Response methods. These methods are used to gather the information about endpoints that wish to participate in a communication so that the device-specific information such as IP, port, availability, media understanding, and audio-video device compatibility can be sorted out before establishing a flowing media connection.

However, it should be noted that traditional SIP is a bit different from SIP over WebSocket (SIPWS), which is used in case of WebRTC with SIP signaling. It is not by default that every SIP server would understand SIPWS. Only those SIP servers that have WebSocket support, or state that they are WebRTC compliant, will be able to proxy or understand the SIP messages sent from a WebRTC client.

Why do we use SIPWS? This protocol allows the development of Convergent applications, that is, applications that support SIP for communication, HTTP for web components, and WebRTC for media. SIPWS can be transformed into plain SIP signal through a gateway, which can then interact with the IMS network. Also, SIP can be used to integrate application logic such as call screening and call rerouting, with the help of SIP Servlets or other kinds of SIP programming. More of this is given in Chapter 3, WebRTC with SIP and IMS.

SIPWS is explained in detail in the IETF draft, The WebSocket Protocol as a Transport for the Session Initiation Protocol (SIP) draft-ietf-sipcore-sip-websocket-10 and can be found at http://tools.ietf.org/html/draft-ietf-sipcore-sip-websocket-10.

The following figure depicts the use of SIPWS signaling plane with WebRTC media plane:

Session Initiation Protocol (SIP)

The following figure provides the call flow of the SIPWS signaling mechanism. Any SIP request is preceded by a one-time WebSocket handshake.

Session Initiation Protocol (SIP)

Alice loads a web page using her web browser and retrieves the JavaScript code that implements the SIP WebSocket subprotocol. The JavaScript code (SIP WebSocket Client) establishes a secure WebSocket connection with a SIP proxy/registrar (SIP WebSocket Server) at proxy.example.com.

The following is an example of a WebSocket handshake in which the Client requests the WebSocket SIP subprotocol support from the Server:

ws://ns313841.ovh.net:10060/
Request Method:
GET
Status Code:
101 Switching Protocols

Request Headers:
Provisional headers are shown.
Cache-Control:no-cache
Connection:Upgrade
Host:ns313841.ovh.net:10060
Origin:http://sipml5.org
Pragma:no-cache
Sec-WebSocket-Extensions:permessage-deflate; client_max_window_bits, x-webkit-deflate-frame
Sec-WebSocket-Key:4aUpDOwtSWPaLmXKzQefJQ==
Sec-WebSocket-Protocol:sip
Sec-WebSocket-Version:13
Upgrade:websocket
User-Agent:Mozilla/5.0 (X11; Linux i686 (x86_64)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.102 Safari/537.36
Response Headersview source
Connection:Upgrade
Content-Length:0
Sec-WebSocket-Accept:5l6iqk2+moekkwZsqlXo4cewzcw=
Sec-WebSocket-Protocol:sip
Sec-WebSocket-Version:13
Upgrade:websocket

The following diagram shows the call between Alice and Bob through the SIP proxy server over WebSocket signaling:

Session Initiation Protocol (SIP)

Every SIP endpoint is registered with the SIP Server by a unique callable ID. This is referred to as the SIP URI and is denoted by the sip:<username>@<domainname> format. When a user, Alice, calls another user, Bob, through Bob's SIP URI, then the SIP WebSocket Server at proxy.example.com acts as a SIP proxy node and routes the INVITE call to Bob's contact. Bob answers the call to start a conversation and then terminates it with a BYE request when the communication is over.

JavaScript-based SIP libraries

There are many popular JavaScript libraries that offer easy-to-integrate support for WebRTC communication using SIP signaling:

  • SIPJS: This is an SIP stack in JavaScript to implement SIP-based audio and video user agents in the browser. You can find a running demo at http://theintencity.com/sip-js/phone.html?network_type=WebRTC. The demo application has the option to switch between WebRTC capabilities and Flash for browsers that support and do not support WebRTC.
  • JSSIP: This is an SIP over WebSocket transport API for audio/video calls and instant messaging. It works with all SIPWS-compatible SIP servers such as OverSIP, Kamailio, and Asterisk servers. You can find a running demo at http://tryit.jssip.net/.
  • sipML5: This is an open source JavaScript library with a provision for RTCWeb Breaker (audio and video transcoding when the endpoints do not support the same codecs or the remote server is not RTCWeb compliant). For example, features such as audio/video call, instant messaging, presence, call hold/resume, explicit call transfer, and Dual-tone multi-frequency (DTMF) signaling using SIP INFO are present. You can find a running example at http://sipml5.org/call.htm.
  • QuoffeSIP: This is another WebRTC SIP library to establish real-time communication between browsers. This is developed in CoffeeScript (simple syntax). It features video/audio call capabilities using SIP over the Websockets protocol and also uses the SIP Outbound and GRUU protocols. You can find a running example athttp://talksetup.quobis.com/.

The implementation of the sipML5 and JSSIP libraries to constitute a simple WebRTC browser client that is able to communicate to a similar peer in any WebRTC-supported browser is covered in the next chapter.

Summary

In this chapter, we learned that a WebRTC communication process is divided into two parts: signaling, where the session setup and teardown is agreed to, and media transactions, which deals with the actual RTP streams that contain voice/video/data that the user has sent. We saw how to program the three basic APIs of WebRTC media stack, namely, getUserMedia, RTCPeerConnection, and DataChannel. The Running WebRTC without SIP section described signaling done over JSON via XMLHttpRequest using Node.js as the intermediately signaling server to connect the peers and prepare for the media flow. The next section, Running WebRTC with SIP, listed the libraries or WebRTC clients that use SIP over WebSocket to take care of the signaling between WebRTC peers. In the following chapters, we will see how to use WebRTC media APIs over the SIP WebSocket protocol in detail.

Left arrow icon Right arrow icon

Description

This book is for programmers who want to learn about real-time communication and utilize the full potential of WebRTC. It is assumed that you have working knowledge of setting up a basic telecom infrastructure as well as basic programming and scripting knowledge.

What you will learn

  • Understand the purpose of Media APIs in the WebRTC media stack
  • Discover more about WebRTC and next generation communication networks
  • Learn how to run simple WebRTC clients with the default peertopeer behavior
  • Run WebRTC calls over a WebSocket protocol using a WebSocket signaling server
  • Integrate WebRTC with old networks
  • Learn best practices to build up a dynamic web project with support for WebRTC calls
  • Explore the usefulness of the interconversion of protocols and the transcoding of codecs with WebRTC

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Oct 31, 2014
Length: 382 pages
Edition : 1st
Language : English
ISBN-13 : 9781783981274
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Oct 31, 2014
Length: 382 pages
Edition : 1st
Language : English
ISBN-13 : 9781783981274
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 102.97
WebRTC Blueprints
€35.99
WebRTC Integrator's Guide
€41.99
Learning WebRTC
€24.99
Total 102.97 Stars icon

Table of Contents

11 Chapters
1. Running WebRTC with and without SIP Chevron down icon Chevron up icon
2. Making a Standalone WebRTC Communication Client Chevron down icon Chevron up icon
3. WebRTC with SIP and IMS Chevron down icon Chevron up icon
4. WebRTC Integration with Intelligent Network Chevron down icon Chevron up icon
5. WebRTC Integration with PSTN Chevron down icon Chevron up icon
6. Basic Features of WebRTC over SIP Chevron down icon Chevron up icon
7. WebRTC with Industry Standard Frameworks Chevron down icon Chevron up icon
8. WebRTC and Rich Communication Services Chevron down icon Chevron up icon
9. Native SIP Application and Interaction with WebRTC Clients Chevron down icon Chevron up icon
10. Other WebRTC Use Cases Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.8
(4 Ratings)
5 star 75%
4 star 25%
3 star 0%
2 star 0%
1 star 0%
Yahamid Dec 31, 2014
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I chose this book after going thoroughly through its ToC and comparing it with many other books on WebRTC. This book is quite comprehensive and covers a lot more topics in quite less pages. The good point is that it starts from basics and goes until PSTN, etc. (although I did not read that stuff). I think it makes a good read. You must know programming very well, though.
Amazon Verified review Amazon
udit mital Jul 09, 2016
Full star icon Full star icon Full star icon Full star icon Full star icon 5
this book is good and filled with easy to do setups ans excercizes to get stated with webrtc for setting my inoffice communication. saved the cost of setting physical phones and vpn etc
Amazon Verified review Amazon
Tsahi Levent-Levi Feb 25, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This book is suitable for those who are looking to integrate WebRTC into a SIP infrastructure, be it native SIP, IMS, RTC, VoLTE or any other of the VoIP curses that use SIP as their signaling protocol.Altanai does a great job at covering many of the aspects associated with this fusion of WebRTC and SIP, taking the time to go over a rich set of use cases.
Amazon Verified review Amazon
rahul jain Jan 05, 2015
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
I bought this book after a good research on the list of books available on this topic. It really helps me a lot in developing my VOIP tool. Has in depth information.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.