Telestax Blog

Using Wireshark RTP captures for Mobicents Media Server testing

The Mobicents Media Server has been coming along nicely for the past year or so. We started building it almost five years ago. Admittedly real time media processing turned to be a harder problem than we anticipated. About a year ago we announced full compliance with JSR 309 (Java API for Media Server Control). That was a major milestone which set a quality baseline that we have been adding to ever since. Many folks from the Mobicents core team and the broader Mobicents community pitched in, but most notably Oleg Kulikoff, who was in charge for MMS 1.x and 2.x, and Yulian Oifa, who took over the media server project leadership near the end of the JSR 309 certification and is still is the acting project lead.

As the project matured we naturally saw more deployments in real world scenarios. With that came a whole set of interesting problems to solve. I will focus on one of them here. The solution that came out of it we added to the ever growing automated regression test suite. Maybe our experience will inspire you to contribute to the open discussion of testing real time media processing code.

We have a customer who has been building a critical life-saving call system based on RestcommONE, SIP Servlets and Media Server. If RestcommONE is a new term for you, it is just the brand under which we sell commercial quality versions of Mobicents. Similar to the way Red Hat sells RHEL around the community Fedora project.

The customer in question has a legacy telephony system that is a mix of proprietary hardware and software plugged to the PSTN (T1 lines). This system was reaching its limits and was due for upgrade or replacement. The company chose to replace it with a new VoIP system using SIP gateways to the PSTN. Enter RestcommONE.

Writing the integration code between the backend operational systems and RestcommONE was a relatively straightforward exercise. The real pain did not commence until we started testing against a legacy auto-dialer system that was connected to the PSTN. The traffic to RestcommONE went through the PSTN, then to a SIP gateway service provider and then to RestcommONE.

We had quite a bit of fun testing with different SIP gateway providers and observing how each introduced a different range of issues related to translating media from legacy signal to RTP packets. One of these issues was related to touch tones.

When a user presses a button on their dial pad you would think that a simple deterministic event travels through the network all the way to the recipient. Well that is the case with relatively modern equipment. However much of the telephony infrastructure out there has been deployed 20-30 years ago and even longer. As a result it is quite typical for touch tones to be converted at some point to audio signal and merged with voice on their way to the receiving end. Much of the old equipment was designed for rotary phones. It was not designed to handle touch tones entered in the middle of a call. Believe it or not, there are still many rotary phones out there. So its not a simple matter all around.

Touch tones traveling through the PSTN are encoded as high pitch voice near the end of the human ear spectrum to avoid overlap. That works most of the time. Sometimes side noise, high pitch voices and other artifacts appear in the voice channel which makes it hard to tell apart touch tones. So what happens when PSTN lines connect to a VoIP(SIP+RTP) gateway? At this point we would like to think that we are transitioning to a modern world and touch tones (also known as DTMF for dual tone multi-frequency) become discrete events. There are two popular standards for that: SIP INFO (IETF RFC 2976) or RTP Out of Band event packets (IETF RFC 2833, 4733). The latter – RTP out of band (OOB) events is more widely used, so we will focus on it.

In order for a VoIP gateway to convert PSTN signal carrying DTMF to OOB events, it has to first recognize that they are present. In order for that to happen, the gateway has to follow a few rules. First it needs to maintain a buffer of audio flowing through and analyze it continuously for the presence of DTMF frequencies. This analysis takes processing cycles and it needs to act on a time slice of audio. The minimum recommended duration for a DTMF event is 70ms. So the gateway needs to wait that long before it can acknowledge the beginning of an OOB event. While waiting, the audio flows through and out as regular RTP audio packets. So the result could be a slice of DTMF “leaked” as audio RTP packets followed by OOB RTP event packets. This leak effect causes problems with tone detection later and it causes more problems if it has to be routed out to the PSTN again. It is not unusual for a phone call to cross several time PSTN and SIP/RTP network borders. We should mention that there are multiple flavors of PSTN networks with their own set of transcoding problems.

To make things more interesting lets add a few variables from the IP world. Media traffic travels as RTP packets, which are encapsulated in UDP packets over the IP layer. UDP packets are intrinsically unreliable. Any amount of UDP packets may not go through the internet routers. Nowadays that is not a great problem as a number of service providers offer high quality bandwidth for a relatively low price. But IP network quality is a factor that in no way should be ignored when it comes to real time media. Other than potentially dropped UDP packets, there is also network latency that plays a role. On top of that there is jitter effect, which is the inconsistency of latency between packets.

All the fun issues mentioned above only scratch the surface, but they should give you a flavor of what media server vendors are dealing with. So how is it possible to write automated repeatable regression tests so that the software gets increasingly better over time at some meaningful pace?

One idea we wanted to explore is the ability to somehow replay the media that comes to the media server in a live environment. If we can repeat a realistic scenario in a programmatic way and include it in our regression testsuite and continuous build system, then there would be several benefits:

  • If there is a bug in the media server, we can capture the media traffic once, potentially with customer’s sysadmin/devops help
  • Reproduce the bug on a developer machine as many times as needed, without involving the customer
  • Add a functional test to the testsuite that will prevent regression in the future

These benefits seemed sufficient to give this idea a try and it seemed to work out fine with about 2-3 days of coding and testing.

Following is a step by step description of the process that we now use. It may or may not work for you. Please post your feedback in the comments section below this blog post.

1. Capture network traffic on the production machine

This is usually done with a tool that uses libpcap. For example:

tcpdump -s 0 -vvv -i any -w ~/capture.cap

The resulting capture.pcap file can be viewed with great tools like WireShark. The content would look similar to this:

WireShark screenshot

WireShark screenshot – click to enlarge

2. Filter RTP packets that contain the issue we are trying to solve

There is a lot more than RTP packets in this view, Because of the way network traffic was captured. The next step is to narrow down the view to RTP traffic only. One way to do it is to go to Wireshark’s Telephony menu > RTP > Show All Streams.

Wireshark Screen Shot - RTP menu

Wireshark Screen Shot – RTP menu

At this point we may see multipel RTP streams to and from this particular machine.

Wireshark Screen Shot - RTP Streams

Wireshark Screen Shot – RTP Streams


Using the information for source IP address:port and destination IP address:port we can usually pick the one RTP stream that is causing trouble. So lets select that stream and save it as a separate file. Advanced Wireshark users can type in the filter directly in the main window. Another option is to click on the stream in the RTP Streams window and then click on Prepare Filter. The filter window will display something like:

(ip.src== && udp.srcport==65534 &&
    ip.dst== && udp.dstport==52220)

The next step is to apply the filter. Click on the Apply button next to the filter expression. At this point only the RTP packets matching the filter should be displayed in the main Wireshark window. No TCP, SIP, MySQL or any other protocols should be visible.

Since the capture file usually covers a period of several minutes, it is normal to include a lot of noise that is not essential to the issue at hand. Since we are looking at a DTMF detection problem, let’s find the first RTP packet related to DTMF and then select the RTP packets range shortly before and after the DTMF events are sent. One way to do this is to simply scroll down the packet list and look for event type packets.

RTP Event packets Screen Shot

RTP Event packets Screen Shot

We can clearly see the events packets. They start at packet number 6130. In the pcap file that I am using the event ends at 6150.

Let’s save this range so we can use it for our regression test. Go to File > Save As. In the Save Capture File As dialog, select the Packet Range. In my case thats 6130-6150. For file name, I will use oob-event-test.cap.

To check that we saved the right information, let’s open the file with Wireshark. The packets may be displayed as raw UDP data. In order to see them as RTP, you may need to select Analyze > Decode As and pick RTP. Now you should see the RTP event packets as they were shown in the original file.

The original file size was 3.3MB, which is typical for a few minutes of network capture. The new file size is only 4K. After compressing it with gzip it goes down to 2K. That’s not a bad file size to use in a test suite. I am not worried about filling up the disk space anytime soon even after writing thousands of test cases that use unique pcap files.

3. Writing the test case

Now that we gathered the evidence which exposes a DTMF detection issue in the Mobicents Media Server, it is time to write an automated test that reproduces it repeatably and reliably. We want to be able to work on a fix without asking the user who reported it to test intermediate changes. That could be a long guessing cycle with lost of time wasted in back and forth messages and wait downtime.

We also want to be able to include the test in our regression testsuite which runs continuously on our QA server farm. We want to ensure that once the bug is fixed, it will never reoccur again in our code base.

As it turns out the libpcap file format is relatively simple and there is a small (around 25K) pure Java library that reads the format. The library is named Hadoop pcap and it is evolving actively. I contributed several small enhancements (#7 and #10) which were quickly reviewed and accepted by the project lead Wolfgang Nagele.

Having a base library that reads pcap files is of big help. The only remaining work is to write some code that reads these files and replays them over UDP sockets in a way that is as close as possible to the original traffic. Fortunately there is timestamp information in each pcap packet which tells the time with microsecond precision when a packet was captured on the network. With this information we can ensure that the packets are spaced out at approximately the same time slots as originally received thus preserving the original latency and and jitter effects. Of course, it is not in any way 100% accurate replay of the original timing, but it gets us closer than any other way I could think of. If you have suggestions for improvements, I would love to hear them, please write back in the comments section.

Once the replay code is implemented, the next step is to write a test case. We already have a base test framework for the Media Server based on JUnit. The setup portion of the test has to bootstrap the pcap replay utility an the embedded media server with the components under test : IVR end point and DTMF detector. Finally we can write the test case itself which passes the pcap test file as a parameter and asserts that the DTMF event (in this case its a “#” sign) is detected properly.

We will skip the boilerplate code here and show just the test case body:

public void testDetectionOneHashTone() throws Exception {
    // play pcap file and wait no more than 3 seconds
    //    for the playback to finish
    testOobDetection("oob-event-test.cap.gz", 3);
    assertEquals("Expected one detected event", 1,
    String tone = ((DtmfEvent)detectedDtmfEvents.
    assertEquals("Expected # tone as event", "#", tone);

If you are interested to see the code of the test framework and the full test case follow this link to the MMS git repo.

Hopefully going forward it will be easier for Mobicents Media Server users and contributors to report problems and supply test cases that expose them using real RTP traffic captures. That would greatly reduce the bug fixing cycle and allow more time for adding new value to the project.

If you happen to have experience with testing real time media processing products and would like to share it, please post a comment below. Constructive criticism is more than welcome.

Thank you!

— Ivelin Ivanov


No Comments

Post A Comment

Get awesome content in your inbox every week.

Give it a try. It only takes a click to unsubscribe.