« July 2005 | Main | September 2005 »

August 27, 2005

Fax facts

As VoIP becomes more prevalent as a replacement for traditional PSTN connections, the issue of how to handle fax traffic becomes more and more of an issue. Fax imposes special demands on VoIP networks because the standards used for fax transmission are designed to exploit features of switched circuits that do not fit within the parameters of most VoIP equipment.

A fax terminal works by scanning the source image and converting it into a stream of pixels that are transmitted to the remote terminal using a data protocol called T.30. This protocol defines how the pixel data is compressed, and how to attach additional meta-data to the transmission like the source terminal number, resolution, number of pages etc. T.30 is an interactive protocol that is composed of commands and responses that allow the terminals to negotiate capabilities at run-time.

T.30 is transmitted over a switched circuit voice line by converting the bitstream into modulated tones. This is the familar warbling that you hear when you accidentally pick up on a fax call. But these tones are merely the modulation technique used to transmit the T.30 data over a fixed bandwith communications channel. The actual payload is the underlying T.30 fax commands transmitted at baud rates between 300 and 19200 baud - the tones themselves are not really important. In this regard, a fax machine is exactly the same as a data modem, except that T.30 defines the higher protocol layers rather than just providing a raw bit stream like a data modem.

T.30 tone modulation is designed to work over a circuit switched voice connection, which these days is usually a 64 kbps voice channel sampled 8000 times a second. This approximates the previous generation analogue phone technology that used a copper wire pair with an approximate bandwidth of 3100 hz. These circuit switched connections have low (and constant) latency, have no jitter, but may experience loss or dropouts. The T.30 modulation is designed for this environment - it exploits the full bandwidth of the circuit switched connection (well, as far as the modulation technology of the era allowed), and the T.30 protocol implements error detection, and error correction, to solve the drop-out and signal noise problems

Given all of this, it might seem that T.30 could be used over a G.711 VoIP connection, because these are intended to emulate ye olde analogue copper wire circuit connection just like a BRI. After all, fax is just audio tones and G.711 is (by definition) the same audio modulation used on PSTN for BRI or PRI lines. But as many people will tell you, fax over G.711 only works reliably over a LAN network, and becomes very unreliable if used over the public Internet.

So why is this?

Nobody uses G.711 for long haul VoIP because it is very susceptible to jitter due to the heavy load on the underlying network. Voice calls tend to break up unless every hop in the underlying network can easily handle the raw 64 kbps data rate, which is actually closer to 80 kbps if a frame size of 30 msec is used.

The circuit switched networks for which T.30 was designed don't have jitter, and the modulation techniques and protocols specified by T.30 contain no provision for dealing with it. Unsuprisingly then, T.30 simply can't deal with the variable jitter of a VoIP connection, although it tries (and fails) valiantly.

The problem is made worse if a codec other than G.711 is used, because most compressed codecs are based on a psycho-acoustical analysis of human speech and remove portions of the signal that, although not important to human voice communication, are vital for modulated data signals. This is the same reason why in-band DTMF tones don't work over compressed codecs, by the way.

The obvious solutions to this problem is to ensure the underlying network can support G.711 calls that are of sufficient quality to allow fax calls to work. Presumably this is what the Vonage network does, as they used to use G.711 (and may still) and they also advertised full fax functionality.

At this point, it's worth reviewing what happens when sending a fax over a VoIP connection using T.30 audio tone modulation.

The source image is converted into a stream of bits which is encoded into a stream of T.30 commands. This bit stream is then converted into an analog waveform that corresponds to an audio frequency tone with specific modulation characteristics. An approximation of this waveform is then created by sampling it 8000 times per second using a non-linear encoding (G.711), and the samples are grouped into 30ms chunks of audio data, and then transmitted over an IP network. The receiver re-assembles the received samples to recreate the analogue waveform and then demodulates it to retreive the original T.30 bit stream. The image data is then extracted from this bit stream.

In other words, the analog image is converted into T.30 digital data which is converted into an analogue waveform, which is then digitised, sent as digital data over an IP network, converted *back* into an analogue wavform, converted yet back yet again into a T.30 digital bit stream to display an approximation of the original analog image. Analog to digital to analog to digital to analog to digital to analog again. How crazy is that? I think it is amazing that it works at all!!

Fortunately, other engineers have also looked at this Rube Goldberg chain of technology and quickly understood that there is an obvious solution. Given that the T.30 command stream is already digital data, it can be sent it over an IP network as is. No need to convert it into analogue tones - just send the raw T.30 data inside IP packets. The standard that defines how this is done is the well-known, but often misunderstood, ITU T.38 standard.

In one stroke, T.38 solves most of the problems with sending fax over an IP network. A T.38 fax call uses 20% of the bandwidth than the modulated audio approach because it is now a stream of bits at an average speed of 14400 bps rather than a stream of audio samples at 64000 bps.

Reliability is now greatly increased because the portion of the decode chain that was susceptible to jitter (the analogue demodulator) is no longer needed. To increase it even further, T.38 allows for the inclusion of redundant data to prevent errors caused by the occasional loss of packets when using a transport with non-guaranteed delivery, like UDP.

As far as implementation is concerned, T.38 does not require any analog modulation or demodulation, because it is only concerned with encapsulating and de-encapsulating a raw T.30 bit stream. As such, it requires much less CPU horsepower than the audio tone approach.

T.38 should not be confused with the similar-sounding T.37 which uses a totally different approach. Whereas T.38 is intended for realtime fax transmision using an encapsulated T.30 data stream, T.37 is intended for "store and forward" applications. It requires the fax data to be converted into TIFF format and then encoded using base64 into a text message and then transmitted using SMTP.

Hopefully this article has helped dispel some of the mystery surrounding fax over IP. Efficient fax transmission over an IP network is acheived by understanding that a fax call is not about sending and receiving modulated audio tones - these are simply the legacy of the old analog phone network and it far more efficient to simply deal with underlying bit stream and send that over the IP data network instead.

Posted by CraigS at 11:20 PM | Comments (3)

August 19, 2005

Believing the impossible

Time for one of my pet peeves.

Three times in the past week I have been been faced with blatantly incorrect behaviour in someone else's code. Something really obvious, like a segmentation fault, or a return value set to something stupid, or a function that reports an incorrect input value when the the offending argument is obviously within the valid range.

In each of these cases I spent the time to isolate the problem and create a simple test program that demonstrated the failure condition.

In all three cases the author, when presented with this information, uttered those four hated words:

"It works for me"

If this happens again, I may just have to do serious damage to someone or something.

I've written on this subject in my blog before, but it still staggers me that supposedly experienced software authors still make this most fundamental of mistakes.

Let's just imagine you are a software author, and somebody has come to you with a problem that they think is in your code. And let's say that the user is not one of the clueless "why is my cursor stuck" kind - they are a fellow software developer who is highly motivated, who wants the software to work, and has done all of the kinds of things you would do if you were looking for this kind of problem. Things like isolating the bug to a specific set of failure modes, making sure it is reproducable, looking at alternate environments to see if that affects the failure mode - the usual stuff.

Given this information, you look at the report and think "Hey, that is impossible. There is no way the code can do that, and I should know because I wrote it". You might even try and reproduce the problem yourself and of course, it "works for you".

This means you have to make a choice on how to proceed. Do you:

a) Assume the person has got it wrong and there is no problem.

b) Assume that there is a problem but you can't see it for some reason.

Of these two choices, a) seems like most attractive because it is the one that involves the least work for you. But really, how realistic is this? For this to be the correct course of action, the poor sod reporting the bug would have to be so deluded that they have managed to concoct a whole story with no basis in reality simply for your benefit. Are you really so self-centred as to believe that people have nothing better to do than create works of fiction in order to bother you? Isn't there even the smallest chance that perhaps, just perhaps, there is a bug in your code that under some circumstances can exhibit the behaviour being seen? Or are you that perfect that this is simply not possible...

So really, the only sensible choice is b) - they actually have found a problem which for some reason does not occur on your system. So, you are to have to try and find it, or at least, provide them with some more information so they can try and isolate the problem further. Differential diagnosis can be very helpful in this kinds of situations.

The reverse often happens as well: you are using someone else's code and you can't even get it to work. Supposedly the code works, but damn it, you can't even get off first base. In this situation, you once again have two choices:

a) Decide that the code is fundamentally flawed if you can't get it to work, so it must be truly broken. Report this as a bug and do something else until it is fixed.

b) Realise that other people have used the code, and so you have failed to understand something that perhaps is not obvious.

You'd be suprised how many people chose a)...

Here is a real-world example: I was asked recently to evaluate a video codec from a vendor who had created a highly optimised codec for a particular hardware platform. I wrote a benchmark program, and integrated the codec as per the documentation provided by the vendor. I could push frames into the encoder, which created a nice bitstream that was supposedly compliant with the codec specification. The bitstream had the right bitrate, but I could not decode the bitstream using a reference implementation of the same codec. However, the decoder from the same vendor recreated the source images just fine.

It was tempting to just decide that the code was just crap because, although it was self-consistent, it was demonstrably non-compliant to the specification. But I also knew that the vendor has delivered this code to other companies, and that it had been used before. So even though it was not compliant, it must be possible to make it work. I just had to find out how.

It took me two days, but I eventually discovered that if I reversed each byte in the bitstream end for end, and then reversed three bits in the first byte that specified the codec mode, then the bitstream magically became compliant and worked just fine. This was not documented anywhere, and the vendor was unable (or unwilling) to explain why this was needed. By assuming that the vendor did actually have somewhat of a clue, I was able to find a way to make it work. Of course it would have been nice if the code had done the right thing, or even if the documentation had been correct, but at least I did not embarrass myself by claiming the codec was garbage and then having the vendor prove me wrong.

So please, the next time a user report a bug that seems impossible, then remember what Sherlock Holmes said:

"How often have I said to you that when you have eliminated the impossible, whatever remains, however improbable, must be the truth?"

Posted by CraigS at 01:51 AM | Comments (2)

August 16, 2005

History shock

If you are reading this, the chances are that you one of those lucky people who are completely familiar with the Internet, computers and all of the hi-tech gadgets and goodies that go along with them. Let's put it another way - I'd be willing to bet that the only reason your VCR is flashing "12:00" right now is because you've not used it since the last time the power went off. And that's because you use a DVD (when you rent a movie) and you use a PVR for recording those rare shows you can't get via BitTorrent.

Obviously, I'm completely into the tech thing to the extent that my budget and time allow. My house is covered by WiFi, all three of my kids have their own computers (my eldest son has two on their own subnet) and everyone has the level of Internet access deemed appropriate to their age and experience.

We have a shared media server with all of our music on it, which we seriously need because my kids have extremely eclectic tastes. As an example: I picked up a CD burnt by one of my kids a few weeks ago which contained tracks by Ramstein, some Bach, a few tracks from "The Secret Garden", and some songs by "Flander and Swan".

As far as TV is concerned, we don't have satellite or cable (I refuse to pay $80 per month for a stack of sports channels I will never use in order to get the few channels I do want). Our family policy is that nobody has TVs in their bedrooms (my wife and I included) and the one TV with an antenna is in the family room where anybody can use it. We watch the few free-to-air programs we like and then get what else we want via BitTorrent. Between us, we watch Stargate Atlantis, Stargate SG-1, Alias, the various elements of the CSI franchise, ER and The West Wing. We've also grabbed TV series that we all like that can't be had on free to air, such as "Scrapheap Challenge" (which we call "Skrothog" because that is the what it is called in Swedish and most of the BitTorrent files for Scrapheap Challenge have Swedish subtitles), Daria and Reboot.

It's very, very easy to get used to being surrounded by this warm glow of technology. So much so, that I was taken by suprise the other day when someone was astonished that I had already seen the latest episodes of just about anything worth watching on Australian free to air TV, even before the series had been advertised (let alone shown).

In that instant, I realised (again) that I am a member of a very small subset of the community that had access to this kind of technology. Most people think they have to wait two years for US TV shows to be shown in Australia. Most people still pay full-price for long distance and international phone calls. Lots of people have iPods, but very few of them realise that devices like this have been around since 1999. Most people see telephones and computers being completely different devices that don't really have anything to do with each other.

Some of you might remember a book from the 70's by Alvin Toffler called "Future Shock". The book title came from the name Toffler gave to the disorientiation felt by people who felt overwhelmed by the relentless arrival of new technologies that they did not understand.

I keep feeling the reverse of this (can we call it "history shock"?) when I hear someone complaining about how there is nothing on TV (why not use BitTorrent and get nearly anything you want?). I get it when someone complains about the cost of international phone calls (international phone calls are cheaper for me than long distance). I get it when someone says they can't find a phone number (why not use one of any number of Internet resources)

I'm not saying that "history shock" is a bad thing. Mostly, it is a reality check for the techno-geek that lets them know that they are living in a world that is detached from the "reality" that most people experience. This can be useful as a prompt to be more understanding of those who are not paid-up members of the digerati. It can also be a pointer to a possible opportunity to make money by converting a hard-to-understand technology into something everyone can use.

It's also good to remember that even uber-geeks can get so wrapped up in their toys that they miss good stuff going on elsewhere. History shock helps me know when to take time off. That, and my kids starting to roll their eyes :)

Time go and sleep.

Posted by CraigS at 01:14 AM | Comments (0)

August 06, 2005

Quick report from ClueCon

Sorry that I've not had time to keep the posts flowing, but I've been at ClueCon in Chicago, and things have been more than a little hectic.

First of all, I arrived on Tuesday evening at 5pm local Chicago time after 24 hours of continuous travel. I was dehydrated and jetlagged, so of course I went out with the other delegates and had a few drinks. I can remember until about 9:30pm, and then I woke up in my hotel room at 3:30am. Everyone says I had a great time, but damned if I can remember anything. I remember reading somewhere that everyone should give up drinking once they turn 40 - I think that time may have come.

But first of all, let me say that ClueCon seriously rocks. Not only have we been having a great time, but we've been getting some great work done. The talks have been interestings, and the speakers even more so. Because this is the US, there is a very definite Asterisk flavour to everything, but regardless there have been lots of really smart people who are ready to talk about all aspects of VoIP. I've learnt more about PRIs and TDM in the last few days than I managed to absorb in the previous 5 years.

Brian and Tony from AsterLink have done a fantastic job in looking after everyone and doing their best to keep everyone happy. I know I'll be doing my level best to come to the next ClueCon.

The big highlight for me was spending time with the guys from Sangoma, especially Gideon and Nenand. I'd forgotten how much fun it is to work with smart, motivated people who love their work and really care about their products.

I announced the release of Derek's IAX2 code during my talk, and that got a round of applause. As it should - he has done some great work and I know that lots of people will be looking at that code very closely over the next weeks.

There is no way I will be able to remember everything that has happened so expect to keep hearing about various stuff over the next few weeks. I'll try and do some more posts as and when I remember stuff.

Oh byeah - Net connectivity is really good (most of the time) so I've been able to catch up on OpenH323 and OPAL patches. I'm going to try and get some coding done this weekend - or at least I'm going to try.

Posted by CraigS at 02:32 AM | Comments (4)