Products Download Order Support About us

VCX. Automatic byte order detection

VCX library

product info

support

Endianness
PCM audio samples in audio streams
Automatic byte order detection in uncompressed audio stream
The way it works in VC components/VCX library
Limitations

Endianness

Term endianness usually means byte ordering in memory storage of data types larger than one byte. For example, a word (two bytes) hexadecimal value 0x1873 may be stored in adjacent memory bytes as the following:

1) least significant byte (LSB) followed by most significant byte (MSB), also called little-endian layout:

memory address value

base+0000 0x73

base+0001 0x18

2) MSB followed by LSB, also called big-endian layout:

memory address value

base+0000 0x18

base+0001 0x73

Same applies to larger data types, for example integer (four bytes) value 0x8E160A3D may be stored as:

1) little-endian layout:

memory address value

base+0000 0x3D

base+0001 0x0A

base+0002 0x16

base+0003 0x8E

2) big-endian layout:

memory address value

base+0000 0x8E

base+0001 0x16

base+0002 0x0A

base+0003 0x3D

There are also other layouts possible, like middle-endian, but they are out of scope of this article. Please refer to

http://en.wikipedia.org/wiki/Endianness

for more information about endianness.

PCM audio samples in audio streams

In this article we define PCM sample as a 16-bit integer value, ranging from -32768 to +32767 decimal, or from 0x8000 to 0x7FFF hexadecimal. Audio stream consists of adjacent PCM samples:

[sample N] [sample N+1] [sample N+2] [sample N+3] ...

Stream may contain more that one channel, in which case they interleave in the steam, but in this article we deal with mono streams only, as automatic detection of byte order in non-mono streams using the algorithm described below may not always produce good results.

Let say we want to store the following PCM samples in memory: 0x0234, 0x0123, 0xFEDC and 0xF987. Samples in mono streams are always stored one by one without interleaving. Depending on organization of memory storage it could be done as:

1) little-endian layout:

memory address value

base+0000 0x34

base+0001 0x02

base+0002 0x23

base+0003 0x01

base+0004 0xDC

base+0005 0xFE

base+0006 0x87

base+0007 0xF9

2) big-endian layout:

memory address value

base+0000 0x02

base+0001 0x34

base+0002 0x01

base+0003 0x23

base+0004 0xFE

base+0005 0xDC

base+0006 0xF9

base+0007 0x87

Note the sample boundaries, marked with different colors here. Depending on layout bytes may be stored in different order within the boundary, but they never "cross" it. That is important for later discussion.

Automatic byte order detection in uncompressed audio stream

When dealing with uncompressed audio streams, especially when they are being transferred over network, care must be taken to retain proper byte order of audio samples in the stream.

Not only the byte order may differ, but the boundary of samples may be unknown. This may happen when data is being transferred over unreliable protocol, like UDP.

Let assume we have received the following sequence of bytes from the network:

stream offset value

+0000 0x02

+0001 0x34

+0002 0x01

+0003 0x23

+0004 0xFE

+0005 0xDC

+0006 0xF9

+0007 0x87

+0008 0xF0

(Yes, that is PCM samples in big-endian layout from the previous example plus one additional byte, but our algorithm must work correctly without this hint :)

We know there are two possible interpretations of this sequence (big- and little-endian), but if we do not know the boundaries of samples, two more additional combinations are possible, making it four different interpretations in total:

1) little-endian, boundaries as is (last sample is ignored, since we have only one byte for it. This last byte will be added at the beginning of next sequence of bytes when it will be received from the network):

sample ## byte value sample value (decimal)

00 0x02 0x3402 (13 314)

0x34

01 0x01 0x2301 (8 961)

0x23

02 0xFE 0xDCFE (-8 962)

0xDC

03 0xF9 0x87F9 (-30 727)

0x87

04 0xF0 0x??F0 (??)

??

2) big-endian, boundaries as is (last sample is ignored, since we have only one byte for it. This last byte will be added at the beginning of next sequence of bytes when it will be received from the network):

sample ## byte value sample value (decimal)

00 0x02 0x0234 (564)

0x34

01 0x01 0x0123 (291)

0x23

02 0xFE 0xFEDC (-292)

0xDC

03 0xF9 0xF987 (-1 657)

0x87

04 0xF0 0xF0?? (??)

??

3) little-endian, adjusted boundaries (we simply ignore the first byte):

sample ## byte value sample value (decimal)

-- 0x02 --

00 0x34 0x0134 (308)

0x01

01 0x23 0xFE23 (-477)

0xFE

02 0xDC 0xF9DC (-1 572)

0xF9

03 0x87 0xF087 (-3 961)

0xF0

4) big-endian, adjusted boundaries (we simply ignore the first byte):

sample ## byte value sample value (decimal)

-- 0x02 --

00 0x34 0x3401 (13 313)

0x01

01 0x23 0x23FE (9 214)

0xFE

02 0xDC 0xDCF9 (-8 967)

0xF9

03 0x87 0x87F0 (-30 736)

0xF0

Notice that sequence of bytes is always the same, and it only the matter of interpretation how to convert it into PCM samples.

Now our task is to decide, which interpretation (byte order and boundaries) should be chosen as proper representation of audio signal.

Let take a look at a sine with PCM samples taken periodically. Each vertical line represents one PCM sample:

As you can see, adjacent samples do not differ much, but rather have a tendency to change slowly into some direction. That assumes audio signal has enough low frequencies for selected sampling rate.

We can use that tendency as a basis of our algorithm. We calculate the sum of differences between adjacent samples in all four interpretations as the following:

1) little-endian, boundaries as is:

sample ## sample value difference

00 13 314 0

01 8 961 13 314 - 8 961 = 4 353

02 -8 962 8 961 + 8 962 = 17 923

03 -30 727 -8 962 + 30 727 = 21 765

Sum: 44 041

2) big-endian, boundaries as is:

sample ## sample value difference

00 564 0

01 291 273

02 -292 583

03 -1 657 1 365

Sum: 2 221

3) little-endian, adjusted boundaries:

sample ## sample value difference

00 308 0

01 -477 785

02 -1 572 1 095

03 -3 961 2 389

Sum: 4 269

4) big-endian, adjusted boundaries:

sample ## sample value difference

00 13 313 0

01 9 214 4 099

02 -8 967 18 181

03 -30 736 21 769

Sum: 44 049

The final step is to select the interpretation with minimal sum of differences between samples. In our case it is the interpretation number two � big-endian, boundaries as is (exactly as hint has suggested :).

Notice, that interpretations 2) and 3) are almost similar, same as interpretations 1) and 4). That is because shifting the boundaries by one byte is almost similar as switching from little- to big-endian, when samples are audio samples, i.e. do not differ much from each other.

As you can see, even for such a sort sequence of bytes it is possible to detect proper order of PCM audio samples. The longer the sequence, the better should be the guess. In real applications it is usually enough to analyze about 1/20 sec of audio signal (400 samples for 8000Hz sampling rate).

Please also note, that if you are using reliable protocol (like TCP), the boundaries and byte order may be guessed only once and then applied to all subsequent data. When using unreliable protocol (like UDP), it may be necessary to apply the guess at each sequence (data packet).

The way it works in VC components/VCX library

Our VC components and VCX library products do include byte order and samples boundaries auto detection algorithm described above. By default it is turned off, so you have to choose one of the following methods if you wish to enable it for incoming and/or outgoing streams:

method description property value

Don't care (default) Auto detection is disabled. unasbo_dontCare

Swap Always swap the bytes. This converts audio stream from big-endian to little-endian and vice versa. Useful for output streams, or input streams with known byte order and boundaries transferred over reliable protocols. unasbo_swap

Auto-detect once Enable byte order and boundaries auto-detection algorithm. It analyzes the first data packet only, and apply same order and boundaries for all subsequent packets. Useful for reliable protocols (like TCP). unasbo_autoDetectOnce

Auto-detect continuously Enable byte order and boundaries auto-detection algorithm. It analyzes all data packets as they arrive. Useful for unreliable protocols (like UDP). unasbo_autoDetectCont

Assign the selected property value to streamByteOrderInput and/or streamByteOrderOutput property to specify which method to be used with incoming or outgoing data.

Limitations

IPServer or IPClient must be working in RAW streaming mode
16-bit uncompressed mono audio streams only