Creating conferencing system

Subscribe to RSS news feed

In addition to peer-to-peer communication, and one-to-many streaming, VC components also include a high performance audio conference server with multiply room support (provided with Enterprise license only).

Audio conference is a multi-party communication process, where every participant can hear all other parties. Like in everyday meetings, people connected to a conference server can collaborate on a joint topics.

Each client records and sends voice stream to the server, and receives special mixed stream for playback. Since our server supports many rooms, upon successful connection client must specify the room it wants to enter by sending a special packet to the server. Each room has unique name and optional password usually provided by end-user.

On a screenshot below the client was allocated seat #1 in the room "test@server.com". Notice the "Room Name" and "Room Password" input fields.

Multi-room audio conference client

High performance server considerations

The heavy task of audio decoding, mixing, encoding is put on the server. This way we sacrifice CPU cycles to save the bandwidth. Two important considerations must be noted therefore:

  • the more powerful PC you have, the more clients the server can handle. Especially it is more important to increase the number of processors (cores) rather than CPU working frequency.
  • server software must be written in a scalable way, so it can take advantage of multi-core and/or multi-processor architecture.

While the first issue is under your control, we control the second one. Here is what was done about it in January 2008 release of VC components.

First, our underlying IP server component was completely rewritten to adapt to the new model of operation. This model is based on I/O completion ports (IOCP) and asynchronous (overlapped) sockets operations. The advantage of this model is salability of server software, which means it can effectively put every processor (core) in use to handle a heavy load.

(NOTE. For our server to run in new mode, Windows NT Workstation 3.5 or Windows 2000 Professional or later is required).

Second, we have rewritten the mode in which our ACM codec component decodes and encodes audio data. In this new mode multiply codec components can perform data processing from one thread, reducing the amount of unnecessary context switching. Before January 2008 release 100 codec components would require 100 threads for operation. Assuming two codecs are needed per one client connection (one for decoding and one for encoding), 200 threads would be created per 100 clients, restraining previous version of server from handling large amount of clients.

One server, many rooms

In addition we have also extended our server so now it can handle several isolated rooms, still listening on one port. Rooms may be created automatically, when new client connection is accepted and non-existing room name is provided by it, or rooms may be created ahead by administrator.

On a screenshot below the room "test.room@conf.server" was created by administrator and has no participants so far.

Multi-room audio conference server

Technical details: Client

Conference client and server sources can be found in the "<VC_root>\demos\ConfRoom\" folder.

Client is implemented as the unaConfClientClass class, defined in classes\unaConfClient.pas unit. It is basically a collection of WaveIn, Codec, IPClient and WaveOut components put together to perform live audio communication with server.

When client receives cmd_inOutIPPacket_hello packet from server, it replies with filled unaConfRoomPacket_clnIDKey record, defined in classes\unaConfCommon.pas unit. This record specifies the room name and room password provided to client by end-user or other means. Server than tries to create the room (if it is not already exists and room auto-creation is allowed) and assign a seat for client. When done, server replies with punaConfRoomPacket_srvReply record, so client can analyze the state of request, or sends cmd_inOutIPPacket_outOfSeats command if there are no room for new client connection.

Please refer to client\ConfRoomCln.dpr sample for more information.

Technical details: Server

Server is implemented as the unaConfServerClass class, defined in classes\unaConfServer.pas unit. It stores client connections, rooms, codecs and mixer, as well as IPServer component.

The following global constants specifies server limits and must be assigned according to your needs for optimal performance:

c_maxClients - maximum number of clients per server;
c_maxRooms - maximum number of rooms per server;
c_maxClientsPerRoom - maximum number of clients per room.

The main job is done in unaConfServerClass.timer() method, where server decodes audio streams received from clients, mixes and encodes them back.

Note that if server was compiled with "DEBUG" symbol defined, it will send client's audio back to client, if there is only one participant in a room.

Final notes

It is strongly recommended to use UDP sockets for audio streaming with conference server, as they dramatically reduce the CPU load on it. As was noted before, server sacrifices CPU to save the bandwidth.

Please note, that Windows 2000 Professional has maximum processor limit of 2 and is not aware of Hyper-Threading technology. Refer to this article for more information.

It is not recommended to run server on Windows 98/Me, unless you have small number of clients. Server will operate in old "select()" sockets mode in this case.

Download

Precompiled binary of this sample is included in the Demos package.