Learning Sound for Game Programming using SDL

When we think of video games, we often think of the music and sounds that accompany them. Providing those special effects used to be very difficult. Keep reading to learn how SDL makes this important task very easy.

A game without audio is like a buffet without spice. Without sound bytes games can be played, but they would fail to providing an immersive environment. Before the coming of SDL, sound effects were either very complex to implement or very limited in output.

Then came SDL with its core and extended libraries. The core library provides the ability to work with wav files. Using the extended libraries, sound formats such as mid, mpeg-1 and others can be integrated into the gaming environment. In the first three sections I will discuss the core audio library. The final section will use the APIs introduced in the first section to create an application that is extensible for future projects.

Playing the Sound with SDL

Sound is one of the sub-systems of SDL. But unlike other sub-systems, sound not only needs to be initialized, but also opened in a way that is akin to setting up the video mode. Even then sound can be used only by using playing routines. In essence there are three steps to using sound within an application. They are:

  1. Initializing audio
  2. Opening the audio
  3. Playing the sound

It is in the second step that the format, track rate and more comes into the picture. The following are the details of each step.

Initializing audio

The first step in using audio in an application is initializing the audio subsystem. This is done by passing the parameter referring the audio subsystem i.e. SDL_AUDIO. To put it in code:

SDL_Init(SDL_AUDIO);

This is no different than initialization of any other sub-system.

{mospagebreak title=Opening the audio}

To open anything, be it a file or a socket, certain data has to be passed to the environment such as file name, mode and so forth. Opening the audio is no different. The data required to be passed include frequency, format and more. In order to provide these to the environment, SDL_OpenAudio() is used. This method takes two parameters; both are references of type SDL_AudioSpec which is a structure. The members of this structure are:

The freq is an integer representing the frequency of the sound to be played. It is measured in samples per second. The common values are 11025, 22050 and 44100. The higher the value, the higher the frequency and the better the quality.

The format of the audio is represented by format. The data type is UInt16. The format means the size and type of samples being sent. The common value is AUDIO_S16. The other acceptable values include AUDIO_U16  and AUDIO_U8. The U stands for unsigned bits, the S stands for signed bits and the number represents the bits in the samples. Hence the value AUDIO_S16 represents a sample having 16 bits which are unsigned.

Channels is a value that refers to the number of separate channels to be used. A value of 1 indicates that mono (single channel) and a value of 2 indicates that stereo channel must be used.

Samples refers to the size of the audio buffer in samples.

Callback takes the pointer to the function that would be used to fill the audio buffer. The function takes user data, stream and length of the user data as parameters.

Apart from these, the other members include an unsigned 8 bit integer representing silence, UInt32 representing the size of the buffer and a void pointer to the user data.

In code it would look like this:

  SDL_AudioSpec wanted;
  void fill_audio(void *udata, Uint8 *stream, int len);

  /* Set the audio format */
  wanted.freq = 22050;
  wanted.format = AUDIO_S16;
  wanted.channels = 2; /* 1 = mono, 2 = stereo */
  wanted.samples = 1024; /* Good low-latency value for
callback */

  wanted.callback = fill_audio;
  wanted.userdata = NULL;

where the frequency is 22050, the format to be used is in 16 bit unsigned integer, the channel is stereo, the size of the audio buffer in the sample is 1024 and the function is fill_audio. There is user data to be passed. The next step is playing the audio.

{mospagebreak title=Playing the Audio}

Playing the audio not only means filling the buffer with required data but also loading the audio file to be played. The functions required are the callback function and the file playing functions. The file playing functions include SDL_LoadWav, SDL_BuildAudioCVT, SDL_ConvertAudio and SDL_FreeWAV.

SDL_LoadWav loads a wav file and returns the given SDL_AudioSpec with the corresponding data filled. The first parameter is the name of the wav file. The second is SDL_AudioSpec. If successful the third parameter would contain a malloc’d buffer that contains the audio data and the last parameter would have the length of the malloc’d audio buffer. In code it would be:

SDL_AudioSpec wave;
Uint8 *data;
Uint32 dlen;
char *file;
SDL_LoadWAV(file, &wave, &data, &dlen);


The above code would load the file represented by file into the data and set its specifications into wave and the length of the buffer into dlen.

To actually use the data it must be converted, for which SDL_AudioCVT structure is used. This structure must be initialized. The function to initialize the structure is SDL_BuildAudioCVT. The parameters are a pointer to the SDL_AudioCVT structure, a format of the source in UInt16, channels in the source in UInt8, the rate of the sample in int, a format of the destination in UInt16, channels in the destination in UInt8, the rate of the sample of destination in int where the source and destination are the formats of conversion. In code:

SDL_BuildAudioCVT(&cvt, wave.format, wave.channels, wave.freq, AUDIO_S16,2, 22050);

where cvt is the SDL_AudioCVT structure, wave.format, wave.channels, wave.freq are the format, channels and frequency of source format and AUDIO_S16,2, 22050 are the format, channels and frequency of destination format. Discussing SDL_AudioCVT is beyond the scope of this article. I will be discussing it in the near future.

The SDL_ConvertAudio function converts one format of audio to another. It takes only one parameter, the previously initialized SDL_AudioCVT. It converts the data pointed to by the buffer of the SDL_AudioCVT member. To understand it fully let’s have a look at some detailed code. The comments are self explanatory:

SDL_AudioSpec *desired, *obtained;
SDL_AudioSpec wav_spec;
SDL_AudioCVT wav_cvt;
Uint32 wav_len;
Uint8 *wav_buf;
int ret;

/* Allocated audio specs */
desired=(SDL_AudioSpec *)malloc(sizeof(SDL_AudioSpec));
obtained=(SDL_AudioSpec *)malloc(sizeof(SDL_AudioSpec));

/* Set desired format */
desired->freq=22050;
desired->format=AUDIO_S16LSB;
desired->samples=8192;
desired->callback=my_audio_callback;
desired->userdata=NULL;

/* Open the audio device */
if ( SDL_OpenAudio(desired, obtained) < 0 ){
  fprintf(stderr, "Couldn’t open audio: %sn", SDL_GetError());
  exit(-1);
}
free(desired);

/* Load the test.wav */
if( SDL_LoadWAV("test.wav", &wav_spec, &wav_buf,
&wav_len) == NULL ){

  fprintf(stderr, "Could not open test.wav: %sn",
SDL_GetError());

  SDL_CloseAudio();
  free(obtained);
  exit(-1);
}

/* Build AudioCVT */
ret = SDL_BuildAudioCVT(&wav_cvt,
wav_spec.format, wav_spec.channels, wav_spec.freq,
obtained->format, obtained->channels, obtained->freq);

/* Check that the convert was built */
if(ret==-1){
  fprintf(stderr, "Couldn’t build converter!n");
  SDL_CloseAudio();
  free(obtained);
  SDL_FreeWAV(wav_buf);
}

/* Setup for conversion */
wav_cvt.buf=(Uint8 *)malloc(wav_len*wav_cvt.len_mult);
wav_cvt.len=wav_len;
memcpy(wav_cvt.buf, wav_buf, wav_len);

/* We can delete to original WAV data now It is coming up
next*/

SDL_FreeWAV(wav_buf);

/* And now we’re ready to convert */
SDL_ConvertAudio(&wav_cvt);

/* do whatever */
.

Once building and conversion is done, the file loaded into the user data has to be released, as it is no longer required. The conversion provides it to the application as a part of the buffer of the SDL_AudioCVT buffer member. To release the memory occupied by user data, SDL_FreeWAV has to be used.

In code:

SDL_FreeWAV(wav_buf);

That covers the functions. The next section will show how to use then to play the sound.

{mospagebreak title=Playing the sound in the real world}

Up to now I have shown you the code snippets. Now it’s time for a full fledged application. So here goes.

First the includes:

#include "SDL.h"
#include "SDL_audio.h"

Then comes the main part and opening the audio:

int main()
{
 
extern void mixaudio(void *unused, Uint8 *stream, int
len);
 
SDL_AudioSpec fmt;

  /* Set 16-bit stereo audio at 22Khz */
 
fmt.freq = 22050;
 
fmt.format = AUDIO_S16;
 
fmt.channels = 2;
 
fmt.samples = 512; /* A good value for games */
 
fmt.callback = mixaudio;
 
fmt.userdata = NULL;

  /* Open the audio device and start playing sound! */
 
if ( SDL_OpenAudio(&fmt, NULL) < 0 ) {
   
fprintf(stderr, "Unable to open audio: %sn",
SDL_GetError());
   
exit(1);
 
}

  //can call other functions like mixing and playing
functions
 
:
 
PlaySound("start.wav");
 
:
 
SDL_CloseAudio();//closes the audio and setting the fmt
to null
}

The next part is playing the wav file. For this we need a structure that keeps track of the current sound data, position and length. The following is the structure:

#define NUM_SOUNDS 2
struct sample {
 
Uint8 *data;
 
Uint32 dpos;
 
Uint32 dlen;
} sounds[NUM_SOUNDS];

The next part is playing the file. The function goes likes this:

void PlaySound(char *file)
{
 
int index;
 
SDL_AudioSpec wave;
 
Uint8 *data;
 
Uint32 dlen;
 
SDL_AudioCVT cvt;

  /* Look for an empty (or finished) sound slot */
 
for ( index=0; index<NUM_SOUNDS; ++index ) {
   
if ( sounds[index].dpos == sounds[index].dlen ) {
     
break;
   
}
 
}
 
if ( index == NUM_SOUNDS )
   
return;

 
/* Load the sound file and convert it to 16-bit stereo
at 22kHz */
 
if ( SDL_LoadWAV(file, &wave, &data, &dlen) == NULL ) {
   
fprintf(stderr, "Couldn’t load %s: %sn", file,
SDL_GetError());
   
return;
 
}
 
SDL_BuildAudioCVT(&cvt, wave.format, wave.channels,
wave.freq,
AUDIO_S16, 2, 22050);
 
cvt.buf = malloc(dlen*cvt.len_mult);
 
memcpy(cvt.buf, data, dlen);
 
cvt.len = dlen;
 
SDL_ConvertAudio(&cvt);
 
SDL_FreeWAV(data);

  /* Put the sound data in the slot (it starts playing
immediately) */
 
if ( sounds[index].data ) {
   
free(sounds[index].data);
 
}
 
SDL_LockAudio();
 
sounds[index].data = cvt.buf;
 
sounds[index].dlen = cvt.len_cvt;
 
sounds[index].dpos = 0;
 
SDL_UnlockAudio();
}

Finally we have to define the callback function for the SDL_AudioSpec which is:

void mixaudio(void *unused, Uint8 *stream, int len)
{
 
int i;
  Uint32 amount;
 
for ( i=0; i<NUM_SOUNDS; ++i ) {
   
amount = (sounds[i].dlen-sounds[i].dpos);
   
if ( amount > len ) {
     
amount = len;
   
}
   
SDL_MixAudio(stream, &sounds[i].data[sounds[i].dpos],
amount, SDL_MIX_MAXVOLUME);
   
sounds[i].dpos += amount;
 
}
}

That brings us to the end of this article. I have left several aspects unexplained. The reason is that just explaining them as stand alone functions wouldn’t do any good. They have to be understood in the context of rendering and scenes. SDL_MixAudio for mixing and the APIs for timer, threading, networking and CD-ROM access are among such functions. In the next article in this ongoing series I will be moving towards rendering using OpenGL with SDL as the base framework. In the rendering and animations, the real utility of the above mentioned APIs will be revealed. Till next time…

[gp-comments width="770" linklove="off" ]

chat