Loading Sound Files - Programming Linux Games - Free Computer, Programming, Mathematics, Techni

SDL provides a convenientSDL LoadWAV function, but it is of little use to programs that don’t use SDL, and it reads only the wave (.wav) file format. A better option is the libsndfile library maintained by Erik de Castro Lopo, which contains routines for loading nearly every common sound format. This library is easy to use, and it is available under the GNU LGPL (formerly the more

restrictive GPL). You might also consider Michael Pruett’s libaudiofile library, a free implementation of an API originally developed by Silicon Graphics. It is a bit less straightforward than libsndfile, however.

Using libsndfile

The libsndfile library makes it easy to read sample data from a large number of sound file formats. Sound files in any supported format are opened with a single library function, and individual samples can then be read with a common interface, regardless of the file format’s encoding or compression. Your program must still take the size and signedness of the sample data into account, but libsndfile provides the necessary information.

The sf open readfunction opens a sound file for reading. This function accepts a filename and a pointer to an SF INFOstructure and returns a pointer to a

SNDFILE structure. TheSF INFOstructure represents the format of the sound data: its sample rate, sample size, and signedness. sf open read fills in this structure; your program does not need to supply this information in most cases. The SNDFILEstructure returned by sf open read is a file handle; its contents are used internally by libsndfile, and they are not important to us. sf open read

returns NULLif the requested file cannot be opened. For the curious, libsndfile does provide a sf open writefacility for writing sound files, but this facility is beyond our present needs.

After opening a sound file and obtaining a SNDFILEpointer, your program can read samples from it. libsndfile provides functions for reading samples as short integers, integers, or doubles. Each sample (8- or 16-bit) is called an item, and a pair of stereo samples (or one mono sample) is a frame. libsndfile allows you to read individual items or complete frames. Thesf readf short,sf readf int, and sf readf doublefunctions read frames from a SNDFILE into buffers of various types. sf readf doublecan optionally scale samples to the interval

[−1..1], which is convenient for advanced audio processing. We will demonstrate this function in the next example.

When your program is finished reading sound data from aSNDFILE, it should close the file with thesf close function.

Function sf open read(filename, info)

Synopsis Opens a sound file, such as a.wav or.aufile, for reading.

Returns Pointer to aSNDFILE structure that represents the open file. Fills in the providedSF INFOstructure with information about the sound data.

Parameters filename—The name of the file to open.

info—Pointer to theSF INFO structure that should receive information about the sound data.

Structure SF INFO

Synopsis Information about the data contained in a sound file.

Members samplerate—Sample rate of the sound data, in hertz. samples—Number of samples contained in each channel of the sound file. (The total number of samples is channels * samples.)

channels—Number of channels contained in the sound file. This is usually 1 for mono or 2 for stereo.

pcmbitwidth—Number of bits per sample.

format—Sample format. Format constants are defined insndfile.h. This chapter only uses a small portion of libsndfile’s capabilities—it can understand a wide variety of encoding formats, not just raw PCM.

Function sf readf type(sndfile, buffer, frames)

Synopsis Reads PCM data from an open sound file and returns it as a particular data type (regardless of the sample’s original format). Possible types areshort,int, and

double. Thedoubleversion of this function takes an extra boolean flag that indicates whether libsndfile should normalize sample values to the range [−1..1].

Returns Number of frames successfully read.

Parameters sndfile—Pointer to an openSNDFILE handle.

buffer—Pointer to a buffer for the data.

frames—Number of frames to read. (A frame consists of one sample from each channel.)

Function sf close(sndfile)

Synopsis Closes a sound file.

Parameters sndfile—Pointer to theSNDFILE to close.

Code Listing 5–1 (mp-loadsound.c)

/* Sound loader for Multi-Play. */ #include <stdio.h>

#include <stdlib.h> #include <sndfile.h>

/* Loads a sound file from disk into a newly allocated buffer. Sets *rate to the sample rate of the sound, *channels to the number of channels (1 for mono, 2 for stereo), *bits to the sample size in bits (8 or 16), *buf to the address of the buffer, and *buflen to the length of the buffer in bytes. 16-bit samples will be stored using the host machine’s endianness (little endian on Intel-based machines, big endian on PowerPC, etc.)

8-bit samples will always be unsigned. 16-bit will always be signed.

Channels are interleaved.

Requires libsndfile to be linked into the program.

Returns 0 on success and nonzero on failure. Prints an error message on failure. */

int LoadSoundFile(char *filename, int *rate, int *channels, int *bits, u_int8_t **buf, int *buflen) {

SNDFILE *file; SF_INFO file_info;

short *buffer_short = NULL; u_int8_t *buffer_8 = NULL; int16_t *buffer_16 = NULL; int i;

/* Open the file and retrieve sample information. */ file = sf_open_read(filename, &file_info);

if (file == NULL) {

printf("Unable to open ’%s’.\n", filename); return -1;

}

/* Make sure the format is acceptable. */

if ((file_info.format & 0x0F) != SF_FORMAT_PCM) { printf("’%s’ is not a PCM-based audio file.\n",

filename); sf_close(file); return -1; } if ((file_info.pcmbitwidth != 8) && (file_info.pcmbitwidth != 16)) {

printf("’%s’ uses an unrecognized sample size.\n", filename);

sf_close(file); return -1; }

/* Allocate buffers. */

buffer_short = (short *)malloc(file_info.samples * file_info.channels * sizeof (short)); buffer_8 = (u_int8_t *)malloc(file_info.samples *

file_info.channels *

file_info.pcmbitwidth / 8); buffer_16 = (int16_t *)buffer_8;

if (buffer_short == NULL || buffer_8 == NULL) {

printf("Unable to allocate enough memory for ’%s’.\n", filename); fclose(file); free(buffer_short); free(buffer_8); return -1; }

/* Read the entire sound file. */

if (sf_readf_short(file,buffer_short,file_info.samples) == (size_t)-1) {

printf("Error while reading samples from ’%s’.\n", filename); fclose(file); free(buffer_short); free(buffer_8); return -1; }

/* Convert the data to the correct format. */

for (i = 0; i < file_info.samples * file_info.channels; i++) { if (file_info.pcmbitwidth == 8) {

/* Convert the sample from a signed short to an unsigned byte */ buffer_8[i] = (u_int8_t)((short)buffer_short[i] + 128); } else { buffer_16[i] = (int16_t)buffer_short[i]; } }

/* Return the sound data. */ *rate = file_info.samplerate; *channels = file_info.channels; *bits = file_info.pcmbitwidth; *buf = buffer_8;

*buflen = file_info.samples * file_info.channels * file_info.pcmbitwidth / 8;

/* Close the file and return success. */ sf_close(file);

free(buffer_short); return 0;

}

This routine serves as Multi-Play’s file loader. It begins by opening a sound file withsf open, checking that the samples are in plain PCM format (libsndfile does support other sample types), and verifying that the samples are either 8-bit or 16-bit (again, other sample types are possible but less common). It then allocates memory for the samples. Since the size of shortintegers can vary from platform to platform, this routine takes a safe approach of allocating a

temporary buffer of typeshort for loading the file and then copying the samples to a buffer of either int16 toru int8 t. This is a bit wasteful, and a

production-quality sound player should probably take a different approach (such as using sizeofto determine the size of shortand handling particular cases). With the sample data now in memory, the routine passes the relevant data back to its caller and returns zero.

libsndfile treats all PCM samples as signed and reads them as short,int, or

double. The actual number of bits used by each sample is indicated in the

pcmbitwidth field of theSF INFOstructure. Since our sound player will treat all 8-bit samples as unsigned and all 16-bit samples as signed (the usual convention for sound playback), our loader converts 8-bit samples to unsigned by adding 128 (bringing them into the range [0..255]). Sixteen-bit samples need no special handling, since libsndfile returns samples in the host machine’s endianness (little endian on Intel-based machines and big endian on many others).

Other Options

If for some reason libsndfile doesn’t appeal to you, there are several other options. You could use the libaudiofile library, which has a slightly different API and similar capabilities. libaudiofile has been around for a long time, and it is widely used. You could implement your own wave file loader, but this is extra work. The original wave file format has a very simple structure, consisting of little more than a RIFF header and a block of raw sample data. Unfortunately, some wave files are encoded in strange ways, and these must be either

specifically handled or rejected. Libraries such as libsndfile and libaudiofile handle these quirks automatically.

Another option is to convert your sound files into a simpler format that is trivial to load. The SoX (Sound eXchange) utility is handy for this purpose. SoX can rewrite sound files as raw, headerless data, which you can then load with the C library’sfopen andfread functions. The main drawbacks to this approach are that the sound will be completely uncompressed (assuming that it was

compressed to begin with) and that you will have to keep track of the sound’s sample format by hand. This is a general nuisance, and so it’s better to use a library for loading sound files.

Using OSS

The OSS API is based on device files and the ioctlfacility. UNIX device files

are files that represent devices in the system rather than ordinary data storage space. To use a device file, an application typically opens it by name with the C library’sopen function and then proceeds to exchange data with it using the standardread and writefunctions. The kernel recognizes the file as a device and intercepts the data. In the instance of a sound card, this data would consist of raw PCM samples. Device files are traditionally located in the/dev directory on UNIX systems, but they can technically exist anywhere in the filesystem. The main OSS device files are/dev/dsp and/dev/audio.

This explains how a program can get samples into the sound card, but what about the sound card’s other attributes, such as the sample format and sampling rate? This is whereioctl comes in. Theioctl function (a system call, just like

configure the device before sending data with write. ioctl is a low-level

interface, but there is really nothing difficult about it, so long as you have plenty of documentation about the device file you are working with (hence this book). OSS provides a header file (soundcard.h) with symbols for the various ioctl

calls it supports. Be careful withioctl; its data goes directly to the kernel, and you could possibly throw a wrench into the system by sending unexpected data (though this would be considered a bug in the kernel; if you find such a bug, please report it).

Before we discuss ways to improve OSS’s performance for gaming, let’s examine the code for Multi-Play’s OSS back end.

Code Listing 5–2 (mp-oss.c)

/* Basic sound playback with OSS. */ #include <stdio.h> #include <unistd.h> #include <fcntl.h> #include <sys/ioctl.h> #include <sys/soundcard.h> #include <sys/mman.h>

/* Plays a sound with OSS (/dev/dsp), using default options. samples - raw 8-bit unsigned or 16-bit signed sample data bits - 8 or 16, indicating the sample size

channels - 1 or 2, indicating mono or stereo rate - sample frequency

bytes - length of sound data in bytes

Returns 0 on successful playback, nonzero on error. */ int PlayerOSS(u_int8_t *samples, int bits, int channels,

int rate, int bytes) {

/* file handle for /dev/dsp */ int dsp = 0;

/* Variables for ioctl’s. */

unsigned int requested, ioctl_format; unsigned int ioctl_channels, ioctl_rate;

/* Playback status variables. */ int position;

/* Attempt to open /dev/dsp for playback (writing). */ dsp = open("/dev/dsp",O_WRONLY);

/* This could very easily fail, so we must handle errors. */ if (dsp == -1) {

perror("OSS player: error opening /dev/dsp for playback"); return -1;

}

/* Select the appropriate sample format. */ switch (bits) {

case 8: ioctl_format = AFMT_U8; break; case 16: ioctl_format = AFMT_S16_NE; break;

default: printf("OSS player: unknown sample size.\n"); return -1;

}

/* We’ve decided on a format. We now need to pass it to OSS. ioctl is a very generalized interface. We always pass data to it by reference, not by value, even if the data is a simple integer. */

requested = ioctl_format;

if (ioctl(dsp,SNDCTL_DSP_SETFMT,&ioctl_format) == -1) { perror("OSS player: format selection failed"); close(dsp);

return -1; }

/* ioctl’s usually modify their arguments. SNDCTL_DSP_SETFMT sets its integer argument to the sample format that OSS actually gave us. This could be different than what we requested. For simplicity, we will not handle this situation. */

if (requested != ioctl_format) {

printf("OSS player: unsupported sample format.\n"); close(dsp);

return -1; }

/* We must inform OSS of the number of channels

(mono or stereo) before we set the sample rate. This is due to limitations in some (older) sound cards. */ ioctl_channels = channels;

if (ioctl(dsp,SNDCTL_DSP_CHANNELS,&ioctl_channels) == -1) { perror("OSS player: unable to set the number of channels"); close(dsp);

return -1; }

/* OSS might not have granted our request, even if the ioctl succeeded. */

if (channels != ioctl_channels) {

printf("OSS player: unable to set the number of channels.\n"); close(dsp);

return -1; }

/* We can now set the sample rate. */ ioctl_rate = rate;

if (ioctl(dsp,SNDCTL_DSP_SPEED,&ioctl_rate) == -1) { perror("OSS player: unable to set sample rate"); close(dsp);

return -1; }

/* OSS sets the SNDCTL_DSP_SPEED argument to the actual

sample rate, which may be different from the requested rate. In this case, a production-quality player would upsample or downsample the sound data. We’ll simply report an error. */ if (rate != ioctl_rate) {

printf("OSS player: unable to set the sample rate.\n"); close(dsp);

return -1; }

/* Feed the sound data to OSS. */ position = 0;

while (position < bytes) { int written, blocksize;

/* We’ll send audio data in 4096-byte chunks. This is arbitrary, but it should be a power of two if possible. This conditional just makes sure we properly handle the last chunk in the buffer. */

if (bytes-position < 4096) blocksize = bytes-position; else

blocksize = 4096;

/* Write to the sound device. */

written = write(dsp,&samples[position],blocksize); if (written == -1) {

perror("\nOSS player: error writing to sound device"); close(dsp);

return -1; }

/* Update the position. */ position += written;

/* Print some information. */

WritePlaybackStatus(position, bytes, channels, bits, rate); printf("\r"); fflush(stdout); } printf("\n"); close(dsp); return 0; }

This is the first of five Multi-Play player back ends, and one of three relating to OSS. These players are self-contained and handle all necessary initialization and cleanup internally. Their parameters are self-explanatory: the sound buffer to play is passed as a u int8 tpointer, and various information about the sample format and playback rate is passed in as integers.

Our player begins by opening the/dev/dsp device file. A production-quality player would likely provide some sort of command-line option to specify a

different OSS device, but/dev/dsp is valid on most systems. If the file is opened successfully, the player begins to set the appropriate playback parameters with theioctl interface.

The first step in configuring the driver is to set the sample size. Our loader recognizes both 8-bit unsigned and 16-bit signed samples, and these are the only cases our player needs to account for. After selecting the appropriate constant (codeAFMT S16 NE for 16-bit signed with default endianness, or AFMT U8for 8-bit unsigned), it callsioctlwith SNDCTL DSP SETFMT. Thisioctl call could fail if the driver does not support the requested format, but this is unlikely to happen unless the user has an extremely old sound card. We could instead use

AFMT S16 LEorAFMT S16 BEto explicitly set little or big endianness,

respectively. If thisioctl succeeds, the player sets the number of channels and the sample rate in a similar fashion. It is important to set the number of channels before the sample rate, because some sound cards have different limitations in mono or stereo mode.

The sound driver is now configured, and our player can begin sending samples to the device. Our player uses the standard UNIX writefunction to transfer the samples in 4,096-byte increments. Since sound cards play samples at a limited rate (specified by the sampling frequency), it is possible that these writecalls will block (delay until the sound card is ready for more samples) and thereby limit the speed of the rest of the program. This would not be acceptable in a game, since games typically have better things to do than wait on the sound card. We will discuss ways to avoid blocking in the next section.

When the entire set of samples has been transferred to the sound card, our player closes the/dev/dsp device and returns zero.

Reality Check

The OSS Web site has two main documents about OSS programming: Audio Programming and Making Audio Complicated. The former describes the basic method of OSS programming and is fairly simple to understand. The latter is aptly named; sound programming can quickly devolve into a messy subject. We will now cover some of the ugly details that are necessary for real-world game development. Unfortunately, the simple example we just discussed is woefully inadequate.

Perhaps the stickiest issue in game sound programming (particularly with OSS) is keeping the sound playback stream synchronized with the rest of the game. SDL provided a nice callback that would notify your program whenever the sound card was hungry for more samples, but OSS doesn’t have a callback system. If you try to write too many samples before the card has a chance to play them, thewritefunction will block (that is, it will not return until it can complete). As a result, the timing of your entire program will depend on OSS’s playback speed. This is not good.

A closely related issue is latency. It is inevitable that a game’s sound will lag slightly behind the action on the screen; all games experience this, and it is not a problem so long as the latency is only a small fraction of a second (preferably under ₁₀1 second). Unfortunately, OSS has quite a bit of internal buffer space, and if you were to fill it to capacity by blindly calling write, your latency would go through the roof. Imagine playing Quake and hearing a gunshot sound several seconds after firing the gun—you would probably laugh and switch to a different game!

OSS stores samples in an internal buffer that is divided intofragments of equal size. Each fragment stores approximately 1₂ second of audio data (resulting in a potential latency of 1₂ second). When the sound card is finished playing a fragment, it jumps to the next. If the next fragment has not been filled with samples, the sound card will play whatever happens to be there, which results in audio glitches. Ideally, you want to stay just a few fragments ahead of the sound card, to avoid skipping while keeping latency to a minimum. You can, within limits, specify the number of fragments and the size of each.

To set OSS’s internal fragment size, send the SNDCTL DSP SETFRAGMENT ioctl

In document Programming Linux Games - Free Computer, Programming, Mathematics, Technical Books, Lecture Notes and Tutorials (Page 182-200)