1.. _devguide-coding-audio: 2 3##### 4Audio 5##### 6 7.. contents:: 8 :local: 9 :backlinks: none 10 :depth: 2 11 12This chapter describes how to use the Pepper audio API to play an audio 13stream. The Pepper audio API provides a low-level means of playing a stream of 14audio samples generated by a Native Client module. The API generally works as 15follows: A Native Client module creates an audio resource that represents an 16audio stream, and tells the browser to start or stop playing the audio 17resource. The browser calls a function in the Native Client module to fill a 18buffer with audio samples every time it needs data to play from the audio 19stream. 20 21The code examples in this chapter describe a simple Native Client module that 22generates audio samples using a sine wave with a frequency of 440 Hz. The module 23starts playing the audio samples as soon as it is loaded into the browser. For a 24slightly more sophisticated example, see the ``audio`` example (source code in 25the SDK directory ``examples/api/audio``), which lets users specify a frequency 26for the sine wave and click buttons to start and stop audio playback. 27 28Reference information 29===================== 30 31For reference information related to the Pepper audio API, see the following 32documentation: 33 34* `pp::AudioConfig class 35 <https://developers.google.com/native-client/peppercpp/classpp_1_1_audio_config>`_ 36 37* `pp::Audio class 38 <https://developers.google.com/native-client/peppercpp/classpp_1_1_audio>`_ 39 40* `audio_config.h 41 <https://developers.google.com/native-client/peppercpp/audio__config_8h>`_ 42 43* `audio.h <https://developers.google.com/native-client/peppercpp/audio_8h>`_ 44 45* `PP_AudioSampleRate 46 <https://developers.google.com/native-client/pepperc/group___enums.html#gaee750c350655f2fb0fe04c04029e0ff8>`_ 47 48About the Pepper audio API 49========================== 50 51The Pepper audio API lets Native Client modules play audio streams in a 52browser. To play an audio stream, a module generates audio samples and writes 53them into a buffer. The browser reads the audio samples from the buffer and 54plays them using an audio device on the client computer. 55 56.. image:: /images/pepper-audio-buffer.png 57 58This mechanism is simple but low-level. If you want to play plain sound files in 59a web application, you may want to consider higher-level alternatives such as 60using the HTML ``<audio>`` tag, JavaScript, or the new `Web Audio API 61<http://chromium.googlecode.com/svn/trunk/samples/audio/index.html>`_. 62 63The Pepper audio API is a good option for playing audio data if you want to do 64audio processing in your web application. You might use the audio API, for 65example, if you want to apply audio effects to sounds, synthesize your own 66sounds, or do any other type of CPU-intensive processing of audio 67samples. Another likely use case is gaming applications: you might use a gaming 68library to process audio data, and then simply use the audio API to output the 69processed data. 70 71The Pepper audio API is straightforward to use: 72 73#. Your module creates an audio configuration resource and an audio resource. 74 75#. Your module implements a callback function that fills an audio buffer with 76 data. 77 78#. Your module invokes the StartPlayback and StopPlayback methods of the audio 79 resource (e.g., when certain events occur). 80 81#. The browser invokes your callback function whenever it needs audio data to 82 play. Your callback function can generate the audio data in a number of 83 ways---e.g., it can generate new data, or it can copy pre-mixed data into the 84 audio buffer. 85 86This basic interaction is illustrated below, and described in detail in the 87sections that follow. 88 89.. image:: /images/pepper-audio-api.png 90 91Digital audio concepts 92====================== 93 94Before you use the Pepper audio API, it's helpful to understand a few concepts 95that are fundamental to how digital audio is recorded and played back: 96 97sample rate 98 the number of times an input sound source is sampled per second; 99 correspondingly, the number of samples that are played back per second 100 101bit depth 102 the number of bits used to represent a sample 103 104channels 105 the number of input sources recorded in each sampling interval; 106 correspondingly, the number of outputs that are played back simultaneously 107 (typically using different speakers) 108 109The higher the sample rate and bit depth used to record a sound wave, the more 110accurately the sound wave can be reproduced, since it will have been sampled 111more frequently and stored using a higher level of quantization. Common sampling 112rates include 44,100 Hz (44,100 samples/second, the sample rate used on CDs), 113and 48,000 Hz (the sample rate used on DVDs and Digital Audio Tapes). A common 114bit depth is 16 bits per sample, and a common number of channels is 2 (left and 115right channels for stereo sound). 116 117.. _pepper_audio_configurations: 118 119The Pepper audio API currently lets Native Client modules play audio streams 120with the following configurations: 121 122* **sample rate**: 44,100 Hz or 48,000 Hz 123* **bit depth**: 16 124* **channels**: 2 (stereo) 125 126Setting up the module 127===================== 128 129The code examples below describe a simple Native Client module that generates 130audio samples using a sine wave with a frequency of 440 Hz. The module starts 131playing the audio samples as soon as it is loaded into the browser. 132 133The Native Client module is set up by implementing subclasses of the 134``pp::Module`` and ``pp::Instance`` classes, as normal. 135 136.. naclcode:: 137 138 class SineSynthInstance : public pp::Instance { 139 public: 140 explicit SineSynthInstance(PP_Instance instance); 141 virtual ~SineSynthInstance() {} 142 143 // Called by the browser once the NaCl module is loaded and ready to 144 // initialize. Creates a Pepper audio context and initializes it. Returns 145 // true on success. Returning false causes the NaCl module to be deleted 146 // and no other functions to be called. 147 virtual bool Init(uint32_t argc, const char* argn[], const char* argv[]); 148 149 private: 150 // Function called by the browser when it needs more audio samples. 151 static void SineWaveCallback(void* samples, 152 uint32_t buffer_size, 153 void* data); 154 155 // Audio resource. 156 pp::Audio audio_; 157 158 ... 159 160 }; 161 162 class SineSynthModule : public pp::Module { 163 public: 164 SineSynthModule() : pp::Module() {} 165 ~SineSynthModule() {} 166 167 // Create and return a SineSynthInstance object. 168 virtual pp::Instance* CreateInstance(PP_Instance instance) { 169 return new SineSynthInstance(instance); 170 } 171 }; 172 173Creating an audio configuration resource 174======================================== 175 176Resources 177--------- 178 179Before the module can play an audio stream, it must create two resources: an 180audio configuration resource and an audio resource. Resources are handles to 181objects that the browser provides to module instances. An audio resource is an 182object that represents the state of an audio stream, including whether the 183stream is paused or being played back, and which callback function to invoke 184when the samples in the stream's buffer run out. An audio configuration resource 185is an object that stores configuration data for an audio resource, including the 186sampling frequency of the audio samples, and the number of samples that the 187callback function must provide when the browser invokes it. 188 189Sample frame count 190------------------ 191 192Prior to creating an audio configuration resource, the module should call 193``RecommendSampleFrameCount`` to obtain a *sample frame count* from the 194browser. The sample frame count is the number of samples that the callback 195function must provide per channel each time the browser invokes the callback 196function. For example, if the sample frame count is 4096 for a stereo audio 197stream, the callback function must provide a 8192 samples (4096 for the left 198channel and 4096 for the right channel). 199 200The module can request a specific sample frame count, but the browser may return 201a different sample frame count depending on the capabilities of the client 202device. At present, ``RecommendSampleFrameCount`` simply bound-checks the 203requested sample frame count (see ``include/ppapi/c/ppb_audio_config.h`` for the 204minimum and maximum sample frame counts, currently 64 and 32768). In the future, 205``RecommendSampleFrameCount`` may perform a more sophisticated calculation, 206particularly if there is an intrinsic buffer size for the client device. 207 208Selecting a sample frame count for an audio stream involves a tradeoff between 209latency and CPU usage. If you want your module to have short audio latency so 210that it can rapidly change what's playing in the audio stream, you should 211request a small sample frame count. That could be useful in gaming applications, 212for example, where sounds have to change frequently in response to game 213action. However, a small sample frame count results in higher CPU usage, since 214the browser must invoke the callback function frequently to refill the audio 215buffer. Conversely, a large sample frame count results in higher latency but 216lower CPU usage. You should request a large sample frame count if your module 217will play long, uninterrupted audio segments. 218 219Supported audio configurations 220------------------------------ 221 222After the module obtains a sample frame count, it can create an audio 223configuration resource. Currently the Pepper audio API supports audio streams 224with the configuration settings shown :ref:`above<pepper_audio_configurations>`. 225C++ modules can create a configuration resource by instantiating a 226``pp::AudioConfig`` object. Check ``audio_config.h`` for the latest 227configurations that are supported. 228 229.. naclcode:: 230 231 bool SineSynthInstance::Init(uint32_t argc, 232 const char* argn[], 233 const char* argv[]) { 234 235 // Ask the browser/device for an appropriate sample frame count size. 236 sample_frame_count_ = 237 pp::AudioConfig::RecommendSampleFrameCount(PP_AUDIOSAMPLERATE_44100, 238 kSampleFrameCount); 239 240 // Create an audio configuration resource. 241 pp::AudioConfig audio_config = pp::AudioConfig(this, 242 PP_AUDIOSAMPLERATE_44100, 243 sample_frame_count_); 244 245 // Create an audio resource. 246 audio_ = pp::Audio(this, 247 audio_config, 248 SineWaveCallback, 249 this); 250 251 // Start playback when the module instance is initialized. 252 return audio_.StartPlayback(); 253 } 254 255Creating an audio resource 256========================== 257 258Once the module has created an audio configuration resource, it can create an 259audio resource. To do so, it instantiates a ``pp::Audio`` object, passing in a 260pointer to the module instance, the audio configuration resource, a callback 261function, and a pointer to user data (data that is used in the callback 262function). See the example above. 263 264Implementing a callback function 265================================ 266 267The browser calls the callback function associated with an audio resource every 268time it needs more samples to play. The callback function can generate new 269samples (e.g., by applying sound effects), or copy pre-mixed samples into the 270audio buffer. The example below generates new samples by computing values of a 271sine wave. 272 273The last parameter passed to the callback function is generic user data that the 274function can use in processing samples. In the example below, the user data is a 275pointer to the module instance, which includes member variables 276``sample_frame_count_`` (the sample frame count obtained from the browser) and 277``theta_`` (the last angle that was used to compute a sine value in the previous 278callback; this lets the function generate a smooth sine wave by starting at that 279angle plus a small delta). 280 281.. naclcode:: 282 283 class SineSynthInstance : public pp::Instance { 284 public: 285 ... 286 287 private: 288 static void SineWaveCallback(void* samples, 289 uint32_t buffer_size, 290 void* data) { 291 292 // The user data in this example is a pointer to the module instance. 293 SineSynthInstance* sine_synth_instance = 294 reinterpret_cast<SineSynthInstance*>(data); 295 296 // Delta by which to increase theta_ for each sample. 297 const double delta = kTwoPi * kFrequency / PP_AUDIOSAMPLERATE_44100; 298 // Amount by which to scale up the computed sine value. 299 const int16_t max_int16 = std::numeric_limits<int16_t>::max(); 300 301 int16_t* buff = reinterpret_cast<int16_t*>(samples); 302 303 // Make sure we can't write outside the buffer. 304 assert(buffer_size >= (sizeof(*buff) * kChannels * 305 sine_synth_instance->sample_frame_count_)); 306 307 for (size_t sample_i = 0; 308 sample_i < sine_synth_instance->sample_frame_count_; 309 ++sample_i, sine_synth_instance->theta_ += delta) { 310 311 // Keep theta_ from going beyond 2*Pi. 312 if (sine_synth_instance->theta_ > kTwoPi) { 313 sine_synth_instance->theta_ -= kTwoPi; 314 } 315 316 // Compute the sine value for the current theta_, scale it up, 317 // and write it into the buffer once for each channel. 318 double sin_value(std::sin(sine_synth_instance->theta_)); 319 int16_t scaled_value = static_cast<int16_t>(sin_value * max_int16); 320 for (size_t channel = 0; channel < kChannels; ++channel) { 321 *buff++ = scaled_value; 322 } 323 } 324 } 325 326 ... 327 }; 328 329Application threads and real-time requirements 330---------------------------------------------- 331 332The callback function runs in a background application thread. This allows audio 333processing to continue even when the application is busy doing something 334else. If the main application thread and the callback thread access the same 335data, you may be tempted to use a lock to control access to that data. You 336should avoid the use of locks in the callback thread, however, as attempting to 337acquire a lock may cause the thread to get swapped out, resulting in audio 338dropouts. 339 340In general, you must program the callback thread carefully, as the Pepper audio 341API is a very low level API that needs to meet hard real-time requirements. If 342the callback thread spends too much time processing, it can easily miss the 343real-time deadline, resulting in audio dropouts. One way the callback thread can 344miss the deadline is by taking too much time doing computation. Another way the 345callback thread can miss the deadline is by executing a function call that swaps 346out the callback thread. Unfortunately, such function calls include just about 347all C Run-Time (CRT) library calls and Pepper API calls. The callback thread 348should therefore avoid calls to malloc, gettimeofday, mutex, condvars, critical 349sections, and so forth; any such calls could attempt to take a lock and swap out 350the callback thread, which would be disastrous for audio playback. Similarly, 351the callback thread should avoid Pepper API calls. Audio dropouts due to thread 352swapping can be very rare and very hard to track down and debug---it's best to 353avoid making system/Pepper calls in the first place. In short, the audio 354(callback) thread should use "lock-free" techniques and avoid making CRT library 355calls. 356 357One other issue to be aware of is that the ``StartPlayback`` function (discussed 358below) is an asynchronous RPC; i.e., it does not block. That means that the 359callback function may not be called immediately after the call to 360``StartPlayback``. If it's important to synchronize the callback thread with 361another thread so that the audio stream starts playing simultaneously with 362another action in your application, you must handle such synchronization 363manually. 364 365Starting and stopping playback 366============================== 367 368To start and stop audio playback, the module simply reacts to JavaScript 369messages. 370 371.. naclcode:: 372 373 const char* const kPlaySoundId = "playSound"; 374 const char* const kStopSoundId = "stopSound"; 375 376 void SineSynthInstance::HandleMessage(const pp::Var& var_message) { 377 if (!var_message.is_string()) { 378 return; 379 } 380 std::string message = var_message.AsString(); 381 if (message == kPlaySoundId) { 382 audio_.StartPlayback(); 383 } else if (message == kStopSoundId) { 384 audio_.StopPlayback(); 385 } else if (...) { 386 ... 387 } 388 } 389