• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1.. _devguide-coding-audio:
2
3#####
4Audio
5#####
6
7.. contents::
8  :local:
9  :backlinks: none
10  :depth: 2
11
12This chapter describes how to use the Pepper audio API to play an audio
13stream. The Pepper audio API provides a low-level means of playing a stream of
14audio samples generated by a Native Client module. The API generally works as
15follows: A Native Client module creates an audio resource that represents an
16audio stream, and tells the browser to start or stop playing the audio
17resource. The browser calls a function in the Native Client module to fill a
18buffer with audio samples every time it needs data to play from the audio
19stream.
20
21The code examples in this chapter describe a simple Native Client module that
22generates audio samples using a sine wave with a frequency of 440 Hz. The module
23starts playing the audio samples as soon as it is loaded into the browser. For a
24slightly more sophisticated example, see the ``audio`` example (source code in
25the SDK directory ``examples/api/audio``), which lets users specify a frequency
26for the sine wave and click buttons to start and stop audio playback.
27
28Reference information
29=====================
30
31For reference information related to the Pepper audio API, see the following
32documentation:
33
34* `pp::AudioConfig class
35  <https://developers.google.com/native-client/peppercpp/classpp_1_1_audio_config>`_
36
37* `pp::Audio class
38  <https://developers.google.com/native-client/peppercpp/classpp_1_1_audio>`_
39
40* `audio_config.h
41  <https://developers.google.com/native-client/peppercpp/audio__config_8h>`_
42
43* `audio.h <https://developers.google.com/native-client/peppercpp/audio_8h>`_
44
45* `PP_AudioSampleRate
46  <https://developers.google.com/native-client/pepperc/group___enums.html#gaee750c350655f2fb0fe04c04029e0ff8>`_
47
48About the Pepper audio API
49==========================
50
51The Pepper audio API lets Native Client modules play audio streams in a
52browser. To play an audio stream, a module generates audio samples and writes
53them into a buffer. The browser reads the audio samples from the buffer and
54plays them using an audio device on the client computer.
55
56.. image:: /images/pepper-audio-buffer.png
57
58This mechanism is simple but low-level. If you want to play plain sound files in
59a web application, you may want to consider higher-level alternatives such as
60using the HTML ``<audio>`` tag, JavaScript, or the new `Web Audio API
61<http://chromium.googlecode.com/svn/trunk/samples/audio/index.html>`_.
62
63The Pepper audio API is a good option for playing audio data if you want to do
64audio processing in your web application. You might use the audio API, for
65example, if you want to apply audio effects to sounds, synthesize your own
66sounds, or do any other type of CPU-intensive processing of audio
67samples. Another likely use case is gaming applications: you might use a gaming
68library to process audio data, and then simply use the audio API to output the
69processed data.
70
71The Pepper audio API is straightforward to use:
72
73#. Your module creates an audio configuration resource and an audio resource.
74
75#. Your module implements a callback function that fills an audio buffer with
76   data.
77
78#. Your module invokes the StartPlayback and StopPlayback methods of the audio
79   resource (e.g., when certain events occur).
80
81#. The browser invokes your callback function whenever it needs audio data to
82   play. Your callback function can generate the audio data in a number of
83   ways---e.g., it can generate new data, or it can copy pre-mixed data into the
84   audio buffer.
85
86This basic interaction is illustrated below, and described in detail in the
87sections that follow.
88
89.. image:: /images/pepper-audio-api.png
90
91Digital audio concepts
92======================
93
94Before you use the Pepper audio API, it's helpful to understand a few concepts
95that are fundamental to how digital audio is recorded and played back:
96
97sample rate
98  the number of times an input sound source is sampled per second;
99  correspondingly, the number of samples that are played back per second
100
101bit depth
102  the number of bits used to represent a sample
103
104channels
105  the number of input sources recorded in each sampling interval;
106  correspondingly, the number of outputs that are played back simultaneously
107  (typically using different speakers)
108
109The higher the sample rate and bit depth used to record a sound wave, the more
110accurately the sound wave can be reproduced, since it will have been sampled
111more frequently and stored using a higher level of quantization. Common sampling
112rates include 44,100 Hz (44,100 samples/second, the sample rate used on CDs),
113and 48,000 Hz (the sample rate used on DVDs and Digital Audio Tapes). A common
114bit depth is 16 bits per sample, and a common number of channels is 2 (left and
115right channels for stereo sound).
116
117.. _pepper_audio_configurations:
118
119The Pepper audio API currently lets Native Client modules play audio streams
120with the following configurations:
121
122* **sample rate**: 44,100 Hz or 48,000 Hz
123* **bit depth**: 16
124* **channels**: 2 (stereo)
125
126Setting up the module
127=====================
128
129The code examples below describe a simple Native Client module that generates
130audio samples using a sine wave with a frequency of 440 Hz. The module starts
131playing the audio samples as soon as it is loaded into the browser.
132
133The Native Client module is set up by implementing subclasses of the
134``pp::Module`` and ``pp::Instance`` classes, as normal.
135
136.. naclcode::
137
138  class SineSynthInstance : public pp::Instance {
139   public:
140    explicit SineSynthInstance(PP_Instance instance);
141    virtual ~SineSynthInstance() {}
142
143    // Called by the browser once the NaCl module is loaded and ready to
144    // initialize.  Creates a Pepper audio context and initializes it. Returns
145    // true on success.  Returning false causes the NaCl module to be deleted
146    // and no other functions to be called.
147    virtual bool Init(uint32_t argc, const char* argn[], const char* argv[]);
148
149   private:
150    // Function called by the browser when it needs more audio samples.
151    static void SineWaveCallback(void* samples,
152                                 uint32_t buffer_size,
153                                 void* data);
154
155    // Audio resource.
156    pp::Audio audio_;
157
158    ...
159
160  };
161
162  class SineSynthModule : public pp::Module {
163   public:
164    SineSynthModule() : pp::Module() {}
165    ~SineSynthModule() {}
166
167    // Create and return a SineSynthInstance object.
168    virtual pp::Instance* CreateInstance(PP_Instance instance) {
169      return new SineSynthInstance(instance);
170    }
171  };
172
173Creating an audio configuration resource
174========================================
175
176Resources
177---------
178
179Before the module can play an audio stream, it must create two resources: an
180audio configuration resource and an audio resource. Resources are handles to
181objects that the browser provides to module instances. An audio resource is an
182object that represents the state of an audio stream, including whether the
183stream is paused or being played back, and which callback function to invoke
184when the samples in the stream's buffer run out. An audio configuration resource
185is an object that stores configuration data for an audio resource, including the
186sampling frequency of the audio samples, and the number of samples that the
187callback function must provide when the browser invokes it.
188
189Sample frame count
190------------------
191
192Prior to creating an audio configuration resource, the module should call
193``RecommendSampleFrameCount`` to obtain a *sample frame count* from the
194browser. The sample frame count is the number of samples that the callback
195function must provide per channel each time the browser invokes the callback
196function. For example, if the sample frame count is 4096 for a stereo audio
197stream, the callback function must provide a 8192 samples (4096 for the left
198channel and 4096 for the right channel).
199
200The module can request a specific sample frame count, but the browser may return
201a different sample frame count depending on the capabilities of the client
202device. At present, ``RecommendSampleFrameCount`` simply bound-checks the
203requested sample frame count (see ``include/ppapi/c/ppb_audio_config.h`` for the
204minimum and maximum sample frame counts, currently 64 and 32768). In the future,
205``RecommendSampleFrameCount`` may perform a more sophisticated calculation,
206particularly if there is an intrinsic buffer size for the client device.
207
208Selecting a sample frame count for an audio stream involves a tradeoff between
209latency and CPU usage. If you want your module to have short audio latency so
210that it can rapidly change what's playing in the audio stream, you should
211request a small sample frame count. That could be useful in gaming applications,
212for example, where sounds have to change frequently in response to game
213action. However, a small sample frame count results in higher CPU usage, since
214the browser must invoke the callback function frequently to refill the audio
215buffer. Conversely, a large sample frame count results in higher latency but
216lower CPU usage. You should request a large sample frame count if your module
217will play long, uninterrupted audio segments.
218
219Supported audio configurations
220------------------------------
221
222After the module obtains a sample frame count, it can create an audio
223configuration resource. Currently the Pepper audio API supports audio streams
224with the configuration settings shown :ref:`above<pepper_audio_configurations>`.
225C++ modules can create a configuration resource by instantiating a
226``pp::AudioConfig`` object. Check ``audio_config.h`` for the latest
227configurations that are supported.
228
229.. naclcode::
230
231  bool SineSynthInstance::Init(uint32_t argc,
232                               const char* argn[],
233                               const char* argv[]) {
234
235    // Ask the browser/device for an appropriate sample frame count size.
236    sample_frame_count_ =
237        pp::AudioConfig::RecommendSampleFrameCount(PP_AUDIOSAMPLERATE_44100,
238                                                   kSampleFrameCount);
239
240    // Create an audio configuration resource.
241    pp::AudioConfig audio_config = pp::AudioConfig(this,
242                                                   PP_AUDIOSAMPLERATE_44100,
243                                                   sample_frame_count_);
244
245    // Create an audio resource.
246    audio_ = pp::Audio(this,
247                       audio_config,
248                       SineWaveCallback,
249                       this);
250
251    // Start playback when the module instance is initialized.
252    return audio_.StartPlayback();
253  }
254
255Creating an audio resource
256==========================
257
258Once the module has created an audio configuration resource, it can create an
259audio resource. To do so, it instantiates a ``pp::Audio`` object, passing in a
260pointer to the module instance, the audio configuration resource, a callback
261function, and a pointer to user data (data that is used in the callback
262function).  See the example above.
263
264Implementing a callback function
265================================
266
267The browser calls the callback function associated with an audio resource every
268time it needs more samples to play. The callback function can generate new
269samples (e.g., by applying sound effects), or copy pre-mixed samples into the
270audio buffer. The example below generates new samples by computing values of a
271sine wave.
272
273The last parameter passed to the callback function is generic user data that the
274function can use in processing samples. In the example below, the user data is a
275pointer to the module instance, which includes member variables
276``sample_frame_count_`` (the sample frame count obtained from the browser) and
277``theta_`` (the last angle that was used to compute a sine value in the previous
278callback; this lets the function generate a smooth sine wave by starting at that
279angle plus a small delta).
280
281.. naclcode::
282
283  class SineSynthInstance : public pp::Instance {
284   public:
285    ...
286
287   private:
288    static void SineWaveCallback(void* samples,
289                                 uint32_t buffer_size,
290                                 void* data) {
291
292      // The user data in this example is a pointer to the module instance.
293      SineSynthInstance* sine_synth_instance =
294          reinterpret_cast<SineSynthInstance*>(data);
295
296      // Delta by which to increase theta_ for each sample.
297      const double delta = kTwoPi * kFrequency / PP_AUDIOSAMPLERATE_44100;
298      // Amount by which to scale up the computed sine value.
299      const int16_t max_int16 = std::numeric_limits<int16_t>::max();
300
301      int16_t* buff = reinterpret_cast<int16_t*>(samples);
302
303      // Make sure we can't write outside the buffer.
304      assert(buffer_size >= (sizeof(*buff) * kChannels *
305                             sine_synth_instance->sample_frame_count_));
306
307      for (size_t sample_i = 0;
308           sample_i < sine_synth_instance->sample_frame_count_;
309           ++sample_i, sine_synth_instance->theta_ += delta) {
310
311        // Keep theta_ from going beyond 2*Pi.
312        if (sine_synth_instance->theta_ > kTwoPi) {
313          sine_synth_instance->theta_ -= kTwoPi;
314        }
315
316        // Compute the sine value for the current theta_, scale it up,
317        // and write it into the buffer once for each channel.
318        double sin_value(std::sin(sine_synth_instance->theta_));
319        int16_t scaled_value = static_cast<int16_t>(sin_value * max_int16);
320        for (size_t channel = 0; channel < kChannels; ++channel) {
321          *buff++ = scaled_value;
322        }
323      }
324    }
325
326    ...
327  };
328
329Application threads and real-time requirements
330----------------------------------------------
331
332The callback function runs in a background application thread. This allows audio
333processing to continue even when the application is busy doing something
334else. If the main application thread and the callback thread access the same
335data, you may be tempted to use a lock to control access to that data. You
336should avoid the use of locks in the callback thread, however, as attempting to
337acquire a lock may cause the thread to get swapped out, resulting in audio
338dropouts.
339
340In general, you must program the callback thread carefully, as the Pepper audio
341API is a very low level API that needs to meet hard real-time requirements. If
342the callback thread spends too much time processing, it can easily miss the
343real-time deadline, resulting in audio dropouts. One way the callback thread can
344miss the deadline is by taking too much time doing computation. Another way the
345callback thread can miss the deadline is by executing a function call that swaps
346out the callback thread. Unfortunately, such function calls include just about
347all C Run-Time (CRT) library calls and Pepper API calls. The callback thread
348should therefore avoid calls to malloc, gettimeofday, mutex, condvars, critical
349sections, and so forth; any such calls could attempt to take a lock and swap out
350the callback thread, which would be disastrous for audio playback. Similarly,
351the callback thread should avoid Pepper API calls. Audio dropouts due to thread
352swapping can be very rare and very hard to track down and debug---it's best to
353avoid making system/Pepper calls in the first place. In short, the audio
354(callback) thread should use "lock-free" techniques and avoid making CRT library
355calls.
356
357One other issue to be aware of is that the ``StartPlayback`` function (discussed
358below) is an asynchronous RPC; i.e., it does not block. That means that the
359callback function may not be called immediately after the call to
360``StartPlayback``. If it's important to synchronize the callback thread with
361another thread so that the audio stream starts playing simultaneously with
362another action in your application, you must handle such synchronization
363manually.
364
365Starting and stopping playback
366==============================
367
368To start and stop audio playback, the module simply reacts to JavaScript
369messages.
370
371.. naclcode::
372
373  const char* const kPlaySoundId = "playSound";
374  const char* const kStopSoundId = "stopSound";
375
376  void SineSynthInstance::HandleMessage(const pp::Var& var_message) {
377    if (!var_message.is_string()) {
378      return;
379    }
380    std::string message = var_message.AsString();
381    if (message == kPlaySoundId) {
382      audio_.StartPlayback();
383    } else if (message == kStopSoundId) {
384      audio_.StopPlayback();
385    } else if (...) {
386      ...
387    }
388  }
389