• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1 /* -----------------------------------------------------------------------------
2 Software License for The Fraunhofer FDK AAC Codec Library for Android
3 
4 © Copyright  1995 - 2021 Fraunhofer-Gesellschaft zur Förderung der angewandten
5 Forschung e.V. All rights reserved.
6 
7  1.    INTRODUCTION
8 The Fraunhofer FDK AAC Codec Library for Android ("FDK AAC Codec") is software
9 that implements the MPEG Advanced Audio Coding ("AAC") encoding and decoding
10 scheme for digital audio. This FDK AAC Codec software is intended to be used on
11 a wide variety of Android devices.
12 
13 AAC's HE-AAC and HE-AAC v2 versions are regarded as today's most efficient
14 general perceptual audio codecs. AAC-ELD is considered the best-performing
15 full-bandwidth communications codec by independent studies and is widely
16 deployed. AAC has been standardized by ISO and IEC as part of the MPEG
17 specifications.
18 
19 Patent licenses for necessary patent claims for the FDK AAC Codec (including
20 those of Fraunhofer) may be obtained through Via Licensing
21 (www.vialicensing.com) or through the respective patent owners individually for
22 the purpose of encoding or decoding bit streams in products that are compliant
23 with the ISO/IEC MPEG audio standards. Please note that most manufacturers of
24 Android devices already license these patent claims through Via Licensing or
25 directly from the patent owners, and therefore FDK AAC Codec software may
26 already be covered under those patent licenses when it is used for those
27 licensed purposes only.
28 
29 Commercially-licensed AAC software libraries, including floating-point versions
30 with enhanced sound quality, are also available from Fraunhofer. Users are
31 encouraged to check the Fraunhofer website for additional applications
32 information and documentation.
33 
34 2.    COPYRIGHT LICENSE
35 
36 Redistribution and use in source and binary forms, with or without modification,
37 are permitted without payment of copyright license fees provided that you
38 satisfy the following conditions:
39 
40 You must retain the complete text of this software license in redistributions of
41 the FDK AAC Codec or your modifications thereto in source code form.
42 
43 You must retain the complete text of this software license in the documentation
44 and/or other materials provided with redistributions of the FDK AAC Codec or
45 your modifications thereto in binary form. You must make available free of
46 charge copies of the complete source code of the FDK AAC Codec and your
47 modifications thereto to recipients of copies in binary form.
48 
49 The name of Fraunhofer may not be used to endorse or promote products derived
50 from this library without prior written permission.
51 
52 You may not charge copyright license fees for anyone to use, copy or distribute
53 the FDK AAC Codec software or your modifications thereto.
54 
55 Your modified versions of the FDK AAC Codec must carry prominent notices stating
56 that you changed the software and the date of any change. For modified versions
57 of the FDK AAC Codec, the term "Fraunhofer FDK AAC Codec Library for Android"
58 must be replaced by the term "Third-Party Modified Version of the Fraunhofer FDK
59 AAC Codec Library for Android."
60 
61 3.    NO PATENT LICENSE
62 
63 NO EXPRESS OR IMPLIED LICENSES TO ANY PATENT CLAIMS, including without
64 limitation the patents of Fraunhofer, ARE GRANTED BY THIS SOFTWARE LICENSE.
65 Fraunhofer provides no warranty of patent non-infringement with respect to this
66 software.
67 
68 You may use this FDK AAC Codec software or modifications thereto only for
69 purposes that are authorized by appropriate patent licenses.
70 
71 4.    DISCLAIMER
72 
73 This FDK AAC Codec software is provided by Fraunhofer on behalf of the copyright
74 holders and contributors "AS IS" and WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES,
75 including but not limited to the implied warranties of merchantability and
76 fitness for a particular purpose. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
77 CONTRIBUTORS BE LIABLE for any direct, indirect, incidental, special, exemplary,
78 or consequential damages, including but not limited to procurement of substitute
79 goods or services; loss of use, data, or profits, or business interruption,
80 however caused and on any theory of liability, whether in contract, strict
81 liability, or tort (including negligence), arising in any way out of the use of
82 this software, even if advised of the possibility of such damage.
83 
84 5.    CONTACT INFORMATION
85 
86 Fraunhofer Institute for Integrated Circuits IIS
87 Attention: Audio and Multimedia Departments - FDK AAC LL
88 Am Wolfsmantel 33
89 91058 Erlangen, Germany
90 
91 www.iis.fraunhofer.de/amm
92 amm-info@iis.fraunhofer.de
93 ----------------------------------------------------------------------------- */
94 
95 /**************************** AAC encoder library ******************************
96 
97    Author(s):   M. Lohwasser
98 
99    Description:
100 
101 *******************************************************************************/
102 
103 /**
104  * \file   aacenc_lib.h
105  * \brief  FDK AAC Encoder library interface header file.
106  *
107 \mainpage  Introduction
108 
109 \section Scope
110 
111 This document describes the high-level interface and usage of the ISO/MPEG-2/4
112 AAC Encoder library developed by the Fraunhofer Institute for Integrated
113 Circuits (IIS).
114 
115 The library implements encoding on the basis of the MPEG-2 and MPEG-4 AAC
116 Low-Complexity standard, and depending on the library's configuration, MPEG-4
117 High-Efficiency AAC v2 and/or AAC-ELD standard.
118 
119 All references to SBR (Spectral Band Replication) are only applicable to HE-AAC
120 or AAC-ELD versions of the library. All references to PS (Parametric Stereo) are
121 only applicable to HE-AAC v2 versions of the library.
122 
123 \section encBasics Encoder Basics
124 
125 This document can only give a rough overview about the ISO/MPEG-2 and ISO/MPEG-4
126 AAC audio coding standard. To understand all the terms in this document, you are
127 encouraged to read the following documents.
128 
129 - ISO/IEC 13818-7 (MPEG-2 AAC), which defines the syntax of MPEG-2 AAC audio
130 bitstreams.
131 - ISO/IEC 14496-3 (MPEG-4 AAC, subparts 1 and 4), which defines the syntax of
132 MPEG-4 AAC audio bitstreams.
133 - Lutzky, Schuller, Gayer, Krämer, Wabnik, "A guideline to audio codec
134 delay", 116th AES Convention, May 8, 2004
135 
136 MPEG Advanced Audio Coding is based on a time-to-frequency mapping of the
137 signal. The signal is partitioned into overlapping portions and transformed into
138 frequency domain. The spectral components are then quantized and coded. \n An
139 MPEG-2 or MPEG-4 AAC audio bitstream is composed of frames. Contrary to MPEG-1/2
140 Layer-3 (mp3), the length of individual frames is not restricted to a fixed
141 number of bytes, but can take on any length between 1 and 768 bytes.
142 
143 
144 \page LIBUSE Library Usage
145 
146 \section InterfaceDescription API Files
147 
148 All API header files are located in the folder /include of the release package.
149 All header files are provided for usage in C/C++ programs. The AAC encoder
150 library API functions are located in aacenc_lib.h.
151 
152 \section CallingSequence Calling Sequence
153 
154 For encoding of ISO/MPEG-2/4 AAC bitstreams the following sequence is mandatory.
155 Input read and output write functions as well as the corresponding open and
156 close functions are left out, since they may be implemented differently
157 according to the user's specific requirements. The example implementation uses
158 file-based input/output.
159 
160 -# Call aacEncOpen() to allocate encoder instance with required \ref encOpen
161 "configuration". \code HANDLE_AACENCODER hAacEncoder = NULL; if ( (ErrorStatus =
162 aacEncOpen(&hAacEncoder,0,0)) != AACENC_OK ) { \endcode
163 -# Call aacEncoder_SetParam() for each parameter to be set. AOT, samplingrate,
164 channelMode, bitrate and transport type are \ref encParams "mandatory". \code
165 ErrorStatus = aacEncoder_SetParam(hAacEncoder, parameter, value);
166 \endcode
167 -# Call aacEncEncode() with NULL parameters to \ref encReconf "initialize"
168 encoder instance with present parameter set. \code ErrorStatus =
169 aacEncEncode(hAacEncoder, NULL, NULL, NULL, NULL); \endcode
170 -# Call aacEncInfo() to retrieve a configuration data block to be transmitted
171 out of band. This is required when using RFC3640 or RFC3016 like transport.
172 \code
173 AACENC_InfoStruct encInfo;
174 aacEncInfo(hAacEncoder, &encInfo);
175 \endcode
176 -# Encode input audio data in loop.
177 \code
178 do
179 {
180 \endcode
181 Feed \ref feedInBuf "input buffer" with new audio data and provide input/output
182 \ref bufDes "arguments" to aacEncEncode(). \code ErrorStatus =
183 aacEncEncode(hAacEncoder, &inBufDesc, &outBufDesc, &inargs, &outargs); \endcode
184 Write \ref writeOutData "output data" to file or audio device.
185 \code
186 } while (ErrorStatus==AACENC_OK);
187 \endcode
188 -# Call aacEncClose() and destroy encoder instance.
189 \code
190 aacEncClose(&hAacEncoder);
191 \endcode
192 
193 
194 \section encOpen Encoder Instance Allocation
195 
196 The assignment of the aacEncOpen() function is very flexible and can be used in
197 the following way.
198 - If the amount of memory consumption is not an issue, the encoder instance can
199 be allocated for the maximum number of possible audio channels (for example 6 or
200 8) with the full functional range supported by the library. This is the default
201 open procedure for the AAC encoder if memory consumption does not need to be
202 minimized. \code aacEncOpen(&hAacEncoder,0,0) \endcode
203 - If the required MPEG-4 AOTs do not call for the full functional range of the
204 library, encoder modules can be allocated selectively. \verbatim
205 ------------------------------------------------------
206  AAC | SBR |  PS | MD |         FLAGS         | value
207 -----+-----+-----+----+-----------------------+-------
208   X  |  -  |  -  |  - | (0x01)                |  0x01
209   X  |  X  |  -  |  - | (0x01|0x02)           |  0x03
210   X  |  X  |  X  |  - | (0x01|0x02|0x04)      |  0x07
211   X  |  -  |  -  |  X | (0x01          |0x10) |  0x11
212   X  |  X  |  -  |  X | (0x01|0x02     |0x10) |  0x13
213   X  |  X  |  X  |  X | (0x01|0x02|0x04|0x10) |  0x17
214 ------------------------------------------------------
215  - AAC: Allocate AAC Core Encoder module.
216  - SBR: Allocate Spectral Band Replication module.
217  - PS: Allocate Parametric Stereo module.
218  - MD: Allocate Meta Data module within AAC encoder.
219 \endverbatim
220 \code aacEncOpen(&hAacEncoder,value,0) \endcode
221 - Specifying the maximum number of channels to be supported in the encoder
222 instance can be done as follows.
223  - For example allocate an encoder instance which supports 2 channels for all
224 supported AOTs. The library itself may be capable of encoding up to 6 or 8
225 channels but in this example only 2 channel encoding is required and thus only
226 buffers for 2 channels are allocated to save data memory. \code
227 aacEncOpen(&hAacEncoder,0,2) \endcode
228  - Additionally the maximum number of supported channels in the SBR module can
229 be denoted separately.\n In this example the encoder instance provides a maximum
230 of 6 channels out of which up to 2 channels support SBR. This encoder instance
231 can produce for example 5.1 channel AAC-LC streams or stereo HE-AAC (v2)
232 streams. HE-AAC 5.1 multi channel is not possible since only 2 out of 6 channels
233 support SBR, which saves data memory. \code aacEncOpen(&hAacEncoder,0,6|(2<<8))
234 \endcode \n
235 
236 \section bufDes Input/Output Arguments
237 
238 \subsection allocIOBufs Provide Buffer Descriptors
239 In the present encoder API, the input and output buffers are described with \ref
240 AACENC_BufDesc "buffer descriptors". This mechanism allows a flexible handling
241 of input and output buffers without impact to the actual encoding call. Optional
242 buffers are necessary e.g. for ancillary data, meta data input or additional
243 output buffers describing superframing data in DAB+ or DRM+.\n At least one
244 input buffer for audio input data and one output buffer for bitstream data must
245 be allocated. The input buffer size can be a user defined multiple of the number
246 of input channels. PCM input data will be copied from the user defined PCM
247 buffer to an internal input buffer and so input data can be less than one AAC
248 audio frame. The output buffer size should be 6144 bits per channel excluding
249 the LFE channel. If the output data does not fit into the provided buffer, an
250 AACENC_ERROR will be returned by aacEncEncode(). \code static INT_PCM
251 inputBuffer[8*2048]; static UCHAR            ancillaryBuffer[50]; static
252 AACENC_MetaData  metaDataSetup; static UCHAR            outputBuffer[8192];
253 \endcode
254 
255 All input and output buffer must be clustered in input and output buffer arrays.
256 \code
257 static void* inBuffer[]        = { inputBuffer, ancillaryBuffer, &metaDataSetup
258 }; static INT   inBufferIds[]     = { IN_AUDIO_DATA, IN_ANCILLRY_DATA,
259 IN_METADATA_SETUP }; static INT   inBufferSize[]    = { sizeof(inputBuffer),
260 sizeof(ancillaryBuffer), sizeof(metaDataSetup) }; static INT   inBufferElSize[]
261 = { sizeof(INT_PCM), sizeof(UCHAR), sizeof(AACENC_MetaData) };
262 
263 static void* outBuffer[]       = { outputBuffer };
264 static INT   outBufferIds[]    = { OUT_BITSTREAM_DATA };
265 static INT   outBufferSize[]   = { sizeof(outputBuffer) };
266 static INT   outBufferElSize[] = { sizeof(UCHAR) };
267 \endcode
268 
269 Allocate buffer descriptors
270 \code
271 AACENC_BufDesc inBufDesc;
272 AACENC_BufDesc outBufDesc;
273 \endcode
274 
275 Initialize input buffer descriptor
276 \code
277 inBufDesc.numBufs            = sizeof(inBuffer)/sizeof(void*);
278 inBufDesc.bufs              = (void**)&inBuffer;
279 inBufDesc.bufferIdentifiers = inBufferIds;
280 inBufDesc.bufSizes          = inBufferSize;
281 inBufDesc.bufElSizes        = inBufferElSize;
282 \endcode
283 
284 Initialize output buffer descriptor
285 \code
286 outBufDesc.numBufs           = sizeof(outBuffer)/sizeof(void*);
287 outBufDesc.bufs              = (void**)&outBuffer;
288 outBufDesc.bufferIdentifiers = outBufferIds;
289 outBufDesc.bufSizes          = outBufferSize;
290 outBufDesc.bufElSizes        = outBufferElSize;
291 \endcode
292 
293 \subsection argLists Provide Input/Output Argument Lists
294 The input and output arguments of an aacEncEncode() call are described in
295 argument structures. \code AACENC_InArgs     inargs; AACENC_OutArgs    outargs;
296 \endcode
297 
298 \section feedInBuf Feed Input Buffer
299 The input buffer should be handled as a modulo buffer. New audio data in the
300 form of pulse-code- modulated samples (PCM) must be read from external and be
301 fed to the input buffer depending on its fill level. The required sample bitrate
302 (represented by the data type INT_PCM which is 16, 24 or 32 bits wide) is fixed
303 and depends on library configuration (usually 16 bit). \code inargs.numInSamples
304 += WAV_InputRead ( wavIn, &inputBuffer[inargs.numInSamples],
305                                        FDKmin(encInfo.inputChannels*encInfo.frameLength,
306                                               sizeof(inputBuffer) /
307                                               sizeof(INT_PCM)-inargs.numInSamples),
308                                        SAMPLE_BITS
309                                      );
310 \endcode
311 
312 After the encoder's internal buffer is fed with incoming audio samples, and
313 aacEncEncode() processed the new input data, update/move remaining samples in
314 input buffer, simulating a modulo buffer: \code if (outargs.numInSamples>0) {
315     FDKmemmove( inputBuffer,
316                 &inputBuffer[outargs.numInSamples],
317                 sizeof(INT_PCM)*(inargs.numInSamples-outargs.numInSamples) );
318     inargs.numInSamples -= outargs.numInSamples;
319 }
320 \endcode
321 
322 \section writeOutData Output Bitstream Data
323 If any AAC bitstream data is available, write it to output file or device as
324 follows. \code if (outargs.numOutBytes>0) { FDKfwrite(outputBuffer,
325 outargs.numOutBytes, 1, pOutFile);
326 }
327 \endcode
328 
329 \section cfgMetaData Meta Data Configuration
330 
331 If the present library is configured with Metadata support, it is possible to
332 insert meta data side info into the generated audio bitstream while encoding.
333 
334 To work with meta data the encoder instance has to be \ref encOpen "allocated"
335 with meta data support. The meta data mode must be be configured with the
336 ::AACENC_METADATA_MODE parameter and aacEncoder_SetParam() function. \code
337 aacEncoder_SetParam(hAacEncoder, AACENC_METADATA_MODE, 0-3); \endcode
338 
339 This configuration indicates how to embed meta data into bitstrem. Either no
340 insertion, MPEG or ETSI style. The meta data itself must be specified within the
341 meta data setup structure AACENC_MetaData.
342 
343 Changing one of the AACENC_MetaData setup parameters can be achieved from
344 outside the library within ::IN_METADATA_SETUP input buffer. There is no need to
345 supply meta data setup structure every frame. If there is no new meta setup data
346 available, the encoder uses the previous setup or the default configuration in
347 initial state.
348 
349 In general the audio compressor and limiter within the encoder library can be
350 configured with the ::AACENC_METADATA_DRC_PROFILE parameter
351 AACENC_MetaData::drc_profile and and AACENC_MetaData::comp_profile.
352 \n
353 
354 \section encReconf Encoder Reconfiguration
355 
356 The encoder library allows reconfiguration of the encoder instance with new
357 settings continuously between encoding frames. Each parameter to be changed must
358 be set with a single aacEncoder_SetParam() call. The internal status of each
359 parameter can be retrieved with an aacEncoder_GetParam() call.\n There is no
360 stand-alone reconfiguration function available. When parameters were modified
361 from outside the library, an internal control mechanism triggers the necessary
362 reconfiguration process which will be applied at the beginning of the following
363 aacEncEncode() call. This state can be observed from external via the
364 AACENC_INIT_STATUS and aacEncoder_GetParam() function. The reconfiguration
365 process can also be applied immediately when all parameters of an aacEncEncode()
366 call are NULL with a valid encoder handle.\n\n The internal reconfiguration
367 process can be controlled from extern with the following access. \code
368 aacEncoder_SetParam(hAacEncoder, AACENC_CONTROL_STATE, AACENC_CTRLFLAGS);
369 \endcode
370 
371 
372 \section encParams Encoder Parametrization
373 
374 All parameteres listed in ::AACENC_PARAM can be modified within an encoder
375 instance.
376 
377 \subsection encMandatory Mandatory Encoder Parameters
378 The following parameters must be specified when the encoder instance is
379 initialized. \code aacEncoder_SetParam(hAacEncoder, AACENC_AOT, value);
380 aacEncoder_SetParam(hAacEncoder, AACENC_BITRATE, value);
381 aacEncoder_SetParam(hAacEncoder, AACENC_SAMPLERATE, value);
382 aacEncoder_SetParam(hAacEncoder, AACENC_CHANNELMODE, value);
383 \endcode
384 Beyond that is an internal auto mode which preinitizializes the ::AACENC_BITRATE
385 parameter if the parameter was not set from extern. The bitrate depends on the
386 number of effective channels and sampling rate and is determined as follows.
387 \code
388 AAC-LC (AOT_AAC_LC): 1.5 bits per sample
389 HE-AAC (AOT_SBR): 0.625 bits per sample (dualrate sbr)
390 HE-AAC (AOT_SBR): 1.125 bits per sample (downsampled sbr)
391 HE-AAC v2 (AOT_PS): 0.5 bits per sample
392 \endcode
393 
394 \subsection channelMode Channel Mode Configuration
395 The input audio data is described with the ::AACENC_CHANNELMODE parameter in the
396 aacEncoder_SetParam() call. It is not possible to use the encoder instance with
397 a 'number of input channels' argument. Instead, the channelMode must be set as
398 follows. \code aacEncoder_SetParam(hAacEncoder, AACENC_CHANNELMODE, value);
399 \endcode The parameter is specified in ::CHANNEL_MODE and can be mapped from the
400 number of input channels in the following way. \code CHANNEL_MODE chMode =
401 MODE_INVALID;
402 
403 switch (nChannels) {
404   case 1:  chMode = MODE_1;          break;
405   case 2:  chMode = MODE_2;          break;
406   case 3:  chMode = MODE_1_2;        break;
407   case 4:  chMode = MODE_1_2_1;      break;
408   case 5:  chMode = MODE_1_2_2;      break;
409   case 6:  chMode = MODE_1_2_2_1;    break;
410   case 7:  chMode = MODE_6_1;        break;
411   case 8:  chMode = MODE_7_1_BACK;   break;
412   default:
413     chMode = MODE_INVALID;
414 }
415 return chMode;
416 \endcode
417 
418 \subsection peakbitrate Peak Bitrate Configuration
419 In AAC, the default bitreservoir configuration depends on the chosen bitrate per
420 frame and the number of effective channels. The size can be determined as below.
421 \f[
422 bitreservoir = nEffChannels*6144 - (bitrate*framelength/samplerate)
423 \f]
424 Due to audio quality concerns it is not recommended to change the bitreservoir
425 size to a lower value than the default setting! However, for minimizing the
426 delay for streaming applications or for achieving a constant size of the
427 bitstream packages in each frame, it may be necessaray to limit the maximum bits
428 per frame size. This can be done with the ::AACENC_PEAK_BITRATE parameter. \code
429 aacEncoder_SetParam(hAacEncoder, AACENC_PEAK_BITRATE, value);
430 \endcode
431 
432 To achieve acceptable audio quality with a reduced bitreservoir size setting at
433 least 1000 bits per audio channel is recommended. For a multichannel audio file
434 with 5.1 channels the bitreservoir reduced to 5000 bits results in acceptable
435 audio quality.
436 
437 
438 \subsection vbrmode Variable Bitrate Mode
439 The variable bitrate (VBR) mode coding adapts the bit consumption to the
440 psychoacoustic requirements of the signal. The encoder ignores the user-defined
441 bit rate and selects a suitable pre-defined configuration based on the provided
442 AOT. The VBR mode 1 is tuned for HE-AACv2, for VBR mode 2, HE-AACv1 should be
443 used. VBR modes 3-5 should be used with Low-Complexity AAC. When encoding
444 AAC-ELD, the best mode is selected automatically.
445 
446 The bitrates given in the table are averages over time and different encoder
447 settings. They strongly depend on the type of audio signal. The VBR
448 configurations can be adjusted with the ::AACENC_BITRATEMODE encoder parameter.
449 \verbatim
450 -----------------------------------------------
451  VBR_MODE | Approx. Bitrate in kbps for stereo
452           |     AAC-LC    |      AAC-ELD
453 ----------+---------------+--------------------
454     VBR_1 | 32 (HE-AACv2) |         48
455     VBR_2 | 72 (HE-AACv1) |         56
456     VBR_3 |      112      |         72
457     VBR_4 |      148      |        148
458     VBR_5 |      228      |        224
459 --------------------------------------------
460 \endverbatim
461 Note that these figures are valid for stereo encoding only. VBR modes 2-5 will
462 yield much lower bit rates when encoding single-channel input. For
463 configurations which are making use of downmix modules the AAC core channels
464 respectively downmix channels shall be considered.
465 
466 \subsection encQual Audio Quality Considerations
467 The default encoder configuration is suggested to be used. Encoder tools such as
468 TNS and PNS are activated by default and are internally controlled (see \ref
469 BEHAVIOUR_TOOLS).
470 
471 There is an additional quality parameter called ::AACENC_AFTERBURNER. In the
472 default configuration this quality switch is deactivated because it would cause
473 a workload increase which might be significant. If workload is not an issue in
474 the application we recommended to activate this feature. \code
475 aacEncoder_SetParam(hAacEncoder, AACENC_AFTERBURNER, 0/1); \endcode
476 
477 \subsection encELD ELD Auto Configuration Mode
478 For ELD configuration a so called auto configurator is available which
479 configures SBR and the SBR ratio by itself. The configurator is used when the
480 encoder parameter ::AACENC_SBR_MODE and ::AACENC_SBR_RATIO are not set
481 explicitly.
482 
483 Based on sampling rate and chosen bitrate a reasonable SBR configuration will be
484 used. \verbatim
485 ------------------------------------------------------------------
486  Sampling Rate |   Total Bitrate | No. of | SBR |       SBR Ratio
487      [kHz]     |      [bit/s]    |  Chan  |     |
488                |                 |        |     |
489 ---------------+-----------------+--------+-----+-----------------
490      ]min, 16[ |    min -    max |      1 | off |             ---
491 ---------------+-----------------+--------------+-----------------
492           [16] |    min -  27999 |      1 |  on | downsampled SBR
493                |  28000 -    max |      1 | off |             ---
494 ---------------+-----------------+--------------+-----------------
495      ]16 - 24] |    min -  39999 |      1 |  on | downsampled SBR
496                |  40000 -    max |      1 | off |             ---
497 ---------------+-----------------+--------------+-----------------
498      ]24 - 32] |    min -  27999 |      1 |  on |    dualrate SBR
499                |  28000 -  55999 |      1 |  on | downsampled SBR
500                |  56000 -    max |      1 | off |             ---
501 ---------------+-----------------+--------------+-----------------
502    ]32 - 44.1] |    min -  63999 |      1 |  on |    dualrate SBR
503                |  64000 -    max |      1 | off |             ---
504 ---------------+-----------------+--------------+-----------------
505    ]44.1 - 48] |    min -  63999 |      1 |  on |    dualrate SBR
506                |  64000 -  max   |      1 | off |             ---
507                |                 |        |     |
508 ---------------+-----------------+--------+-----+-----------------
509      ]min, 16[ |    min -    max |      2 | off |             ---
510 ---------------+-----------------+--------------+-----------------
511           [16] |    min -  31999 |      2 |  on | downsampled SBR
512                |  32000 -  63999 |      2 |  on | downsampled SBR
513                |  64000 -    max |      2 | off |             ---
514 ---------------+-----------------+--------------+-----------------
515      ]16 - 24] |    min -  47999 |      2 |  on | downsampled SBR
516                |  48000 -  79999 |      2 |  on | downsampled SBR
517                |  80000 -    max |      2 | off |             ---
518 ---------------+-----------------+--------------+-----------------
519      ]24 - 32] |    min -  31999 |      2 |  on |    dualrate SBR
520                |  32000 -  67999 |      2 |  on |    dualrate SBR
521                |  68000 -  95999 |      2 |  on | downsampled SBR
522                |  96000 -    max |      2 | off |             ---
523 ---------------+-----------------+--------------+-----------------
524    ]32 - 44.1] |    min -  43999 |      2 |  on |    dualrate SBR
525                |  44000 - 127999 |      2 |  on |    dualrate SBR
526                | 128000 -    max |      2 | off |             ---
527 ---------------+-----------------+--------------+-----------------
528    ]44.1 - 48] |    min -  43999 |      2 |  on |    dualrate SBR
529                |  44000 - 127999 |      2 |  on |    dualrate SBR
530                | 128000 -  max   |      2 | off |             ---
531                |                 |              |
532 ------------------------------------------------------------------
533 \endverbatim
534 
535 \subsection encDsELD Reduced Delay (Downscaled) Mode
536 The downscaled mode of AAC-ELD reduces the algorithmic delay of AAC-ELD by
537 virtually increasing the sampling rate. When using the downscaled mode, the
538 bitrate should be increased for keeping the same audio quality level. For common
539 signals, the bitrate should be increased by 25% for a downscale factor of 2.
540 
541 Currently, downscaling factors 2 and 4 are supported.
542 To enable the downscaled mode in the encoder, the framelength parameter
543 AACENC_GRANULE_LENGTH must be set accordingly to 256 or 240 for a downscale
544 factor of 2 or 128 or 120 for a downscale factor of 4. The default values of 512
545 or 480 mean that no downscaling is applied. \code
546 aacEncoder_SetParam(hAacEncoder, AACENC_GRANULE_LENGTH, 256);
547 aacEncoder_SetParam(hAacEncoder, AACENC_GRANULE_LENGTH, 128);
548 \endcode
549 
550 Downscaled bitstreams are fully backwards compatible. However, the legacy
551 decoder needs to support high sample rate, e.g. 96kHz. The signaled sampling
552 rate is multiplied by the downscale factor. Although not required, downscaling
553 should be applied when decoding downscaled bitstreams. It reduces CPU workload
554 and the output will have the same sampling rate as the input. In an ideal
555 configuration both encoder and decoder should run with the same downscale
556 factor.
557 
558 The following table shows approximate filter bank delays in ms for common
559 sampling rates(sr) at framesize(fs), and downscale factor(dsf), based on this
560 formula: \f[ 1000 * fs / (dsf * sr) \f]
561 
562 \verbatim
563 --------------------------------------
564       | 512/2 | 512/4 | 480/2 | 480/4
565 ------+-------+-------+-------+-------
566 22050 | 17.41 |  8.71 | 16.33 |  8.16
567 32000 | 12.00 |  6.00 | 11.25 |  5.62
568 44100 |  8.71 |  4.35 |  8.16 |  4.08
569 48000 |  8.00 |  4.00 |  7.50 |  3.75
570 --------------------------------------
571 \endverbatim
572 
573 \section audiochCfg Audio Channel Configuration
574 The MPEG standard refers often to the so-called Channel Configuration. This
575 Channel Configuration is used for a fixed Channel Mapping. The configurations
576 1-7 and 11,12,14 are predefined in MPEG standard and used for implicit
577 signalling within the encoded bitstream. For user defined Configurations the
578 Channel Configuration is set to 0 and the Channel Mapping must be explecitly
579 described with an appropriate Program Config Element. The present Encoder
580 implementation does not allow the user to configure this Channel Configuration
581 from extern. The Encoder implementation supports fixed Channel Modes which are
582 mapped to Channel Configuration as follow. \verbatim
583 ----------------------------------------------------------------------------------------
584  ChannelMode           | ChCfg | Height | front_El      | side_El  | back_El  |
585 lfe_El
586 -----------------------+-------+--------+---------------+----------+----------+---------
587 MODE_1                 |     1 | NORM   | SCE           |          |          |
588 MODE_2                 |     2 | NORM   | CPE           |          |          |
589 MODE_1_2               |     3 | NORM   | SCE, CPE      |          |          |
590 MODE_1_2_1             |     4 | NORM   | SCE, CPE      |          | SCE      |
591 MODE_1_2_2             |     5 | NORM   | SCE, CPE      |          | CPE      |
592 MODE_1_2_2_1           |     6 | NORM   | SCE, CPE      |          | CPE      |
593 LFE MODE_1_2_2_2_1         |     7 | NORM   | SCE, CPE, CPE |          | CPE
594 | LFE MODE_6_1               |    11 | NORM   | SCE, CPE      |          | CPE,
595 SCE | LFE MODE_7_1_BACK          |    12 | NORM   | SCE, CPE      |          |
596 CPE, CPE | LFE
597 -----------------------+-------+--------+---------------+----------+----------+---------
598 MODE_7_1_TOP_FRONT     |    14 | NORM   | SCE, CPE      |          | CPE      |
599 LFE |       | TOP    | CPE           |          |          |
600 -----------------------+-------+--------+---------------+----------+----------+---------
601 MODE_7_1_REAR_SURROUND |     0 | NORM   | SCE, CPE      |          | CPE, CPE |
602 LFE MODE_7_1_FRONT_CENTER  |     0 | NORM   | SCE, CPE, CPE |          | CPE
603 | LFE
604 ----------------------------------------------------------------------------------------
605 - NORM: Normal Height Layer.     - TOP: Top Height Layer.  - BTM: Bottom Height
606 Layer.
607 - SCE: Single Channel Element.   - CPE: Channel Pair.      - LFE: Low Frequency
608 Element. \endverbatim
609 
610 The Table describes all fixed Channel Elements for each Channel Mode which are
611 assigned to a speaker arrangement. The arrangement includes front, side, back
612 and lfe Audio Channel Elements in the normal height layer, possibly followed by
613 front, side, and back elements in the top and bottom layer (Channel
614 Configuration 14). \n This mapping of Audio Channel Elements is defined in MPEG
615 standard for Channel Config 1-7 and 11,12,14.\n In case of Channel Config 0 or
616 writing matrix mixdown coefficients, the encoder enables the writing of Program
617 Config Element itself as described in \ref encPCE. The configuration used in
618 Program Config Element refers to the denoted Table.\n Beside the Channel Element
619 assignment the Channel Modes are resposible for audio input data channel
620 mapping. The Channel Mapping of the audio data depends on the selected
621 ::AACENC_CHANNELORDER which can be MPEG or WAV like order.\n Following table
622 describes the complete channel mapping for both Channel Order configurations.
623 \verbatim
624 ---------------------------------------------------------------------------------------
625 ChannelMode            |  MPEG-Channelorder            |  WAV-Channelorder
626 -----------------------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---
627 MODE_1                 | 0 |   |   |   |   |   |   |   | 0 |   |   |   |   |   |
628 | MODE_2                 | 0 | 1 |   |   |   |   |   |   | 0 | 1 |   |   |   |
629 |   | MODE_1_2               | 0 | 1 | 2 |   |   |   |   |   | 2 | 0 | 1 |   |
630 |   |   | MODE_1_2_1             | 0 | 1 | 2 | 3 |   |   |   |   | 2 | 0 | 1 | 3
631 |   |   |   | MODE_1_2_2             | 0 | 1 | 2 | 3 | 4 |   |   |   | 2 | 0 | 1
632 | 3 | 4 |   |   | MODE_1_2_2_1           | 0 | 1 | 2 | 3 | 4 | 5 |   |   | 2 | 0
633 | 1 | 4 | 5 | 3 |   | MODE_1_2_2_2_1         | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 2
634 | 6 | 7 | 0 | 1 | 4 | 5 | 3 MODE_6_1               | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
635 | 2 | 0 | 1 | 4 | 5 | 6 | 3 | MODE_7_1_BACK          | 0 | 1 | 2 | 3 | 4 | 5 | 6
636 | 7 | 2 | 0 | 1 | 6 | 7 | 4 | 5 | 3 MODE_7_1_TOP_FRONT     | 0 | 1 | 2 | 3 | 4 |
637 5 | 6 | 7 | 2 | 0 | 1 | 4 | 5 | 3 | 6 | 7
638 -----------------------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---
639 MODE_7_1_REAR_SURROUND | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 2 | 0 | 1 | 6 | 7 | 4 |
640 5 | 3 MODE_7_1_FRONT_CENTER  | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 2 | 6 | 7 | 0 | 1
641 | 4 | 5 | 3
642 ---------------------------------------------------------------------------------------
643 \endverbatim
644 
645 The denoted mapping is important for correct audio channel assignment when using
646 MPEG or WAV ordering. The incoming audio channels are distributed MPEG like
647 starting at the front channels and ending at the back channels. The distribution
648 is used as described in Table concering Channel Config and fix channel elements.
649 Please see the following example for clarification.
650 
651 \verbatim
652 Example: MODE_1_2_2_1 - WAV-Channelorder 5.1
653 ------------------------------------------
654  Input Channel      | Coder Channel
655 --------------------+---------------------
656  2 (front center)   | 0 (SCE channel)
657  0 (left center)    | 1 (1st of 1st CPE)
658  1 (right center)   | 2 (2nd of 1st CPE)
659  4 (left surround)  | 3 (1st of 2nd CPE)
660  5 (right surround) | 4 (2nd of 2nd CPE)
661  3 (LFE)            | 5 (LFE)
662 ------------------------------------------
663 \endverbatim
664 
665 
666 \section suppBitrates Supported Bitrates
667 
668 The FDK AAC Encoder provides a wide range of supported bitrates.
669 The minimum and maximum allowed bitrate depends on the Audio Object Type. For
670 AAC-LC the minimum bitrate is the bitrate that is required to write the most
671 basic and minimal valid bitstream. It consists of the bitstream format header
672 information and other static/mandatory information within the AAC payload. The
673 maximum AAC framesize allowed by the MPEG-4 standard determines the maximum
674 allowed bitrate for AAC-LC. For HE-AAC and HE-AAC v2 a library internal look-up
675 table is used.
676 
677 A good working point in terms of audio quality, sampling rate and bitrate, is at
678 1 to 1.5 bits/audio sample for AAC-LC, 0.625 bits/audio sample for dualrate
679 HE-AAC, 1.125 bits/audio sample for downsampled HE-AAC and 0.5 bits/audio sample
680 for HE-AAC v2. For example for one channel with a sampling frequency of 48 kHz,
681 the range from 48 kbit/s to 72 kbit/s achieves reasonable audio quality for
682 AAC-LC.
683 
684 For HE-AAC and HE-AAC v2 the lowest possible audio input sampling frequency is
685 16 kHz because then the AAC-LC core encoder operates in dual rate mode at its
686 lowest possible sampling frequency, which is 8 kHz. HE-AAC v2 requires stereo
687 input audio data.
688 
689 Please note that in HE-AAC or HE-AAC v2 mode the encoder supports much higher
690 bitrates than are appropriate for HE-AAC or HE-AAC v2. For example, at a bitrate
691 of more than 64 kbit/s for a stereo audio signal at 44.1 kHz it usually makes
692 sense to use AAC-LC, which will produce better audio quality at that bitrate
693 than HE-AAC or HE-AAC v2.
694 
695 \section reommendedConfig Recommended Sampling Rate and Bitrate Combinations
696 
697 The following table provides an overview of recommended encoder configuration
698 parameters which we determined by virtue of numerous listening tests.
699 
700 \subsection reommendedConfigLC AAC-LC, HE-AAC, HE-AACv2 in Dualrate SBR mode.
701 \verbatim
702 -----------------------------------------------------------------------------------
703 Audio Object Type  |  Bit Rate Range  |            Supported  | Preferred  | No.
704 of |         [bit/s]  |       Sampling Rates  |    Sampl.  |  Chan. |
705 |                [kHz]  |      Rate  | |                  |
706 |     [kHz]  |
707 -------------------+------------------+-----------------------+------------+-------
708 AAC LC + SBR + PS  |   8000 -  11999  |         22.05, 24.00  |     24.00  | 2
709 AAC LC + SBR + PS  |  12000 -  17999  |                32.00  |     32.00  | 2
710 AAC LC + SBR + PS  |  18000 -  39999  |  32.00, 44.10, 48.00  |     44.10  | 2
711 AAC LC + SBR + PS  |  40000 -  64000  |  32.00, 44.10, 48.00  |     48.00  | 2
712 -------------------+------------------+-----------------------+------------+-------
713 AAC LC + SBR       |   8000 -  11999  |         22.05, 24.00  |     24.00  | 1
714 AAC LC + SBR       |  12000 -  17999  |                32.00  |     32.00  | 1
715 AAC LC + SBR       |  18000 -  39999  |  32.00, 44.10, 48.00  |     44.10  | 1
716 AAC LC + SBR       |  40000 -  64000  |  32.00, 44.10, 48.00  |     48.00  | 1
717 -------------------+------------------+-----------------------+------------+-------
718 AAC LC + SBR       |  16000 -  27999  |  32.00, 44.10, 48.00  |     32.00  | 2
719 AAC LC + SBR       |  28000 -  63999  |  32.00, 44.10, 48.00  |     44.10  | 2
720 AAC LC + SBR       |  64000 - 128000  |  32.00, 44.10, 48.00  |     48.00  | 2
721 -------------------+------------------+-----------------------+------------+-------
722 AAC LC + SBR       |  64000 -  69999  |  32.00, 44.10, 48.00  |     32.00  |
723 5, 5.1 AAC LC + SBR       |  70000 - 239999  |  32.00, 44.10, 48.00  |     44.10
724 | 5, 5.1 AAC LC + SBR       | 240000 - 319999  |  32.00, 44.10, 48.00  |
725 48.00  | 5, 5.1
726 -------------------+------------------+-----------------------+------------+-------
727 AAC LC             |   8000 -  15999  | 11.025, 12.00, 16.00  |     12.00  | 1
728 AAC LC             |  16000 -  23999  |                16.00  |     16.00  | 1
729 AAC LC             |  24000 -  31999  |  16.00, 22.05, 24.00  |     24.00  | 1
730 AAC LC             |  32000 -  55999  |                32.00  |     32.00  | 1
731 AAC LC             |  56000 - 160000  |  32.00, 44.10, 48.00  |     44.10  | 1
732 AAC LC             | 160001 - 288000  |                48.00  |     48.00  | 1
733 -------------------+------------------+-----------------------+------------+-------
734 AAC LC             |  16000 -  23999  | 11.025, 12.00, 16.00  |     12.00  | 2
735 AAC LC             |  24000 -  31999  |                16.00  |     16.00  | 2
736 AAC LC             |  32000 -  39999  |  16.00, 22.05, 24.00  |     22.05  | 2
737 AAC LC             |  40000 -  95999  |                32.00  |     32.00  | 2
738 AAC LC             |  96000 - 111999  |  32.00, 44.10, 48.00  |     32.00  | 2
739 AAC LC             | 112000 - 320001  |  32.00, 44.10, 48.00  |     44.10  | 2
740 AAC LC             | 320002 - 576000  |                48.00  |     48.00  | 2
741 -------------------+------------------+-----------------------+------------+-------
742 AAC LC             | 160000 - 239999  |                32.00  |     32.00  |
743 5, 5.1 AAC LC             | 240000 - 279999  |  32.00, 44.10, 48.00  |     32.00
744 | 5, 5.1 AAC LC             | 280000 - 800000  |  32.00, 44.10, 48.00  |
745 44.10  | 5, 5.1
746 -----------------------------------------------------------------------------------
747 \endverbatim \n
748 
749 \subsection reommendedConfigLD AAC-LD, AAC-ELD, AAC-ELD with SBR in Dualrate SBR
750 mode. Unlike to HE-AAC configuration the SBR is not covered by ELD audio object
751 type and needs to be enabled explicitly. Use ::AACENC_SBR_MODE to configure SBR
752 and its samplingrate ratio with ::AACENC_SBR_RATIO parameter. \verbatim
753 -----------------------------------------------------------------------------------
754 Audio Object Type  |  Bit Rate Range  |            Supported  | Preferred  | No.
755 of |         [bit/s]  |       Sampling Rates  |    Sampl.  |  Chan. |
756 |                [kHz]  |      Rate  | |                  |
757 |     [kHz]  |
758 -------------------+------------------+-----------------------+------------+-------
759 ELD + SBR          |  18000 -  24999  |        32.00 - 44.10  |     32.00  | 1
760 ELD + SBR          |  25000 -  31999  |        32.00 - 48.00  |     32.00  | 1
761 ELD + SBR          |  32000 -  64000  |        32.00 - 48.00  |     48.00  | 1
762 -------------------+------------------+-----------------------+------------+-------
763 ELD + SBR          |  32000 -  51999  |        32.00 - 48.00  |     44.10  | 2
764 ELD + SBR          |  52000 - 128000  |        32.00 - 48.00  |     48.00  | 2
765 -------------------+------------------+-----------------------+------------+-------
766 ELD + SBR          |  78000 - 160000  |        32.00 - 48.00  |     48.00  | 3
767 -------------------+------------------+-----------------------+------------+-------
768 ELD + SBR          | 104000 - 212000  |        32.00 - 48.00  |     48.00  | 4
769 -------------------+------------------+-----------------------+------------+-------
770 ELD + SBR          | 130000 - 246000  |        32.00 - 48.00  |     48.00  |
771 5, 5.1
772 -------------------+------------------+-----------------------+------------+-------
773 LD, ELD            |  16000 -  19999  |        16.00 - 24.00  |     16.00  | 1
774 LD, ELD            |  20000 -  39999  |        16.00 - 32.00  |     24.00  | 1
775 LD, ELD            |  40000 -  49999  |        22.05 - 32.00  |     32.00  | 1
776 LD, ELD            |  50000 -  61999  |        24.00 - 44.10  |     32.00  | 1
777 LD, ELD            |  62000 -  84999  |        32.00 - 48.00  |     44.10  | 1
778 LD, ELD            |  85000 - 192000  |        44.10 - 48.00  |     48.00  | 1
779 -------------------+------------------+-----------------------+------------+-------
780 LD, ELD            |  64000 -  75999  |        24.00 - 32.00  |     32.00  | 2
781 LD, ELD            |  76000 -  97999  |        24.00 - 44.10  |     32.00  | 2
782 LD, ELD            |  98000 - 135999  |        32.00 - 48.00  |     44.10  | 2
783 LD, ELD            | 136000 - 384000  |        44.10 - 48.00  |     48.00  | 2
784 -------------------+------------------+-----------------------+------------+-------
785 LD, ELD            |  96000 - 113999  |        24.00 - 32.00  |     32.00  | 3
786 LD, ELD            | 114000 - 146999  |        24.00 - 44.10  |     32.00  | 3
787 LD, ELD            | 147000 - 203999  |        32.00 - 48.00  |     44.10  | 3
788 LD, ELD            | 204000 - 576000  |        44.10 - 48.00  |     48.00  | 3
789 -------------------+------------------+-----------------------+------------+-------
790 LD, ELD            | 128000 - 151999  |        24.00 - 32.00  |     32.00  | 4
791 LD, ELD            | 152000 - 195999  |        24.00 - 44.10  |     32.00  | 4
792 LD, ELD            | 196000 - 271999  |        32.00 - 48.00  |     44.10  | 4
793 LD, ELD            | 272000 - 768000  |        44.10 - 48.00  |     48.00  | 4
794 -------------------+------------------+-----------------------+------------+-------
795 LD, ELD            | 160000 - 189999  |        24.00 - 32.00  |     32.00  |
796 5, 5.1 LD, ELD            | 190000 - 244999  |        24.00 - 44.10  |     32.00
797 | 5, 5.1 LD, ELD            | 245000 - 339999  |        32.00 - 48.00  |
798 44.10  | 5, 5.1 LD, ELD            | 340000 - 960000  |        44.10 - 48.00  |
799 48.00  | 5, 5.1
800 -----------------------------------------------------------------------------------
801 \endverbatim \n
802 
803 \subsection reommendedConfigELD AAC-ELD with SBR in Downsampled SBR mode.
804 \verbatim
805 -----------------------------------------------------------------------------------
806 Audio Object Type  |  Bit Rate Range  |            Supported  | Preferred  | No.
807 of |         [bit/s]  |       Sampling Rates  |    Sampl.  |  Chan. |
808 |                [kHz]  |      Rate  | |                  |
809 |     [kHz]  |
810 -------------------+------------------+-----------------------+------------+-------
811 ELD + SBR          |  18000 - 24999   |        16.00 - 22.05  |     22.05  | 1
812 (downsampled SBR)  |  25000 - 31999   |        16.00 - 24.00  |     24.00  | 1
813                    |  32000 - 47999   |        22.05 - 32.00  |     32.00  | 1
814                    |  48000 - 64000   |        22.05 - 48.00  |     32.00  | 1
815 -------------------+------------------+-----------------------+------------+-------
816 ELD + SBR          |  32000 - 51999   |        16.00 - 24.00  |     24.00  | 2
817 (downsampled SBR)  |  52000 - 59999   |        22.05 - 24.00  |     24.00  | 2
818                    |  60000 - 95999   |        22.05 - 32.00  |     32.00  | 2
819                    |  96000 - 128000  |        22.05 - 48.00  |     32.00  | 2
820 -------------------+------------------+-----------------------+------------+-------
821 ELD + SBR          |  78000 -  99999  |        22.05 - 24.00  |     24.00  | 3
822 (downsampled SBR)  | 100000 - 143999  |        22.05 - 32.00  |     32.00  | 3
823                    | 144000 - 159999  |        22.05 - 48.00  |     32.00  | 3
824                    | 160000 - 192000  |        32.00 - 48.00  |     32.00  | 3
825 -------------------+------------------+-----------------------+------------+-------
826 ELD + SBR          | 104000 - 149999  |        22.05 - 24.00  |     24.00  | 4
827 (downsampled SBR)  | 150000 - 191999  |        22.05 - 32.00  |     32.00  | 4
828                    | 192000 - 211999  |        22.05 - 48.00  |     32.00  | 4
829                    | 212000 - 256000  |        32.00 - 48.00  |     32.00  | 4
830 -------------------+------------------+-----------------------+------------+-------
831 ELD + SBR          | 130000 - 171999  |        22.05 - 24.00  |     24.00  |
832 5, 5.1 (downsampled SBR)  | 172000 - 239999  |        22.05 - 32.00  |     32.00
833 | 5, 5.1 | 240000 - 320000  |        32.00 - 48.00  |     32.00  | 5, 5.1
834 -----------------------------------------------------------------------------------
835 \endverbatim \n
836 
837 \subsection reommendedConfigELDv2 AAC-ELD v2, AAC-ELD v2 with SBR.
838 The ELD v2 212 configuration must be configured explicitly with
839 ::AACENC_CHANNELMODE parameter according MODE_212 value. SBR can be configured
840 separately through ::AACENC_SBR_MODE and ::AACENC_SBR_RATIO parameter. Following
841 configurations shall apply to both framelengths 480 and 512. For ELD v2
842 configuration without SBR and framelength 480 the supported sampling rate is
843 restricted to the range from 16 kHz up to 24 kHz. \verbatim
844 -----------------------------------------------------------------------------------
845 Audio Object Type  |  Bit Rate Range  |            Supported  | Preferred  | No.
846 of |         [bit/s]  |       Sampling Rates  |    Sampl.  |  Chan. |
847 |                [kHz]  |      Rate  | |                  |
848 |     [kHz]  |
849 -------------------+------------------+-----------------------+------------+-------
850 ELD-212            |  16000 -  19999  |        16.00 - 24.00  |     16.00  | 2
851 (without SBR)      |  20000 -  39999  |        16.00 - 32.00  |     24.00  | 2
852                    |  40000 -  49999  |        22.05 - 32.00  |     32.00  | 2
853                    |  50000 -  61999  |        24.00 - 44.10  |     32.00  | 2
854                    |  62000 -  84999  |        32.00 - 48.00  |     44.10  | 2
855                    |  85000 - 192000  |        44.10 - 48.00  |     48.00  | 2
856 -------------------+------------------+-----------------------+------------+-------
857 ELD-212 + SBR      |  18000 -  20999  |                32.00  |     32.00  | 2
858 (dualrate SBR)     |  21000 -  25999  |        32.00 - 44.10  |     32.00  | 2
859                    |  26000 -  31999  |        32.00 - 48.00  |     44.10  | 2
860                    |  32000 -  64000  |        32.00 - 48.00  |     48.00  | 2
861 -------------------+------------------+-----------------------+------------+-------
862 ELD-212 + SBR      |  18000 -  19999  |        16.00 - 22.05  |     22.05  | 2
863 (downsampled SBR)  |  20000 -  24999  |        16.00 - 24.00  |     22.05  | 2
864                    |  25000 -  31999  |        16.00 - 24.00  |     24.00  | 2
865                    |  32000 -  64000  |        24.00 - 24.00  |     24.00  | 2
866 -------------------+------------------+-----------------------+------------+-------
867 \endverbatim \n
868 
869 \page ENCODERBEHAVIOUR Encoder Behaviour
870 
871 \section BEHAVIOUR_BANDWIDTH Bandwidth
872 
873 The FDK AAC encoder usually does not use the full frequency range of the input
874 signal, but restricts the bandwidth according to certain library-internal
875 settings. They can be changed in the table "bandWidthTable" in the file
876 bandwidth.cpp (if available).
877 
878 The encoder API provides the ::AACENC_BANDWIDTH parameter to adjust the
879 bandwidth explicitly. \code aacEncoder_SetParam(hAacEncoder, AACENC_BANDWIDTH,
880 value); \endcode
881 
882 However it is not recommended to change these settings, because they are based
883 on numerous listening tests and careful tweaks to ensure the best overall
884 encoding quality. Also, the maximum bandwidth that can be set manually by the
885 user is 20kHz or fs/2, whichever value is smaller.
886 
887 Theoretically a signal of for example 48 kHz can contain frequencies up to 24
888 kHz, but to use this full range in an audio encoder usually does not make sense.
889 Usually the encoder has a very limited amount of bits to spend (typically 128
890 kbit/s for stereo 48 kHz content) and to allow full range bandwidth would waste
891 a lot of these bits for frequencies the human ear is hardly able to perceive
892 anyway, if at all. Hence it is wise to use the available bits for the really
893 important frequency range and just skip the rest. At lower bitrates (e. g. <= 80
894 kbit/s for stereo 48 kHz content) the encoder will choose an even smaller
895 bandwidth, because an encoded signal with smaller bandwidth and hence less
896 artifacts sounds better than a signal with higher bandwidth but then more coding
897 artefacts across all frequencies. These artefacts would occur if small bitrates
898 and high bandwidths are chosen because the available bits are just not enough to
899 encode all frequencies well.
900 
901 Unfortunately some people evaluate encoding quality based on possible bandwidth
902 as well, but it is a double-edged sword considering the trade-off described
903 above.
904 
905 Another aspect is workload consumption. The higher the allowed bandwidth, the
906 more frequency lines have to be processed, which in turn increases the workload.
907 
908 \section FRAMESIZES_AND_BIT_RESERVOIR Frame Sizes & Bit Reservoir
909 
910 For AAC there is a difference between constant bit rate and constant frame
911 length due to the so-called bit reservoir technique, which allows the encoder to
912 use less bits in an AAC frame for those audio signal sections which are easy to
913 encode, and then spend them at a later point in time for more complex audio
914 sections. The extent to which this "bit exchange" is done is limited to allow
915 for reliable and relatively low delay real time streaming. Therefore, for
916 AAC-ELD, the bitreservoir is limited. It varies between 500 and 4000 bits/frame,
917 depending on the bitrate/channel.
918 - For a bitrate of 12kbps/channel and below, the AAC-ELD bitreservoir is 500
919 bits/frame.
920 - For a bitrate of 70kbps/channel and above, the AAC-ELD bitreservoir is 4000
921 bits/frame.
922 - Between 12kbps/channel and 70kbps/channel, the AAC-ELD bitrervoir is increased
923 linearly.
924 - For AAC-LC, the bitrate is only limited by the maximum AAC frame length. It
925 is, regardless of the available bit reservoir, defined as 6144 bits per channel.
926 
927 Over a longer period in time the bitrate will be constant in the AAC constant
928 bitrate mode, e.g. for ISDN transmission. This means that in AAC each bitstream
929 frame will in general have a different length in bytes but over time it
930 will reach the target bitrate.
931 
932 
933 One could also make an MPEG compliant
934 AAC encoder which always produces constant length packages for each AAC frame,
935 but the audio quality would be considerably worse since the bit reservoir
936 technique would have to be switched off completely. A higher bit rate would have
937 to be used to get the same audio quality as with an enabled bit reservoir.
938 
939 For mp3 by the way, the same bit reservoir technique exists, but there each bit
940 stream frame has a constant length for a given bit rate (ignoring the
941 padding byte). In mp3 there is a so-called "back pointer" which tells
942 the decoder which bits belong to the current mp3 frame - and in general some or
943 many bits have been transmitted in an earlier mp3 frame. Basically this leads to
944 the same "bit exchange between mp3 frames" as in AAC but with virtually constant
945 length frames.
946 
947 This variable frame length at "constant bit rate" is not something special
948 in this Fraunhofer IIS AAC encoder. AAC has been designed in that way.
949 
950 \subsection BEHAVIOUR_ESTIM_AVG_FRAMESIZES Estimating Average Frame Sizes
951 
952 A HE-AAC v1 or v2 audio frame contains 2048 PCM samples per channel.
953 
954 The number of HE-AAC frames \f$N\_FRAMES\f$ per second at 44.1 kHz is:
955 
956 \f[
957 N\_FRAMES = 44100 / 2048 = 21.5332
958 \f]
959 
960 At a bit rate of 8 kbps the average number of bits per frame
961 \f$N\_BITS\_PER\_FRAME\f$ is:
962 
963 \f[
964 N\_BITS\_PER\_FRAME = 8000 / 21.5332 = 371.52
965 \f]
966 
967 which is about 46.44 bytes per encoded frame.
968 
969 At a bit rate of 32 kbps, which is quite high for single channel HE-AAC v1, it
970 is:
971 
972 \f[
973 N\_BITS\_PER\_FRAME = 32000 / 21.5332 = 1486
974 \f]
975 
976 which is about 185.76 bytes per encoded frame.
977 
978 These bits/frame figures are average figures where each AAC frame generally has
979 a different size in bytes. To calculate the same for AAC-LC just use 1024
980 instead of 2048 PCM samples per frame and channel. For AAC-LD/ELD it is either
981 480 or 512 PCM samples per frame and channel.
982 
983 
984 \section BEHAVIOUR_TOOLS Encoder Tools
985 
986 The AAC encoder supports TNS, PNS, MS, Intensity and activates these tools
987 depending on the audio signal and the encoder configuration (i.e. bitrate or
988 AOT). It is not required to configure these tools manually.
989 
990 PNS improves encoding quality only for certain bitrates. Therefore it makes
991 sense to activate PNS only for these bitrates and save the processing power
992 required for PNS (about 10 % of the encoder) when using other bitrates. This is
993 done automatically inside the encoder library. PNS is disabled inside the
994 encoder library if an MPEG-2 AOT is choosen since PNS is an MPEG-4 AAC feature.
995 
996 If SBR is activated, the encoder automatically deactivates PNS internally. If
997 TNS is disabled but PNS is allowed, the encoder deactivates PNS calculation
998 internally.
999 
1000 */
1001 
1002 #ifndef AACENC_LIB_H
1003 #define AACENC_LIB_H
1004 
1005 #include "machine_type.h"
1006 #include "FDK_audio.h"
1007 
1008 #define AACENCODER_LIB_VL0 4
1009 #define AACENCODER_LIB_VL1 0
1010 #define AACENCODER_LIB_VL2 1
1011 
1012 /**
1013  *  AAC encoder error codes.
1014  */
1015 typedef enum {
1016   AACENC_OK = 0x0000, /*!< No error happened. All fine. */
1017 
1018   AACENC_INVALID_HANDLE =
1019       0x0020, /*!< Handle passed to function call was invalid. */
1020   AACENC_MEMORY_ERROR = 0x0021,          /*!< Memory allocation failed. */
1021   AACENC_UNSUPPORTED_PARAMETER = 0x0022, /*!< Parameter not available. */
1022   AACENC_INVALID_CONFIG = 0x0023,        /*!< Configuration not provided. */
1023 
1024   AACENC_INIT_ERROR = 0x0040,     /*!< General initialization error. */
1025   AACENC_INIT_AAC_ERROR = 0x0041, /*!< AAC library initialization error. */
1026   AACENC_INIT_SBR_ERROR = 0x0042, /*!< SBR library initialization error. */
1027   AACENC_INIT_TP_ERROR = 0x0043, /*!< Transport library initialization error. */
1028   AACENC_INIT_META_ERROR =
1029       0x0044, /*!< Meta data library initialization error. */
1030   AACENC_INIT_MPS_ERROR = 0x0045, /*!< MPS library initialization error. */
1031 
1032   AACENC_ENCODE_ERROR = 0x0060, /*!< The encoding process was interrupted by an
1033                                    unexpected error. */
1034 
1035   AACENC_ENCODE_EOF = 0x0080 /*!< End of file reached. */
1036 
1037 } AACENC_ERROR;
1038 
1039 /**
1040  *  AAC encoder buffer descriptors identifier.
1041  *  This identifier are used within buffer descriptors
1042  * AACENC_BufDesc::bufferIdentifiers.
1043  */
1044 typedef enum {
1045   /* Input buffer identifier. */
1046   IN_AUDIO_DATA = 0,    /*!< Audio input buffer, interleaved INT_PCM samples. */
1047   IN_ANCILLRY_DATA = 1, /*!< Ancillary data to be embedded into bitstream. */
1048   IN_METADATA_SETUP = 2, /*!< Setup structure for embedding meta data. */
1049 
1050   /* Output buffer identifier. */
1051   OUT_BITSTREAM_DATA = 3, /*!< Buffer holds bitstream output data. */
1052   OUT_AU_SIZES =
1053       4 /*!< Buffer contains sizes of each access unit. This information
1054              is necessary for superframing. */
1055 
1056 } AACENC_BufferIdentifier;
1057 
1058 /**
1059  *  AAC encoder handle.
1060  */
1061 typedef struct AACENCODER *HANDLE_AACENCODER;
1062 
1063 /**
1064  *  Provides some info about the encoder configuration.
1065  */
1066 typedef struct {
1067   UINT maxOutBufBytes; /*!< Maximum number of encoder bitstream bytes within one
1068                           frame. Size depends on maximum number of supported
1069                           channels in encoder instance. */
1070 
1071   UINT maxAncBytes; /*!< Maximum number of ancillary data bytes which can be
1072                        inserted into bitstream within one frame. */
1073 
1074   UINT inBufFillLevel; /*!< Internal input buffer fill level in samples per
1075                           channel. This parameter will automatically be cleared
1076                           if samplingrate or channel(Mode/Order) changes. */
1077 
1078   UINT inputChannels; /*!< Number of input channels expected in encoding
1079                          process. */
1080 
1081   UINT frameLength; /*!< Amount of input audio samples consumed each frame per
1082                        channel, depending on audio object type configuration. */
1083 
1084   UINT nDelay; /*!< Codec delay in PCM samples/channel. Depends on framelength
1085                   and AOT. Does not include framing delay for filling up encoder
1086                   PCM input buffer. */
1087 
1088   UINT nDelayCore; /*!< Codec delay in PCM samples/channel, w/o delay caused by
1089                       the decoder SBR module. This delay is needed to correctly
1090                       write edit lists for gapless playback. The decoder may not
1091                       know how much delay is introdcued by SBR, since it may not
1092                       know if SBR is active at all (implicit signaling),
1093                       therefore the decoder must take into account any delay
1094                       caused by the SBR module. */
1095 
1096   UCHAR confBuf[64]; /*!< Configuration buffer in binary format as an
1097                         AudioSpecificConfig or StreamMuxConfig according to the
1098                         selected transport type. */
1099 
1100   UINT confSize; /*!< Number of valid bytes in confBuf. */
1101 
1102 } AACENC_InfoStruct;
1103 
1104 /**
1105  *  Describes the input and output buffers for an aacEncEncode() call.
1106  */
1107 typedef struct {
1108   INT numBufs;            /*!< Number of buffers. */
1109   void **bufs;            /*!< Pointer to vector containing buffer addresses. */
1110   INT *bufferIdentifiers; /*!< Identifier of each buffer element. See
1111                              ::AACENC_BufferIdentifier. */
1112   INT *bufSizes;          /*!< Size of each buffer in 8-bit bytes. */
1113   INT *bufElSizes;        /*!< Size of each buffer element in bytes. */
1114 
1115 } AACENC_BufDesc;
1116 
1117 /**
1118  *  Defines the input arguments for an aacEncEncode() call.
1119  */
1120 typedef struct {
1121   INT numInSamples; /*!< Number of valid input audio samples (multiple of input
1122                        channels). */
1123   INT numAncBytes;  /*!< Number of ancillary data bytes to be encoded. */
1124 
1125 } AACENC_InArgs;
1126 
1127 /**
1128  *  Defines the output arguments for an aacEncEncode() call.
1129  */
1130 typedef struct {
1131   INT numOutBytes;  /*!< Number of valid bitstream bytes generated during
1132                        aacEncEncode(). */
1133   INT numInSamples; /*!< Number of input audio samples consumed by the encoder.
1134                      */
1135   INT numAncBytes;  /*!< Number of ancillary data bytes consumed by the encoder.
1136                      */
1137   INT bitResState;  /*!< State of the bit reservoir in bits. */
1138 
1139 } AACENC_OutArgs;
1140 
1141 /**
1142  *  Meta Data Compression Profiles.
1143  */
1144 typedef enum {
1145   AACENC_METADATA_DRC_NONE = 0,          /*!< None. */
1146   AACENC_METADATA_DRC_FILMSTANDARD = 1,  /*!< Film standard. */
1147   AACENC_METADATA_DRC_FILMLIGHT = 2,     /*!< Film light. */
1148   AACENC_METADATA_DRC_MUSICSTANDARD = 3, /*!< Music standard. */
1149   AACENC_METADATA_DRC_MUSICLIGHT = 4,    /*!< Music light. */
1150   AACENC_METADATA_DRC_SPEECH = 5,        /*!< Speech. */
1151   AACENC_METADATA_DRC_NOT_PRESENT =
1152       256 /*!< Disable writing gain factor (used for comp_profile only). */
1153 
1154 } AACENC_METADATA_DRC_PROFILE;
1155 
1156 /**
1157  *  Meta Data setup structure.
1158  */
1159 typedef struct {
1160   AACENC_METADATA_DRC_PROFILE
1161   drc_profile; /*!< MPEG DRC compression profile. See
1162                   ::AACENC_METADATA_DRC_PROFILE. */
1163   AACENC_METADATA_DRC_PROFILE
1164   comp_profile; /*!< ETSI heavy compression profile. See
1165                    ::AACENC_METADATA_DRC_PROFILE. */
1166 
1167   INT drc_TargetRefLevel;  /*!< Used to define expected level to:
1168                                 Scaled with 16 bit. x*2^16. */
1169   INT comp_TargetRefLevel; /*!< Adjust limiter to avoid overload.
1170                                 Scaled with 16 bit. x*2^16. */
1171 
1172   INT prog_ref_level_present; /*!< Flag, if prog_ref_level is present */
1173   INT prog_ref_level;         /*!< Programme Reference Level = Dialogue Level:
1174                                    -31.75dB .. 0 dB ; stepsize: 0.25dB
1175                                    Scaled with 16 bit. x*2^16.*/
1176 
1177   UCHAR PCE_mixdown_idx_present; /*!< Flag, if dmx-idx should be written in
1178                                     programme config element */
1179   UCHAR ETSI_DmxLvl_present;     /*!< Flag, if dmx-lvl should be written in
1180                                     ETSI-ancData */
1181 
1182   SCHAR centerMixLevel; /*!< Center downmix level (0...7, according to table) */
1183   SCHAR surroundMixLevel; /*!< Surround downmix level (0...7, according to
1184                              table) */
1185 
1186   UCHAR
1187   dolbySurroundMode; /*!< Indication for Dolby Surround Encoding Mode.
1188                           - 0: Dolby Surround mode not indicated
1189                           - 1: 2-ch audio part is not Dolby surround encoded
1190                           - 2: 2-ch audio part is Dolby surround encoded */
1191 
1192   UCHAR drcPresentationMode; /*!< Indicatin for DRC Presentation Mode.
1193                                   - 0: Presentation mode not inticated
1194                                   - 1: Presentation mode 1
1195                                   - 2: Presentation mode 2 */
1196 
1197   struct {
1198     /* extended ancillary data */
1199     UCHAR extAncDataEnable; /*< Indicates if MPEG4_ext_ancillary_data() exists.
1200                                 - 0: No MPEG4_ext_ancillary_data().
1201                                 - 1: Insert MPEG4_ext_ancillary_data(). */
1202 
1203     UCHAR
1204     extDownmixLevelEnable;   /*< Indicates if ext_downmixing_levels() exists.
1205                                  - 0: No ext_downmixing_levels().
1206                                  - 1: Insert ext_downmixing_levels(). */
1207     UCHAR extDownmixLevel_A; /*< Downmix level index A (0...7, according to
1208                                 table) */
1209     UCHAR extDownmixLevel_B; /*< Downmix level index B (0...7, according to
1210                                 table) */
1211 
1212     UCHAR dmxGainEnable; /*< Indicates if ext_downmixing_global_gains() exists.
1213                              - 0: No ext_downmixing_global_gains().
1214                              - 1: Insert ext_downmixing_global_gains(). */
1215     INT dmxGain5;        /*< Gain factor for downmix to 5 channels.
1216                               -15.75dB .. -15.75dB; stepsize: 0.25dB
1217                               Scaled with 16 bit. x*2^16.*/
1218     INT dmxGain2;        /*< Gain factor for downmix to 2 channels.
1219                               -15.75dB .. -15.75dB; stepsize: 0.25dB
1220                               Scaled with 16 bit. x*2^16.*/
1221 
1222     UCHAR lfeDmxEnable; /*< Indicates if ext_downmixing_lfe_level() exists.
1223                             - 0: No ext_downmixing_lfe_level().
1224                             - 1: Insert ext_downmixing_lfe_level(). */
1225     UCHAR lfeDmxLevel;  /*< Downmix level index for LFE (0..15, according to
1226                            table) */
1227 
1228   } ExtMetaData;
1229 
1230 } AACENC_MetaData;
1231 
1232 /**
1233  * AAC encoder control flags.
1234  *
1235  * In interaction with the ::AACENC_CONTROL_STATE parameter it is possible to
1236  * get information about the internal initialization process. It is also
1237  * possible to overwrite the internal state from extern when necessary.
1238  */
1239 typedef enum {
1240   AACENC_INIT_NONE = 0x0000, /*!< Do not trigger initialization. */
1241   AACENC_INIT_CONFIG =
1242       0x0001, /*!< Initialize all encoder modules configuration. */
1243   AACENC_INIT_STATES = 0x0002, /*!< Reset all encoder modules history buffer. */
1244   AACENC_INIT_TRANSPORT =
1245       0x1000, /*!< Initialize transport lib with new parameters. */
1246   AACENC_RESET_INBUFFER =
1247       0x2000,              /*!< Reset fill level of internal input buffer. */
1248   AACENC_INIT_ALL = 0xFFFF /*!< Initialize all. */
1249 } AACENC_CTRLFLAGS;
1250 
1251 /**
1252  * \brief  AAC encoder setting parameters.
1253  *
1254  * Use aacEncoder_SetParam() function to configure, or use aacEncoder_GetParam()
1255  * function to read the internal status of the following parameters.
1256  */
1257 typedef enum {
1258   AACENC_AOT =
1259       0x0100, /*!< Audio object type. See ::AUDIO_OBJECT_TYPE in FDK_audio.h.
1260                    - 2: MPEG-4 AAC Low Complexity.
1261                    - 5: MPEG-4 AAC Low Complexity with Spectral Band Replication
1262                  (HE-AAC).
1263                    - 29: MPEG-4 AAC Low Complexity with Spectral Band
1264                  Replication and Parametric Stereo (HE-AAC v2). This
1265                  configuration can be used only with stereo input audio data.
1266                    - 23: MPEG-4 AAC Low-Delay.
1267                    - 39: MPEG-4 AAC Enhanced Low-Delay. Since there is no
1268                  ::AUDIO_OBJECT_TYPE for ELD in combination with SBR defined,
1269                  enable SBR explicitely by ::AACENC_SBR_MODE parameter. The ELD
1270                  v2 212 configuration can be configured by ::AACENC_CHANNELMODE
1271                  parameter.
1272                    - 129: MPEG-2 AAC Low Complexity.
1273                    - 132: MPEG-2 AAC Low Complexity with Spectral Band
1274                  Replication (HE-AAC).
1275 
1276                    Please note that the virtual MPEG-2 AOT's basically disables
1277                  non-existing Perceptual Noise Substitution tool in AAC encoder
1278                  and controls the MPEG_ID flag in adts header. The virtual
1279                  MPEG-2 AOT doesn't prohibit specific transport formats. */
1280 
1281   AACENC_BITRATE = 0x0101, /*!< Total encoder bitrate. This parameter is
1282                               mandatory and interacts with ::AACENC_BITRATEMODE.
1283                                 - CBR: Bitrate in bits/second.
1284                                 - VBR: Variable bitrate. Bitrate argument will
1285                               be ignored. See \ref suppBitrates for details. */
1286 
1287   AACENC_BITRATEMODE = 0x0102, /*!< Bitrate mode. Configuration can be different
1288                                   kind of bitrate configurations:
1289                                     - 0: Constant bitrate, use bitrate according
1290                                   to ::AACENC_BITRATE. (default) Within none
1291                                   LD/ELD ::AUDIO_OBJECT_TYPE, the CBR mode makes
1292                                   use of full allowed bitreservoir. In contrast,
1293                                   at Low-Delay ::AUDIO_OBJECT_TYPE the
1294                                   bitreservoir is kept very small.
1295                                     - 1: Variable bitrate mode, \ref vbrmode
1296                                   "very low bitrate".
1297                                     - 2: Variable bitrate mode, \ref vbrmode
1298                                   "low bitrate".
1299                                     - 3: Variable bitrate mode, \ref vbrmode
1300                                   "medium bitrate".
1301                                     - 4: Variable bitrate mode, \ref vbrmode
1302                                   "high bitrate".
1303                                     - 5: Variable bitrate mode, \ref vbrmode
1304                                   "very high bitrate". */
1305 
1306   AACENC_SAMPLERATE = 0x0103, /*!< Audio input data sampling rate. Encoder
1307                                  supports following sampling rates: 8000, 11025,
1308                                  12000, 16000, 22050, 24000, 32000, 44100,
1309                                  48000, 64000, 88200, 96000 */
1310 
1311   AACENC_SBR_MODE = 0x0104, /*!< Configure SBR independently of the chosen Audio
1312                                Object Type ::AUDIO_OBJECT_TYPE. This parameter
1313                                is for ELD audio object type only.
1314                                  - -1: Use ELD SBR auto configurator (default).
1315                                  - 0: Disable Spectral Band Replication.
1316                                  - 1: Enable Spectral Band Replication. */
1317 
1318   AACENC_GRANULE_LENGTH =
1319       0x0105, /*!< Core encoder (AAC) audio frame length in samples:
1320                    - 1024: Default configuration.
1321                    - 512: Default length in LD/ELD configuration.
1322                    - 480: Length in LD/ELD configuration.
1323                    - 256: Length for ELD reduced delay mode (x2).
1324                    - 240: Length for ELD reduced delay mode (x2).
1325                    - 128: Length for ELD reduced delay mode (x4).
1326                    - 120: Length for ELD reduced delay mode (x4). */
1327 
1328   AACENC_CHANNELMODE = 0x0106, /*!< Set explicit channel mode. Channel mode must
1329                                   match with number of input channels.
1330                                     - 1-7, 11,12,14 and 33,34: MPEG channel
1331                                   modes supported, see ::CHANNEL_MODE in
1332                                   FDK_audio.h. */
1333 
1334   AACENC_CHANNELORDER =
1335       0x0107, /*!< Input audio data channel ordering scheme:
1336                    - 0: MPEG channel ordering (e. g. 5.1: C, L, R, SL, SR, LFE).
1337                  (default)
1338                    - 1: WAVE file format channel ordering (e. g. 5.1: L, R, C,
1339                  LFE, SL, SR). */
1340 
1341   AACENC_SBR_RATIO =
1342       0x0108, /*!<  Controls activation of downsampled SBR. With downsampled
1343                  SBR, the delay will be shorter. On the other hand, for
1344                  achieving the same quality level, downsampled SBR needs more
1345                  bits than dual-rate SBR. With downsampled SBR, the AAC encoder
1346                  will work at the same sampling rate as the SBR encoder (single
1347                  rate). Downsampled SBR is supported for AAC-ELD and HE-AACv1.
1348                     - 1: Downsampled SBR (default for ELD).
1349                     - 2: Dual-rate SBR   (default for HE-AAC). */
1350 
1351   AACENC_AFTERBURNER =
1352       0x0200, /*!< This parameter controls the use of the afterburner feature.
1353                    The afterburner is a type of analysis by synthesis algorithm
1354                  which increases the audio quality but also the required
1355                  processing power. It is recommended to always activate this if
1356                  additional memory consumption and processing power consumption
1357                    is not a problem. If increased MHz and memory consumption are
1358                  an issue then the MHz and memory cost of this optional module
1359                  need to be evaluated against the improvement in audio quality
1360                  on a case by case basis.
1361                    - 0: Disable afterburner (default).
1362                    - 1: Enable afterburner. */
1363 
1364   AACENC_BANDWIDTH = 0x0203, /*!< Core encoder audio bandwidth:
1365                                   - 0: Determine audio bandwidth internally
1366                                 (default, see chapter \ref BEHAVIOUR_BANDWIDTH).
1367                                   - 1 to fs/2: Audio bandwidth in Hertz. Limited
1368                                 to 20kHz max. Not usable if SBR is active. This
1369                                 setting is for experts only, better do not touch
1370                                 this value to avoid degraded audio quality. */
1371 
1372   AACENC_PEAK_BITRATE =
1373       0x0207, /*!< Peak bitrate configuration parameter to adjust maximum bits
1374                  per audio frame. Bitrate is in bits/second. The peak bitrate
1375                  will internally be limited to the chosen bitrate
1376                  ::AACENC_BITRATE as lower limit and the
1377                  number_of_effective_channels*6144 bit as upper limit.
1378 
1379                    Setting the peak bitrate equal to ::AACENC_BITRATE does not
1380                  necessarily mean that the audio frames will be of constant
1381                  size. Since the peak bitate is in bits/second, the frame sizes
1382                  can vary by one byte in one or the other direction over various
1383                  frames. However, it is not recommended to reduce the peak
1384                  pitrate to ::AACENC_BITRATE - it would disable the
1385                  bitreservoir, which would affect the audio quality by a large
1386                  amount. */
1387 
1388   AACENC_TRANSMUX = 0x0300, /*!< Transport type to be used. See ::TRANSPORT_TYPE
1389                                in FDK_audio.h. Following types can be configured
1390                                in encoder library:
1391                                  - 0: raw access units
1392                                  - 1: ADIF bitstream format
1393                                  - 2: ADTS bitstream format
1394                                  - 6: Audio Mux Elements (LATM) with
1395                                muxConfigPresent = 1
1396                                  - 7: Audio Mux Elements (LATM) with
1397                                muxConfigPresent = 0, out of band StreamMuxConfig
1398                                  - 10: Audio Sync Stream (LOAS) */
1399 
1400   AACENC_HEADER_PERIOD =
1401       0x0301, /*!< Frame count period for sending in-band configuration buffers
1402                  within LATM/LOAS transport layer. Additionally this parameter
1403                  configures the PCE repetition period in raw_data_block(). See
1404                  \ref encPCE.
1405                    - 0xFF: auto-mode default 10 for TT_MP4_ADTS, TT_MP4_LOAS and
1406                  TT_MP4_LATM_MCP1, otherwise 0.
1407                    - n: Frame count period. */
1408 
1409   AACENC_SIGNALING_MODE =
1410       0x0302, /*!< Signaling mode of the extension AOT:
1411                    - 0: Implicit backward compatible signaling (default for
1412                  non-MPEG-4 based AOT's and for the transport formats ADIF and
1413                  ADTS)
1414                         - A stream that uses implicit signaling can be decoded
1415                  by every AAC decoder, even AAC-LC-only decoders
1416                         - An AAC-LC-only decoder will only decode the
1417                  low-frequency part of the stream, resulting in a band-limited
1418                  output
1419                         - This method works with all transport formats
1420                         - This method does not work with downsampled SBR
1421                    - 1: Explicit backward compatible signaling
1422                         - A stream that uses explicit backward compatible
1423                  signaling can be decoded by every AAC decoder, even AAC-LC-only
1424                  decoders
1425                         - An AAC-LC-only decoder will only decode the
1426                  low-frequency part of the stream, resulting in a band-limited
1427                  output
1428                         - A decoder not capable of decoding PS will only decode
1429                  the AAC-LC+SBR part. If the stream contained PS, the result
1430                  will be a a decoded mono downmix
1431                         - This method does not work with ADIF or ADTS. For
1432                  LOAS/LATM, it only works with AudioMuxVersion==1
1433                         - This method does work with downsampled SBR
1434                    - 2: Explicit hierarchical signaling (default for MPEG-4
1435                  based AOT's and for all transport formats excluding ADIF and
1436                  ADTS)
1437                         - A stream that uses explicit hierarchical signaling can
1438                  be decoded only by HE-AAC decoders
1439                         - An AAC-LC-only decoder will not decode a stream that
1440                  uses explicit hierarchical signaling
1441                         - A decoder not capable of decoding PS will not decode
1442                  the stream at all if it contained PS
1443                         - This method does not work with ADIF or ADTS. It works
1444                  with LOAS/LATM and the MPEG-4 File format
1445                         - This method does work with downsampled SBR
1446 
1447                     For making sure that the listener always experiences the
1448                  best audio quality, explicit hierarchical signaling should be
1449                  used. This makes sure that only a full HE-AAC-capable decoder
1450                  will decode those streams. The audio is played at full
1451                  bandwidth. For best backwards compatibility, it is recommended
1452                  to encode with implicit SBR signaling. A decoder capable of
1453                  AAC-LC only will then only decode the AAC part, which means the
1454                  decoded audio will sound band-limited.
1455 
1456                     For MPEG-2 transport types (ADTS,ADIF), only implicit
1457                  signaling is possible.
1458 
1459                     For LOAS and LATM, explicit backwards compatible signaling
1460                  only works together with AudioMuxVersion==1. The reason is
1461                  that, for explicit backwards compatible signaling, additional
1462                  information will be appended to the ASC. A decoder that is only
1463                  capable of decoding AAC-LC will skip this part. Nevertheless,
1464                  for jumping to the end of the ASC, it needs to know the ASC
1465                  length. Transmitting the length of the ASC is a feature of
1466                  AudioMuxVersion==1, it is not possible to transmit the length
1467                  of the ASC with AudioMuxVersion==0, therefore an AAC-LC-only
1468                  decoder will not be able to parse a LOAS/LATM stream that was
1469                  being encoded with AudioMuxVersion==0.
1470 
1471                     For downsampled SBR, explicit signaling is mandatory. The
1472                  reason for this is that the extension sampling frequency (which
1473                  is in case of SBR the sampling frequqncy of the SBR part) can
1474                  only be signaled in explicit mode.
1475 
1476                     For AAC-ELD, the SBR information is transmitted in the
1477                  ELDSpecific Config, which is part of the AudioSpecificConfig.
1478                  Therefore, the settings here will have no effect on AAC-ELD.*/
1479 
1480   AACENC_TPSUBFRAMES =
1481       0x0303, /*!< Number of sub frames in a transport frame for LOAS/LATM or
1482                  ADTS (default 1).
1483                    - ADTS: Maximum number of sub frames restricted to 4.
1484                    - LOAS/LATM: Maximum number of sub frames restricted to 2.*/
1485 
1486   AACENC_AUDIOMUXVER =
1487       0x0304, /*!< AudioMuxVersion to be used for LATM. (AudioMuxVersionA,
1488                  currently not implemented):
1489                    - 0: Default, no transmission of tara Buffer fullness, no ASC
1490                  length and including actual latm Buffer fullnes.
1491                    - 1: Transmission of tara Buffer fullness, ASC length and
1492                  actual latm Buffer fullness.
1493                    - 2: Transmission of tara Buffer fullness, ASC length and
1494                  maximum level of latm Buffer fullness. */
1495 
1496   AACENC_PROTECTION = 0x0306, /*!< Configure protection in transport layer:
1497                                    - 0: No protection. (default)
1498                                    - 1: CRC active for ADTS transport format. */
1499 
1500   AACENC_ANCILLARY_BITRATE =
1501       0x0500, /*!< Constant ancillary data bitrate in bits/second.
1502                    - 0: Either no ancillary data or insert exact number of
1503                  bytes, denoted via input parameter, numAncBytes in
1504                  AACENC_InArgs.
1505                    - else: Insert ancillary data with specified bitrate. */
1506 
1507   AACENC_METADATA_MODE = 0x0600, /*!< Configure Meta Data. See ::AACENC_MetaData
1508                                     for further details:
1509                                       - 0: Do not embed any metadata.
1510                                       - 1: Embed dynamic_range_info metadata.
1511                                       - 2: Embed dynamic_range_info and
1512                                     ancillary_data metadata.
1513                                       - 3: Embed ancillary_data metadata. */
1514 
1515   AACENC_CONTROL_STATE =
1516       0xFF00, /*!< There is an automatic process which internally reconfigures
1517                  the encoder instance when a configuration parameter changed or
1518                  an error occured. This paramerter allows overwriting or getting
1519                  the control status of this process. See ::AACENC_CTRLFLAGS. */
1520 
1521   AACENC_NONE = 0xFFFF /*!< ------ */
1522 
1523 } AACENC_PARAM;
1524 
1525 #ifdef __cplusplus
1526 extern "C" {
1527 #endif
1528 
1529 /**
1530  * \brief  Open an instance of the encoder.
1531  *
1532  * Allocate memory for an encoder instance with a functional range denoted by
1533  * the function parameters. Preinitialize encoder instance with default
1534  * configuration.
1535  *
1536  * \param phAacEncoder  A pointer to an encoder handle. Initialized on return.
1537  * \param encModules    Specify encoder modules to be supported in this encoder
1538  * instance:
1539  *                      - 0x0: Allocate memory for all available encoder
1540  * modules.
1541  *                      - else: Select memory allocation regarding encoder
1542  * modules. Following flags are possible and can be combined.
1543  *                              - 0x01: AAC module.
1544  *                              - 0x02: SBR module.
1545  *                              - 0x04: PS module.
1546  *                              - 0x08: MPS module.
1547  *                              - 0x10: Metadata module.
1548  *                              - example: (0x01|0x02|0x04|0x08|0x10) allocates
1549  * all modules and is equivalent to default configuration denotet by 0x0.
1550  * \param maxChannels   Number of channels to be allocated. This parameter can
1551  * be used in different ways:
1552  *                      - 0: Allocate maximum number of AAC and SBR channels as
1553  * supported by the library.
1554  *                      - nChannels: Use same maximum number of channels for
1555  * allocating memory in AAC and SBR module.
1556  *                      - nChannels | (nSbrCh<<8): Number of SBR channels can be
1557  * different to AAC channels to save data memory.
1558  *
1559  * \return
1560  *          - AACENC_OK, on succes.
1561  *          - AACENC_INVALID_HANDLE, AACENC_MEMORY_ERROR, AACENC_INVALID_CONFIG,
1562  * on failure.
1563  */
1564 AACENC_ERROR aacEncOpen(HANDLE_AACENCODER *phAacEncoder, const UINT encModules,
1565                         const UINT maxChannels);
1566 
1567 /**
1568  * \brief  Close the encoder instance.
1569  *
1570  * Deallocate encoder instance and free whole memory.
1571  *
1572  * \param phAacEncoder  Pointer to the encoder handle to be deallocated.
1573  *
1574  * \return
1575  *          - AACENC_OK, on success.
1576  *          - AACENC_INVALID_HANDLE, on failure.
1577  */
1578 AACENC_ERROR aacEncClose(HANDLE_AACENCODER *phAacEncoder);
1579 
1580 /**
1581  * \brief Encode audio data.
1582  *
1583  * This function is mainly for encoding audio data. In addition the function can
1584  * be used for an encoder (re)configuration process.
1585  * - PCM input data will be retrieved from external input buffer until the fill
1586  * level allows encoding a single frame. This functionality allows an external
1587  * buffer with reduced size in comparison to the AAC or HE-AAC audio frame
1588  * length.
1589  * - If the value of the input samples argument is zero, just internal
1590  * reinitialization will be applied if it is requested.
1591  * - At the end of a file the flushing process can be triggerd via setting the
1592  * value of the input samples argument to -1. The encoder delay lines are fully
1593  * flushed when the encoder returns no valid bitstream data
1594  * AACENC_OutArgs::numOutBytes. Furthermore the end of file is signaled by the
1595  * return value AACENC_ENCODE_EOF.
1596  * - If an error occured in the previous frame or any of the encoder parameters
1597  * changed, an internal reinitialization process will be applied before encoding
1598  * the incoming audio samples.
1599  * - The function can also be used for an independent reconfiguration process
1600  * without encoding. The first parameter has to be a valid encoder handle and
1601  * all other parameters can be set to NULL.
1602  * - If the size of the external bitbuffer in outBufDesc is not sufficient for
1603  * writing the whole bitstream, an internal error will be the return value and a
1604  * reconfiguration will be triggered.
1605  *
1606  * \param hAacEncoder           A valid AAC encoder handle.
1607  * \param inBufDesc             Input buffer descriptor, see AACENC_BufDesc:
1608  *                              - At least one input buffer with audio data is
1609  * expected.
1610  *                              - Optionally a second input buffer with
1611  * ancillary data can be fed.
1612  * \param outBufDesc            Output buffer descriptor, see AACENC_BufDesc:
1613  *                              - Provide one output buffer for the encoded
1614  * bitstream.
1615  * \param inargs                Input arguments, see AACENC_InArgs.
1616  * \param outargs               Output arguments, AACENC_OutArgs.
1617  *
1618  * \return
1619  *          - AACENC_OK, on success.
1620  *          - AACENC_INVALID_HANDLE, AACENC_ENCODE_ERROR, on failure in encoding
1621  * process.
1622  *          - AACENC_INVALID_CONFIG, AACENC_INIT_ERROR, AACENC_INIT_AAC_ERROR,
1623  * AACENC_INIT_SBR_ERROR, AACENC_INIT_TP_ERROR, AACENC_INIT_META_ERROR,
1624  * AACENC_INIT_MPS_ERROR, on failure in encoder initialization.
1625  *          - AACENC_UNSUPPORTED_PARAMETER, on incorrect input or output buffer
1626  * descriptor initialization.
1627  *          - AACENC_ENCODE_EOF, when flushing fully concluded.
1628  */
1629 AACENC_ERROR aacEncEncode(const HANDLE_AACENCODER hAacEncoder,
1630                           const AACENC_BufDesc *inBufDesc,
1631                           const AACENC_BufDesc *outBufDesc,
1632                           const AACENC_InArgs *inargs, AACENC_OutArgs *outargs);
1633 
1634 /**
1635  * \brief  Acquire info about present encoder instance.
1636  *
1637  * This function retrieves information of the encoder configuration. In addition
1638  * to informative internal states, a configuration data block of the current
1639  * encoder settings will be returned. The format is either Audio Specific Config
1640  * in case of Raw Packets transport format or StreamMuxConfig in case of
1641  * LOAS/LATM transport format. The configuration data block is binary coded as
1642  * specified in ISO/IEC 14496-3 (MPEG-4 audio), to be used directly for MPEG-4
1643  * File Format or RFC3016 or RFC3640 applications.
1644  *
1645  * \param hAacEncoder           A valid AAC encoder handle.
1646  * \param pInfo                 Pointer to AACENC_InfoStruct. Filled on return.
1647  *
1648  * \return
1649  *          - AACENC_OK, on succes.
1650  *          - AACENC_INVALID_HANDLE, AACENC_INIT_ERROR, on failure.
1651  */
1652 AACENC_ERROR aacEncInfo(const HANDLE_AACENCODER hAacEncoder,
1653                         AACENC_InfoStruct *pInfo);
1654 
1655 /**
1656  * \brief  Set one single AAC encoder parameter.
1657  *
1658  * This function allows configuration of all encoder parameters specified in
1659  * ::AACENC_PARAM. Each parameter must be set with a separate function call. An
1660  * internal validation of the configuration value range will be done and an
1661  * internal reconfiguration will be signaled. The actual configuration adoption
1662  * is part of the subsequent aacEncEncode() call.
1663  *
1664  * \param hAacEncoder           A valid AAC encoder handle.
1665  * \param param                 Parameter to be set. See ::AACENC_PARAM.
1666  * \param value                 Parameter value. See parameter description in
1667  * ::AACENC_PARAM.
1668  *
1669  * \return
1670  *          - AACENC_OK, on success.
1671  *          - AACENC_INVALID_HANDLE, AACENC_UNSUPPORTED_PARAMETER,
1672  * AACENC_INVALID_CONFIG, on failure.
1673  */
1674 AACENC_ERROR aacEncoder_SetParam(const HANDLE_AACENCODER hAacEncoder,
1675                                  const AACENC_PARAM param, const UINT value);
1676 
1677 /**
1678  * \brief  Get one single AAC encoder parameter.
1679  *
1680  * This function is the complement to aacEncoder_SetParam(). After encoder
1681  * reinitialization with user defined settings, the internal status can be
1682  * obtained of each parameter, specified with ::AACENC_PARAM.
1683  *
1684  * \param hAacEncoder           A valid AAC encoder handle.
1685  * \param param                 Parameter to be returned. See ::AACENC_PARAM.
1686  *
1687  * \return  Internal configuration value of specifed parameter ::AACENC_PARAM.
1688  */
1689 UINT aacEncoder_GetParam(const HANDLE_AACENCODER hAacEncoder,
1690                          const AACENC_PARAM param);
1691 
1692 /**
1693  * \brief  Get information about encoder library build.
1694  *
1695  * Fill a given LIB_INFO structure with library version information.
1696  *
1697  * \param info  Pointer to an allocated LIB_INFO struct.
1698  *
1699  * \return
1700  *          - AACENC_OK, on success.
1701  *          - AACENC_INVALID_HANDLE, AACENC_INIT_ERROR, on failure.
1702  */
1703 AACENC_ERROR aacEncGetLibInfo(LIB_INFO *info);
1704 
1705 #ifdef __cplusplus
1706 }
1707 #endif
1708 
1709 #endif /* AACENC_LIB_H */
1710