-
Notifications
You must be signed in to change notification settings - Fork 514
Wave Formats
Audio formats are described by the WAVEFORMATEX
structure, but the polymorphic nature of this format encompasses several other structures as well.
The 'generic' structure of a WAVEFORMATEX
is as follows (18 bytes), and is the modern header structure used to encode audio data. This header can be interpreted as other structures based on the wFormatTag
value, and can encode variable-length header extensions.
WORD wFormatTag;
WORD nChannels;
DWORD nSamplesPerSec;
DWORD nAvgBytesPerSec;
WORD nBlockAlign;
WORD wBitsPerSample;
WORD cbSize;
XAudio2 supports WAVE_FORMAT_PCM
, WAVE_FORMAT_IEEE_FLOAT
, WAVE_FORMAT_ADPCM
, WAVE_FORMAT_WMAUDIO2
, and WAVE_FORMAT_WMAUDIO3
. On Xbox One, it also supports WAVE_FORMAT_XMA2
.
The simplest kind of wave format is a partial version of WAVEFORMATEX
(it's only 14 instead of 18 bytes) that was the original polymorphic header for audio. This structure is largely considered deprecated in favor of WAVEFORMATEX
, and is only used as part of the structure definition for some older formats.
WORD wFormatTag;
WORD nChannels;
DWORD nSamplesPerSec;
DWORD nAvgBytesPerSec;
WORD nBlockAlign;
The next simplest form is another partial WAVEFORMATEX
version using an older structure (it's only 16 bytes instead of 18 bytes). This is commonly used to encode PCM format data using WAVE_FORMAT_PCM
or WAVE_FORMAT_IEEE_FLOAT
, although it's recommended you use a complete WAVEFORMATEX
instead.
WAVEFORMAT wf;
WORD wBitsPerSample;
XAudio2 only supports wBitsPerSample
values of 8, 16, 24, or 32 for WAVE_FORMAT_PCM.
XAudio2 only supports a wBitsPerSample
of 32 for WAVE_FORMAT_IEEE_FLOAT.
Generally for handling legacy
.wav
files, you allow the header format structure to be only 16 bytes long for theWAVE_FORMAT_PCM
orWAVE_FORMAT_IEEE_FLOAT
tags and just assumecbSize
is 0.
A common compressed encoding for audio data is ADPCM which is indicated with WAVE_FORMAT_ADPCM
.
WAVEFORMATEX wfx;
WORD wSamplesPerBlock;
WORD wNumCoef;
ADPCMCOEFSET aCoef[];
XAudio2 only supports Microsoft ADPCM which specifically contains 32 extra bytes (MSADPCM_FORMAT_EXTRA_BYTES
) in the wfx.cbSize
. It has a wSamplesPerBlock
of 4 (MSADPCM_BITS_PER_SAMPLE
), and uses 7 (MSADPCM_NUM_COEFFICIENTS
) fixed coefficient pairs:
{ 256, 0 }, { 512, -256 }, { 0, 0 }, { 192, 64 }, { 240, 0 }, { 460, -208 }, { 392, -232 }
A more complicated version of WAVEFORMATEX
is this extensible structure indicated by WAVE_FORMAT_EXTENSIBLE
with a cbSize
of 22 for a total of 40 bytes. It encodes the format as a GUID allowing for custom audio formats to be easily encoded.
WAVEFORMATEX Format;
union {
WORD wValidBitsPerSample;
WORD wSamplesPerBlock;
WORD wReserved;
} Samples;
DWORD dwChannelMask;
GUID SubFormat;
The most common use of this format is for encoding multi-channel audio since it provides an explicit dwChannelMask
. The standard WAVE_FORMAT_
tags are encoded in a WAVEFORMATEXTENSIBLE
using 'well-known' GUIDs of the following form:
{ formatTag, 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xAA, 0x00, 0x38, 0x9B, 0x71 }
Here's example code for finding the standard format tag that handles WAVEFORMATEXTENSIBLE
:
inline uint32_t GetFormatTag(const WAVEFORMATEX* wfx)
{
if (wfx->wFormatTag == WAVE_FORMAT_EXTENSIBLE)
{
if (wfx->cbSize < (sizeof(WAVEFORMATEXTENSIBLE) - sizeof(WAVEFORMATEX)))
return 0;
static const GUID s_wfexBase = { 0x00000000, 0x0000, 0x0010, 0x80, 0x00, 0x00, 0xAA, 0x00, 0x38, 0x9B, 0x71 };
auto wfex = reinterpret_cast<const WAVEFORMATEXTENSIBLE*>(wfx);
if (memcmp(reinterpret_cast<const BYTE*>(&wfex->SubFormat) + sizeof(DWORD),
reinterpret_cast<const BYTE*>(&s_wfexBase) + sizeof(DWORD), sizeof(GUID) - sizeof(DWORD)) != 0)
{
return 0;
}
return wfex->SubFormat.Data1;
}
else
{
return wfx->wFormatTag;
}
}
The Xbox One supports a custom hardware-compressed audio format known as XMA and is indicated using a custom variant of WAVEFORMATEX
with the WAVE_FORMAT_XMA2
(0x166). It has a wfx.cbSize
value of 34 for a total of 52 bytes.
WAVEFORMATEX wfx;
WORD NumStreams;
DWORD ChannelMask;
DWORD SamplesEncoded;
DWORD BytesPerBlock;
DWORD PlayBegin;
DWORD PlayLength;
DWORD LoopBegin;
DWORD LoopLength;
BYTE LoopCount;
BYTE EncoderVersion;
WORD BlockCount;
Older versions of the Xbox supported XMA v1 indicated with
WAVE_FORMAT_XMA
(0x165).
All content and source code for this package are subject to the terms of the MIT License.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
- Universal Windows Platform apps
- Windows desktop apps
- Windows 11
- Windows 10
- Windows 8.1
- Xbox One
- x86
- x64
- ARM64
- Visual Studio 2022
- Visual Studio 2019 (16.11)
- clang/LLVM v12 - v18
- MinGW 12.2, 13.2
- CMake 3.20