ACM File Format
ACM is Interplay's compressed audio format. Fallout and Fallout 2 use it for music, speech, and sound effects. It is not related to Microsoft's Windows Audio Compression Manager despite sharing the same extension.
An ACM file contains a 14-byte header followed by a little-endian bitstream. Decoding produces signed 16-bit PCM samples. The common Fallout rate is 22050 Hz, with mono and stereo files both appearing in Interplay-era games. Modern decoders should read the channel count and sample rate from the header, but Fallout's own playback paths also make contextual assumptions about how the sound is used.
Browser ACM Player
The player below decodes ACM directly in the browser. The decoder mode controls how strict the parser is about Fallout-style standalone files, WAVC wrappers, and the Fallout 2 CE raw-block extension; the playback-channel mode controls how the decoded sample stream is grouped into audio frames. This matters because Fallout's playback context can override what the ACM header appears to say.
If narration or speech sounds too fast, try Speech / narration mono. That mode treats each decoded sample value as one mono frame instead of pairing values as stereo; merely downmixing a stereo interpretation to mono still keeps the wrong duration.
Where Fallout Uses It
| Folder | Use | Notes |
|---|---|---|
sound\sfx\ | Interface, combat, ambient, scenery, weapon, and item sound effects. | Often loaded through the sound effects cache. |
sound\music\ | Background music. | Map music names come from maps.txt or engine calls and are loaded as .ACM. |
sound\speech\ | Spoken dialogue. | Dialogue MSG entries reference speech base names without the extension. |
ACM files may be loose in the data tree or stored in DAT archives. The DAT layer is independent: a DAT2 entry may be stored compressed with zlib, and the entry payload may itself be an ACM compressed audio file.
Relationship to MSG and LIP
Dialogue lines can reference an ACM file in the second field of an MSG entry:
{104}{hak001}{Greetings, Chosen.}
The engine resolves the speech name through the speech sound path and appends .ACM. If a talking-head line has lip sync, the corresponding LIP file stores mouth-shape timing; the ACM stores only audio. The two files must agree in duration closely enough for the animation to remain synchronized.
Header
All multi-byte header fields are little-endian. The packed stream after the header is read least-significant bit first.
| Offset | Size | Field | Description |
|---|---|---|---|
0x00 | 4 | signature | Bytes 97 28 03 01. Often written as FourCC/value 0x01032897 in little-endian order. |
0x04 | 4 | sample_count | Total number of decoded 16-bit sample values. For stereo, this counts interleaved left+right sample values, not just sample frames. |
0x08 | 2 | channels | Usually 1 or 2. Some decoders warn that some ACMs can lie about this value, so callers should also know the intended playback context. |
0x0A | 2 | sample_rate | Commonly 22050 Hz in Fallout data. |
0x0C | 2 | attributes | Low 4 bits are levels; high 12 bits are rows / subblocks. |
struct AcmHeader {
uint32_t signature; // 0x01032897, bytes 97 28 03 01
uint32_t sample_count; // decoded int16 sample values
uint16_t channels;
uint16_t sample_rate;
uint16_t attributes; // levels in low nibble, rows in high 12 bits
};
Derived Values
The attributes word drives block shape and transform depth:
levels = attributes & 0x000F
rows = attributes >> 4
cols = 1 << levels
block_sample_values = rows * cols
transform_state_values = levels == 0 ? 0 : (3 * cols / 2 - 2)
Older Fallout documentation calls levels packAttrs and rows packAttrs2. Both names describe the same header bits. A decoded block contains rows * (1 << levels) 16-bit sample values before channel grouping. Decoder implementations name and size the previous-block transform buffer differently; the formula above matches the classic Fallout/CE-style description.
Fallout 2 CE additionally processes the inverse transform in slices. It computes:
block_rows_per_step = 2048 / cols - 2
if block_rows_per_step < 1:
block_rows_per_step = 1
This limits how much previous-block state is needed while reconstructing the waveform.
High-Level Decode Flow
- Read and validate the 14-byte header.
- Allocate a coefficient block of
rows * colsinteger values. - Allocate a wrap/previous-sample buffer if
levels != 0. - For each encoded block, read a 20-bit scale header: 4-bit
pwrand 16-bitval. - Build the amplitude lookup table around index zero using multiples of
val. - For each column/subband, read a 5-bit filler code and decode coefficient values into the block.
- If
levels != 0, apply the inverse transform using the wrap buffer from previous blocks. - Output each reconstructed value shifted right by
levelsas signed 16-bit PCM. - Stop after
sample_countsample values have been produced.
Block Scale Header
Every encoded block starts with:
| Bits | Field | Description |
|---|---|---|
| 4 | pwr | Amplitude table power. count = 1 << pwr. |
| 16 | val | Amplitude step value. |
The decoder fills the midpoint of a 65536-entry amplitude table so signed indexes can be used directly:
mid[0] = 0
mid[1] = val
mid[2] = 2 * val
...
mid[count - 1] = (count - 1) * val
mid[-1] = -val
mid[-2] = -2 * val
...
mid[-count] = -count * val
Filler Codes
After the scale header, the decoder reads one 5-bit filler code per column. The column count is cols = 1 << levels. Each filler writes values down that column across all rows.
| Code range | Name | Purpose |
|---|---|---|
0 | zero | Fill the column with zero coefficients. |
1, 2, 25, 28, 30, 31 | invalid / fail in strict decoders | Classic notes called these Ret0 or failed fillers. Standard ACMs should not need them. |
3..16 | linear | Read the code number of bits per row and index the amplitude table directly. |
17 | k13 | Sparse coding for values 0, -1, +1, with repeated-zero shortcuts. |
18 | k12 | Sparse coding for 0, -1, +1. |
19 | t15 | 5 bits encode three base-3 digits, mapping to -1, 0, +1. |
20 | k24 | Sparse coding for 0, -2, -1, +1, +2, with repeated-zero shortcuts. |
21 | k23 | Sparse coding for 0 and nearby two-bit nonzero values. |
22 | t27 | 7 bits encode three base-5 digits, mapping to -2..+2. |
23 | k35 | Sparse coding for 0, -3, -2, -1, +1, +2, +3. |
24 | k34 | Sparse coding similar to k35, without the double-zero shortcut. |
26 | k45 | Sparse coding for 0 and three-bit nonzero values -4..-1, +1..+4. |
27 | k44 | Compact coding for 0 and three-bit nonzero values. |
29 | t37 | 7 bits encode two base-11 digits, mapping to -5..+5. |
Fallout 2 CE also contains a handler for filler code 31 used by some Russian localizations. That handler reads raw 16-bit values for the entire block and acts as both filler and transformer. Standard Fallout ACM files and most general decoders treat code 31 as invalid.
Inverse Transform
The filled block is not always PCM yet. If levels is nonzero, the decoder applies a recursive inverse transform over subbands. Implementations often call this step juggle, untransform, or unpacking.
The transform uses previous-block state, which is why a decoder must keep the wrap buffer between blocks. It also adds 1 to a subset of reconstructed values during processing before later output scaling. The final PCM sample value is the reconstructed integer shifted right by levels and written as signed little-endian 16-bit audio.
For seeking backward in ACM audio, CE resets the decoder and decodes forward until the requested byte position is reached. A robust standalone decoder can use the same strategy for arbitrary seeks, but it is not cheap because the bitstream has no independent seek table.
PCM Output
| Property | Decoded output |
|---|---|
| Sample format | Signed 16-bit PCM |
| Byte order | Little-endian when written to WAV/raw files on PC |
| Sample count | sample_count 16-bit values from the header |
| Byte count | sample_count * 2 |
| Frame count | sample_count / channels, if the channel count is trusted |
| Duration | sample_count / (channels * sample_rate) seconds |
Fallout 2 CE's audio wrapper asks the ACM decoder for sample_count and then doubles it to get the decoded byte size. This is another useful confirmation that the header count is a count of 16-bit sample values, not bytes.
Channel Interpretation and Playback Context
The ACM header has a channels field, but Fallout-era playback code should not be treated as a pure "decode header, play header" pipeline. The compressed stream decodes to a flat sequence of signed 16-bit sample values. Grouping those values into frames is a playback decision: mono uses one sample value per frame, while stereo uses two interleaved sample values per frame.
This distinction matters because some narration or speech-like files can play at the wrong speed if a tool trusts a stereo interpretation where the game context expects mono. In that failure mode, every two decoded values are incorrectly paired into one stereo frame, so the frame count is halved and the audio plays roughly twice as fast. Downmixing that mistaken stereo stream to mono does not fix the duration; the stream must be reinterpreted as mono before frame construction.
| Interpretation | Frame construction | Duration formula | Typical use |
|---|---|---|---|
| Mono stream | Each decoded sample value is one frame. | sample_count / sample_rate | Speech, narration, many effects. |
| Stereo stream | Pairs of decoded sample values form left/right frames. | sample_count / (2 * sample_rate) | Music and stereo assets. |
| Header-driven | Uses channels from the ACM header. | sample_count / (channels * sample_rate) | Useful default for well-behaved standalone tools. |
For a format viewer, it is useful to expose both operations separately: reinterpret the sample stream as mono or stereo, and optionally downmix stereo to mono for listening. Reinterpretation changes duration; downmixing changes only channel count after the frame grouping has already happened.
WAVC Wrapper
Some Interplay/BioWare-era files use a WAVC wrapper before an ACM stream. This is not the normal Fallout 1/2 standalone .ACM file layout, but it appears in broader Interplay ACM documentation and tooling.
struct WavcHeader {
char signature[4]; // "WAVC"
char version[4]; // "V1.0"
uint32_t uncompressed_size;
uint32_t compressed_size;
uint32_t header_size; // often 28
uint16_t channels;
uint16_t bits;
uint16_t sample_rate;
uint16_t unknown;
};
If a file starts with WAVC, a general-purpose tool should skip to the embedded ACM header using the wrapper's header size rather than expecting 97 28 03 01 at offset 0.
Validation Checklist
- File should contain at least 14 bytes.
- Standalone Fallout ACM signature should be bytes
97 28 03 01. sample_countshould be nonzero for normal playable files.channelsshould normally be1or2.sample_rateshould usually be22050for Fallout data, though the format can carry other rates.levelsshould produce a reasonablecols = 1 << levels; very large values can overflow allocations.rowsshould be nonzero androws * colsshould fit the decoder's block buffer limits.- Decode should stop after exactly
sample_countsample values even if the last block has padding or unused bitstream data. - If a speech or narration file has the right pitch but plays too fast, test mono stream interpretation before assuming the ACM bitstream decode is wrong.
- Reject or explicitly handle filler codes not supported by the target engine.
Authoring Notes
- For Fallout 1/2 compatibility, encode to 16-bit PCM ACM and prefer
22050 Hz. - Use mono or stereo according to the playback path. Speech, narration, and many SFX are effectively mono; music is commonly treated as stereo by playback code.
- When converting or previewing audio, keep "reinterpret stream as mono/stereo" separate from "downmix to mono". Reinterpretation changes duration; downmixing does not.
- Dialogue speech file names are usually referenced from MSG without path or extension.
- When replacing talking-head speech, update or regenerate the matching LIP timing file if the duration or phrasing changes.
- Do not wrap Fallout replacement speech/music in WAV unless the target engine/tool explicitly supports that wrapper.
- When storing ACM in DAT2, remember that DAT compression is separate and optional. Recompressing already-compressed ACM data may not save much space.
Related Formats
- MSG File Format - dialogue lines can reference speech ACM base names.
- LIP File Format - talking-head mouth timing paired with speech audio.
- DAT File Format - archive container that often stores ACM resources.
- MAP File Format - maps select background music indirectly through map indexes and world-map configuration.
- PRO File Format - item, scenery, and weapon sound IDs feed SFX name construction.
- World-map Text Configuration Files - map music and ambient sound effects are referenced from text configuration.
Source References
- Fallout 2 CE
sound_decoder.cc- ACM header parsing, bitstream reader, filler table, inverse transform, and PCM output. - Fallout 2 CE
audio_file.cc- compressed audio file wrapper, decoded byte-size handling, and seek behavior. - Fallout 2 CE
game_sound.cc- Fallout music, speech, and SFX paths and ACM playback integration. - FFmpeg
interplayacm.c- independent Interplay ACM decoder, filler functions, block transform, and output model. - MultimediaWiki Interplay ACM - concise header summary, broader game usage, and WAVC wrapper note.
- Vault-Tec Labs ACM File Format - Abel's original Fallout-oriented ACM notes.
Tools
- TeamX sound utilities
- Game Audio Player
- libacm
- FFmpeg with the
interplayacmdecoder - SND2ACM
Credits
The original Fallout ACM write-up was written by Abel in 2000. Later decoder work by Marko Kreen, Adam Gashlin, Paul B Mahol, Fallout 2 CE contributors, and other open-source authors makes the format much easier to verify today.