The ACF formatVersions and compatibility"Format" chunk"Palette" chunk"Frame" chunks Tile Correction Tile DecodingACF Extractor
Now that I've given some information about how the way levels were designed and scripted, time to talk about the Xtra 3D Motion1!The ACF format
Let's start by the credits for the "Adeline Software Motion Compressor"2.- Serge Plagnol for most of the hard work such as compression methods, color reduction, x86 depacker...
- Olivier Lhermite for the GIF and LZMIT code
- Frantz Cournil for the FLC loading code
- Mickaël Pointier3 for the FLI loading code as well as the PSX specific code, including the R3000 depacker.
ACF stands for "Adeline Chunk(ed) Format", and is basically a container format similar to Electronic Arts's IFF4.
Each section ("Chunk") is made of 12 bytes: 8 bytes with the name of the chunk, and 4 bytes for the length:
Using this information it is possible to iterate over the entire file even if you don't know some of the chunks.
A basic ACF decoder, just looking to extract the images from the stream would just require knowledge of the following chunks:
- "Format", which gives information about the file itself
- "Palette", containing the 256 colors palette
- "KeyFrame", which encodes complete images
- "DltFrame", which encode partial images
- "End", which tells you to stop processing the file
Versions and compatibility
Some of these chunks are platform specific, and the format was somewhat altered to become more efficient on the slightly underpowered PSX console.One important thing to to know, is that the format evolved over time, but unfortunately there are no version field in the format5, so some of the earlier demo releases and prototypes will result in corrupted output, or even crash of the decoder because of out of bound buffer accesses.
I validated the code I'm sharing here6 on the ACF files available on the GOG PC version of Time Commando (released in July 1996), and I can definitely say it does crash on some earlier ACF files I had in my archives from earlier the same year (in April and May), but I don't have trace of what the changes were (we did not use source control versionning back in the days).
"Format" chunk
The Format chunk contains information about the size of the encoded images, framerate, etc... including which compressor was used.If all you want is to extract the images, you just need to access width and height!
uint32 struct_size 44 bytes
uint32 width 320 pixels
uint32 height 192, 200 or 240 pixels
uint32 frame_size 15000 bytes
uint32 key_size 64000 bytes
uint32 key_rate
uint32 play_rate 12 fps
uint32 sampling_rate
uint32 sample_type
uint32 sample_flags
uint32 compressor 0 for ACF, 1 for XCF
The main differences between XCF and ACF are statisticaly, internally the encoding is the same.
"Palette" chunk
Nothing special, it's basically just 768 bytes, the standard VGA palette with 256 series of Red, Green, Blue triplets.
uint8 red
uint8 green
uint8 blue
// And 255 more times of the same
"Frame" chunks
There are actually two of these, encoded in exactly the same way:
uint32 offset_to_color_stream
uint8 tile_method[width*height*6/8]
uint8 aligned_stream[...] 32 bit aligned
uint8 unaligned_stream[...]
The two streams of data are separated for performance reasons. Some processors are not able to load unaligned data7, and when they can, this comes with some form of performance penalty.
By having to separate streams, we were able to optimize the code by making sure some of the data would always be properly aligned on a multiple of 4, thus being optimal on all the architectures we supported.
A typical example would be to load a 32 bit mask from the aligned stream, and some pixel color data from the unaligned one.
"KeyFrame" chunks contain complete pictures, which can be practical to use if you want to jump forward and backward in the video stream, while "DltFrame" require the previous picture content as a decoding source.
As you can see on this picture of the tunnel, some parts of the image do not change at all (the gray areas on the difference view), and since the tunnel is slowly zooming, most of the content of the n+1 frame as just made of rearranged elements from the previous frame.
This is one of the reasons why the background moves relatively slowly: If it moved faster, it would be much harder to compress efficiently.
When starting the decoding process, we need to acquire pointers on all these sections, and also two pointers to the previously decoded image.
The compressed images are split into small tiles made of 8 by 8 pixels, and each of these tile is encoded using one of 64 different combinations of methods.
These methods contain two phases: The decoding phase, and the correction phase, and are defined in the opcode section.
Tile Correction
I'm going to start by the correction part, because it's the simplest to explain.At the core, any efficient compression method is destructive, which means the decoded version of the image is not identical to the original. The less differences there are, the better the quality, which is why we are using a post decoding correction.
All this does, is to "touch up" a few pixels on the tile to reduce the loss of quality.
We have three updates methods:
- Update4, which update 4 of the 64 pixels in the current tile
- Update8, which update 8 of the 64 pixels in the current tile
- Update16, which update 16 of the 64 pixels in the current tile
void Update4()
{
uint32 value = *(uint32*)unaligned_stream;
data_stream += 3;
for (int32 i = 0; i < 4; i++)
{
current_tile[(value & 7) + ((value >> 3) & 7) * m_Width] = *aligned_stream++;
value >>= 6;
}
}
void Update8()
{
Update4();
Update4();
}
Update16 uses a single 64 bit mask to indicate which of the pixels are impacted, plus 16 bytes for the color of each of these pixels.
void Update16()
{
for (int32 y = 0; y < 8; y++)
{
uint8 mask = *aligned_stream++;
for (int32 x = 0; x < 8; x++)
{
if (mask & 1)
{
current_tile[x + y * m_Width] = *unaligned_stream++;
}
mask >>= 1;
}
}
}
And that's all there is recording post decoding correction.
Tile Decoding
Ultimately there are 64 possible combinations of decoding and updating, but there are only 31 decoding methods, split in various categories:- "pixel" vs "bank"
- "movie" vs "non movie"
Let see how these are used.
Color encoding
If you've not read the article about the Time Command palette I suggest you do it.Basically in Time Commando, we have 256 colors, organized in 16 groups of 16 colors called "banks".
This is exploited by the ACF compressor which when able to detect when all the colors of a specific tile fit in the same bank, will just use 4 bits to indicate the color index instead of the full 8 bits.
The 'non movie' methods.
These are only using data from the ACF file and do not require any data from the previous frame, as such they are the ones that can be used for the KeyFrames images.The first group contains the very simple ways of filling a 8 by 8 pixels tile.
Can't do more simple that these two examples: Fetch one color and use it to colorize an area of the tile.
- SingleColorFillDecode (1 byte)
Color: 1 byte which is used to fill the 8x8 block - FourColorFillDecode (4 bytes)
Color: 4x1 byte which is used to fill the 4x4 blocks
The Raw decode follows the idea, but instead of just 1 or 4 colors, we get the entire set of 64 pixels colors that we can just copy directly, with the "bank" variants optimized for the cases where we have a limited range of colors.
- RawTileDecode (64 bytes)
Data: The entire set of 8x8 pixels for this tile to copy directly in the buffer. - OneBankTileDecode (33 bytes)
Data: 32 bytes forming 8x8x4 bits which define the color in the bank (in the interval [0.15])
Color: 1 byte indicating the value of the color bank to use. - TwoBanksTileDecode (41 bytes)
Data: 40 bytes forming 8x8x5 bits which define the color (in the interval [0.15]), and the bank number.
Color: 1 byte forming 2x4 bits which give the numbers of the 2 color banks to be used.
The bit decode methods are similar to bitmap encoding on retro computers, like Atari ST, where you have a fixed palette with a certain number of colors, and you read a specific number of bits to find the color index.
- OneBitTileDecode (10 bytes)
Data: 8 bytes forming 8x8x1 bits each designating the color to be used
Color: 2 bytes of color (index in the palette) - TwoBitTileDecode (20 bytes)
Data: 16 bytes forming 8x8x2 bits each designating the color to be used
Color: 4 bytes of color (index in the palette) - ThreeBitTileDecode (32 bytes)
Data: 24 bytes forming 8x8x3 bits each designating the color to be used
Color: 8 bytes of color (index in the palette) - FourBitTileDecode (48 bytes)
Data: 32 bytes forming 8x8x4 bits each designating the color to be used
Color: 16 bytes of color (index in the palette)
These next three are similar, but work on each of the 4x4 quadrants of the tile.
- OneBitSplitTileDecode (16 bytes)
Data: 4x2 bytes forming 8x8x1 bits each designating the color to be used
Color: 4x2 bytes of color (index in the palette) - TwoBitSplitTileDecode (32 bytes)
Data: 4x4 bytes forming 8x8x2 bits each designating the color to be used
Color: 4x4 bytes of color (index in the palette) - ThreeBitSplitTileDecode (56 bytes)
Data: 4x6 bytes forming 8x8x3 bits each designating the color to be used
Color: 4x8 bytes of color (index in the palette)
- PrimeDecode (n bytes)
Data: 8 bytes forming 8x8x1 bits each designating whether the main color is used, or that of the stream.
Color: 1 byte of main color.
Color: n bytes (as many as there are bits at 1 in the previous field) of color (index in the palette)
The Block Decode methods are designed to help with areas with large gradients with slow changes of colors.
- BlockDecode (8+n bytes)
Exists in Horizontal, Vertical, Diagonal1 and Diagonal2 variants
Data: 8 bytes forming 8x8x1 bits each designating whether the current color is used, or the following one.
Color: n bytes (as many as there are bits at 1 in the previous field) of color (index in the palette - BlockBank1DecodeHorizontal (8+1+n/2 bytes)
Exists in Horizontal, Vertical, Diagonal1 and Diagonal2 variants
Data: 8 bytes forming 8x8x1 bits each designating whether the current color is used, or the following one.
Color: 1 byte of bank
Color: n / 2 bytes (as many as there are bits at 1 in the previous field) of color (index in the palette) - CrossDecode (20 bytes)
Data: 4 mask bytes + 4x4 color bytes
The 'movie' methods.
These use both data from the ACF file, and also use the previously decoded information available in the previous frame, and as such can only be used for Delta Frames images.The first three movie methods are using absolute locations.
- ZeroMotionDecode (0 byte)
Possibly the simplest encoding, used when nothing changed: Basically just keep the 8x8 pixels of the current block unchanged (reuse the previous frame) - Motion8Decode (2 bytes)
Data: 2 bytes forming a 16bit unsigned offset relative to the start of the buffer pointing to the 8x8 pixels area to copy in place of the current tile. - Motion4Decode (20 bytes)
Data: Similar to Motion8Decode, but with a 16 bit offset for each of the 4x4 blocks
The final movie methods, are using relative positions.
- ShortMotion8Decode (1 byte)
Data: 4 bits containing the horizontal relative offset (from -8 to +7), 4 bit containing the vertical relative offset. - ShortMotion4Decode (4 bytes)
Data: Similar to ShortMotion8Decode, but for each of the 4x4 block. - ROMotion8Decode (2 bytes)
Data: A 16bit signed relative offset pointing to the block to reuse. - ROMotion4Decode (8 bytes)
Data: Similar to ROMotion8Decode, but for each of the 4x4 block. - RCMotion8Decode (2 bytes)
Data: 1 byte giving the horizontal offset (from -128 to +127), and similarly one byte for the vertical offset. - RCMotion4Decode (8 bytes)
Data: Similar to RCMotion8Decode, but for each of the 4x4 block.
And that's about it really.
The first pass runs the block decode, the second pass do the pixel correction.
ACF Extractor
This article comes with a complete source code, rewritten by me.It's obviously based on the original documentation and code, but it does not use any of the Adeline code, and only use standard C++17 functionalities.
So, here is the complete source code: ACF Extractor 1.0 (38 kilobytes)
It's only one single CPP file, that you should be able to compile with anything that supports C++178.
There are no command line parameters, basically the idea is that you need to compile and hack around to see what it does by tracing the code.
There's a big #if at the end that selects between a batch mode that scans a bunch of folders and exports PCX images in a target folder, and a second mode that only converts one specific video.
And that's really all there is.
There is no support for the extra rooms (SAL_xxxx chunks): This may happen at some point, but I've to admit it already has taken way more time than I have expected...
...also I've heard that some exciting things may be happening soon, so we will see how things go.
Feel free to ask questions, or point out bugs, the code is probably not "production quality", I was more aiming for understandability than anything else.
Have fun!
1. For some reason, somebody decided that 'video streaming' was not exciting enough, and they came up with this acronym for the technology.↩
2. The source code thanks Michel Royer who worked on Cyberia at Xatrix Entertainment↩
3. Hey, it's me!↩
4. Interchange File Format↩
5. Which I systematicaly do these days, it does not cost much and makes things so much easier↩
6. Jump to the end of the article if you are in a hurry↩
7. Like loading a 16 bit value at an odd address, or a 32 bit value on an address not multiple of four.↩
8. I only tested on Visual Studio 2019↩