Commit Graph

14 Commits

Author SHA1 Message Date
Alexander Suvorov bec4114bea Add compression support for ETC2A textures
This change makes it possible to use Crunch algorithms for ETC textures with Alpha channel.

Explanation:

For simplicity, Crunch algorithms currently do not use ETC2 specific modes (T, H or P). For this reason, the currently used ETC2A compression format is technically equivalent to ETC1 + Alpha. Note that ETC2 encoding is a superset of ETC1, so any texture, which consists of ETC1 color blocks and ETC2 Alpha blocks, can be correctly decoded by an ETC2A (ETC2_RGBA8) decoder.

Compression scheme for ETC2 Alpha blocks is equivalent to the compression scheme for DXT5 Alpha blocks. ETC2 Alpha endpoint clusterization is performed based on the very same output of the Alpha palettizer which is used for DXT5 Alpha. The only part which is actually different is the Alpha endpoint optimization step.

In order to perform ETC2 Alpha encoding, we can first run the already existing algorithm for DXT5 Alpha endpoint optimization, in order to obtain the initial approximate solution. Then the approximate solution is refined based on the ETC2 Alpha modifier table. When performing raw ETC2A encoding, all the 16 ETC2 Alpha modifiers are used during optimization. However, when performing ETC2A quantization, for performance reasons, only 2 Alpha modifiers are currently used (modifier 13, which allows to perform precise approximation on short Alpha intervals, and modifier 11, which has more or less regularly distributed values, and is used for large Alpha intervals).

For compatibility reasons, ETC2 color compression wrappers have also been added to the code, though, as has been mentioned before, at the current moment ETC2 specific modes are not used, so ETC2 color compression is currently equivalent to ETC1 compression.

The ETC decoder functionality has been significantly extended, Crunch is now capable to decode ETC2 and ETC2A textures (input ETC2 textures can have T, H or P blocks).

In order to use ETC2A compression, use the -ETC2A command line option (i.e. "crunch_x64.exe -ETC2A input.png"). By default, compressed ETC2A textures will be decompressed into KTX file format.

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.880 sec
Modified: 1468204 bytes / 13.288 sec
Improvement: 7.21% (compression ratio) / 53.99% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.936 sec
Modified: 1914805 bytes / 18.044 sec
Improvement: 7.28% (compression ratio) / 51.15% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 17.361 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-08-04 16:56:10 +02:00
Alexander Suvorov f284523b15 Add compression support for ETC1 textures
Explanation:

Crunch algorithms are normally used for compression of DXTn textures. However, Crunch algorithms are much more powerful, and with some minor adjustments, those algorithms can be directly used to compress other texture formats. For example, the current commit demonstrates how to use the existing Crunch algorithms to compress ETC1 textures.

Basics:

In general, Crunch is performing the following steps:
- tiling (determines block encodings)
- quantization of the tile endpoints (determines endpoint indices)
- optimization of the endpoints for each tile group (determines endpoint dictionary)
- quantization of the selectors (determines selector indices)
- selector refinement for each selector group (determines selector dictionary)
- compression of the previously determined block encodings, dictionaries and indices

Dictionary element:

When applying Crunch algorithms to a new texture format, it is necessary to first define the dictionary element. In context of Crunch, this means thats the whole image consists of smaller non-overlapping blocks, while the contents of each individual block is determined by an endpoint and a selector from the corresponding dictionaries. For example, in case of DXT format, each endpoint and selector codebook element corresponds to a 4x4 pixel block. In general, the size of the blocks, which form the encoded image, depends on the texture format and quality considerations.

It is proposed to define the dictionaries according to the following limitations:
- The dictionary elements should be compatible with the existing Crunch algorithms, while the image blocks defined by those dictionary elements should be compatible with the texture encoding format.
- It should be possible to cover a wide range of image quality and bitrates by just changing the size of the endpoint and selector dictionaries. If there is no limitation on the dictionary size, the encoding should preferably become lossless or near-lossless (not considering the quality loss implied by the texture format itself).

In case of ETC1, the texture format itself determines the minimal size of the image block, defined by endpoint and selector: it can be either 2x4 or 4x2 rectangle, aligned to the borders of the 4x4 grid. It is not possible to use higher granularity, because each of those rectangles can have only one base color, according to the ETC1 format. For the same reason, any image block, defined by an endpoint and a selector from the dictionary, should be combined from those aligned 2x4 or 4x2 rectangles.

Let's investigate the possibilities for the endpoint dictionary. According to the ETC1 format, each 4x4 ETC1 block is split in half, while each ETC1 subblock has it's own base color and a modifier table index. In fact, the base color and the modifier table index simply define the high and the low colors for the subblock (while there are some limitations on the position of those high and low colors, implied by the ETC1 encoding). If we define the endpoint dictionary element in such a way that it contains information about more than one ETC1 base color, then such a dictionary will become incompatible with the existing tile quantization algorithm, and the reason for this is the following. The Crunch tiling algorithm first performs quantization of all the tile pixel colors, down to just 2 colors. Then it quantizes all those color pairs, coming from different tiles. This approach works quite well for 4x4 DXT blocks, as those 2 colors approximately represent the principle component of the tile pixel colors. In case of ETC1 however, mixing together pixels, which correspond to different base colors, does not make much sense, as each group of those pixels has it's own low and high color values, independent from other groups. When those pixels are mixed together, the information about the original principle components of each subblock gets lost.

For the mentioned reason, each endpoint dictionary element should correspond to a single ETC1 base color. In such case, the tile quantization algorithm will work almost the same way as for DXT format. Each pair of colors, generated by the tile palletizer, will normally have the subblock base color value somewhere in the middle between those 2 colors, so quantizing those color pairs should also automatically quantize the corresponding base colors. Moreover, each color pair implicitly contains information about the modifier table index (which corresponds to the distance between the high and the low colors), and therefore the corresponding table index will also get automatically quantized.

Endpoint and selector dictionary elements, which define a single 2x4 or 4x2 ETC1 subblock, are fully compatible with the existing Crunch algorithms (because each ETC1 subblock is associated with a single base color and a single modifier table index). At the same time, those subblocks are minimal possible blocks, which can be defined by a dictionary element for ETC1 format (as has been shown earlier). Of course, it is also possible to use blocks larger than 2x4 or 4x2 (assuming that all the ETC1 subblocks, which form such a block, will have the same base color and the same modifier table index), however, with a larger block area it would be not possible to achieve near-lossless quality when the dictionary size is not limited.

As the result, it is proposed to define the dictionaries in the following way:
- Each element of the endpoint dictionary defines a single base color and a single modifier table index of a 2x4 or a 4x2 pixel block (which represents an ETC1 subblock).
- Each endpoint is encoded as 3555 (3 bits for the table index and 5 bits for each component of the base color).
- Each element of the selector dictionary defines selectors for a 2x4 or a 4x2 block.
- Each selector is encoded using 16 bits.

ETC1-specific adjustments:

In case of DXT, the size of the encoded block is 4x4, while the tiling is performed in a 8x8 area (4 blocks). In case of ETC1, the tiling can be performed either in a 4x4 area (2 blocks), or in a 8x8 area (8 blocks), while other possibilities are either not symmetrical or too complex. For simplicity it is proposed to use 4x4 area for tiling. There are therefore 3 possible encodings: the 4x4 block is not split (encoded with a single endpoint), the 4x4 block is split horizontally, the 4x4 block is split vertically.

For simplicity, endpoint references are currently determined only within the tiling area, while the encoding of the endpoint references has been adjusted in the following way:
- The first ETC1 subblock will always have the reference value of 0
- The second ETC1 subblock can have the reference value of 0 if it has the same endpoint as the first subblock (note that in such case the flip of the ETC1 block does not need to be defined), the value of 1 if the corresponding ETC1 block is split horizontally, and the value of 2 if the corresponding ETC1 block is split vertically

According to the ETC1 format, the base colors within an ETC1 block can be encoded either as 444 and 444, or differentially as 555 and 333. For simplicity, this aspect is currently not taken into account (all the endpoints are encoded as 3555 in the codebook). If it appears that the base colors in the resulting ETC1 block can not be encoded differentially, the decoder will convert both base colors from 555 to 444.

At first, it might look like the ETC1 block flipping can bring some complications for Crunch, as the subblock structure might not look like a grid. This can be easily resolved by mirroring all the vertical ETC1 blocks across the main diagonal of the block after the tiling step (so that all the ETC1 subblocks will become 4x2 and form a regular grid). The decoder can mirror the ETC1 selector back according to the decoded block flip.

The code adjustments for the ETC1 compression support are pretty straightforward and mostly trivial. Just note that when format-specific adjustments affect performance critical code, it makes sense to duplicate the body of the affected function and perform format-specific optimizations in each copy of the function individually. For performance reasons, the following 4 functions now got both ETC and DTX specific versions:
- determine_tiles_task_etc() is an ETC-optimized version of the determine_tiles_task(), where dxt_fast class has been replaced with the etc1_optimizer class.
- determine_color_endpoint_codebook_task_etc() is an ETC-optimized version of the determine_color_endpoint_codebook_task(), where dxt1_endpoint_optimizer class has been replaced with the etc1_optimizer class.
- pack_color_endpoints_etc() is an ETC-optimized version of the pack_color_endpoints(), where 565565 DXT color endpoint encoding has been replaced with 3555 ETC color endpoint encoding.
- unpack_etc1() is an ETC version of the unpack_dxt1() function.

The color_quality_power_mul and m_adaptive_tile_color_psnr_derating parameters for ETC1 format have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the equivalent DXT1 compression, when compressing the Kodak test set without mipmaps using default quality.

In order to use ETC1 compression, use the -ETC1 command line option (i.e. "crunch_x64.exe -ETC1 input.png"). By default, compressed ETC1 textures will be decompressed into KTX file format.

DXT Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.876 sec
Modified: 1482780 bytes / 13.255 sec
Improvement: 6.28% (compression ratio) / 54.10% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.987 sec
Modified: 1931586 bytes / 18.068 sec
Improvement: 6.47% (compression ratio) / 51.15% (compression time)

ETC Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1887265 bytes
Total time: 14.954 sec
Average bitrate: 1.600 bpp
Average Luma PSNR: 34.049 dB
2017-07-05 18:19:23 +02:00
Alexander Suvorov d34192aa07 Split the header block from the crn_decomp.h into a separate crn_defs.h file. This change makes the used CRND_HEADER_FILE_ONLY macro unneccesary. 2017-04-26 13:16:13 +02:00
Alexander Suvorov 7c02055d05 Reformat the source files. The source files have been reformatted using: clang-format.exe -style="{BasedOnStyle: Google, AllowAllParametersOfDeclarationOnNextLine: false, AllowShortFunctionsOnASingleLine: Inline, AllowShortIfStatementsOnASingleLine: false, AllowShortLoopsOnASingleLine: false, ColumnLimit: 0, DerivePointerAlignment: false, SortIncludes: false}" 2017-04-26 11:41:07 +02:00
Alexander Suvorov 41d7b962b0 Update solution to use Visual C++ 2010 compiler and libraries. When compiled with Visual Studio 2010, the code will produce the same results as the originally distributed Crunch binaries. 2017-04-26 10:59:07 +02:00
Rich Geldreich 0aea5beeb2 Fixing copyright 2016-06-16 20:08:40 -07:00
Rich Geldreich e647680ef9 Updating github URL 2016-06-15 23:21:39 -07:00
Rich Geldreich d2a3948ab9 Updating license/copyright/email contact info 2016-06-15 22:59:25 -07:00
richgel99@gmail.com bd19626e58 fixed codeblocks win32 project compiler options 2012-11-25 08:47:02 +00:00
richgel99@gmail.com f71b49be60 Initial checkin of v1.04 - KTX file format support, basic ETC1 compression/decompression, Linux makefile with proper gcc options, lots of high-level improvements to get crnlib into a state where I can more easily add additional formats. 2012-11-25 08:41:25 +00:00
richgel99@gmail.com a8011e9d7f Adding new files 2012-04-26 07:59:22 +00:00
richgel99@gmail.com f63e26aee6 v1.03 prerelease - Full Linux port of crnlib/crunch, in progress - still more testing to do, and some cmd line options (such as -timestamp) don't work under linux yet, but the core stuff (compression/decompression/transcoding) should work fine and performance under Linux is comparable to Windows. The 3 examples haven't been ported yet. 2012-04-26 07:14:21 +00:00
richgel99@gmail.com 45901c935c - Fixing DDS reader so it doesn't require the source pitch/linear size field to be divisible by 4
- Fixing DDS writer so it writes a non-zero pitch/linearsize field (a few DDS readers in the wild require this field to be set)
- Fixing example2.cpp and example3.cpp so they write non-zero pitch/linearsize fields.
- Misc merges from the ddsexport branch
- Adding -usesourceformat command line option, useful when reprocessing a lot of existing DDS files.
2012-04-16 00:58:38 +00:00
richgel99@gmail.com 9f98ea7e22 2011-12-27 21:18:07 +00:00