Migrating wiki contents from Google Code
This commit is contained in:
+489
@@ -0,0 +1,489 @@
|
||||
## Contents ##
|
||||
|
||||
* [Introduction](API_Docs#Introduction.md)
|
||||
* [Public API Overview](API_Docs#Public_API_Overview.md)
|
||||
* [Public Enums](API_Docs#Public_Enums.md)
|
||||
* [Public Structs](API_Docs#Public_Structs.md)
|
||||
* [Public Functions](API_Docs#Public_Functions.md)
|
||||
* [Memory Allocation](API_Docs#Memory_Allocation.md)
|
||||
* [Compression](API_Docs#Compression.md)
|
||||
* [Transcoding](API_Docs#Transcoding.md)
|
||||
* [Decompression](API_Docs#Decompression.md)
|
||||
* [DXTn Block Compression](API_Docs#DXTn_Block_Compression.md)
|
||||
* [Helper Functions](API_Docs#Helper_Functions.md)
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Introduction ##
|
||||
|
||||
crnlib is a C++ library designed to be statically linked into the calling application. It can compress to .CRN, regular .DDS, or clustered .DDS files. It can also transcode .CRN to .DDS, and unpack .DDS files to individual 24/32-bit images. For completeness, crnlib's high-quality DXTn block compressor is also accessible.
|
||||
|
||||
The library does not use C++ exceptions, but it does use some C++ features such as templates, virtual functions, and inheritance. It also makes heavy use of heap allocation. Due to porting, exception, and inconsistent performance issues (especially in debug builds) crnlib mostly uses custom containers instead of STL.
|
||||
|
||||
The VC9 (Visual Studio 2008) .LIB files are built here:
|
||||
|
||||
```
|
||||
lib\VC9\release\win32\crnlib_vc9.lib
|
||||
lib\VC9\release\win64\crnlib_x64_vc9.lib
|
||||
lib\VC9\release_dll\win32\crnlib_DLL_vc9.lib
|
||||
lib\VC9\release_dll\win64\crnlib_DLL_x64_vc9.lib
|
||||
```
|
||||
|
||||
crnlib should also build with VC10 (Visual Studio 2010), and Codeblocks 10.05 using TDM-GCC, but the majority of my testing has been with VC9.
|
||||
|
||||
Currently crnlib is Win32 only, but it already compiles with GCC so a Linux/BSD/Mac port shouldn't be too difficult. (The threading related code is the biggest blocker to porting.) crnlib itself has only been tested on PC's, but [crn\_decomp.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crn_decomp.h) (the stand-alone transcoder header file library) should work fine on consoles.
|
||||
|
||||
A Xbox 360 specific version of crn\_decomp.h is available that can transcode .CRN textures into X360 tiled textures located in cached or write combined memory at only a ~10% slowdown. Please email me if you're interested (it's bitrotted a bit since the public release).
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Public API Overview ##
|
||||
|
||||
There are two header files of interest, both under the [inc](http://code.google.com/p/crunch/source/browse/trunk/#trunk%2Finc) directory. crnlib exposes a simple high level, C-style function based API, which is defined in the single public header file [inc/crnlib.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crnlib.h).
|
||||
|
||||
The second public header file, [inc/crn\_decomp.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crn_decomp.h), contains all the functionality needed to transcode .CRN files to raw DXTn bits. It does not depend on crnlib in any way, although crnlib internally uses `crn_decomp.h` itself to transcode, examine, and validate .CRN files.
|
||||
|
||||
Each crnlib API falls into one of the following categories:
|
||||
|
||||
* **Memory management**:
|
||||
* `crn_set_memory_callbacks()`
|
||||
* `crn_free_block()`
|
||||
|
||||
* **Image or texture compression** from memory to .CRN or .DDS file in memory:
|
||||
* crn\_compress()
|
||||
|
||||
* **Texture decompression** from a .CRN or .DDS file memory to memory:
|
||||
* `crn_decompress_crn_to_dds()`
|
||||
* `crn_decompress_dds_to_images()`
|
||||
* `crn_free_all_images()`
|
||||
|
||||
* **Plain DXTn block compression** of 4x4 pixel blocks to DXTn compressed blocks:
|
||||
* `crn_create_block_compressor()`
|
||||
* `crn_compress_block()`
|
||||
* `crn_free_block_compressor()`
|
||||
|
||||
* **Misc. helpers**:
|
||||
* **crn\_format info**:
|
||||
* `crn_get_format_fourcc()`
|
||||
* `crn_get_format_bits_per_texel()`
|
||||
* `crn_get_bytes_per_dxt_block()`
|
||||
* `crn_get_fundamental_dxt_format()`
|
||||
* **crn\_format to/from ANSI and UTF16 string**:
|
||||
* `crn_get_file_type_exta()`
|
||||
* `crn_get_file_type_ext()`
|
||||
* `crn_get_format_stringa()`
|
||||
* `crn_get_format_string()`
|
||||
* `crn_get_dxt_quality_stringa()`
|
||||
* `crn_get_dxt_quality_string()`
|
||||
|
||||
Several custom types and parameter structs are also defined in [inc/crnlib.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crnlib.h). The most important structs are:
|
||||
* [crn\_comp\_params](API_Docs#enum_crn_comp_params.md), which contains all the parameters passed to the compression function `crn_compress()`
|
||||
* `struct crn_mipmap_params`, which contains a bunch of parameters that control crnlib's optional mipmap generator. (This struct is not yet documented here.)
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Public Enums ##
|
||||
|
||||
### enum crn\_file\_type ###
|
||||
```
|
||||
enum crn_file_type
|
||||
{
|
||||
cCRNFileTypeCRN = 0,
|
||||
cCRNFileTypeDDS,
|
||||
};
|
||||
```
|
||||
|
||||
`crn_file_type` contains the supported file types. crnlib only supports DX9-style .DDS files.
|
||||
|
||||
`cCRNFileTypeCRN`: .CRN file format
|
||||
|
||||
`cCRNFileTypeDDS`: .DDS file format
|
||||
|
||||
|
||||
### enum crn\_format ###
|
||||
```
|
||||
enum crn_format
|
||||
{
|
||||
cCRNFmtInvalid = -1,
|
||||
|
||||
cCRNFmtDXT1 = 0,
|
||||
|
||||
cCRNFmtFirstValid = cCRNFmtDXT1,
|
||||
|
||||
// cCRNFmtDXT3 is not currently supported when writing to CRN - only DDS.
|
||||
cCRNFmtDXT3,
|
||||
|
||||
cCRNFmtDXT5,
|
||||
|
||||
// Various DXT5 derivatives
|
||||
cCRNFmtDXT5_CCxY, // Luma-chroma
|
||||
cCRNFmtDXT5_xGxR, // Swizzled 2-component
|
||||
cCRNFmtDXT5_xGBR, // Swizzled 3-component
|
||||
cCRNFmtDXT5_AGBR, // Swizzled 4-component
|
||||
|
||||
// ATI 3DC and X360 DXN
|
||||
cCRNFmtDXN_XY,
|
||||
cCRNFmtDXN_YX,
|
||||
|
||||
// DXT5 alpha blocks only
|
||||
cCRNFmtDXT5A,
|
||||
|
||||
cCRNFmtTotal,
|
||||
};
|
||||
```
|
||||
|
||||
The `crn_format` enum contains the supported compressed pixel formats. It lists all the standard DX9 compressed pixel formats (BC1-BC5), with some swizzled DXT5 formats (most of them supported by ATI's Compressonator).
|
||||
|
||||
### enum crn\_limits ###
|
||||
```
|
||||
enum crn_limits
|
||||
{
|
||||
cCRNMaxLevelResolution = 4096,
|
||||
|
||||
cCRNMinPaletteSize = 8,
|
||||
cCRNMaxPaletteSize = 8192,
|
||||
|
||||
cCRNMaxFaces = 6,
|
||||
cCRNMaxLevels = 16,
|
||||
|
||||
cCRNMaxHelperThreads = 16,
|
||||
|
||||
cCRNMinQualityLevel = 0,
|
||||
cCRNMaxQualityLevel = 255
|
||||
};
|
||||
```
|
||||
|
||||
The `crn_limits` enum lists various library limits. Notably, the max supported texture resolution is currently 4096x4096 (this can be easily increased in the x64 version).
|
||||
|
||||
### enum crn\_comp\_flags ###
|
||||
```
|
||||
enum crn_comp_flags
|
||||
{
|
||||
cCRNCompFlagPerceptual = 1,
|
||||
cCRNCompFlagHierarchical = 2,
|
||||
cCRNCompFlagQuick = 4,
|
||||
cCRNCompFlagUseBothBlockTypes = 8,
|
||||
cCRNCompFlagUseTransparentIndicesForBlack = 16,
|
||||
cCRNCompFlagDisableEndpointCaching = 32,
|
||||
cCRNCompFlagManualPaletteSizes = 64,
|
||||
cCRNCompFlagDXT1AForTransparency = 128,
|
||||
cCRNCompFlagGrayscaleSampling = 256,
|
||||
cCRNCompFlagDebugging = 0x80000000,
|
||||
};
|
||||
```
|
||||
|
||||
The `crn_comp_flags` enum contains a number of compression related flags:
|
||||
|
||||
`cCRNCompFlagPerceptual`: Default: Enabled. If enabled, perceptual colorspace distance metrics are enabled. **Important**: Be sure to **disable** this flag when compressing non-sRGB colorspace images, like normal maps!
|
||||
|
||||
`cCRNCompFlagHierarchical`: Default: Enabled. If enabled, 4x4, 4x8, 8x4, and 8x8 tiles may be used in each macroblock. If disabled, all macroblocks are forced to use four 4x4 pixel tiles. Compression ratio will be lower when disabled, and transcoding will be a bit slower, but this will reduce macroblock tiling artifacts.
|
||||
|
||||
`cCRNCompFlagQuick`: Default: Disabled. If enabled, this flag disables several output file optimizations. Intended for things like quicker previews.
|
||||
|
||||
`cCRNCompFlagUseBothBlockTypes`: Default: Enabled. This flag controls which block types are used when compressing to .DDS. (This flag is not relevant when compressing to .CRN, which only uses a subset of the possible DXTn block types.)
|
||||
|
||||
> DXT1: OK to use DXT1A (3 color) alpha blocks if doing so results in lower RGB error, or for transparent pixels.
|
||||
|
||||
> DXT5: OK to use both DXT5 block types.
|
||||
|
||||
`cCRNCompFlagUseTransparentIndicesForBlack`: Default: Disabled. If enabled, it's OK to use DXT1A transparent indices to encode full black colors (assumes pixel shader ignores fetched alpha). (Not relevant when compressing to .CRN files, because it never uses alpha blocks.)
|
||||
|
||||
`cCRNCompFlagDisableEndpointCaching`: Default: Disabled. When set, this flag disables endpoint caching, for deterministic output. Only relevant when compressing to .DDS.
|
||||
|
||||
`cCRNCompFlagManualPaletteSizes`: Default: Disabled. If enabled, use the cCRNColorEndpointPaletteSize, etc. params to control the CRN palette sizes. Only relevant when compressing to .CRN.
|
||||
|
||||
`cCRNCompFlagDXT1AForTransparency`: Default: Disabled. If enabled, DXT1A alpha blocks are used to encode single bit transparency. Only relevant when compressing to .DDS, .CRN does not support DXT1A alpha blocks.
|
||||
|
||||
`cCRNCompFlagGrayscaleSampling`: Default: Disabled. If enabled, the DXT1 compressor's color distance metric assumes the pixel shader will be converting the fetched RGB results to luma (Y part of YCbCr).
|
||||
|
||||
This increases quality when compressing grayscale images, because the compressor can spread the luma error amoung all three channels (i.e. it can generate blocks with some chroma present if doing so will ultimately lead to lower luma error). Of course, only enable on grayscale source images.
|
||||
|
||||
`cCRNCompFlagDebugging`: Default: Disabled. If enabled, the frontend and backend gather and dump various statistics during the compression process. Only used for development/debugging purposes.
|
||||
|
||||
### enum crn\_dxt\_quality ###
|
||||
```
|
||||
enum crn_dxt_quality
|
||||
{
|
||||
cCRNDXTQualitySuperFast,
|
||||
cCRNDXTQualityFast,
|
||||
cCRNDXTQualityNormal,
|
||||
cCRNDXTQualityBetter,
|
||||
cCRNDXTQualityUber,
|
||||
};
|
||||
```
|
||||
|
||||
The `crn_dxt_quality` enum lists the various quality modes supported by the endpoint optimizers. This enum is only relevant when compressing to .DDS. cCRNDXTQualityUber is slower, but it has the best PSNR.
|
||||
|
||||
|
||||
### enum crn\_dxt\_compressor\_type ###
|
||||
```
|
||||
enum crn_dxt_compressor_type
|
||||
{
|
||||
cCRNDXTCompressorCRN,
|
||||
cCRNDXTCompressorCRNF,
|
||||
cCRNDXTCompressorRYG
|
||||
};
|
||||
```
|
||||
|
||||
This enum lists the DXTn block compressors supported by the library. This enum is only relevant when compressing to non-clustered .DDS files.
|
||||
|
||||
`cCRNDXTCompressorCRN`: crnlib's default endpoint optimizer.
|
||||
|
||||
`cCRNDXTCompressorCRNF`: A faster version of the default optimizer.
|
||||
|
||||
`cCRNDXTCompressorRYG`: RYG's public domain endpoint optimizer.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Public Structs ##
|
||||
|
||||
### enum crn\_comp\_params ###
|
||||
```
|
||||
typedef crn_bool (*crn_progress_callback_func)(crn_uint32 phase_index, crn_uint32 total_phases,
|
||||
crn_uint32 subphase_index, crn_uint32 total_subphases, void* pUser_data_ptr);
|
||||
|
||||
struct crn_comp_params
|
||||
{
|
||||
inline crn_comp_params();
|
||||
|
||||
inline void clear();
|
||||
|
||||
inline bool check() const;
|
||||
|
||||
inline bool get_flag(crn_comp_flags flag) const;
|
||||
inline void set_flag(crn_comp_flags flag, bool val);
|
||||
|
||||
crn_uint32 m_size_of_obj;
|
||||
|
||||
crn_file_type m_file_type;
|
||||
|
||||
crn_uint32 m_faces;
|
||||
crn_uint32 m_width;
|
||||
crn_uint32 m_height;
|
||||
crn_uint32 m_levels;
|
||||
|
||||
crn_format m_format;
|
||||
|
||||
crn_uint32 m_flags;
|
||||
|
||||
const crn_uint32* m_pImages[cCRNMaxFaces][cCRNMaxLevels];
|
||||
|
||||
float m_target_bitrate;
|
||||
|
||||
crn_uint32 m_quality_level;
|
||||
|
||||
crn_uint32 m_dxt1a_alpha_threshold;
|
||||
crn_dxt_quality m_dxt_quality;
|
||||
crn_dxt_compressor_type m_dxt_compressor_type;
|
||||
|
||||
crn_uint32 m_alpha_component;
|
||||
|
||||
float m_crn_adaptive_tile_color_psnr_derating;
|
||||
float m_crn_adaptive_tile_alpha_psnr_derating;
|
||||
|
||||
crn_uint32 m_crn_color_endpoint_palette_size;
|
||||
crn_uint32 m_crn_color_selector_palette_size;
|
||||
|
||||
crn_uint32 m_crn_alpha_endpoint_palette_size;
|
||||
crn_uint32 m_crn_alpha_selector_palette_size;
|
||||
|
||||
crn_uint32 m_num_helper_threads;
|
||||
|
||||
crn_uint32 m_userdata0;
|
||||
crn_uint32 m_userdata1;
|
||||
|
||||
crn_progress_callback_func m_pProgress_func;
|
||||
void* m_pProgress_func_data;
|
||||
};
|
||||
```
|
||||
|
||||
The `crn_comp_params` struct contains all parameters passed to the compressor. The caller must fill in this struct before calling `crn_compress()`. Note that some parameters/flags are relevant only when compressing to .CRN, clustered .DDS, or regular .DDS (I've tried to document all dependencies).
|
||||
|
||||
This struct contains several simple inline methods defined in this header. The constructor calls `clear()`, and the `clear()` method sets all parameters to their defaults. The `check()` method returns true if all parameters are within reasonable/supported ranges. The `get_flag()` and `set_flag()` helpers directly manipulate the `m_flags` member.
|
||||
|
||||
**crn\_file\_type m\_file\_type**: Default: cCRNFileTypeCRN. Output file type. May be `cCRNFileTypeCRN` or `cCRNFileTypeDDS`.
|
||||
|
||||
**crn\_uint32 m\_faces**: Default: 1. Set to 1 to compress 2D textures, or 6 to compress cubemaps.
|
||||
|
||||
**crn\_uint32 m\_width** and **crn\_uint32 m\_height**: Default: (0,0). The source texture's topmost (largest) mipmap dimensions in pixels. Must be in the range [1, cCRNMaxLevelResolution], non-power of 2 is OK, non-square OK. Textures that don't have dimensions divisible by 4 will be padded to the next multiple of 4.
|
||||
|
||||
**crn\_uint32 m\_levels**: Default: 1. The source texture's total mipmap chain size, where 1 is not mipmapped. Must be in the range [1, cCRNMaxLevels].
|
||||
|
||||
**crn\_format m\_format**: Default: cCRNFmtDXT1. Sets the output file's compressed pixel format.
|
||||
|
||||
**crn\_uint32 m\_flags**: Defualt: `cCRNCompFlagPerceptual` | `cCRNCompFlagHierarchical` | `cCRNCompFlagUseBothBlockTypes`. Compressor flags logically OR'd together, see the [crn\_comp\_flags enum](API_Docs#enum_crn_comp_flags.md).
|
||||
|
||||
|
||||
**const crn\_uint32`*` m\_pImages`[`cCRNMaxFaces`]``[`cCRNMaxLevels`]`**: Default: All NULL. 2D array of pointers to 32bpp RGBA input images. The red component is always first in memory, independent of platform endianness.
|
||||
|
||||
**float m\_target\_bitrate**: Default: 0. Target bitrate. If non-zero, the compressor will use an interpolative search to find the highest quality level that results in a file length that is <= the target bitrate. If it fails to find a bitrate high enough, the compressor will disable adaptive block sizes (by disabled the cCRNCompFlagHierarchical flag) and try again. This process can be pretty slow.
|
||||
|
||||
**crn\_uint32 m\_quality\_level**: Default: cCRNMaxQualityLevel (255). Sets the desired quality level (higher=better). Must range between [cCRNMinQualityLevel, cCRNMaxQualityLevel]. Note that .CRN and .DDS quality levels are not compatible with each other from an image quality standpoint.
|
||||
|
||||
m\_quality\_level directly controls the endpoint/selector palette sizes used by the .CRN/clustered .DDS frontends.
|
||||
|
||||
**crn\_uint32 m\_dxt1a\_alpha\_threshold**, **crn\_dxt\_quality m\_dxt\_quality**, **crn\_dxt\_compressor\_type m\_dxt\_compressor\_type**: These parameters are only relevant when compressing to .DDS files.
|
||||
|
||||
**crn\_uint32 m\_alpha\_component**: Default: 3. Specifies which source image component contains the alpha channel.
|
||||
|
||||
**crn\_uint32 m\_num\_helper\_threads**: Number of helper threads to create to assist the compressor. 0=no threading. Must be in the range [0,cCRNMaxHelperThreads].
|
||||
|
||||
**crn\_uint32 m\_userdata0**, **crn\_uint32 m\_userdata1**: Default: 0. These two 32-bit values are written directly to the header of the output .CRN file. They can be retrieved from a .CRN file by using the `crnd::crnd_get_texture_info()` helper function in `inc/crn_decomp.h`.
|
||||
|
||||
**crn\_progress\_callback\_func m\_pProgress\_func**, **void`*` m\_pProgress\_func\_data**: Pointer to a user-provided progress function and user data. This function is called periodically during compression, and can be used to terminate compression before it completes.
|
||||
|
||||
Various low-level .CRN specific parameters:
|
||||
|
||||
**float m\_crn\_adaptive\_tile\_color\_psnr\_derating** and **float m\_crn\_adaptive\_tile\_alpha\_psnr\_derating**: Default: 2.0f PSNR. Controls how aggressively the frontend uses large (non-4x4) tiles. Higher settings result in fewer tiles, resulting in lower quality/more blockiness, but smaller files. If this value is set too high the output may become too blocky.
|
||||
|
||||
**crn\_uint32 m\_crn\_color\_endpoint\_palette\_size**, **crn\_uint32 m\_crn\_color\_selector\_palette\_size**, **crn\_uint32 m\_crn\_alpha\_endpoint\_palette\_size**, **crn\_uint32 m\_crn\_alpha\_selector\_palette\_size**: Default: 0. These parameters allow the caller to directly control the palette sizes used by the frontend. The `cCRNCompFlagManualPaletteSizes` flag must be set.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Public Functions ##
|
||||
|
||||
### Memory Allocation ###
|
||||
```
|
||||
#define CRNLIB_MIN_ALLOC_ALIGNMENT sizeof(size_t) * 2
|
||||
|
||||
typedef void* (*crn_realloc_func)(void* p, size_t size, size_t* pActual_size, bool movable, void* pUser_data);
|
||||
typedef size_t (*crn_msize_func)(void* p, void* pUser_data);
|
||||
|
||||
void crn_set_memory_callbacks(crn_realloc_func pRealloc, crn_msize_func pMSize, void* pUser_data);
|
||||
|
||||
void crn_free_block(void *pBlock);
|
||||
```
|
||||
|
||||
By default, crnlib calls the usual C-API's to manage memory (`malloc`, `realloc`, `free`, etc.). Call `crn_set_memory_callbacks` to globally override this behavior. The user must implement two callbacks, one to handle block allocation/reallocation/freeing, and another that returns the size of allocated blocks.
|
||||
|
||||
This function is not thread safe, so don't call it while another thread is inside the library.
|
||||
|
||||
The custom realloc and msize functions must be implemented in a thread safe manner. These functions can be called from multiple threads when threaded compression is enabled.
|
||||
|
||||
All block pointers returned by the realloc callback must be aligned to at least `CRNLIB_MIN_ALLOC_ALIGNMENT` bytes.
|
||||
|
||||
### realloc callback ###
|
||||
|
||||
The custom reallocation function callback `crn_realloc_func` must examine its input parameters to determine the caller's actual intent. If the input pointer `p` is NULL, the caller wants to allocate a block which must be at least as large as `size`. NULL is returned if the allocation fails.
|
||||
|
||||
If `p` is not NULL but `size` is 0, the caller wants to free the block pointed to by `p`.
|
||||
|
||||
Otherwise, the caller wants to attempt to change the size of the block pointed to by `p`. In this case, if `movable` is true, it is acceptable to physically move the block to satisfy the reallocation request. If `movable` is false, the block **must not** be moved. NULL is returned if reallocation fails for any reason. In this case, the original allocated block must remain allocated.
|
||||
|
||||
If `pActual_size` is not NULL, `*pActual_size` should be set to the actual size of the returned block.
|
||||
|
||||
### crn\_free\_block function ###
|
||||
|
||||
Call this function to free the memory blocks allocated and returned by `crn_compress()`, `crn_decompress_crn_to_dds()`, or `crn_decompress_dds_to_images()`.
|
||||
|
||||
## Compression ##
|
||||
|
||||
### crn\_compress functions (overloaded) ###
|
||||
```
|
||||
void *crn_compress(const crn_comp_params &comp_params,
|
||||
crn_uint32 &compressed_size, crn_uint32 *pActual_quality_level = NULL, float *pActual_bitrate = NULL);
|
||||
|
||||
void *crn_compress(const crn_comp_params &comp_params, const crn_mipmap_params &mip_params,
|
||||
crn_uint32 &compressed_size, crn_uint32 *pActual_quality_level = NULL, float *pActual_bitrate = NULL);
|
||||
```
|
||||
|
||||
These functions compress a 32-bit/pixel texture to either: a regular DX9-style .DDS file, a "clustered" (or reduced entropy) .DDS file, or a .CRN file in memory.
|
||||
|
||||
This function is overloaded. The first variant cannot automatically generate mipmap levels, and the second one can.
|
||||
|
||||
Input parameters:
|
||||
|
||||
* **comp\_params** is the [compression parameters struct](API_Docs#enum_crn_comp_params.md).
|
||||
|
||||
* **compressed\_size** will be set to the size of the returned memory block containing the output file. The returned block must be freed by calling `crn_free_block()`.
|
||||
|
||||
* **`*`pActual\_quality\_level** will be set to the actual quality level used to compress the image. May be NULL.
|
||||
|
||||
* **`*`pActual\_bitrate** will be set to the output file's effective bitrate, possibly taking into account LZMA compression. May be NULL.
|
||||
|
||||
Return values:
|
||||
> A pointer to the compressed file data, or NULL on failure. The returned block must be freed by calling `crn_free_block()`. The **compressed\_size** parameter will be set to the size of the returned memory buffer.
|
||||
|
||||
Notes:
|
||||
* A "regular" .DDS file is compressed using normal (plain block by block) DXTn compression at the specified DXT quality level, using multiple threads if threading is enabled.
|
||||
* A "clustered" DDS file is compressed using clustered DXTn compression to either the target bitrate or the specified integer quality factor.
|
||||
* The output file is a standard DX9 format DDS file, except the compressor assumes you will be later losslessly compressing the DDS output file using the LZMA algorithm.
|
||||
* A texture is defined as an array of 1 or 6 "faces" (6 faces=cubemap), where each "face" consists of between [1,cCRNMaxLevels] mipmap levels.
|
||||
* Mipmap levels are simple 32-bit 2D images with a pitch of width\*sizeof(uint32), arranged in the usual raster order (top scanline first). Each pixel is arranged in memory as [R,G,B,A], where R is always first independent of platform endianness.
|
||||
* The image pixels may be grayscale (YYYX), grayscale/alpha (YYYA), 24-bit RGBX, or 32-bit RGBA colors (where "X"=don't care).
|
||||
* If the input is not sRGB, be sure to clear the `cCRNCompFlagPerceptual` flag in the [crn\_comp\_params](API_Docs#enum_crn_comp_params.md) struct.
|
||||
|
||||
For a usage example, see [example1.cpp](http://code.google.com/p/crunch/source/browse/trunk/example1/example1.cpp).
|
||||
|
||||
## Transcoding ##
|
||||
|
||||
### crn\_decompress\_crn\_to\_dds function ###
|
||||
```
|
||||
void *crn_decompress_crn_to_dds(const void *pCRN_file_data, crn_uint32 &file_size);
|
||||
```
|
||||
|
||||
`crn_decompress_crn_to_dds()` transcodes an entire .CRN file to .DDS using the [inc/crn\_decomp.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crn_decomp.h) header file library to do most of the heavy lifting. The output .DDS file's format is guaranteed to be one of the DXTn formats in the `crn_format` enum. This is a very fast operation, because the .CRN format is explicitly designed to be efficiently transcodable to DXTn.
|
||||
|
||||
For more control over decompression (particularly over memory management, and to implement palette caching), see the lower-level helper functions in [inc/crn\_decomp.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crn_decomp.h), which do not depend at all on crnlib.
|
||||
|
||||
For a usage example, see [example1.cpp](http://code.google.com/p/crunch/source/browse/trunk/example1/example1.cpp).
|
||||
|
||||
## Decompressing ##
|
||||
|
||||
### crn\_decompress\_dds\_to\_images function ###
|
||||
```
|
||||
struct crn_texture_desc
|
||||
{
|
||||
crn_uint32 m_faces;
|
||||
crn_uint32 m_width;
|
||||
crn_uint32 m_height;
|
||||
crn_uint32 m_levels;
|
||||
crn_uint32 m_fmt_fourcc; // Same as crnlib::pixel_format
|
||||
};
|
||||
bool crn_decompress_dds_to_images(const void *pDDS_file_data, crn_uint32 dds_file_size,
|
||||
crn_uint32 **ppImages, crn_texture_desc &tex_desc);
|
||||
|
||||
void crn_free_all_images(crn_uint32 **ppImages, const crn_texture_desc &desc);
|
||||
```
|
||||
|
||||
`crn_decompress_dds_to_images()` decompresses an entire .DDS file in any supported compressed/uncompressed pixel format to one or more uncompressed 32-bit/pixel images. See the crnlib::pixel\_format enum in [inc/dds\_defs.h](http://code.google.com/p/crunch/source/browse/trunk/inc/dds_defs.h) for a list of the supported .DDS pixel formats.
|
||||
|
||||
The caller is responsible for freeing each returned image, either by calling `crn_free_all_images()` or by manually calling `crn_free_block()` on each image pointer.
|
||||
|
||||
For a usage example, see [example1.cpp](http://code.google.com/p/crunch/source/browse/trunk/example1/example1.cpp).
|
||||
|
||||
## DXTn Block Compression ##
|
||||
|
||||
```
|
||||
typedef void *crn_block_compressor_context_t;
|
||||
|
||||
crn_block_compressor_context_t crn_create_block_compressor(const crn_comp_params ¶ms);
|
||||
|
||||
void crn_compress_block(crn_block_compressor_context_t pContext, const crn_uint32 *pPixels, void *pDst_block);
|
||||
|
||||
void crn_free_block_compressor(crn_block_compressor_context_t pContext);
|
||||
```
|
||||
|
||||
These functions allow the caller to compress 4x4 pixel image blocks to any non-swizzled DXTn format supported by crnlib: DXT1, DXT3, DXT5, DXT5A, DXN\_XY and DXN\_YX (basically BC1-BC5). For a usage example, see [example3.cpp](http://code.google.com/p/crunch/source/browse/trunk/example3/example3.cpp).
|
||||
|
||||
Unlike most other DXTn block compressors (such as ATI\_Compress or squish) crnlib's is stateful, so for efficient usage you should call `crn_create_block_compressor()` to create a state object and reuse it as many times as possible. (If you're curious, the state consists of an endpoint cache, and a bunch of heap memory used by the compressor for temporary arrays.) Don't call `crn_create_block_compressor()` once for each block to compress, or performance will be dreadful.
|
||||
|
||||
crnlib's DXTn endpoint optimizer actually supports any number of source pixels (i.e. from 1 to thousands, not just 16), but for simplicity this API currently only supports 4x4 texel blocks.
|
||||
|
||||
`crn_compress_block()` is thread safe (it may be called in parallel from multiple threads), as long as each thread uses its own state context.
|
||||
|
||||
## Helper Functions ##
|
||||
|
||||
crnlib exposes a number of straightforward functions to convert the crn-related enums defined above to ANSI/Unicode strings and back. There are also functions to retrieve various bits of info about the supported pixel formats.
|
||||
|
||||
They don't seem worth individually listing here, just see the [inc/crnlib.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crnlib.h).
|
||||
+34
@@ -0,0 +1,34 @@
|
||||
# Building #
|
||||
|
||||
## Windows ##
|
||||
|
||||
`crn.2008.sln` builds crnlib and the command line tool, and `crn_examples.2008.sln` builds the examples. Both are Visual Studio 2008 (VC9) solution file containing projects for Win32 and x64.
|
||||
|
||||
crnlib and crunch have also been built with VS2005, VS2010, and gcc 4.5.0 ([TDM GCC+MinGW](http://tdm-gcc.tdragon.net/)). A Codeblocks 10.05 workspace is also included (but building crnlib this way hasn't been tested a whole lot - it mostly exists to make porting to Linux using gcc a little easier).
|
||||
|
||||
## Linux ##
|
||||
|
||||
I simple makefile to build only the crunch executable is in crnlib/Makefile. I've only built/tested under 32-bit Ubuntu 12.04, however 64-bit should be easy to get working with minimal tweaks. Alternately, you can use the Codeblocks v10.05 Linux workspace in "crn\_linux.workspace".
|
||||
|
||||
**Important**: When compiling with gcc, be sure to use **-fno-strict-aliasing** otherwise crnlib will randomly misbehave. This also applies to the transcoder library in [inc/crn\_decomp.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crn_decomp.h).
|
||||
|
||||
## [example1](http://code.google.com/p/crunch/source/browse/trunk/example1/example1.cpp) ##
|
||||
Demonstrates how to use crnlib's high-level C-helper
|
||||
compression/decompression/transcoding functions in [inc/crnlib.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crnlib.h). It's a
|
||||
fairly complete example of crnlib's functionality.
|
||||
|
||||
## [example2](http://code.google.com/p/crunch/source/browse/trunk/example2/example2.cpp) ##
|
||||
Shows how to transcodec .CRN files to .DDS using **only**
|
||||
the functionality in [inc/crn\_decomp.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crn_decomp.h). It does not link against against
|
||||
crnlib.lib or depend on it in any way. (Note: The complete source code,
|
||||
approx. 4800 lines, to the CRN transcoder is included in [inc/crn\_decomp.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crn_decomp.h).)
|
||||
|
||||
example2 is intended to show how simple it is to integrate CRN textures
|
||||
into your application.
|
||||
|
||||
## [example3](http://code.google.com/p/crunch/source/browse/trunk/example3/example3.cpp) ##
|
||||
Shows how to use the regular, low-level DXTn block compressor
|
||||
functions in [inc/crnlib.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crn_decomp.h). This functionality is included for
|
||||
completeness. (Your engine or toolchain most likely already has its own
|
||||
DXTn compressor. crnlib's compressor is very competitive to most available closed and open source CPU-based
|
||||
compressors.)
|
||||
@@ -0,0 +1,23 @@
|
||||
# Known Issues/Bugs #
|
||||
|
||||
* .DDS files written by crunch v1.00 (and output by crnlib v1.00) don't have the pitch/linearsize fields set in the DDS header. Some DDS readers expect valid values here (and expect the DDSD\_LINEARSIZE flag to be set too). This should be fixed in v1.01.
|
||||
|
||||
* You really should provide crnlib with raw, 24/32-bit source textures. Don't provide it with second generation textures that have already been DXTn or JPEG compressed. Of course, you can do so, and I know of one company doing this to repackage existing assets (without source art) so they download more quickly, but obviously don't expect the highest quality.
|
||||
|
||||
> crnlib's custom DXT1 endpoint optimizer can detect pixel blocks which have been previously compressed to DXT1 using another DXTn compression library. It attempts to derive the endpoints originally used to compress these blocks in order to reduce artifacts, but it's not always successful.
|
||||
|
||||
* crnlib currently assumes you'll be further losslessly compressing its output .DDS files using LZMA. However, some engines use weaker codecs such as LZO, zlib, etc., so crnlib's bitrate measurements will be inaccurate. It should be easy to allow the caller to plug-in custom lossless compressors for bitrate measurement.
|
||||
|
||||
* Compressing to a desired bitrate can be very (to extremely) time consuming, especially when processing large (2k or 4k) images to the .CRN format. There are several high-level optimizations employed when compressing to clustered DXTn .DDS files using multiple trials, but not so for .CRN.
|
||||
|
||||
> The current approach compresses the input image multiple times, using an [interpolation search](http://en.wikipedia.org/wiki/Interpolation_search) to find the quality level index that gets closest to the target bitrate. The lib does have some functionality to save the closest quality level found for later runs, but the command line tool doesn't expose this feature yet.
|
||||
|
||||
* The .CRN compressor doesn't use 3 color (transparent) DXT1 blocks at all, only 4 color blocks. (Supporting both block types would be a major pain at this point.) So it doesn't support DXT1A transparency, and its output quality suffers a little due to this limitation. (Note that the clustered DXTn compressor does not have this limitation.)
|
||||
|
||||
* DXT3 is not supported when writing .CRN or clustered DXTn DDS files. (DXT3 is supported by crnlib when compressing to regular DXTn DDS files.) You'll get DXT5 files if you request DXT3. However, DXT3 is supported by the regular DXTn block compressor.
|
||||
|
||||
* The DXT5\_CCXY format uses a simple YCoCg encoding that seems workable but hasn't been tuned for max. quality yet.
|
||||
|
||||
* Ignore the SSIM statistics printed when using the -imagestats option - it's currently bogus. I've been tuning the codec using PSNR/RMSE so far.
|
||||
|
||||
* The crn\_decomp.h header file library is freaking huge (~4800 lines). It would be nice to port it to C and shrink it.
|
||||
+19
@@ -0,0 +1,19 @@
|
||||
# crunch/crnlib's license #
|
||||
|
||||
crnlib uses the (very permissive) open source ZLIB license:
|
||||
|
||||
[http://opensource.org/licenses/Zlib](http://opensource.org/licenses/Zlib)
|
||||
|
||||
License text from [crnlib.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crnlib.h):
|
||||
|
||||
Copyright (c) 2010-2012 Rich Geldreich and Tenacious Software LLC
|
||||
|
||||
This software is provided 'as-is', without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software.
|
||||
|
||||
Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions:
|
||||
|
||||
1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required.
|
||||
|
||||
2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software.
|
||||
|
||||
3. This notice may not be removed or altered from any source distribution.
|
||||
+244
@@ -0,0 +1,244 @@
|
||||
**crunch** is an open source ([ZLIB license](http://www.opensource.org/licenses/Zlib)) lossy texture compression library and command line compression tool for developers that distribute and use
|
||||
content in the [DXT1/5/N](http://en.wikipedia.org/wiki/S3_Texture_Compression) or [3DC/BC5](http://en.wikipedia.org/wiki/3Dc) compressed [mipmapped](http://en.wikipedia.org/wiki/Mipmap) GPU texture formats. It consists of a command line tool named "crunch", a compression library named "crnlib", and a single-header file, completely stand alone .CRN->DXTc transcoder C++ class located in [inc/crn\_decomp.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crn_decomp.h). crnlib's results are competitive to transform based recompression approaches, as shown [here](http://code.google.com/p/crunch/wiki/Stats).
|
||||
|
||||
If you're going to SIGGRAPH this year, Brandon Jones is going to be showing crunch at the WebGL BoF (Birds of a Feather) event: [Brandon Jones: Crunch/DXT/Rage demo](http://www.khronos.org/news/events/siggraph-los-angeles-2012). Wednesday, August 8, 4-5pm, JW Marriott Los Angeles at LA Live, Gold Ballroom – Salon 3
|
||||
|
||||
For background info/history of crunch: [blog post](http://richg42.blogspot.com/2012/07/doug-has-updated-his-blog-hes-now.html) (or the original post [here](http://richg42.blogspot.com/2012/07/the-saga-of-crunch.html)).
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Technical Summary ##
|
||||
|
||||
crnlib can compress mipmapped 2D textures and cubemaps to
|
||||
approximately .8-1.25 bits/texel, and normal maps to 1.75-2 bits/texel with reasonable quality (comparable or better than JPEG followed by real-time or even offline DXTc compression). The actual bitrate is indirectly controllable using an integer quality factor (like JPEG), or directly by specifying a target bitrate. crnlib implements a form of "clustered DXTn" compression, which is ultimately bounded by the quality achievable by DXTn itself. (DXTn's quality is actually [pretty low](http://cbloomrants.blogspot.com/2008/11/11-18-08-dxtc-part-2.html), but it's directly supported by Direct3D and OpenGL, and in hardware by practically every PC/console GPU.)
|
||||
|
||||
The approach used by crnlib differs significantly from [other approaches](http://www.intel.com/jp/software/pix/324337_324337.pdf), such as using JPEG decompression followed by compression using a real-time DXTn compressor. Its compressed texture data format was carefully designed to be quickly transcodable directly to DXTn with no intermediate recompression step.
|
||||
The single threaded transcode to DXTn rate is approximately 100 (DXT5/3DC) and 250 (DXT5A/DXT1) megatexels/sec. (Core i7 2.6 GHz). Fast random access to individual mipmap levels is supported. No pixel-level operations are performed during transcoding. The core transcode loops operate at the 4x4 block or 8x8 macroblock level.
|
||||
|
||||
crnlib can also generate standard .DDS files that, when losslessly post-compressed using LZMA/Deflate/LZO/etc., result in much smaller compressed files. (This is effectively a form of [rate-distortion optimization](http://en.wikipedia.org/wiki/Rate%E2%80%93distortion_optimization) applied to DXT+LZMA.) This feature allows easy integration into any engine or graphics library that already supports .DDS files and applies some form of lossless post-compression to the DXTn bits stored in those files (most engines do). Here's a Windows app that demonstrates this capability: [DDSExport](http://sites.google.com/site/richgel99/ddsexport).
|
||||
|
||||
The .CRN file format supports BC1-BC5, corresponding to the following DXTn texture formats: DXT1 (but not DXT1A), DXT5, DXT5A, and DXN/ATI\_3DC (either XY or YX component order).
|
||||
|
||||
The library also supports several popular swizzled variants, typically used for normal maps (several are supported by [AMD's Compressonator](http://developer.amd.com/tools/compressonator/pages/default.aspx)):
|
||||
DXT5\_XGBR, DXT5\_xGxR, DXT5\_AGBR, and DXT5\_CCxY (experimental luma-chroma YCoCg).
|
||||
|
||||
crnlib currently compiles under Linux (using gcc, currently only x86 but x64 support should be easy), and Windows (both x86 and x64) using Visual Studio 2008/2010. It also compiles and has been minimally tested with Codeblocks 10.05 using [TDM-GCC x64/MinGW](http://tdm-gcc.tdragon.net/) under Windows.
|
||||
|
||||
crnlib also contains some other possibly useful bits of code, like a multithreaded version of my [image resampler](http://code.google.com/p/imageresampler/) class, and my fast symbol\_codec class.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Upcoming Release ##
|
||||
|
||||
Planned features for v1.05 as of 11/25/12.
|
||||
|
||||
* Continue experimenting with PVRTC: Implement a PVRTC decompressor, then a basic compressor.
|
||||
* Add support for "raw" CRN files, assuming the user will post compress using gzip (useful for Javascript/WebGL apps). This is my highest priority after releasing v1.04.
|
||||
* Now that miniz is in the project, add support for DXTc+ZLIB rate distortion optimization, instead of just DXTc+LZMA.
|
||||
* Figure out the most elegant way to add support for writing 555, 565, and 4444 .DDS/.KTX textures (useful for mobile).
|
||||
* Compile with LLVM
|
||||
* Compile and test under 64-bit Linux (I only have 32-bit installed right now). Improve makefile.
|
||||
* Add rate distortion optimization for .KTX files
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Release History ##
|
||||
|
||||
* v1.04 (SVN trunk) - Nov. 25, 2012: Currently only checked into SVN trunk:
|
||||
* Added "-fno-strict-aliasing" gcc compiler option, otherwise crnlib randomly crashes in weird spots.
|
||||
* Fixed various DDS reader problems.
|
||||
* Better Linux support. Added makefile with proper command line options, see crnlib/Makefile (thanks alonzakai). Modified gcc compiler options used by Codeblocks Linux projects.
|
||||
* Basic ETC1 support - vanilla 4x4 block packing/unpacking. (More or less complete - see [rg\_etc1](http://code.google.com/p/rg-etc1/).) No support for rate distortion optimized or .CRN ETC1 files, though, just vanilla block by block ETC1.
|
||||
* .KTX file format reading/writing. The .KTX file format is not well supported by any tools yet - I've tested crnlib's KTX writer as best as I can with what's available.
|
||||
* Low-level support for reading/writing/flipping/unflipping Y flipped textures in all possible formats (useful to OpenGL/OpenGL ES devs) (more or less complete).
|
||||
* Integrate miniz and jpeg-compressor to crnlib so crunch can write PNG's and read progressive JPEG's without adding messy external dependencies (completed).
|
||||
* Fixed assertion problems in crn\_threading's "task\_pool" class.
|
||||
|
||||
* v1.03 (SVN tags/v103) - Apr. 26, 2012: Currently only checked in to SVN trunk until I finish the Linux port and fully regression test the codec. If you would like to give the Linux port a spin, you can download prebuilt binaries of v1.03 for 32-bit Linux/Win32/x86 [here](http://www.tenacioussoftware.com/crunch_v103_prerelease_win_linux_execs.7z) (or just build them yourself using Codeblocks v10.05).
|
||||
* v1.02 - Apr. 22, 2012: Full Linux port of crnlib and crunch for Evan Parker to test at Google. Lots of files modified: Got rid of all wchar\_t usage (wasn't worth the effort to port), now using LZHAM's more cross platform multithreading/threadpool code, added platform independent file and directory I/O wrappers. Also, I optimized the task\_pool "join" method a bit (it now uses a semaphore compared to spinning with sleep(1) while waiting for the workers to finish).
|
||||
* v1.01 - Apr. 15, 2012: DDS reader/writer fixes, -adding -usesourceformat command line option, merged over a few minor fixes from the ddsexport branch. Thanks to the devs at [The Happy Cloud](http://www.thehappycloud.com/) for reporting the DDS header problem.
|
||||
* v1.00 - Dec. 27, 2011: Initial release
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Applications Using crunch ##
|
||||
|
||||
Jean Sabatier reports that the [Fly! Legacy](http://fly.simvol.org/indexus.php) open source flight simulator is using a database of ~30,000 DXT5 .CRN textures as part of its texture streaming system.
|
||||
|
||||
[Planetside 2](http://www.planetside2.com/) is using crunch's rate distortion optimized (or "clustered") DXTc compressor and the [LZHAM lossless codec](http://code.google.com/p/lzham/) for most of its texture assets, which greatly reduces the title's download time.
|
||||
|
||||
[Evan Parker](http://plus.google.com/104261567553968048744) at Google has compiled the CRN->DXTc transcoder header file library [inc/crn\_decomp.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crn_decomp.h) to Javascript using [Emscripten](https://github.com/kripken/emscripten/wiki) (here's his [post](http://plus.google.com/104261567553968048744/posts/28jEPHtuhq5) with details). This allows him to quickly transcode CRN compressed images/textures directly to DXTc in Javascript (which is ~10x faster than decoding JPG's and packing to DXTc). Here's a [demo](http://www-cs-students.stanford.edu/~eparker/files/crunch/decode_test.html) (needs the latest Chrome beta with DXT texture support to fully function), and here's more [technical info](http://www-cs-students.stanford.edu/~eparker/files/crunch/more_info.html).
|
||||
|
||||
Brandon Jones has tested the Javascript emscripten port of crn\_decomp.h and reported his results [here](http://plus.google.com/101501294230020638079/posts/KJ42NGorLTj).
|
||||
|
||||
I believe the first shipping product to use crunch/crnlib compressed textures is ["Zombie Track Meat"](http://toucharcade.com/2012/03/10/gdc-2012-a-look-at-the-zombie-track-meat-collaboration/), a free to play NaCL game in the Chrome App Store [here](http://chrome.google.com/webstore/detail/jmfhnfnjfdoplkgbkmibfkdjolnemfdk). More technical info [here](http://fuzzycube.blogspot.com/2012/04/zombie-track-meat-post-mortem.html). ZTM uses .CRN compressed textures and the CRN->DXTc real-time transcoder library.
|
||||
|
||||
If you use crunch/crnlib, I would greatly appreciate it if you sent me an email with any feedback, or info on how you're using it in practice. (Credits somewhere would also be much appreciated, but are not required.)
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Documents ##
|
||||
|
||||
* [Various RMSE statistics (charts, graphs, etc.), and an example image (kodim14) compressed at various bitrates](http://code.google.com/p/crunch/wiki/Stats)
|
||||
* [Building the examples](http://code.google.com/p/crunch/wiki/Building)
|
||||
* [crnlib API Documentation](http://code.google.com/p/crunch/wiki/API_Docs)
|
||||
* [Supported file formats](http://code.google.com/p/crunch/wiki/SupportedFormats)
|
||||
* [Known problems](http://code.google.com/p/crunch/wiki/KnownProblems)
|
||||
* [Technical details](http://code.google.com/p/crunch/wiki/TechnicalDetails) is a high level description of the CRN data format, the CRN->DXTn transcoding process, and how the current compressor works.
|
||||
* Here's an external website showing the quality achievable with an early version of the lib (called hx/hxc at the time) at various palette (quality) settings: [Kodak test images](http://www.tenacioussoftware.com/hx/kodak/)
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Recommended Software ##
|
||||
|
||||
[AMD's Compressonator](http://developer.amd.com/gpu/compressonator/pages/default.aspx) tool is recommended to view the .DDS files created by the crunch tool and the included example projects.
|
||||
|
||||
Note: Some of the funky swizzled DXTn .DDS output formats (such as DXT5\_xGBR)
|
||||
read/written by the crunch tool or examples deviate from the DX9 DDS
|
||||
standard, so DXSDK tools such as DXTEX.EXE won't load them at all or
|
||||
they won't be properly displayed. AMD's tool can view these files.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Creating Compressed Textures from the Command Line (crunch.exe) ##
|
||||
|
||||
The simplest way to create compressed textures using crnlib is to
|
||||
integrate the bin\crunch.exe (or bin\crunch\_x64.exe) command line tool
|
||||
into your texture build toolchain or export process. It can write DXTn
|
||||
compressed 2D/cubemap textures to regular DXTn compressed .DDS,
|
||||
clustered (or reduced entropy) DXTn compressed .DDS, or .CRN files. It
|
||||
can also transcode or decompress files to several standard image
|
||||
formats, such as TGA or BMP. Run crunch.exe with no options for help.
|
||||
|
||||
The .CRN files created by crunch.exe can be efficiently transcoded to
|
||||
DXTn using the stand-alone CRN transcoding header file library located in `inc/crn_decomp.h`.
|
||||
|
||||
Here are a few example crunch.exe command lines:
|
||||
|
||||
1. Compress blah.tga to blah.dds using normal DXT1 compression:
|
||||
|
||||
`crunch -file blah.tga -fileformat dds -dxt1`
|
||||
|
||||
2. Compress blah.tga to blah.dds using clustered DXT1 at an effective bitrate of 1.5 bits/texel (after the .DDS file is post-compressed using LZMA), display image statistic:
|
||||
|
||||
`crunch -file blah.tga -fileformat dds -dxt1 -bitrate 1.5 -imagestats`
|
||||
|
||||
3. Compress blah.tga to blah.dds using clustered DXT1 at quality level 100 (from [0,255]), with no mipmaps, display LZMA statistics:
|
||||
|
||||
`crunch -file blah.tga -fileformat dds -dxt1 -quality 100 -mipmode none -lzmastats`
|
||||
|
||||
3. Compress blah.tga to blah.crn using clustered DXT1 at a bitrate of 1.2 bits/texel, no mipmaps:
|
||||
|
||||
`crunch -file blah.tga -dxt1 -bitrate 1.2 -mipmode none`
|
||||
|
||||
4. Decompress blah.dds to a .tga file:
|
||||
|
||||
`crunch -file blah.dds -fileformat tga`
|
||||
|
||||
5. Transcode blah.crn to a .dds file:
|
||||
|
||||
`crunch -file blah.crn`
|
||||
|
||||
6. Decompress blah.crn, writing each mipmap level to a separate .tga file:
|
||||
|
||||
`crunch -split -file blah.crn -fileformat tga`
|
||||
|
||||
crunch.exe can do a lot more, like rescale/crop images before
|
||||
compression, convert images from one file format to another, compare
|
||||
images, process multiple images, etc.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Using crnlib ##
|
||||
|
||||
The most flexible and powerful way of using crnlib is to integrate the
|
||||
library into your editor/toolchain/etc. and directly supply it your
|
||||
raw/source texture bits. See the C-style API's and comments in
|
||||
[inc/crnlib.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crnlib.h).
|
||||
|
||||
To compress, #include "crnlib.h", fill in the `crn_comp_params` struct, and call one function:
|
||||
|
||||
```
|
||||
void *crn_compress(const crn_comp_params &comp_params, crn_uint32 &compressed_size,
|
||||
crn_uint32 *pActual_quality_level = NULL, float *pActual_bitrate = NULL);
|
||||
```
|
||||
|
||||
The returned pointer will be NULL on failure, or a pointer to the .CRN or .DDS file data.
|
||||
|
||||
Or, if you want crnlib to also generate mipmaps, you call this function:
|
||||
|
||||
```
|
||||
void *crn_compress(const crn_comp_params &comp_params, const crn_mipmap_params &mip_params,
|
||||
crn_uint32 &compressed_size, crn_uint32 *pActual_quality_level = NULL, float *pActual_bitrate = NULL);
|
||||
```
|
||||
|
||||
You can also transcode/uncompress .DDS/.CRN files to raw 32bpp images
|
||||
using `crn_decompress_crn_to_dds()` and `crn_decompress_dds_to_images()`.
|
||||
|
||||
Internally, crnlib just uses `inc/crn_decomp.h` to transcode textures to
|
||||
DXTn. If you only need to transcode .CRN format files to raw DXTn bits
|
||||
at runtime (and not compress), you don't actually need to compile or
|
||||
link against crnlib at all. Just include inc/crn\_decomp.h, which
|
||||
contains a completely self-contained CRN transcoder in the "crnd"
|
||||
namespace. The `crnd_get_texture_info()`, `crnd_unpack_begin()`,
|
||||
`crnd_unpack_level()`, etc. functions are all you need to efficiently get
|
||||
at the raw DXTn bits, which can be directly supplied to whatever API or
|
||||
GPU you're using. (See example2.)
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Related Links ##
|
||||
|
||||
I'm not aware of any other open source libraries that solve this problem in a usable, "out of the box" manner yet, but these links are interesting:
|
||||
|
||||
* [Experiments in Luma-Optimized and Mipmapped DXT1 Compression](http://sites.google.com/site/richgel99/luma_chroma_texture_compression)
|
||||
* [ddsexport](http://sites.google.com/site/richgel99/ddsexport) - A GUI demo of crnlib's ability to create rate-distortion optimized DDS textures.
|
||||
* Strom and Wennersten, [Lossless Compression of Already Compressed Textures](http://www.jacobstrom.com/publications/StromWennerstenHPG2011.pdf)
|
||||
* van Waveren, [Real-Time DXT Compression](http://www.intel.com/jp/software/pix/324337_324337.pdf)
|
||||
* Excellent public domain real-time DXTn compressor: [rygDXT](http://www.farb-rausch.de/~fg/code/) and [stb\_dxt.h](http://nothings.org/stb/stb_dxt.h)
|
||||
* [Charles Bloom's various blog posts on DXT compression](http://cbloomrants.blogspot.com/2009/06/06-17-09-dxtc-more-followup.html)
|
||||
* [FastDXT](http://www.evl.uic.edu/cavern/fastdxt/)
|
||||
* [Spiro's DXT Compression Algorithm Experiments](http://lspiroengine.com/?p=260)
|
||||
* [LSDxt DXT](http://lspiroengine.com/?p=516) - Spiro's texture compression tool
|
||||
* [Variable Bit Rate GPU Texture Compression](http://www.csee.umbc.edu/~olano/papers/#texcompress)
|
||||
* [libsquish](http://code.google.com/p/libsquish/) - Open source (MIT license) DXT compression library.
|
||||
* [Super Simple Texture Compression](http://github.com/divVerent/s2tc/wiki) - Alternative DXT1-compatible compression method that is purposely limited to a subset of DXT1 (only uses 2 colors per block, and is effectively only 1 bit per selector). The [Quality Comparison](http://github.com/divVerent/s2tc/wiki/QualityComparison) page is interesting.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Special Thanks ##
|
||||
|
||||
Thanks to [Colt McAnlis](https://plus.google.com/105062545746290691206/posts) at Google for porting the CRN->DXT transcoder library (in crn\_decomp.h) to Native Client, and for mentioning crunch in his [GDC presentation](http://www.youtube.com/watch?v=7bJ-D1xXEeg) on his texture compression R&D.
|
||||
|
||||
Some portions of this software make use of public domain code
|
||||
originally written by Igor Pavlov (LZMA), RYG's public domain real-time DXTn compressor, and stb\_image.c from [Sean Barrett](http://nothings.org/).
|
||||
|
||||
Many thanks to Violet Koppel for funding much of crnlib's development in 2009. Also, thanks to Colt again, and John Brooks at [Blue Shift, Inc.](http://www.blueshiftinc.com/) for helping test and giving feedback on crnlib. Also thanks to Charles Bloom's informative [blog posts](http://cbloomrants.blogspot.com/2008/11/11-18-08-dxtc.html) on his work on DXT compression.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Support Contact ##
|
||||
|
||||
For any questions or problems with this software please **email** [Rich Geldreich](http://www.mobygames.com/developer/sheet/view/developerId,190072/) at <richgel99 _at_ gmail _dot_ com>. Here's my [twitter page](http://twitter.com/#!/richgel999).
|
||||
@@ -0,0 +1,80 @@
|
||||
## CRN vs. JPEG+Real-Time DXT1 ##
|
||||
|
||||
A popular alternative technique involves bolting some sort of real-time DXT compressor onto the back end of a transform coder, like JPEG. JPEG is far from state of the art, but it's very fast and quite popular. The following data shows that CRN is competitive against transform based solutions.
|
||||
|
||||
In this test, I compressed the test corpus with libjpeg at various bitrates using a binary search to find the JPEG quality factor level closest to each test bitrate. The test bitrates where .75-2.0 bpp at .25 bpp increments.
|
||||
|
||||
These .JPG files where then unpacked to 24-bit RGB (using the decompressor in [stb\_image.c](http://nothings.org/stb_image.c)), then compressed to DXT1 using [RYG's real-time DXT1 compressor](http://www.farb-rausch.de/~fg/code/). The resulting DXT1 bits where then unpacked using crnlib to 24-bit RGB and compared to the original images to generate this RMSE data:
|
||||
|
||||
[Charts at various bitrates](http://www.tenacioussoftware.com/crn_stats/dec29/crn_vs_jpg_stats_dec29.htm)
|
||||
|
||||
[Raw Excel spreadsheet](http://www.tenacioussoftware.com/crn_stats/dec29/crn_vs_jpg_stats_dec29.xlsx)
|
||||
|
||||
This test shows that, in a RMSE sense, CRN is worse than JPG+RYG\_DXT1 at less than 1.0bpp. Between 1-1.25bpp, CRN is roughly comparable (please excuse the line colors - they are different on each chart):
|
||||
|
||||

|
||||
|
||||
Beginning at 1.25bpp CRN is usually better than JPEG followed by real-time DXT1 compression:
|
||||
|
||||

|
||||
|
||||
At 1.5bpp or higher CRN was always equal or better:
|
||||
|
||||

|
||||
|
||||
These results are similar to [Stromm and Wennersten's published results of testing JPEG followed by ETC1](http://www.jacobstrom.com/publications/StromWennerstenHPG2011.pdf):
|
||||
|
||||
> "If JPEG is used as the transport format, the textures will need to be compressed on-the-fly to a texture compression format such as ETC1 after download. To find out how much this transcoding will lower the quality, we compressed the JPEG images to ETC1 and measured the average mean square error. The result was an increased error equivalent to a PSNR drop of 2.02 dB than if just ETC1 encoding were used. Thus even if the JPEGs have equal quality to the proposed scheme, after the transcoding there will be a significant quality penalty. The transcoding used slow exhaustive compression—fast transcoding will give an even higher penalty"
|
||||
|
||||
Also, these observations from the introduction are relevant to this comparison:
|
||||
|
||||
> "[...] This is often a good solution, especially if low bit rates are of interest, but the resulting texture quality suffers for two reasons: First, the final texture will include image artifacts both from JPEG and from the texture codec. Second, to make recompression from JPEG to the texture codec quick enough, shortcuts may be necessary, especially on mobile devices with limited computational power. This lowers quality, specially when compared to slow, perhaps exhaustive compression."
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Raw Data: CRN .75-2.0bpp vs. DXT1 (real-time and offline) ##
|
||||
|
||||
Here's a [table](http://www.tenacioussoftware.com/crn_stats/dec29/crn_stats_dec_29.htm) containing the RGB [RMSE](http://en.wikipedia.org/wiki/RMSE) of RYG's real-time DXT1 compressor, crnlib's regular DXT1 compressor using uniform or perceptual metrics (which is very similar to ATI\_Compress or squish), and .CRN at .75-2.0bpp at .25bpp increments. This chart also has two columns showing how many bits/pixel are needed by LZMA to compress DXT1 uniform and perceptual .DDS files.
|
||||
|
||||
Here's a [chart](http://www.tenacioussoftware.com/crn_stats/dec29/rmse_chart.png) of the above spreadsheet, with the images sorted by DXT1 uniform RMSE. and here's the [raw Excel spreadsheet file](http://www.tenacioussoftware.com/crn_stats/dec29/crn_stats_dec_29.xlsx).
|
||||
|
||||
Note that each .CRN column was generated by specifying the -bitrate option, which really only limits the **maximum** bitrate a file is allowed to use. .CRN is a variable block size format, and the front end limits the maximum endpoint/selector palette sizes to 8192 entries each. So on a few of the simpler images here very high bitrates (like 1.75 or 2.0bpp) may not actually be achievable. (You can see this effect clearly on serrano.)
|
||||
|
||||
These RMSE values should be comparable to the values in Charles Bloom's very useful [blog post](http://cbloomrants.blogspot.com/2008/11/11-20-08-dxtc-part-3.html) comparing various DXT1 compressors. (My RYG RMSE stats are slightly different from Bloom's, and I don't know exactly why, but I'm guessing our versions of RYG's compressor are slightly different.)
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Compressed Images ##
|
||||
|
||||
Ordered by highest to lowest quality. It's pretty clear that .CRN's subjective quality starts dropping fast around ~1.0 bpp.
|
||||
|
||||
**crnlib DXT1** (uniform colorspace metrics), RMSE: 8.29, .DDS+LZMA: 2.95 bpp
|
||||

|
||||
|
||||
**RYG DXT1**, RMSE: 8.97
|
||||

|
||||
|
||||
**CRN DXT1 2.0 bpp**, RMSE: 10.36
|
||||

|
||||
|
||||
**CRN DXT1 1.75 bpp**, RMSE: 11.23
|
||||

|
||||
|
||||
**CRN DXT1 1.5 bpp**, RMSE: 12.44
|
||||

|
||||
|
||||
**CRN DXT1 1.25 bpp**, RMSE: 14.2
|
||||

|
||||
|
||||
**CRN DXT1 1.0 bpp**, RMSE: 16.75
|
||||

|
||||
|
||||
**CRN DXT1 .75bpp**, RMSE: 20.69
|
||||

|
||||
|
||||
**CRN DXT1 .65bpp**, RMSE: 22.93
|
||||

|
||||
@@ -0,0 +1,41 @@
|
||||
# Supported File Formats #
|
||||
|
||||
crnlib supports two compressed texture file formats. The first
|
||||
format (clustered [.DDS](http://en.wikipedia.org/wiki/DirectDraw_Surface)) is simple to integrate into an existing project
|
||||
(no code changes are typically required), but it doesn't offer the
|
||||
highest quality/compression ratio that crnlib is capable of. Integrating
|
||||
the second, higher quality custom format (.CRN) requires a few
|
||||
typically straightforward engine modifications to integrate the
|
||||
.CRN->DXTn transcoder header file library into your tools/engine.
|
||||
|
||||
## .DDS ##
|
||||
crnlib can compress textures to standard DX9-style [.DDS](http://en.wikipedia.org/wiki/DirectDraw_Surface) files using
|
||||
clustered DXTn compression, which is a subset of the approach used to
|
||||
create .CRN files. (For completeness, crnlib also supports vanilla, block by block DXTn compression too, but that's not very interesting.)
|
||||
Clustered DXTn compressed .DDS files are much more compressible than
|
||||
files created by other libraries/tools. Apart from increased
|
||||
compressibility, the .DDS files generated by this process are completely
|
||||
standard so they should be fairly easy to add to a project with little
|
||||
to no code changes.
|
||||
|
||||
To actually benefit from clustered DXTn .DDS files, your engine needs to
|
||||
further losslessly compress the .DDS data generated by crnlib using a
|
||||
lossless codec such as zlib, lzo, LZMA, LZHAM, etc. Most likely, your
|
||||
engine does this already. (If not, you definitely should because DXTn
|
||||
compressed textures generally contain a large amount of highly redundant
|
||||
data.)
|
||||
|
||||
Clustered .DDS files are intended to be the simplest/fastest way to
|
||||
integrate crnlib's tech into a project.
|
||||
|
||||
## .CRN ##
|
||||
The second, better, option is to compress your textures to .CRN files
|
||||
using crnlib. To read the resulting .CRN data, you must add the .CRN
|
||||
transcoder library (located in the included single file, stand-alone
|
||||
header file library inc/crn\_decomp.h) into your application. .CRN files
|
||||
provide noticeably higher quality at the same effective bitrate compared
|
||||
to clustered DXTn compressed .DDS files. Also, .CRN files don't require
|
||||
further lossless compression because they're already highly compressed.
|
||||
|
||||
.CRN files are a bit more difficult/risky to integrate into a project, but
|
||||
the resulting compression ratio and quality is superior vs. clustered .DDS files.
|
||||
@@ -0,0 +1,118 @@
|
||||
# Compression Algorithm Details #
|
||||
|
||||
This is pretty high level and could be much better. I'll improve this over time, for now I hope this is enough:
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Data Format ##
|
||||
|
||||
The easiest way to describe how crnlib works is to start at the compressed data stream and the transcoding process and work backwards to the compressor, which also mirrors the design process followed when I designed crnlib.
|
||||
|
||||
.CRN DXT1 files consist of a small header, followed by a [DPCM](http://en.wikipedia.org/wiki/DPCM)+[Huffman](http://en.wikipedia.org/wiki/Huffman_Compression) compressed endpoint palette, and a DPCM+Huffman compressed selector palette. (DXT5 files contain two more palettes for alpha endpoints and selectors. Also, I'm not sure that "palette" is the best word. "Codebook" may be more appropriate, but graphics programmers seem more familiar with the concept of palettes.)
|
||||
|
||||
Here's a visualization of the DXT1 color selector palette for kodim04.png. It's 2692x16. The 2-bit selectors where scaled to [0,255]. (I think this palette was sorted by similarity, which is one of the palette orderings tested by the compressor's backend.)
|
||||
|
||||

|
||||
|
||||
And here's a visualization of the DXT1 color endpoint palette:
|
||||

|
||||
|
||||
(These a very wide images, so they get downsampled when viewed in the wiki.)
|
||||
|
||||
This particular color endpoint palette contains 2415 entries (horizontal axis), where each entry contains a 32-bit integer containing two 565 colors (vertical axis, enlarged by 8x in this image).
|
||||
|
||||
Each mipmap is divided up into 8x8 pixel "macroblocks". Each macroblock corresponds to four 4x4 pixel DXTn blocks arranged in a 2x2 checkerboard pattern. Each macroblock is adaptively subdivided by the compressor into one or more "tiles". Very simple macroblocks (say solid ones that use only a single color) can use a single 8x8 pixel tile, but more complex macroblocks can use any non-overlapping combination of 8x4, 4x8, or 4x4 tiles. (There are 9 possible ways of arranging the tiles in a single macroblock.)
|
||||
|
||||
In this image, the macroblock tile boundaries are outlined in gray:
|
||||
|
||||

|
||||
|
||||
Notice that the more complex areas of the image contain smaller tiles, so these image areas get assigned more endpoints. Simpler areas use larger tiles, so the DXT1 blocks in these tiles are constrained to share the same endpoints. Also, a single color endpoint pair can be shared by many tiles, independent of their location in the image.
|
||||
|
||||
The endpoint/selector palettes are shared by all mipmap levels present in the .CRN file.
|
||||
|
||||
For each tile, a compressed index is sent to select the macroblock tile arrangement, followed by between one to four DPCM+Huffman compressed endpoint palette indices. Four selector indices (again coded using DPCM+Huffman) are always sent immediately after the endpoint(s). The macroblock rows are raster scanned in a serpentine order: left->right, then right->left, etc.
|
||||
|
||||
The C++ code for the transcoder's inner loop is in [crn\_decomp.h](http://code.google.com/p/crunch/source/browse/trunk/inc/crn_decomp.h). DXT1 textures are handled by `crn_unpacker::unpack_dxt1()`.
|
||||
|
||||
Zeng's technique is used to order the palettes so DPCM coding of the various block palette indices works efficiently. See [An efficient color re-indexing scheme for palette-based compression](http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=899448)
|
||||
|
||||
For some textures, it's more efficient to reorder the palettes by similarity (effectively the [traveling salesman problem](http://en.wikipedia.org/wiki/Traveling_salesman_problem)) so they compress more effectively, but this can hurt index compression. The compressor tries several palette orderings and chooses whatever is cheapest overall.
|
||||
|
||||
The example1 tool can display a bunch of information about .CRN files, such as the compressed size of each palette, Huffman tables, and mip levels. For example:
|
||||
|
||||
```
|
||||
E:\crunch17_3\bin>example1 i kodim04.crn
|
||||
example1 - Version v1.00 Built Dec 27 2011, 17:18:08
|
||||
Loading source file: kodim04.crn
|
||||
crnd_validate_file:
|
||||
File size: 85949
|
||||
ActualDataSize: 85949
|
||||
HeaderSize: 110
|
||||
TotalPaletteSize: 12687
|
||||
TablesSize: 1448
|
||||
Levels: 10
|
||||
LevelCompressedSize: 51830 14460 3968 1050 271 68 26 12 13 6 0 0 0 0 0 0
|
||||
ColorEndpointPaletteSize: 2415
|
||||
ColorSelectorPaletteSize: 2692
|
||||
AlphaEndpointPaletteSize: 0
|
||||
AlphaSelectorPaletteSize: 0
|
||||
crnd_get_texture_info:
|
||||
Dimensions: 512x768
|
||||
Levels: 10
|
||||
Faces: 1
|
||||
BytesPerBlock: 8
|
||||
UserData0: 0
|
||||
UserData1: 0
|
||||
CrnFormat: DXT1
|
||||
```
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Transcoding to DXTn ##
|
||||
|
||||
To transcode a mipmap level to DXTn, the palettes must be first unpacked, either into a temporary array or cache. Currently, all mipmaps in a .CRN file share the same set of endpoint/selector palettes. To generate the DXTn bits, the transcoder iterates through each macroblock and decodes the palette indices. The actual DXTn bits are effectively just memcpy'd from the palette arrays directly into the destination DXTn texture. The transcoder doesn't care at all what the endpoint/selector palette entries actually consist of during transcoding -- it just copies the bits. Transcoding is quite fast because it works at the macroblock/block level, never at the pixel level.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Compression ##
|
||||
|
||||
The compressor is very complex, partially due to the weird and surprisingly deep properties imposed by the DXTn block format. It consists of two independent parts, called the "frontend" and "backend". The frontend is by far the most complex, and a good chunk of crnlib is devoted to helper classes used by the frontend.
|
||||
|
||||
The frontend, located in [dxt\_hc.cpp/h](http://code.google.com/p/crunch/source/browse/trunk/crnlib/crn_dxt_hc.cpp), takes the 24/32bpp source texture mipmaps as inputs. It adaptively subdivides the texture macroblocks into tiles, finds the endpoint and selector clusters, and then generates optimized, but unordered palettes based off these clusters. The backend, located in [crn\_comp.cpp/h](http://code.google.com/p/crunch/source/browse/trunk/crnlib/crn_comp.cpp) takes the raw palettes, macroblock tile layouts, and indices supplied by the frontend and tries to efficiently code them.
|
||||
|
||||
The color endpoint palettes are created from their source clusters using a very high quality, scalable DXT1 endpoint optimizer located in [crn\_dxt1.cpp/h](http://code.google.com/p/crunch/source/browse/trunk/crnlib/crn_dxt1.cpp). This custom optimizer is capable of processing any number of source pixels, instead of the typical hard coded 16. crnlib's DXT1 endpoint optimizer's quality (in a PSNR sense) is comparable to ATI's, NVidia's, or squish's. (I verified this while building the endpoint optimizer by randomly extracting millions of 4x4 pixel blocks from a large corpus of game textures and photos, compressing->decompressing them using each compressor, comparing the results, and ruthlessly investigating and fixing any blocks where crnlib's output was lower quality. I hope to eventually release this tool.)
|
||||
|
||||
Interestingly, crnlib's DXT1 endpoint optimizer is equal or better than squish or ATI\_Compress (in a PSNR sense), and of comparable speed, without using a single line of SIMD or assembly code.
|
||||
|
||||
The endpoint clusterization step uses top-down [cluster analysis](http://en.wikipedia.org/wiki/Cluster_analysis), and [vector quantization](http://en.wikipedia.org/wiki/Vector_quantization) is used to create the initial selector palette. The frontend performs several feedback passes, in between the clusterization and VQ steps, to optimize quality, and the compressor uses several brute force refinement stages to improve quality even more.
|
||||
|
||||
Most of the compression steps are multithreaded in a relatively straightforward way: subdivide the work into independent threadpool tasks, fork to multiple threads, then join. The clusterizer is also multithreaded, where it forks to multiple threads after the initial tree subdivision steps.
|
||||
|
||||
The .CRN format currently utilizes [canonical Huffman coding](http://en.wikipedia.org/wiki/Canonical_Huffman_code) for speed. The symbol codelengths for each Huffman table are sent in a simple compressed manner after the header (like Deflate).
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
## The Path Forward ##
|
||||
|
||||
Given a fixed amount of additional developer time to improve .CRN's bitrate/quality, I think the backend would benefit the most from more work. (So far, much more effort has been devoted to the DXT1 endpoint optimizer and the frontend stages.) The current format is probably favoring transcoding speed too highly vs. ratio. Also, the Huffman tables contain too many symbols, and alternatives to the DPCM coding should be explored.
|
||||
|
||||
Ideas for crunch v2.0:
|
||||
* Use techniques from LZHAM to improve backend coding (mix bitwise arithmetic with semi-adaptive Huffman).
|
||||
* Port transcoder library to plain C vs. C++
|
||||
* Add smarter prediction to the macroblock tile layout selector indices
|
||||
* Native Javascript transcoders
|
||||
* Palette compression improvements
|
||||
* Split mipchain from mip0, so individual mips can be transcoded more quickly
|
||||
* Support "raw" CRN files that use no additional compression (assume they will be post-compressed by the user using gzip or LZMA - useful for Javascript/WebGL)
|
||||
* Support uncompressed palettes, for high speed random access in the transcoder
|
||||
* Investigate 16x16 or 32x32 macroblock sizes. Optimize .CRN for bitrates below 1.0 bpp.
|
||||
* Clustered (rate distortion optimized) DDS: Add support for ZLIB, LZO, and Snappy lossless post-compression
|
||||
@@ -0,0 +1,336 @@
|
||||
<html xmlns:v="urn:schemas-microsoft-com:vml"
|
||||
xmlns:o="urn:schemas-microsoft-com:office:office"
|
||||
xmlns:x="urn:schemas-microsoft-com:office:excel"
|
||||
xmlns="http://www.w3.org/TR/REC-html40">
|
||||
|
||||
<head>
|
||||
<meta name="Excel Workbook Frameset">
|
||||
<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
|
||||
<meta name=ProgId content=Excel.Sheet>
|
||||
<meta name=Generator content="Microsoft Excel 12">
|
||||
<link rel=File-List href="crn_stats_dec_29_files/filelist.xml">
|
||||
<![if !supportTabStrip]>
|
||||
<link id="shLink" href="crn_stats_dec_29_files/sheet001.htm">
|
||||
<link id="shLink" href="crn_stats_dec_29_files/sheet002.htm">
|
||||
<link id="shLink" href="crn_stats_dec_29_files/sheet003.htm">
|
||||
|
||||
<link id="shLink">
|
||||
|
||||
<script language="JavaScript">
|
||||
<!--
|
||||
var c_lTabs=3;
|
||||
|
||||
var c_rgszSh=new Array(c_lTabs);
|
||||
c_rgszSh[0] = "Sheet1";
|
||||
c_rgszSh[1] = "Sheet2";
|
||||
c_rgszSh[2] = "Sheet3";
|
||||
|
||||
|
||||
|
||||
var c_rgszClr=new Array(8);
|
||||
c_rgszClr[0]="window";
|
||||
c_rgszClr[1]="buttonface";
|
||||
c_rgszClr[2]="windowframe";
|
||||
c_rgszClr[3]="windowtext";
|
||||
c_rgszClr[4]="threedlightshadow";
|
||||
c_rgszClr[5]="threedhighlight";
|
||||
c_rgszClr[6]="threeddarkshadow";
|
||||
c_rgszClr[7]="threedshadow";
|
||||
|
||||
var g_iShCur;
|
||||
var g_rglTabX=new Array(c_lTabs);
|
||||
|
||||
function fnGetIEVer()
|
||||
{
|
||||
var ua=window.navigator.userAgent
|
||||
var msie=ua.indexOf("MSIE")
|
||||
if (msie>0 && window.navigator.platform=="Win32")
|
||||
return parseInt(ua.substring(msie+5,ua.indexOf(".", msie)));
|
||||
else
|
||||
return 0;
|
||||
}
|
||||
|
||||
function fnBuildFrameset()
|
||||
{
|
||||
var szHTML="<frameset rows=\"*,18\" border=0 width=0 frameborder=no framespacing=0>"+
|
||||
"<frame src=\""+document.all.item("shLink")[0].href+"\" name=\"frSheet\" noresize>"+
|
||||
"<frameset cols=\"54,*\" border=0 width=0 frameborder=no framespacing=0>"+
|
||||
"<frame src=\"\" name=\"frScroll\" marginwidth=0 marginheight=0 scrolling=no>"+
|
||||
"<frame src=\"\" name=\"frTabs\" marginwidth=0 marginheight=0 scrolling=no>"+
|
||||
"</frameset></frameset><plaintext>";
|
||||
|
||||
with (document) {
|
||||
open("text/html","replace");
|
||||
write(szHTML);
|
||||
close();
|
||||
}
|
||||
|
||||
fnBuildTabStrip();
|
||||
}
|
||||
|
||||
function fnBuildTabStrip()
|
||||
{
|
||||
var szHTML=
|
||||
"<html><head><style>.clScroll {font:8pt Courier New;color:"+c_rgszClr[6]+";cursor:default;line-height:10pt;}"+
|
||||
".clScroll2 {font:10pt Arial;color:"+c_rgszClr[6]+";cursor:default;line-height:11pt;}</style></head>"+
|
||||
"<body onclick=\"event.returnValue=false;\" ondragstart=\"event.returnValue=false;\" onselectstart=\"event.returnValue=false;\" bgcolor="+c_rgszClr[4]+" topmargin=0 leftmargin=0><table cellpadding=0 cellspacing=0 width=100%>"+
|
||||
"<tr><td colspan=6 height=1 bgcolor="+c_rgszClr[2]+"></td></tr>"+
|
||||
"<tr><td style=\"font:1pt\"> <td>"+
|
||||
"<td valign=top id=tdScroll class=\"clScroll\" onclick=\"parent.fnFastScrollTabs(0);\" onmouseover=\"parent.fnMouseOverScroll(0);\" onmouseout=\"parent.fnMouseOutScroll(0);\"><a>«</a></td>"+
|
||||
"<td valign=top id=tdScroll class=\"clScroll2\" onclick=\"parent.fnScrollTabs(0);\" ondblclick=\"parent.fnScrollTabs(0);\" onmouseover=\"parent.fnMouseOverScroll(1);\" onmouseout=\"parent.fnMouseOutScroll(1);\"><a><</a></td>"+
|
||||
"<td valign=top id=tdScroll class=\"clScroll2\" onclick=\"parent.fnScrollTabs(1);\" ondblclick=\"parent.fnScrollTabs(1);\" onmouseover=\"parent.fnMouseOverScroll(2);\" onmouseout=\"parent.fnMouseOutScroll(2);\"><a>></a></td>"+
|
||||
"<td valign=top id=tdScroll class=\"clScroll\" onclick=\"parent.fnFastScrollTabs(1);\" onmouseover=\"parent.fnMouseOverScroll(3);\" onmouseout=\"parent.fnMouseOutScroll(3);\"><a>»</a></td>"+
|
||||
"<td style=\"font:1pt\"> <td></tr></table></body></html>";
|
||||
|
||||
with (frames['frScroll'].document) {
|
||||
open("text/html","replace");
|
||||
write(szHTML);
|
||||
close();
|
||||
}
|
||||
|
||||
szHTML =
|
||||
"<html><head>"+
|
||||
"<style>A:link,A:visited,A:active {text-decoration:none;"+"color:"+c_rgszClr[3]+";}"+
|
||||
".clTab {cursor:hand;background:"+c_rgszClr[1]+";font:9pt Arial;padding-left:3px;padding-right:3px;text-align:center;}"+
|
||||
".clBorder {background:"+c_rgszClr[2]+";font:1pt;}"+
|
||||
"</style></head><body onload=\"parent.fnInit();\" onselectstart=\"event.returnValue=false;\" ondragstart=\"event.returnValue=false;\" bgcolor="+c_rgszClr[4]+
|
||||
" topmargin=0 leftmargin=0><table id=tbTabs cellpadding=0 cellspacing=0>";
|
||||
|
||||
var iCellCount=(c_lTabs+1)*2;
|
||||
|
||||
var i;
|
||||
for (i=0;i<iCellCount;i+=2)
|
||||
szHTML+="<col width=1><col>";
|
||||
|
||||
var iRow;
|
||||
for (iRow=0;iRow<6;iRow++) {
|
||||
|
||||
szHTML+="<tr>";
|
||||
|
||||
if (iRow==5)
|
||||
szHTML+="<td colspan="+iCellCount+"></td>";
|
||||
else {
|
||||
if (iRow==0) {
|
||||
for(i=0;i<iCellCount;i++)
|
||||
szHTML+="<td height=1 class=\"clBorder\"></td>";
|
||||
} else if (iRow==1) {
|
||||
for(i=0;i<c_lTabs;i++) {
|
||||
szHTML+="<td height=1 nowrap class=\"clBorder\"> </td>";
|
||||
szHTML+=
|
||||
"<td id=tdTab height=1 nowrap class=\"clTab\" onmouseover=\"parent.fnMouseOverTab("+i+");\" onmouseout=\"parent.fnMouseOutTab("+i+");\">"+
|
||||
"<a href=\""+document.all.item("shLink")[i].href+"\" target=\"frSheet\" id=aTab> "+c_rgszSh[i]+" </a></td>";
|
||||
}
|
||||
szHTML+="<td id=tdTab height=1 nowrap class=\"clBorder\"><a id=aTab> </a></td><td width=100%></td>";
|
||||
} else if (iRow==2) {
|
||||
for (i=0;i<c_lTabs;i++)
|
||||
szHTML+="<td height=1></td><td height=1 class=\"clBorder\"></td>";
|
||||
szHTML+="<td height=1></td><td height=1></td>";
|
||||
} else if (iRow==3) {
|
||||
for (i=0;i<iCellCount;i++)
|
||||
szHTML+="<td height=1></td>";
|
||||
} else if (iRow==4) {
|
||||
for (i=0;i<c_lTabs;i++)
|
||||
szHTML+="<td height=1 width=1></td><td height=1></td>";
|
||||
szHTML+="<td height=1 width=1></td><td></td>";
|
||||
}
|
||||
}
|
||||
szHTML+="</tr>";
|
||||
}
|
||||
|
||||
szHTML+="</table></body></html>";
|
||||
with (frames['frTabs'].document) {
|
||||
open("text/html","replace");
|
||||
charset=document.charset;
|
||||
write(szHTML);
|
||||
close();
|
||||
}
|
||||
}
|
||||
|
||||
function fnInit()
|
||||
{
|
||||
g_rglTabX[0]=0;
|
||||
var i;
|
||||
for (i=1;i<=c_lTabs;i++)
|
||||
with (frames['frTabs'].document.all.tbTabs.rows[1].cells[fnTabToCol(i-1)])
|
||||
g_rglTabX[i]=offsetLeft+offsetWidth-6;
|
||||
}
|
||||
|
||||
function fnTabToCol(iTab)
|
||||
{
|
||||
return 2*iTab+1;
|
||||
}
|
||||
|
||||
function fnNextTab(fDir)
|
||||
{
|
||||
var iNextTab=-1;
|
||||
var i;
|
||||
|
||||
with (frames['frTabs'].document.body) {
|
||||
if (fDir==0) {
|
||||
if (scrollLeft>0) {
|
||||
for (i=0;i<c_lTabs&&g_rglTabX[i]<scrollLeft;i++);
|
||||
if (i<c_lTabs)
|
||||
iNextTab=i-1;
|
||||
}
|
||||
} else {
|
||||
if (g_rglTabX[c_lTabs]+6>offsetWidth+scrollLeft) {
|
||||
for (i=0;i<c_lTabs&&g_rglTabX[i]<=scrollLeft;i++);
|
||||
if (i<c_lTabs)
|
||||
iNextTab=i;
|
||||
}
|
||||
}
|
||||
}
|
||||
return iNextTab;
|
||||
}
|
||||
|
||||
function fnScrollTabs(fDir)
|
||||
{
|
||||
var iNextTab=fnNextTab(fDir);
|
||||
|
||||
if (iNextTab>=0) {
|
||||
frames['frTabs'].scroll(g_rglTabX[iNextTab],0);
|
||||
return true;
|
||||
} else
|
||||
return false;
|
||||
}
|
||||
|
||||
function fnFastScrollTabs(fDir)
|
||||
{
|
||||
if (c_lTabs>16)
|
||||
frames['frTabs'].scroll(g_rglTabX[fDir?c_lTabs-1:0],0);
|
||||
else
|
||||
if (fnScrollTabs(fDir)>0) window.setTimeout("fnFastScrollTabs("+fDir+");",5);
|
||||
}
|
||||
|
||||
function fnSetTabProps(iTab,fActive)
|
||||
{
|
||||
var iCol=fnTabToCol(iTab);
|
||||
var i;
|
||||
|
||||
if (iTab>=0) {
|
||||
with (frames['frTabs'].document.all) {
|
||||
with (tbTabs) {
|
||||
for (i=0;i<=4;i++) {
|
||||
with (rows[i]) {
|
||||
if (i==0)
|
||||
cells[iCol].style.background=c_rgszClr[fActive?0:2];
|
||||
else if (i>0 && i<4) {
|
||||
if (fActive) {
|
||||
cells[iCol-1].style.background=c_rgszClr[2];
|
||||
cells[iCol].style.background=c_rgszClr[0];
|
||||
cells[iCol+1].style.background=c_rgszClr[2];
|
||||
} else {
|
||||
if (i==1) {
|
||||
cells[iCol-1].style.background=c_rgszClr[2];
|
||||
cells[iCol].style.background=c_rgszClr[1];
|
||||
cells[iCol+1].style.background=c_rgszClr[2];
|
||||
} else {
|
||||
cells[iCol-1].style.background=c_rgszClr[4];
|
||||
cells[iCol].style.background=c_rgszClr[(i==2)?2:4];
|
||||
cells[iCol+1].style.background=c_rgszClr[4];
|
||||
}
|
||||
}
|
||||
} else
|
||||
cells[iCol].style.background=c_rgszClr[fActive?2:4];
|
||||
}
|
||||
}
|
||||
}
|
||||
with (aTab[iTab].style) {
|
||||
cursor=(fActive?"default":"hand");
|
||||
color=c_rgszClr[3];
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
function fnMouseOverScroll(iCtl)
|
||||
{
|
||||
frames['frScroll'].document.all.tdScroll[iCtl].style.color=c_rgszClr[7];
|
||||
}
|
||||
|
||||
function fnMouseOutScroll(iCtl)
|
||||
{
|
||||
frames['frScroll'].document.all.tdScroll[iCtl].style.color=c_rgszClr[6];
|
||||
}
|
||||
|
||||
function fnMouseOverTab(iTab)
|
||||
{
|
||||
if (iTab!=g_iShCur) {
|
||||
var iCol=fnTabToCol(iTab);
|
||||
with (frames['frTabs'].document.all) {
|
||||
tdTab[iTab].style.background=c_rgszClr[5];
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
function fnMouseOutTab(iTab)
|
||||
{
|
||||
if (iTab>=0) {
|
||||
var elFrom=frames['frTabs'].event.srcElement;
|
||||
var elTo=frames['frTabs'].event.toElement;
|
||||
|
||||
if ((!elTo) ||
|
||||
(elFrom.tagName==elTo.tagName) ||
|
||||
(elTo.tagName=="A" && elTo.parentElement!=elFrom) ||
|
||||
(elFrom.tagName=="A" && elFrom.parentElement!=elTo)) {
|
||||
|
||||
if (iTab!=g_iShCur) {
|
||||
with (frames['frTabs'].document.all) {
|
||||
tdTab[iTab].style.background=c_rgszClr[1];
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
function fnSetActiveSheet(iSh)
|
||||
{
|
||||
if (iSh!=g_iShCur) {
|
||||
fnSetTabProps(g_iShCur,false);
|
||||
fnSetTabProps(iSh,true);
|
||||
g_iShCur=iSh;
|
||||
}
|
||||
}
|
||||
|
||||
window.g_iIEVer=fnGetIEVer();
|
||||
if (window.g_iIEVer>=4)
|
||||
fnBuildFrameset();
|
||||
//-->
|
||||
</script>
|
||||
<![endif]><!--[if gte mso 9]><xml>
|
||||
<x:ExcelWorkbook>
|
||||
<x:ExcelWorksheets>
|
||||
<x:ExcelWorksheet>
|
||||
<x:Name>Sheet1</x:Name>
|
||||
<x:WorksheetSource HRef="crn_stats_dec_29_files/sheet001.htm"/>
|
||||
</x:ExcelWorksheet>
|
||||
<x:ExcelWorksheet>
|
||||
<x:Name>Sheet2</x:Name>
|
||||
<x:WorksheetSource HRef="crn_stats_dec_29_files/sheet002.htm"/>
|
||||
</x:ExcelWorksheet>
|
||||
<x:ExcelWorksheet>
|
||||
<x:Name>Sheet3</x:Name>
|
||||
<x:WorksheetSource HRef="crn_stats_dec_29_files/sheet003.htm"/>
|
||||
</x:ExcelWorksheet>
|
||||
</x:ExcelWorksheets>
|
||||
<x:Stylesheet HRef="crn_stats_dec_29_files/stylesheet.css"/>
|
||||
<x:WindowHeight>12915</x:WindowHeight>
|
||||
<x:WindowWidth>28620</x:WindowWidth>
|
||||
<x:WindowTopX>120</x:WindowTopX>
|
||||
<x:WindowTopY>15</x:WindowTopY>
|
||||
<x:ProtectStructure>False</x:ProtectStructure>
|
||||
<x:ProtectWindows>False</x:ProtectWindows>
|
||||
</x:ExcelWorkbook>
|
||||
</xml><![endif]-->
|
||||
</head>
|
||||
|
||||
<frameset rows="*,39" border=0 width=0 frameborder=no framespacing=0>
|
||||
<frame src="crn_stats_dec_29_files/sheet001.htm" name="frSheet">
|
||||
<frame src="crn_stats_dec_29_files/tabstrip.htm" name="frTabs" marginwidth=0 marginheight=0>
|
||||
<noframes>
|
||||
<body>
|
||||
<p>This page uses frames, but your browser doesn't support them.</p>
|
||||
</body>
|
||||
</noframes>
|
||||
</frameset>
|
||||
</html>
|
||||
Binary file not shown.
|
After Width: | Height: | Size: 1.1 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 315 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 12 KiB |
Reference in New Issue
Block a user