54 Commits

Author SHA1 Message Date
Alexander Suvorov 8708900eca Implement ETC1S/ETC2AS image compression
Explanation:

ETC1S encoding is a subset of ETC1, which is using only one color endpoint per 4x4 block (modifier indices are identical for both subblocks, base color is encoded differentially as RGB555 with the differential RGB333 part always set to zero, flip bit is always set to zero).
Usage: crunch_x64.exe -ETC1S input.png -out output.ktx

ETC2AS encoding is a subset of ETC2A encoding which is using ETC1S encoding for color and ETC2A encoding for alpha.
Usage: crunch_x64.exe -ETC2AS input.png -out output.ktx
2018-10-23 23:16:00 +02:00
Alexander Suvorov 4bb735f796 Make Crunch compression work correctly on CPU supporting 16 or more threads
Explanation:

Crunch library has been designed to work correctly when using up to 16 threads. Considering that one of those threads is the main thread, the maximum number of helper threads should be set to 15.
2018-10-23 20:23:56 +02:00
Alexander Suvorov 660322d3a6 Add compression support for ETC1S/ETC2AS encodings
Explanation:

ETC1S encoding is a subset of ETC1, which is using only one color endpoint per 4x4 block. The base color is therefore is always encoded as RGB555 and there is no need to encode block flips. ETC2AS encoding is a subset of ETC2A encoding which is using ETC1S encoding for color and default ETC2A encoding for alpha.

ETC1S/ETC2AS Crunch compression and decompression is based on ETC and DXT Crunch compression and decompression algorithms:
- ETC1S/ETC2AS tiling is performed within the area of 8x8 pixels using DXT1/DXT5 tiling scheme
- ETC1S color endpoints are generated using standard ETC1 optimization
- ETC1S color codebook encoding is equivalent to ETC1 codebook encoding
- ETC1S level encoding is equivalent to DXT1 level encoding
- ETC2AS alpha codebook encoding is equivalent to ETC2A alpha codebook encoding
- ETC2AS level encoding is equivalent to DXT5 level encoding

Testing results suggest that ETC1S/ETC2AS encodings can be used to achieve lower bitrates than ETC1/ETC2A on the Kodak test set while providing equivalent image quality (estimated using PSNR).

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.854 sec
Modified: 1468204 bytes / 5.473 sec
Improvement: 7.21% (compression ratio) / 81.03% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.925 sec
Modified: 1914805 bytes / 7.297 sec
Improvement: 7.28% (compression ratio) / 80.24% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 12.842 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB

ETCS Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1S quantization parameters have been selected in such a way, so that ETC1S compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1S encoding]
Total size: 1363676 bytes
Total time: 15.586 sec
Average bitrate: 1.156 bpp
Average Luma PSNR: 34.047 dB
2018-06-07 19:20:30 +02:00
Alexander Suvorov c1d8e8da71 Optimize vector quantization step
This change improves the compression speed for both DXT and ETC encodings.

Explanation:

The vector quantization algorithm takes floating point vectors as input and performs vector preprocessing right before the quantization. At the same time, selector training vectors are generated directly from integer selector values, packed into a single uint64. It would therefore be more efficient to perform preprocessing of the selector training vectors (which includes sorting and deduplication) while still having them in a packed form. Additional performance boost is achieved by using multiple threads for sorting the training vectors.

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.869 sec
Modified: 1468204 bytes / 5.477 sec
Improvement: 7.21% (compression ratio) / 81.03% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.961 sec
Modified: 1914805 bytes / 7.322 sec
Improvement: 7.28% (compression ratio) / 80.19% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 12.766 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-10-27 16:50:44 +02:00
Alexander Suvorov 21eb70bc10 Optimize tile computation step
This change improves the compression speed for both DXT and ETC encodings.

Explanation:

In the tile computation step, pixels within the tiling area are palettized using a general purpose tree clusterization algorithm. At the same time, clusterization of the tile pixels is always performed with the following restrictions: the maximum number of palettized pixels is 64, the maximum number of clusters is 2. The performance can therefore be improved by solving the palettizing task with a specialized version of the tree clusterizer, which does not maintain the tree structure and uses constant memory.

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.863 sec
Modified: 1468204 bytes / 5.726 sec
Improvement: 7.21% (compression ratio) / 80.16% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.950 sec
Modified: 1914805 bytes / 7.683 sec
Improvement: 7.28% (compression ratio) / 79.21% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 13.071 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-10-25 19:15:36 +02:00
Alexander Suvorov b8b456d48c Optimize endpoint reordering algorithm
This change improves the compression speed for both DXT and ETC encodings.

Explanation:
The main ideas used for optimization of the endpoint reordering algorithm:
- On each iteration, the list of the chosen endpoints is updated only from one side, so all the computations performed for the unchanged side of the list can be cached and reused on the next iteration.
- The list of the chosen endpoints can be build using an array of double size, growing from the middle of the array. This eliminates unnecessary memory reallocations and movements.
- When an element is removed from the list of remaining endpoints, instead of moving all the elements with higher indices, just a single last element of the list can be moved into the position of the removed element (the original indices of the remaining endpoints should be stored within the list elements to maintain proper indexing).

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.848 sec
Modified: 1468204 bytes / 5.875 sec
Improvement: 7.21% (compression ratio) / 79.63% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.952 sec
Modified: 1914805 bytes / 7.834 sec
Improvement: 7.28% (compression ratio) / 78.80% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 13.261 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-10-25 16:17:06 +02:00
Alexander Suvorov 7143913032 Optimize DXT endpoints computation
This change improves the compression speed for DXT encoding.

Explanation:

When performing per-component endpoint optimization, the trial solutions are generated using all possible combinations of the component values. Then the error boundary computation is performed for each block color of the trial solution in order to check the possibility of early out. The important observation here is that some component values are present in several trial solutions and therefore are processed multiple times. The overall performance can therefore be improved by computing and caching the errors for all the possible component values in advance.

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.843 sec
Modified: 1468204 bytes / 6.067 sec
Improvement: 7.21% (compression ratio) / 78.97% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.983 sec
Modified: 1914805 bytes / 8.080 sec
Improvement: 7.28% (compression ratio) / 78.15% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 13.421 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-10-24 19:48:37 +02:00
Alexander Suvorov dbbef6a21f Perform multithreaded node preprocessing for faster vector quantization
This change significantly improves the compression speed for both DXT and ETC encodings.

Explanation:

On each iteration of the vector quantization algorithm, the leaf with the highest variance is selected for splitting. If the leaf gets split, then two new leaves are created (while the leaves that can not be split will be ignored on the future iterations). There does not seem to be any simple way to compute or reliably predict the variances of the future leaves in advance, which means that there is no simple way to efficiently perform split operations in parallel.

And still, there is an interesting observation. Even though the order of the split operations depends on the previous iterations, the split operations performed in different subtrees are completely independent. So what if instead of solving the main quantization task we will first solve an alternative quantization task, which has a lot in common with the main task, but at the same time can be efficiently parallelized. Then the intermediate computation results of the alternative solution can be reused when solving the main task. Specifically, the idea is to efficiently compute an alternative split tree, which is more or less balanced, and has approximately the same number of nodes as the main tree. Then the overlapping part of the main and alternative trees can be reused while solving the main quantization task.

In order to achieve this, the initial root is first split normally until the number of splittable leaves reaches the number of available threads. Then each leaf is split in a separate thread, while the maximum number of split iterations for each subtree is defined as the maximum number of split iterations for the whole main tree divided by the number of used threads. This way the total number of nodes in the alternative tree will be approximately the same as the number of nodes in the main tree.

Note that in general, the alternative tree does not match the main tree, so some nodes of the alternative tree will never be reused. In practice however, the portion of such unnecessarily precomputed nodes is not very big. And considering that the nodes of the alternative tree are precomputed in parallel using multiple threads, in most cases the overall performance is significantly improved.

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.840 sec
Modified: 1468204 bytes / 6.303 sec
Improvement: 7.21% (compression ratio) / 78.14% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.955 sec
Modified: 1914805 bytes / 8.342 sec
Improvement: 7.28% (compression ratio) / 77.43% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 13.322 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-10-20 19:18:08 +02:00
Alexander Suvorov 1028520280 Use multiple threads for node split in vector quantization
This change improves the compression speed for both DXT and ETC encodings.

Explanation:

During the node split iteration, identical computations are performed for all the vectors of the split node. The overall performance can be improved by performing independent computations in separate threads. In order to avoid possible performance overhead, on each iteration the number of threads is selected in such a way so that each thread processes at least 512 vectors.

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.892 sec
Modified: 1468204 bytes / 7.578 sec
Improvement: 7.21% (compression ratio) / 73.77% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.943 sec
Modified: 1914805 bytes / 9.993 sec
Improvement: 7.28% (compression ratio) / 72.95% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 14.753 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-10-20 14:23:47 +02:00
Alexander Suvorov fbe3f6ca10 Optimize vector quantization algorithm
This change improves the compression speed for both DXT and ETC encodings.

Explanation:

On each iteration of the vector quantization algorithm, the leaf with the highest variance is selected for splitting. At the same time, each split operation adds at most 2 new leaves. Considering this, the search of the leaf with the highest variance can be performed more efficiently if all the leaves are stored in a priority queue (in order to guarantee that texture decompression gives identical result to the original version of Crunch, the node comparison operation also takes the node index into account).

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.844 sec
Modified: 1468204 bytes / 7.883 sec
Improvement: 7.21% (compression ratio) / 72.67% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.978 sec
Modified: 1914805 bytes / 10.490 sec
Improvement: 7.28% (compression ratio) / 71.63% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 15.165 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-10-19 18:58:05 +02:00
Alexander Suvorov 11a89d25ed Optimize vector quantization algorithm
This change improves the compression speed for both DXT and ETC encodings.

Explanation:

When a node is split during the quantization step, all of its vectors are split between the child nodes, and new memory is allocated to store each new set of vectors. At the same time, the set of vectors of the parent node is no longer accessed after the split. Considering that the sets of vectors of the child nodes do not intersect, it is possible to reuse the memory allocated for the parent set of vectors, to store the child sets of vectors. This can be achieved in the following way. All the source vectors are initially stored in an array. Let's assume that it is possible to reorder this common array of vectors in such a way, so that vectors of each node would form a continuous block within this array. Then it would be sufficient to store only two indices for each node (pointing to the first and to the last node vectors in the common array of vectors) in order to describe the complete set of vectors of this node. This assumption is correct for the root node, which has initial vector indices pointing to the first and to the last elements of the complete vector array. When a node is split, let's reorder its vectors (stored in a continuous block within the common array of vectors) in such a way, so that vectors of the left child node are put in front, and then followed by the vectors of the right child node (the indices of the first and last vectors of the child nodes should be set accordingly). This way each child node will also have its vectors stored in a continuous block within the common array of vectors, defined by two indices, and the split iteration can be repeated. Note that the memory, which is used to store the sets of vectors for all the nodes, now needs to be allocated only once.

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.872 sec
Modified: 1468204 bytes / 8.276 sec
Improvement: 7.21% (compression ratio) / 71.34% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.971 sec
Modified: 1914805 bytes / 10.944 sec
Improvement: 7.28% (compression ratio) / 70.40% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 15.364 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-10-17 15:12:55 +02:00
Alexander Suvorov a14a313361 Optimize color endpoint solution evaluation
This change improves the compression speed for DXT encoding.

Explanation:

In order to evaluate an endpoint solution, it is necessary to compute the sum of the squared distances from the source pixels to their nearest block colors, defined by the evaluated endpoint solution. Such computation is quite complicated, so before it is performed, we can compute the sum of the squared distances from the source pixels to the axis-aligned bounding box enclosing all the evaluated block colors (if the source pixel appears to be inside the AABB of the evaluated solution, then the distance is considered to be 0). If the sum of the squared distances to the AABB of the current solution is already bigger than the sum of the squared distances computed for the previously found best solution, then the current solution does not need to be evaluated.

The actual trick here is that the sum of the squared distances to the AABB of the current solution can be computed in constant time using the following approach. The sums of the squared distances for each color component can be computed separately. For each color component the AABB determines 2 planes: the "lower" plane, defined by the lower boundary of the AABB, and the "upper" plane, defined by the upper boundary of the AABB. The sum for each color component is combined from two parts: the sum of the squared distances from the lower plane to all the source pixels which are below the lower plane, and the sum of the squared distances from the upper plane to all the source pixels which are above the upper plane. Considering that the endpoints of the evaluated solution are encoded as RGB565, there are 32 possible planes for the red and blue components, and 64 possible planes for the green component. For each plane it is sufficient to precompute the following two values: the sum of the squared distances from the plane to all the source pixels which are "below" this plane, and the sum of the squared distances from the plane to all the source pixels which are "above" this plane. The total sum of the squared distances from the source pixels to any evaluated AABB can then be represented as a sum of 6 precomputed values, while all the used values can be precomputed in linear time with dynamic programming.

Note: The AABB check seems to work faster than inserting a solution into the hash map. For this reason the AABB check is performed first.

Additional improvements: A few minor adjustments have been made in order to make sure that the texture decompression gives identical result to the original version of Crunch also for 32-bit builds (original Crunch library uses different floating point models for 32-bit and 64-bit builds).

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.861 sec
Modified: 1468204 bytes / 8.622 sec
Improvement: 7.21% (compression ratio) / 70.13% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.980 sec
Modified: 1914805 bytes / 11.294 sec
Improvement: 7.28% (compression ratio) / 69.46% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 15.529 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-10-13 17:20:31 +02:00
Alexander Suvorov 65f44319c0 Optimize computation of the endpoint cluster indices
This change improves the compression speed for both DXT and ETC encodings.

Explanation:

The vectors which are processed in the cluster indices computation step, are the very same vectors which were used in the vector quantization step. This means that every processed vector already has a specific centroid associated with it. Even though the associated centroid is not necessarily the closest one to the processed vector, the distance to the associated centroid can be used as an upper boundary of the distance to the closest centroid. This allows to efficiently perform early out while computing the distances to the other centroids.

Note: The modified algorithm is supposed to generate decompression result identical to the original version of Crunch. For this reason the centroid associated with a specific training vector is not used as an initial best solution, because it could potentially change the decompression result in cases when the processed training vector is equidistant from multiple centroids (selection of the closest centroid in such cases depends on the processing order).

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.847 sec
Modified: 1468204 bytes / 8.929 sec
Improvement: 7.21% (compression ratio) / 69.05% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.953 sec
Modified: 1914805 bytes / 11.651 sec
Improvement: 7.28% (compression ratio) / 68.47% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 15.695 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-10-10 17:13:41 +02:00
Alexander Suvorov 51f73fdfed Optimize vector quantization algorithm
This change improves the compression speed for both DXT and ETC encodings.

Explanation:
The main ideas used for optimization of the vector quantization algorithm:
- intermediate structures can store vector indices instead of the vector data, which minimizes the total amount of copied data when splitting a node (this is especially important for selector quantization, where processed vectors have 16 components)
- weighted vectors and weighted dot products can be cached

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.893 sec
Modified: 1468204 bytes / 9.310 sec
Improvement: 7.21% (compression ratio) / 67.78% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.942 sec
Modified: 1914805 bytes / 12.232 sec
Improvement: 7.28% (compression ratio) / 66.89% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 16.121 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-10-06 19:38:02 +02:00
Alexander Suvorov 205829da99 Optimize DXT color endpoint solution evaluation
This change improves the compression speed for DXT encoding.

Explanation:

In order to evaluate an endpoint solution, it is necessary to compute the sum of the squared distances from the source pixels to their nearest block colors, defined by the evaluated endpoint solution. Considering that we are looking for a solution with a minimal sum, the computation can be stopped as soon as the current sum is higher or equal than the previously found best sum. An interesting observation here is that the performance improvement, achieved by such early out approach, depends on the order in which the source pixels are processed. It makes sense to process the pixels with the highest introduced errors first, as this significantly increases the chances to exit the computation earlier.

On the one hand, equal source pixels are grouped together, so the computed distance from each unique source pixel is multiplied by its weight. For this reason it makes sense to first process the pixels with the highest weights, as their errors have the highest multipliers. On the other hand, the pixels which project onto the middle part of the endpoint interval, have higher chances of being close to one of the block colors. For this reason it makes sense to first process the pixels, which projections are the most distant from the middle of the endpoint interval, as those pixels will normally introduce the highest errors. In order to combine those two aspects, it is proposed to sort the pixels according to the multiplication of the weight and the distance from the projected pixel to the center of the endpoint interval.

Of course, reordering the pixels on each iteration would be very expensive and is not considered. However, there is a high chance that most endpoint intervals will be aligned in a similar way as the principle axis, as well as have their centers close to the mean color. As soon as the principle axis is computed, it can be used for approximation of all the future endpoint intervals. So the projection and reordering of the source pixels is performed only once.

Two approaches have been considered. In the first approach, the pixels have been sorted by the multiplication of the weight and the absolute distance in decreasing order. In the second approach, the pixels have been sorted by the multiplication of the weight and the signed distance, and then interleaved starting from the opposite sides of the ordered sequence. When tested on the Kodak image set, the interleaving approach shows better results.

Additional optimization: perceptual and uniform versions of the evaluation function are now implemented separately, which slightly improves the performance.

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.845 sec
Modified: 1468204 bytes / 10.071 sec
Improvement: 7.21% (compression ratio) / 65.09% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.929 sec
Modified: 1914805 bytes / 13.248 sec
Improvement: 7.28% (compression ratio) / 64.13% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 17.126 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-10-04 17:33:32 +02:00
Alexander Suvorov 4bd4355683 Optimize DXT endpoints computation
This change improves the compression speed for DXT encoding.

Explanation:

When performing per-component endpoint optimization, it is not necessary to go through all the source pixels on every iteration in order to calculate the total weighted squared error for a specific trial endpoint. The computation can be optimized in the following way:

sum(w(i) * (x - p(i)) * (x - p(i))) = sum(w(i) * x * x) - sum(w(i) * 2 * x * p(i)) + sum(w(i) * p(i) * p(i)) = sum(w(i)) * x * x - sum(2 * w(i) * p(i)) * x + sum(w(i) * p(i) * p(i))

The values of sum(w(i)), sum(2 * w(i) * p(i)), sum(w(i) * p(i) * p(i)) can be precalculated for each of 4 selectors, and only have to be updated when the solution improves. This way the error computation on each iteration can be performed using 12 multiplications instead of 2 * N (where N is the number of pixels in the processed cluster).

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.811 sec
Modified: 1468204 bytes / 10.520 sec
Improvement: 7.21% (compression ratio) / 63.49% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.936 sec
Modified: 1914805 bytes / 13.902 sec
Improvement: 7.28% (compression ratio) / 62.36% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 17.121 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-09-13 14:10:00 +02:00
Alexander Suvorov 335f0ee056 Optimize DXT endpoints refinement
This change improves the compression speed for DXT encoding.

Explanation:

When creating the array of trial alpha endpoints, there is no need to use bit array for tracking duplicate entries. Instead, the uniqueness of the endpoint pair can be determined using simple comparison operations. Moreover, it is not necessary to go through all the source pixels on every iteration in order to calculate the total squared error for a specific trial endpoint. Considering that selector values are not modified during the refinement step, each selector has a fixed set of pixels associated with it during optimization. This means that calculation of the total squared error can be optimized in the following way:

sum((x - p(i)) * (x - p(i))) = sum(x * x) + sum(2 * x * p(i)) + sum(p(i) * p(i)) = N * x * x + sum(2 * p(i)) * x + sum(p(i) * p(i))

As the set of pixels, associated with a specific selector is fixed, the sum(2 * p(i)) and sum(p(i) * p(i)) values can be precalculated in advance. This means that error computation for each component now requires only (3 * S) multiplications instead of N (where N is the number of pixels in the processed cluster, and S is the number of selectors, equal to 4 for color components and 8 for alpha components).

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.864 sec
Modified: 1468204 bytes / 10.794 sec
Improvement: 7.21% (compression ratio) / 62.60% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.912 sec
Modified: 1914805 bytes / 14.244 sec
Improvement: 7.28% (compression ratio) / 61.41% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 17.125 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-09-12 16:50:03 +02:00
Alexander Suvorov 3053c9dd93 Optimize DXT endpoints computation
This change improves the compression speed for DXT encoding.

Explanation:

The main ideas used for the DXT endpoints computation optimization:
- Instead of using map in tree clusterizer, the source vectors can be stored in an array and sorted before the quantization. This might increase the amount of used memory, but is much more efficient in terms of memory reallocation.
- Endpoint caching can be used throughout the color endpoint computation, and not just within the optimize_endpoints function. The only place where endpoint caching can not be used is the final step of the try_combinatorial_encoding function, where alternate rounding is used.
- When computing endpoint codebooks, endpoint optimizer and endpoint refiner can be reused, which eliminates unnecessary memory reallocations.

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.879 sec
Modified: 1468204 bytes / 11.099 sec
Improvement: 7.21% (compression ratio) / 61.57% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.919 sec
Modified: 1914805 bytes / 14.621 sec
Improvement: 7.28% (compression ratio) / 60.40% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 17.108 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-09-12 13:03:56 +02:00
Alexander Suvorov 3e12aff909 Fix miscellaneous compiler warnings
DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.866 sec
Modified: 1468204 bytes / 11.858 sec
Improvement: 7.21% (compression ratio) / 58.92% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.878 sec
Modified: 1914805 bytes / 15.625 sec
Improvement: 7.28% (compression ratio) / 57.63% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 17.181 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-09-11 13:52:21 +02:00
Alexander Suvorov 6b3172f793 Optimize DXT color endpoints computation
This change significantly improves the compression speed for DXT encoding.

Explanation:

The main ideas used for the DXT color endpoints computation optimization:
- When the DXT endpoint computation function is called from the qunatization algorithm, almost all of its input parameters (except the color metrics) are hardcoded in the quantization code. This allows to optimize the endpoint evaluation function (which is the bottleneck of the endpoint computation algorithm) for this specific set of parameters.
- In the original version of the evaluation function, selectors are computed each time when a new endpoint is evaluated. While in fact, this is not necessary, because some selector values are never used, so they can be computed lazily, based on the previously determined optimal endpoint values. This approach significantly reduces the amount of computations.

Other improvements:
- The original version of Crunch has a minor bug: the counter for the cached endpoint values does not get initialized. This results in nondeterministic DXT conversion of large textures, as the counter overflow can occur at a random moment. The issue is now fixed in the current branch.

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.893 sec
Modified: 1468204 bytes / 11.882 sec
Improvement: 7.21% (compression ratio) / 58.88% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.946 sec
Modified: 1914805 bytes / 15.628 sec
Improvement: 7.28% (compression ratio) / 57.70% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 17.352 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-08-11 13:12:44 +02:00
Alexander Suvorov bec4114bea Add compression support for ETC2A textures
This change makes it possible to use Crunch algorithms for ETC textures with Alpha channel.

Explanation:

For simplicity, Crunch algorithms currently do not use ETC2 specific modes (T, H or P). For this reason, the currently used ETC2A compression format is technically equivalent to ETC1 + Alpha. Note that ETC2 encoding is a superset of ETC1, so any texture, which consists of ETC1 color blocks and ETC2 Alpha blocks, can be correctly decoded by an ETC2A (ETC2_RGBA8) decoder.

Compression scheme for ETC2 Alpha blocks is equivalent to the compression scheme for DXT5 Alpha blocks. ETC2 Alpha endpoint clusterization is performed based on the very same output of the Alpha palettizer which is used for DXT5 Alpha. The only part which is actually different is the Alpha endpoint optimization step.

In order to perform ETC2 Alpha encoding, we can first run the already existing algorithm for DXT5 Alpha endpoint optimization, in order to obtain the initial approximate solution. Then the approximate solution is refined based on the ETC2 Alpha modifier table. When performing raw ETC2A encoding, all the 16 ETC2 Alpha modifiers are used during optimization. However, when performing ETC2A quantization, for performance reasons, only 2 Alpha modifiers are currently used (modifier 13, which allows to perform precise approximation on short Alpha intervals, and modifier 11, which has more or less regularly distributed values, and is used for large Alpha intervals).

For compatibility reasons, ETC2 color compression wrappers have also been added to the code, though, as has been mentioned before, at the current moment ETC2 specific modes are not used, so ETC2 color compression is currently equivalent to ETC1 compression.

The ETC decoder functionality has been significantly extended, Crunch is now capable to decode ETC2 and ETC2A textures (input ETC2 textures can have T, H or P blocks).

In order to use ETC2A compression, use the -ETC2A command line option (i.e. "crunch_x64.exe -ETC2A input.png"). By default, compressed ETC2A textures will be decompressed into KTX file format.

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.880 sec
Modified: 1468204 bytes / 13.288 sec
Improvement: 7.21% (compression ratio) / 53.99% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.936 sec
Modified: 1914805 bytes / 18.044 sec
Improvement: 7.28% (compression ratio) / 51.15% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 17.361 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-08-04 16:56:10 +02:00
Alexander Suvorov 54d4084c68 Use XOR-deltas for selector codebook encoding
This change improves compression ratio for both DXT and ETC encodings.

Explanation:

When encoding the deltas between two pixel selectors, it is possible to use XOR-deltas instead of modulo-deltas. At first it might seem counterintuitive that XOR-delta can perform better than modulo-delta, as it does not reflect the continuity properties of the data that well. The actual trick here is that the encoded selectors are first sorted according to the used delta operation and the corresponding metric. The initial distance maps for the XOR-deltas have been obtained experimentally, using bitrate optimization on the test set of images. Additionally, ETC1 decoding has been optimized for speed: all the normal and flipped ETC1 selectors are now computed in advance.

Note: This modification alters the output file format and makes it incompatible with the previous revisions.

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.899 sec
Modified: 1468204 bytes / 13.353 sec
Improvement: 7.21% (compression ratio) / 53.79% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.985 sec
Modified: 1914805 bytes / 18.111 sec
Improvement: 7.28% (compression ratio) / 51.03% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1607858 bytes
Total time: 17.356 sec
Average bitrate: 1.363 bpp
Average Luma PSNR: 34.050 dB
2017-07-19 12:33:48 +02:00
Alexander Suvorov e972e0b480 Improve selector weight computation for ETC1 encoding
This change improves compression ratio for ETC1 encoding.

Explanation:

When computing endpoint weights for ETC1 encoding, it is possible to use delta luma instead of the Euclidean distance between the outer endpoint colors, as it gives approximately the same result.

When computing selector weight, it is important to take into account the following factors:
- The bigger is the difference between the outer endpoint colors, the bigger error can be introduced by the corresponding selector, therefore the bigger should be the weight of that selector. In the original Crunch algorithm, the selector weight is proportional to the squared distance between the outer endpoint colors. Such optimization improves PSNR, but it might also introduce significant distortion in smooth areas of the image. In order to mitigate this effect, it is proposed to limit the maximum difference between the endpoint colors (currently delta luma is limited by 100).
- Blocks with low difference between the outer endpoint colors introduce relatively small error, so their selectors should have smaller weights. In the original algorithm it is achieved by using squared distance between the outer endpoint colors, though the effect can be amplified further by using powers higher than 2 (currently it is set to 2.7), which improves PSNR.

In the original Crunch algorithm the encoding weights are initialized non-symmetrically (and are set to math::lerp(1.15f, 1.0f, 1.0f / 7.0f) for horizontal split and to math::lerp(1.15f, 1.0f, 2.0f / 7.0f) for vertical split). It is proposed to use the same encoding weight for both splits in case of ETC1 (the used coefficient 0.972 has been computed as math::lerp(1.15f, 1.0f, 1.5f / 7.0f) / 1.15f).

The ETC1 quantization parameters have been adjusted accordingly to preserve the average Luma PSNR.

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.843 sec
Modified: 1473711 bytes / 13.312 sec
Improvement: 6.86% (compression ratio) / 53.85% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.962 sec
Modified: 1920600 bytes / 18.122 sec
Improvement: 7.00% (compression ratio) / 50.97% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1612083 bytes
Total time: 17.351 sec
Average bitrate: 1.367 bpp
Average Luma PSNR: 34.050 dB
2017-07-17 18:07:42 +02:00
Alexander Suvorov a0044903aa Use diagonal endpoint references for ETC1 encoding
This change slightly improves compression ratio for ETC1 encoding, and also demonstrates how to adjust the endpoint reference configuration for a specific texture format.

Note: This modification alters the output file format for ETC1 encoding and makes it incompatible with the previous revisions.

Explanation:

In addition to the standard endpoint references (to the top and to the left ETC1 blocks), it is also possible to use an endpoint reference to the top-left diagonal neighbour ETC1 block. Specifically, the first ETC1 subblock will now have the reference value of 3 if the endpoint is copied from the second subblock of the top-left neighbour ETC1 block.

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.895 sec
Modified: 1473711 bytes / 13.356 sec
Improvement: 6.86% (compression ratio) / 53.78% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.979 sec
Modified: 1920600 bytes / 18.083 sec
Improvement: 7.00% (compression ratio) / 51.10% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1633207 bytes
Total time: 17.434 sec
Average bitrate: 1.384 bpp
Average Luma PSNR: 34.057 dB
2017-07-12 18:17:29 +02:00
Alexander Suvorov 7402f3d4f3 Use endpoint references for all the ETC1 subblocks
This change significantly improves compression ratio for ETC1 encoding.

Note: This modification alters the output file format for ETC1 encoding and makes it incompatible with the previous revisions.

Explanation:

Previously, for simplicity, endpoint references for ETC1 encoding have been only computed withing the tiling area. Now endpoint references are computed for all the ETC1 subblocks. This means that endpoints can now be inherited from the surrounding ETC1 blocks, which significantly improves the compression ratio.

Endpoint references for ETC1 subblocks are encoded in the following way:
- The first ETC1 subblock has the reference value of 0 if the endpoint is decoded from the input stream, the value of 1 if the endpoint is copied from the second subblock of the left neighbour ETC1 block, and the value of 2 if the endpoint is copied from the first subblock of the top neighbour ETC1 block.
- The second ETC1 subblock has the reference value of 0 if the endpoint is copied from the first subblock, the value of 1 if the endpoint is decoded from the input stream and the corresponding ETC1 block is split horizontally, and the value of 2 if the endpoint is decoded from the input stream and the corresponding ETC1 block is split vertically.

DXT Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (revision ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.901 sec
Modified: 1473711 bytes / 13.353 sec
Improvement: 6.86% (compression ratio) / 53.80% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.997 sec
Modified: 1920600 bytes / 18.096 sec
Improvement: 7.00% (compression ratio) / 51.09% (compression time)

ETC Testing:

The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1639063 bytes
Total time: 17.421 sec
Average bitrate: 1.389 bpp
Average Luma PSNR: 34.057 dB
2017-07-11 15:36:48 +02:00
Alexander Suvorov e3c1c6baf6 Use modulo deltas for selector codebook encoding
This change improves compression ratio for both DXT and ETC encodings.

Explanation:

In the original version of Crunch, selector codebook is encoded with Huffman coding applied to the raw deltas between corresponding pixel selectors of the neighbour codebook elements. However, using Huffman coding for raw deltas has a downside. Specifically, for each individual pixel selector, only about a half of all the possible raw deltas are valid. Indeed, once the value of the current selector is determined, the selector delta depends only on the next selector value, so only N out of 2 * N - 1 total raw delta values are possible at any specific point. And yet, the impossible raw delta values are encoded with a non-zero probability, as the probability table is calculated throughout the whole codebook.

The situation can be improved by using modulo deltas instead of raw deltas (modulo 4 for color selectors and modulo 8 for alpha selectors). This eliminates the mentioned implicit restriction on the value of selector delta, and therefore improves the compression ratio. The distance maps are initialized using squared distances between the selector values (the distances are calculated on a wrapped interval, according to the modulo arithmetics).

DXT Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch (rev ea9b8d8).

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.870 sec
Modified: 1473711 bytes / 13.286 sec
Improvement: 6.86% (compression ratio) / 53.98% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.991 sec
Modified: 1920600 bytes / 18.035 sec
Improvement: 7.00% (compression ratio) / 51.24% (compression time)

ETC Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1681327 bytes
Total time: 17.403 sec
Average bitrate: 1.425 bpp
Average Luma PSNR: 34.057 dB
2017-07-10 13:05:10 +02:00
Alexander Suvorov 205f8a171d Use 4x4 selector dictionary for ETC1 compression
This change significantly improves the ETC1 compression ratio.

Explanation:

As has been shown in the previous commit, each element of the ETC1 endpoint dictionary should correspond to a single ETC1 base color. In order to achieve near-lossless compression with unlimited dictionary, it has been proposed to use 4x2 or 2x4 ETC1 subblocks as building elements, defined by a single endpoint and selector. This scheme is equivalent to the original DXT compression scheme, expect the different size of the block, defined by the dictionary elements.

Now let's pay attention to the following interesting observation. Even though in the original DXT compression scheme the dictionaries are defined in such a way, so that both endpoints and selectors from the dictionaries correspond to the same size of the decoded block (in case of DXT it is 4x4), there is no requirement for this implied by the Crunch algorithms. In fact, selector dictionary and indices are defined after the endpoint optimization is complete. At this point each image pixel is already associated with a specific endpoint. At the same time, the selector computation step is only using those per-pixel endpoint associations as an input information, so the size and the shape of the blocks, defined by selector dictionary elements, does not depend in any way on the size or shape of the blocks, defined by endpoint dictionary elements.

In other words, the endpoint space of the texture can be split into one set of blocks, defined by endpoint dictionary and endpoint indices. And the selector space of the texture can be split into absolutely different set of blocks, defined by selector dictionary and selector indices. Endpoint blocks can be different in size from the selector blocks, as well as endpoint blocks can overlap in arbitrary way with the selector blocks, and such setup will still be fully compatible with the existing Crunch algorithms.

In the current commit, the size of the block, defined by an ETC1 selector dictionary element, has been set to 4x4, which significantly improves the compression ratio (the ETC1 quantization parameters have been adjusted to preserve the average Luma PSNR).

Future research:
The discovered property of the Crunch algorithms opens another dimension for optimization of the compression ratio. Specifically, the quality of the compressed selectors can now be adjusted in two ways: by changing the size of the selector dictionary and by changing the size of the selector block. Note that both DXT and ETC formats have selectors encoded as plain bits in the output format, so there is no technical limitation on the size or shape of the selector block (though, for performance reasons, non-power-of-two selector blocks might require some specific optimizations in the decoder).

DXT Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.859 sec
Modified: 1482780 bytes / 13.326 sec
Improvement: 6.28% (compression ratio) / 53.82% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.996 sec
Modified: 1931586 bytes / 18.121 sec
Improvement: 6.47% (compression ratio) / 51.02% (compression time)

ETC Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1692204 bytes
Total time: 17.528 sec
Average bitrate: 1.434 bpp
Average Luma PSNR: 34.057 dB
2017-07-07 17:36:30 +02:00
Alexander Suvorov f284523b15 Add compression support for ETC1 textures
Explanation:

Crunch algorithms are normally used for compression of DXTn textures. However, Crunch algorithms are much more powerful, and with some minor adjustments, those algorithms can be directly used to compress other texture formats. For example, the current commit demonstrates how to use the existing Crunch algorithms to compress ETC1 textures.

Basics:

In general, Crunch is performing the following steps:
- tiling (determines block encodings)
- quantization of the tile endpoints (determines endpoint indices)
- optimization of the endpoints for each tile group (determines endpoint dictionary)
- quantization of the selectors (determines selector indices)
- selector refinement for each selector group (determines selector dictionary)
- compression of the previously determined block encodings, dictionaries and indices

Dictionary element:

When applying Crunch algorithms to a new texture format, it is necessary to first define the dictionary element. In context of Crunch, this means thats the whole image consists of smaller non-overlapping blocks, while the contents of each individual block is determined by an endpoint and a selector from the corresponding dictionaries. For example, in case of DXT format, each endpoint and selector codebook element corresponds to a 4x4 pixel block. In general, the size of the blocks, which form the encoded image, depends on the texture format and quality considerations.

It is proposed to define the dictionaries according to the following limitations:
- The dictionary elements should be compatible with the existing Crunch algorithms, while the image blocks defined by those dictionary elements should be compatible with the texture encoding format.
- It should be possible to cover a wide range of image quality and bitrates by just changing the size of the endpoint and selector dictionaries. If there is no limitation on the dictionary size, the encoding should preferably become lossless or near-lossless (not considering the quality loss implied by the texture format itself).

In case of ETC1, the texture format itself determines the minimal size of the image block, defined by endpoint and selector: it can be either 2x4 or 4x2 rectangle, aligned to the borders of the 4x4 grid. It is not possible to use higher granularity, because each of those rectangles can have only one base color, according to the ETC1 format. For the same reason, any image block, defined by an endpoint and a selector from the dictionary, should be combined from those aligned 2x4 or 4x2 rectangles.

Let's investigate the possibilities for the endpoint dictionary. According to the ETC1 format, each 4x4 ETC1 block is split in half, while each ETC1 subblock has it's own base color and a modifier table index. In fact, the base color and the modifier table index simply define the high and the low colors for the subblock (while there are some limitations on the position of those high and low colors, implied by the ETC1 encoding). If we define the endpoint dictionary element in such a way that it contains information about more than one ETC1 base color, then such a dictionary will become incompatible with the existing tile quantization algorithm, and the reason for this is the following. The Crunch tiling algorithm first performs quantization of all the tile pixel colors, down to just 2 colors. Then it quantizes all those color pairs, coming from different tiles. This approach works quite well for 4x4 DXT blocks, as those 2 colors approximately represent the principle component of the tile pixel colors. In case of ETC1 however, mixing together pixels, which correspond to different base colors, does not make much sense, as each group of those pixels has it's own low and high color values, independent from other groups. When those pixels are mixed together, the information about the original principle components of each subblock gets lost.

For the mentioned reason, each endpoint dictionary element should correspond to a single ETC1 base color. In such case, the tile quantization algorithm will work almost the same way as for DXT format. Each pair of colors, generated by the tile palletizer, will normally have the subblock base color value somewhere in the middle between those 2 colors, so quantizing those color pairs should also automatically quantize the corresponding base colors. Moreover, each color pair implicitly contains information about the modifier table index (which corresponds to the distance between the high and the low colors), and therefore the corresponding table index will also get automatically quantized.

Endpoint and selector dictionary elements, which define a single 2x4 or 4x2 ETC1 subblock, are fully compatible with the existing Crunch algorithms (because each ETC1 subblock is associated with a single base color and a single modifier table index). At the same time, those subblocks are minimal possible blocks, which can be defined by a dictionary element for ETC1 format (as has been shown earlier). Of course, it is also possible to use blocks larger than 2x4 or 4x2 (assuming that all the ETC1 subblocks, which form such a block, will have the same base color and the same modifier table index), however, with a larger block area it would be not possible to achieve near-lossless quality when the dictionary size is not limited.

As the result, it is proposed to define the dictionaries in the following way:
- Each element of the endpoint dictionary defines a single base color and a single modifier table index of a 2x4 or a 4x2 pixel block (which represents an ETC1 subblock).
- Each endpoint is encoded as 3555 (3 bits for the table index and 5 bits for each component of the base color).
- Each element of the selector dictionary defines selectors for a 2x4 or a 4x2 block.
- Each selector is encoded using 16 bits.

ETC1-specific adjustments:

In case of DXT, the size of the encoded block is 4x4, while the tiling is performed in a 8x8 area (4 blocks). In case of ETC1, the tiling can be performed either in a 4x4 area (2 blocks), or in a 8x8 area (8 blocks), while other possibilities are either not symmetrical or too complex. For simplicity it is proposed to use 4x4 area for tiling. There are therefore 3 possible encodings: the 4x4 block is not split (encoded with a single endpoint), the 4x4 block is split horizontally, the 4x4 block is split vertically.

For simplicity, endpoint references are currently determined only within the tiling area, while the encoding of the endpoint references has been adjusted in the following way:
- The first ETC1 subblock will always have the reference value of 0
- The second ETC1 subblock can have the reference value of 0 if it has the same endpoint as the first subblock (note that in such case the flip of the ETC1 block does not need to be defined), the value of 1 if the corresponding ETC1 block is split horizontally, and the value of 2 if the corresponding ETC1 block is split vertically

According to the ETC1 format, the base colors within an ETC1 block can be encoded either as 444 and 444, or differentially as 555 and 333. For simplicity, this aspect is currently not taken into account (all the endpoints are encoded as 3555 in the codebook). If it appears that the base colors in the resulting ETC1 block can not be encoded differentially, the decoder will convert both base colors from 555 to 444.

At first, it might look like the ETC1 block flipping can bring some complications for Crunch, as the subblock structure might not look like a grid. This can be easily resolved by mirroring all the vertical ETC1 blocks across the main diagonal of the block after the tiling step (so that all the ETC1 subblocks will become 4x2 and form a regular grid). The decoder can mirror the ETC1 selector back according to the decoded block flip.

The code adjustments for the ETC1 compression support are pretty straightforward and mostly trivial. Just note that when format-specific adjustments affect performance critical code, it makes sense to duplicate the body of the affected function and perform format-specific optimizations in each copy of the function individually. For performance reasons, the following 4 functions now got both ETC and DTX specific versions:
- determine_tiles_task_etc() is an ETC-optimized version of the determine_tiles_task(), where dxt_fast class has been replaced with the etc1_optimizer class.
- determine_color_endpoint_codebook_task_etc() is an ETC-optimized version of the determine_color_endpoint_codebook_task(), where dxt1_endpoint_optimizer class has been replaced with the etc1_optimizer class.
- pack_color_endpoints_etc() is an ETC-optimized version of the pack_color_endpoints(), where 565565 DXT color endpoint encoding has been replaced with 3555 ETC color endpoint encoding.
- unpack_etc1() is an ETC version of the unpack_dxt1() function.

The color_quality_power_mul and m_adaptive_tile_color_psnr_derating parameters for ETC1 format have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the equivalent DXT1 compression, when compressing the Kodak test set without mipmaps using default quality.

In order to use ETC1 compression, use the -ETC1 command line option (i.e. "crunch_x64.exe -ETC1 input.png"). By default, compressed ETC1 textures will be decompressed into KTX file format.

DXT Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps using DXT1 encoding]
Original: 1582222 bytes / 28.876 sec
Modified: 1482780 bytes / 13.255 sec
Improvement: 6.28% (compression ratio) / 54.10% (compression time)

[Compressing Kodak set with mipmaps using DXT1 encoding]
Original: 2065243 bytes / 36.987 sec
Modified: 1931586 bytes / 18.068 sec
Improvement: 6.47% (compression ratio) / 51.15% (compression time)

ETC Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). The ETC1 quantization parameters have been selected in such a way, so that ETC1 compression gives approximately the same average Luma PSNR as the corresponding DXT1 compression (which is equal to 34.044 dB for the Kodak test set compressed without mipmaps using DXT1 encoding and default quality settings).

[Compressing Kodak set without mipmaps using ETC1 encoding]
Total size: 1887265 bytes
Total time: 14.954 sec
Average bitrate: 1.600 bpp
Average Luma PSNR: 34.049 dB
2017-07-05 18:19:23 +02:00
Alexander Suvorov 39b85b74c2 Optimize selector codebook creation algorithm
This change significantly improves compression speed.

Explanation:
When generating selector codebook, pixel selectors can be processed in groups, while the intermediate error results for those groups can be precalculated.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.865 sec
Modified: 1482780 bytes / 13.340 sec
Improvement: 6.28% (compression ratio) / 53.78% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.988 sec
Modified: 1931586 bytes / 18.087 sec
Improvement: 6.47% (compression ratio) / 51.10% (compression time)
2017-06-16 14:55:32 +02:00
Alexander Suvorov eee6b26e5d Optimize endpoint and selector sorting algorithms
This change significantly improves compression speed.

Explanation:
The main ideas used for the endpoint and selector sorting optimization:
- unpacked color and alpha endpoints can be cached
- pixel selectors can be processed in groups, while the intermediate error results for those groups can be precalculated
- instead of maintaining the mask of the processed elements, the remaining elements can be reorganized to form a continuous block on each iteration (the last remaining element is moved into the position of the processed element)
- after optimization, endpoint sorting works significantly faster than endpoint reordering, so the overall performance can be improved by moving selector optimization into the endpoint sorting thread

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.863 sec
Modified: 1482780 bytes / 14.564 sec
Improvement: 6.28% (compression ratio) / 49.54% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.968 sec
Modified: 1931586 bytes / 19.717 sec
Improvement: 6.47% (compression ratio) / 46.66% (compression time)
2017-06-14 14:56:41 +02:00
Alexander Suvorov f1d6a5a735 Improve and optimize the endpoint reordering algorithm
This change significantly improves the compression ratio and compression speed.

Explanation:
After the endpoint codebook has been determined, the endpoints can be reordered in order to improve the compression ratio. On the one hand, endpoint indices of the neighbor blocks should be similar, as the encoder compresses the deltas between those neighbour indices. On the other hand, the neighbor endpoints in the codebook should be also similar, as the encoder compresses the deltas between the color components of those neighbor endpoints. The optimization is based on the Zeng's technique, using a weighted function which takes into account both similarity of the endpoint indices for the neighbor blocks and similarity of the neighbor endpoints in the codebook.

The similarity of the endpoint indices is optimized using the combined neighborhood frequency of the candidate endpoint and all the currently selected endpoints in the list. The similarity of the neighbor endpoints in the codebook is optimized using euclidian distance from the candidate endpoint to the extremity of selected endpoints list. The original optimization function for the endpoint candidate (i) can be represented as:

F(i) = (total_neighborhood_frequency(i) + 1) * (endpoint_similarity(i) + 1)

The problem with this approach is the following. While the endpoint_similarity(i) has a limited range of values, the total_neighborhood_frequency(i) grows rapidly with the increasing size of the selected endpoints list. With each iteration this introduces additional disbalance for the weighted function. In order to minimize this effect, is it proposed to normalize the total_neighborhood_frequency(i) on each iteration. For computational simplicity, the normalizer is computed as the optimal total_neighborhood_frequency value from the previous iteration, multiplied by a constant. The modified optimization function can be represented as:

F(i) = (total_neighborhood_frequency(i) + total_neighborhood_frequency_normalizer) * (endpoint_similarity(i) + 1)

The main ideas used for endpoint reordering optimization:
- all the computations, which are common for the endpoint reordering threads, have been moved outside of the threads
- the ordering histogram offsets, which point to the neighborhood frequency values for a specific endpoint, are now cached, which reduces the number of multiplications when accessing the histogram
- floating point operations have been replaced with integer operations

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.873 sec
Modified: 1482726 bytes / 15.791 sec
Improvement: 6.29% (compression ratio) / 45.31% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.925 sec
Modified: 1931475 bytes / 20.970 sec
Improvement: 6.48% (compression ratio) / 43.21% (compression time)
2017-06-09 19:14:41 +02:00
Alexander Suvorov 5822475b22 Completely remove all the chunk related code from the encoder and decoder
This change slightly improves compression speed and simplifies further modification of the code.

Explanation:
Additional performance boost is achieved by using linear representation for selectors and storing block selectors in a single uint32/uint64.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.927 sec
Modified: 1494501 bytes / 17.301 sec
Improvement: 5.54% (compression ratio) / 40.19% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.992 sec
Modified: 1945365 bytes / 22.548 sec
Improvement: 5.80% (compression ratio) / 39.05% (compression time)
2017-06-07 16:55:41 +02:00
Alexander Suvorov e7d458aa22 Switch from chunk encoding to block encoding while performing image quantization
This change improves compression speed and simplifies further modification of the code.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.947 sec
Modified: 1494501 bytes / 17.642 sec
Improvement: 5.54% (compression ratio) / 39.05% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.965 sec
Modified: 1945365 bytes / 22.989 sec
Improvement: 5.80% (compression ratio) / 37.81% (compression time)
2017-06-02 18:13:49 +02:00
Alexander Suvorov cd9ba9b615 Switch from chunk encoding to block encoding after the tile computation
This change improves compression speed and simplifies further modification of the code.

Explanation:
This change is required for further optimization of the tile computation code. Additional performance boost is achieved by moving the tile palettizing into the tile computation thread.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.928 sec
Modified: 1494501 bytes / 18.259 sec
Improvement: 5.54% (compression ratio) / 36.88% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.978 sec
Modified: 1945365 bytes / 23.857 sec
Improvement: 5.80% (compression ratio) / 35.48% (compression time)
2017-05-31 15:05:32 +02:00
Alexander Suvorov 7b6f456399 Optimize selector quantization, assignment and refinement
This change significantly improves compression speed.

Explanation:
The main ideas used for selector computations optimization:
- possible pixel values for each endpoint can be cached
- the distances between the possible pixel values and the actual pixels values within a block can be cached for fast error computation during selector assignment
- selector refinement can be efficiently integrated with the selector assignment, as it is based on the same set of cached error values
- using block encoding instead of chunk encoding for both endpoints and selectors eliminates extra levels of indirection

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.953 sec
Modified: 1494501 bytes / 19.667 sec
Improvement: 5.54% (compression ratio) / 32.07% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.998 sec
Modified: 1945365 bytes / 25.642 sec
Improvement: 5.80% (compression ratio) / 30.69% (compression time)
2017-05-19 19:55:10 +02:00
Alexander Suvorov b8349dfac8 Use block encoding to store intermediate selectors after endpoint quantization
This change simplifies further modification of the code.

Explanation:
This change is required for further optimization of the quantization code.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.935 sec
Modified: 1494501 bytes / 24.528 sec
Improvement: 5.54% (compression ratio) / 15.23% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.982 sec
Modified: 1945365 bytes / 32.308 sec
Improvement: 5.80% (compression ratio) / 12.64% (compression time)
2017-05-18 13:44:04 +02:00
Alexander Suvorov 1ef829ed6f Move alpha endpoint refinement into the alpha endpoint optimization thread
This change improves compression speed when using alpha channel.

Explanation:
As the alpha endpoint refinement does not depend on the alpha selector codebook computation, it can be safely moved into the alpha endpoint optimization thread.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.912 sec
Modified: 1494501 bytes / 24.128 sec
Improvement: 5.54% (compression ratio) / 16.55% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.985 sec
Modified: 1945365 bytes / 31.741 sec
Improvement: 5.80% (compression ratio) / 14.18% (compression time)
2017-05-12 14:06:53 +02:00
Alexander Suvorov 9c289fc621 Move color endpoint refinement into the color endpoint optimization thread
This change significantly improves compression speed.

Explanation:
If we take a closer look at the color endpoint refinement, we can see that the input for the color refinement comes directly from the color endpoint optimization step, while the selector codebook computation does not affect the color endpoint refinement at all. Therefore color endpoint refinement can be safely moved into the endpoint optimization thread.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.899 sec
Modified: 1494501 bytes / 24.043 sec
Improvement: 5.54% (compression ratio) / 16.80% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.884 sec
Modified: 1945365 bytes / 31.586 sec
Improvement: 5.80% (compression ratio) / 14.36% (compression time)
2017-05-11 16:04:51 +02:00
Alexander Suvorov c9fd4dca75 Compute compressed endpoints size without pack simulation
This change improves compression speed.

Explanation:
While trying different remappings for the endpoint indices, there is no need to perform full pack simulation when using Huffman coding. Once the delta index histogram is generated, it is sufficient to simply multiply the code sizes by the corresponding frequences in order to get the total size of the compressed endpoint indices stream. There is also no need to compute the rest of the compressed stream, as its size does not depend on the endpoint remapping and therefore is always constant, so it will not affect the size comparison during endpoint optimization.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.864 sec
Modified: 1494501 bytes / 25.317 sec
Improvement: 5.54% (compression ratio) / 12.29% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.927 sec
Modified: 1945365 bytes / 33.151 sec
Improvement: 5.80% (compression ratio) / 10.23% (compression time)
2017-05-10 11:32:01 +02:00
Alexander Suvorov d0b6f5759b Switch from chunk encoding to block encoding after quantization
This change simplifies further modification of the code.

Explanation:
Considering that chunks are no longer used in the output format, it makes sense to also remove chunk related code from the intermediate processing. This modification also allows to use endpoint references from the leftmost block to the rightmost block in the previous scanline (wrapped reference to the left).

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.846 sec
Modified: 1494501 bytes / 25.628 sec
Improvement: 5.54% (compression ratio) / 11.16% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.869 sec
Modified: 1945365 bytes / 33.497 sec
Improvement: 5.80% (compression ratio) / 9.15% (compression time)
2017-05-09 17:34:21 +02:00
Alexander Suvorov 5258727545 Remove duplicate endpoints and selectors from the codebooks
This change significantly improves the compression ratio.

Explanation:
By default, the size of the endpoint and selector codebooks is calculated based on the number of blocks in the image and the quality parameter, while the actual complexity of the image does not affect the initial codebook size. So the target codebook size is selected in such a way, that even complex images can be approximated well enough. At the same time, normally, the lower is the complexity of the image, the higher is the density of the quantized vectors. Considering that vector quantization is performed using floating point computations, and the quantized endpoints have integer components, high density of quantized vectors will result in large number of duplicate endpoints. As the result, some identical endpoints are being represented by multiple different indices, which significantly affects the compression ratio. Note that this is not the case for selectors, as their corresponding vector components are rounded after quantization, but instead it leads to some duplicate selectors in the codebook being not used. In the modified version of the algorithm all the duplicate codebook entries are merged together, unused entries are removed from the codebooks, the endpoint and selector indices are updated accordingly.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.835 sec
Modified: 1494630 bytes / 25.637 sec
Improvement: 5.54% (compression ratio) / 11.09% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.875 sec
Modified: 1946533 bytes / 33.546 sec
Improvement: 5.75% (compression ratio) / 9.03% (compression time)
2017-05-05 20:06:00 +02:00
Alexander Suvorov ef540e54de Encode raw selector indices instead of selector indices deltas
This change significantly improves compression ratio and compression speed.

Explanation:
The original version of Crunch encodes the differences between the neighbour indices in order to get advantage of the neighbour indices similarity. The efficiency of such approach highly depends on the continuity of the encoded data. While neighbour color and alpha endpoints are usualy similar, this is usually not the case for selectors. Of course, in some situations, encoding deltas for selector indices makes sense, for example, when the image contains a lot of regular patterns (except the special case of completely flat areas, where using selector deltas does not bring much advantage). In any case, such situations are relatively rare, so it usually appears to be more efficient to encode raw selector indices. Note that when not using deltas for selector indices, the remapping of the selector indices no longer affects the size of the encoded selector indices stream (at least when using Huffman coding). This makes the Zeng optimization step unnecessary, and it is sufficient to simply optimize the size of the packed selector codebook.

Note:
This modification alters the output file format and makes it incompatible with the previous revisions.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.845 sec
Modified: 1521167 bytes / 26.048 sec
Improvement: 3.86% (compression ratio) / 9.70% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.949 sec
Modified: 1977373 bytes / 33.889 sec
Improvement: 4.25% (compression ratio) / 8.28% (compression time)
2017-05-05 11:26:52 +02:00
Alexander Suvorov 974fab40a5 Switch from the chunk encoding concept to the reference encoding concept
This change improves the compression ratio.

Explanation:
In the original version of Crunch all the blocks are grouped into chunks of 2x2 blocks. Each chunk can have one of 8 different types. The type of the chunk determines which blocks inside the chunk share the same endpoints (for example, all the blocks inside the chunk share the same endpoints, or blocks in the right column share the same endpoints, or all the blocks have different endpoints, etc.). Encoding of endpoints equality is usually cheaper than encoding of duplicate endpoint indices. The used 8 chunk types do not cover all the possibilities, but they can be efficiently encoded using 0.75 bits per block (uncompressed).

The modified algorithm no longer uses the concept of chunks in the output file format and is based on an alternative approach. Endpoints for each block can be either copied from the left nearest block (reference to the left), copied from the upper nearest block (reference to the top), or decoded from the stream (reference to itself). Note that this is a superset of the original encoding, so all the images previously encoded with the original algorithm can be losslessly transcoded into the new format, but not vice versa. Even though the new endpoint equality encoding is more expensive (about 1.58 bits per block, uncompressed), it provides more flexibility for endpoint matching inside the former "chunks", and more importantly, it allows to inherit endpoints from outside the former "chunks" (which is not possible when using the original chunk encoding). The blocks are no longer grouped together and are encoded in the same order as they appear on the image.

Note:
This modification alters the output file format and makes it incompatible with the previous revisions.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.903 sec
Modified: 1548791 bytes / 28.818 sec
Improvement: 2.11% (compression ratio) / 0.29% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.978 sec
Modified: 2017245 bytes / 36.846 sec
Improvement: 2.32% (compression ratio) / 0.36% (compression time)
2017-05-04 18:41:24 +02:00
Alexander Suvorov 178742ca6f Remove linear lists of endpoint and selector indices
Explanation:
After switching to ordering histograms, the linear lists of endpoint and selector indices are no longer used in Zeng function, and therefore can be removed.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.872 sec
Modified: 1561622 bytes / 28.434 sec
Improvement: 1.30% (compression ratio) / 1.52% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.910 sec
Modified: 2033151 bytes / 36.369 sec
Improvement: 1.55% (compression ratio) / 1.47% (compression time)
2017-05-02 13:03:11 +02:00
Alexander Suvorov 125536a3b5 Use left nearest block for selector index prediction
This change improves compression ratio.

Explanation:
In the original algorithm the relative position of the block, used for prediction of the selector index for the currently decoded block, depends on the position of the current block in the chunk. It can be a horizontal neighbour or a diagonal neighbour. Using left nearest neighbour for selector index prediction for each block (except the blocks at the image borders) minimizes the average distance to the prediction block and therefore usually improves the selector index prediction. Similarly to the endpoint index processing, the selector ordering histogram in now generated based on the selector index prediction order.

Note:
This modification alters the output file format and makes it incompatible with the previous revisions.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.869 sec
Modified: 1561622 bytes / 28.522 sec
Improvement: 1.30% (compression ratio) / 1.20% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 37.038 sec
Modified: 2033151 bytes / 36.407 sec
Improvement: 1.55% (compression ratio) / 1.70% (compression time)
2017-04-28 16:55:54 +02:00
Alexander Suvorov a4ab9fedee Generate ordering histogram for endpoint indexes based on the prediction order
This change improves compression ratio.

Explanation:
The original histogram has been generated based on the linear order of encoded endpoint indexes. In the modified version of the algorithm, endpoint indexes are predicted using the nearest left block on the image, which is not necessarily the preceding block in the encoded sequence. Using the same block ordering both for prediction and Zeng optimization normally improves the compression ratio.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.905 sec
Modified: 1566133 bytes / 28.457 sec
Improvement: 1.02% (compression ratio) / 1.55% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 37.021 sec
Modified: 2040086 bytes / 36.300 sec
Improvement: 1.22% (compression ratio) / 1.95% (compression time)
2017-04-28 13:49:47 +02:00
Alexander Suvorov 19f05aadbc Prepare for encoding of endpoint and selector indexes in non-linear order
This change makes the compression scheme more flexible.

Explanation:
In the original scheme, indexes are encoded in linear order, which means that each index uses the previously encoded index for prediction. However, more sophisticated schemes might require arbitrary references into the stream of already encoded indexes. For this reason, Zeng function has been modified to accept the ordering histogram as an input, instead of the linear array of indexes. Note that Zeng function itself does not rely on the indexes being encoded in linear order.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.867 sec
Modified: 1570534 bytes / 28.524 sec
Improvement: 0.74% (compression ratio) / 1.19% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 37.001 sec
Modified: 2051509 bytes / 36.388 sec
Improvement: 0.67% (compression ratio) / 1.66% (compression time)
2017-04-28 11:32:14 +02:00
Alexander Suvorov 8cc5f19ae5 Use left nearest block for endpoint index prediction
This change improves compression ratio.

Explanation:
In the original algorithm the relative position of the block, used for prediction of the endpoint index for the currently decoded block, depends on the chunk encoding type. It can be a horizontal neighbour, a vertical neighbour, a diagonal neighbour, or in some rare cases even a block at relative position (-2, 0) or (-3, 0). Using left nearest neighbour for endpoint index prediction for each block (except the blocks at the image borders) minimizes the average distance to the prediction block and therefore usually improves the endpoint index prediction.

Note:
This modification alters the output file format and makes it incompatible with the previous revisions.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.838 sec
Modified: 1570534 bytes / 28.629 sec
Improvement: 0.74% (compression ratio) / 0.72% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.977 sec
Modified: 2051509 bytes / 36.568 sec
Improvement: 0.67% (compression ratio) / 1.11% (compression time)
2017-04-27 15:49:48 +02:00
Alexander Suvorov 13b1faa48d Reorder chunks in each scanline in the left-to-right manner
This change slightly improves compression ratio and compression time.

Explanation:
The efficiency of the Crunch encoding scheme depends on the similarity between the neighbour chunks. For this reason in original version of Crunch the order of chunks is reversed after each scanline, so that there is no jump from one side of the image to another at the image borders. The problem here is that inside of each chunk, the blocks are normally ordered in a usual up-to-down-left-to-right manner, regardless of the chunk scanning order. While on the forward scan we normally need to perform diagonal jumps (+1, +1) in order to get to the next chunk, on the reverse scan we normally need to perform much larger (-3, +1) jumps, which usually defeats the advantage of not having discontinuity at the image borders.

Note:
This modification alters the output format and makes it incompatible with the previous revisions.

Testing:
The modified algorithm has been tested on the Kodak test set using 64-bit build with default settings (running on Windows 10, i7-4790, 3.6GHz). All the decompressed test images are identical to the images being compressed and decompressed using original version of Crunch.

[Compressing Kodak set without mipmaps]
Original: 1582222 bytes / 28.882 sec
Modified: 1579618 bytes / 28.743 sec
Improvement: 0.16% (compression ratio) / 0.48% (compression time)

[Compressing Kodak set with mipmaps]
Original: 2065243 bytes / 36.920 sec
Modified: 2061499 bytes / 36.833 sec
Improvement: 0.18% (compression ratio) / 0.24% (compression time)
2017-04-27 11:08:16 +02:00
Alexander Suvorov 5d09a511d5 Update .gitignore 2017-04-26 15:54:16 +02:00
Alexander Suvorov 1df47a4250 Remove big endian support, write barriers, byte streams and dxt1 decoding optimization from the decompression code
This change makes the code more simple to modify. The removed functionality might be reintroduced in the future if necessary.
2017-04-26 15:09:07 +02:00
Alexander Suvorov d34192aa07 Split the header block from the crn_decomp.h into a separate crn_defs.h file. This change makes the used CRND_HEADER_FILE_ONLY macro unneccesary. 2017-04-26 13:16:13 +02:00
Alexander Suvorov 7c02055d05 Reformat the source files. The source files have been reformatted using: clang-format.exe -style="{BasedOnStyle: Google, AllowAllParametersOfDeclarationOnNextLine: false, AllowShortFunctionsOnASingleLine: Inline, AllowShortIfStatementsOnASingleLine: false, AllowShortLoopsOnASingleLine: false, ColumnLimit: 0, DerivePointerAlignment: false, SortIncludes: false}" 2017-04-26 11:41:07 +02:00
Alexander Suvorov 41d7b962b0 Update solution to use Visual C++ 2010 compiler and libraries. When compiled with Visual Studio 2010, the code will produce the same results as the originally distributed Crunch binaries. 2017-04-26 10:59:07 +02:00
222 changed files with 68388 additions and 76390 deletions
+16 -1
View File
@@ -1,2 +1,17 @@
*.o
crnlib/crunch
*.2010.vcxproj.user
*.2010.suo
/crnlib/crunch
/crnlib/Win32
/crnlib/x64
/crunch/Win32
/crunch/x64
/example1/Win32
/example1/x64
/example2/Win32
/example2/x64
/example3/Win32
/example3/x64
/lib
/bin/*
!bin/crunch_x64.exe
+2 -2
View File
@@ -141,9 +141,9 @@ be reliably read by other tools.
This release contains the source code and projects for three simple
example projects:
crn_examples.2008.sln is a Visual Studio 2008 (VC9) solution file
crn_examples.2010.sln is a Visual Studio 2010 (VC10) solution file
containing projects for Win32 and x64. crnlib itself also builds with
VS2005, VS2010, and gcc 4.5.0 (TDM GCC+MinGW). A codeblocks 10.05
VS2005, VS2008, and gcc 4.5.0 (TDM GCC+MinGW). A codeblocks 10.05
workspace and project file is also included, but compiling crnlib this
way hasn't been tested much.
BIN
View File
Binary file not shown.
BIN
View File
Binary file not shown.
Binary file not shown.
Binary file not shown.
+4 -7
View File
@@ -1,12 +1,9 @@

Microsoft Visual Studio Solution File, Format Version 10.00
# Visual Studio 2008
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "crunch", "crunch\crunch.2008.vcproj", "{8F645BA1-B996-49EB-859B-970A671DE05D}"
ProjectSection(ProjectDependencies) = postProject
{CF2E70E8-7133-4D96-92C7-68BB406C0664} = {CF2E70E8-7133-4D96-92C7-68BB406C0664}
EndProjectSection
Microsoft Visual Studio Solution File, Format Version 11.00
# Visual Studio 2010
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "crunch", "crunch\crunch.2010.vcxproj", "{8F645BA1-B996-49EB-859B-970A671DE05D}"
EndProject
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "crnlib", "crnlib\crnlib.2008.vcproj", "{CF2E70E8-7133-4D96-92C7-68BB406C0664}"
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "crnlib", "crnlib\crnlib.2010.vcxproj", "{CF2E70E8-7133-4D96-92C7-68BB406C0664}"
EndProject
Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution
@@ -1,11 +1,11 @@

Microsoft Visual Studio Solution File, Format Version 10.00
# Visual Studio 2008
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "example1", "example1\example1.2008.vcproj", "{8F745B42-F996-49EB-859B-970A671DE05D}"
Microsoft Visual Studio Solution File, Format Version 11.00
# Visual Studio 2010
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "example1", "example1\example1.2010.vcxproj", "{8F745B42-F996-49EB-859B-970A671DE05D}"
EndProject
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "example2", "example2\example2.2008.vcproj", "{AF745B42-F996-49EB-859B-970A671DEF5E}"
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "example2", "example2\example2.2010.vcxproj", "{AF745B42-F996-49EB-859B-970A671DEF5E}"
EndProject
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "example3", "example3\example3.2008.vcproj", "{AF745B42-E296-46EB-859B-970A671DEF5E}"
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "example3", "example3\example3.2010.vcxproj", "{AF745B42-E296-46EB-859B-970A671DEF5E}"
EndProject
Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution
+71 -144
View File
@@ -6,12 +6,9 @@
#define RECT_DEBUG
namespace crnlib
{
namespace crnlib {
static void area_fatal_error(const char* pFunc, const char* pMsg, ...)
{
pFunc;
static void area_fatal_error(const char*, const char* pMsg, ...) {
va_list args;
va_start(args, pMsg);
@@ -27,8 +24,7 @@ namespace crnlib
CRNLIB_FAIL(buf);
}
static Area * delete_area(Area_List *Plist, Area *Parea)
{
static Area* delete_area(Area_List* Plist, Area* Parea) {
Area *p, *q;
#ifdef RECT_DEBUG
@@ -48,27 +44,23 @@ namespace crnlib
return (q);
}
static Area * alloc_area(Area_List *Plist)
{
static Area* alloc_area(Area_List* Plist) {
Area* p = Plist->Pfree;
if (p == NULL)
{
if (p == NULL) {
if (Plist->next_free == Plist->total_areas)
area_fatal_error("alloc_area", "Out of areas!");
p = Plist->Phead + Plist->next_free;
Plist->next_free++;
}
else
} else
Plist->Pfree = p->Pnext;
return (p);
}
static Area* insert_area_before(Area_List* Plist, Area* Parea,
int x1, int y1, int x2, int y2)
{
int x1, int y1, int x2, int y2) {
Area *p, *Pnew_area = alloc_area(Plist);
p = Parea->Pprev;
@@ -89,8 +81,7 @@ namespace crnlib
}
static Area* insert_area_after(Area_List* Plist, Area* Parea,
int x1, int y1, int x2, int y2)
{
int x1, int y1, int x2, int y2) {
Area *p, *Pnew_area = alloc_area(Plist);
p = Parea->Pnext;
@@ -110,15 +101,13 @@ namespace crnlib
return (Pnew_area);
}
void Area_List_deinit(Area_List* Pobj_base)
{
void Area_List_deinit(Area_List* Pobj_base) {
Area_List* Plist = (Area_List*)Pobj_base;
if (!Plist)
return;
if (Plist->Phead)
{
if (Plist->Phead) {
crnlib_free(Plist->Phead);
Plist->Phead = NULL;
}
@@ -126,8 +115,7 @@ namespace crnlib
crnlib_free(Plist);
}
Area_List * Area_List_init(int max_areas)
{
Area_List* Area_List_init(int max_areas) {
Area_List* Plist = (Area_List*)crnlib_calloc(1, sizeof(Area_List));
Plist->total_areas = max_areas + 2;
@@ -147,12 +135,10 @@ namespace crnlib
return (Plist);
}
void Area_List_print(Area_List *Plist)
{
void Area_List_print(Area_List* Plist) {
Area* Parea = Plist->Phead->Pnext;
while (Parea != Plist->Ptail)
{
while (Parea != Plist->Ptail) {
printf("%04i %04i : %04i %04i\n", Parea->x1, Parea->y1, Parea->x2, Parea->y2);
Parea = Parea->Pnext;
@@ -160,8 +146,7 @@ namespace crnlib
}
Area_List* Area_List_dup_new(Area_List* Plist,
int x_ofs, int y_ofs)
{
int x_ofs, int y_ofs) {
int i;
Area_List* Pnew_list = (Area_List*)crnlib_calloc(1, sizeof(Area_List));
@@ -176,8 +161,7 @@ namespace crnlib
memcpy(Pnew_list->Phead, Plist->Phead, sizeof(Area) * Plist->total_areas);
for (i = 0; i < Plist->total_areas; i++)
{
for (i = 0; i < Plist->total_areas; i++) {
Pnew_list->Phead[i].Pnext = (Plist->Phead[i].Pnext == NULL) ? NULL : (Plist->Phead[i].Pnext - Plist->Phead) + Pnew_list->Phead;
Pnew_list->Phead[i].Pprev = (Plist->Phead[i].Pprev == NULL) ? NULL : (Plist->Phead[i].Pprev - Plist->Phead) + Pnew_list->Phead;
@@ -190,14 +174,12 @@ namespace crnlib
return (Pnew_list);
}
uint Area_List_get_num(Area_List* Plist)
{
uint Area_List_get_num(Area_List* Plist) {
uint num = 0;
Area* Parea = Plist->Phead->Pnext;
while (Parea != Plist->Ptail)
{
while (Parea != Plist->Ptail) {
num++;
Parea = Parea->Pnext;
@@ -207,8 +189,7 @@ namespace crnlib
}
void Area_List_dup(Area_List* Psrc_list, Area_List* Pdst_list,
int x_ofs, int y_ofs)
{
int x_ofs, int y_ofs) {
int i;
if (Psrc_list->total_areas != Pdst_list->total_areas)
@@ -220,10 +201,8 @@ namespace crnlib
memcpy(Pdst_list->Phead, Psrc_list->Phead, sizeof(Area) * Psrc_list->total_areas);
if ((x_ofs) || (y_ofs))
{
for (i = 0; i < Psrc_list->total_areas; i++)
{
if ((x_ofs) || (y_ofs)) {
for (i = 0; i < Psrc_list->total_areas; i++) {
Pdst_list->Phead[i].Pnext = (Psrc_list->Phead[i].Pnext == NULL) ? NULL : (Psrc_list->Phead[i].Pnext - Psrc_list->Phead) + Pdst_list->Phead;
Pdst_list->Phead[i].Pprev = (Psrc_list->Phead[i].Pprev == NULL) ? NULL : (Psrc_list->Phead[i].Pprev - Psrc_list->Phead) + Pdst_list->Phead;
@@ -232,11 +211,8 @@ namespace crnlib
Pdst_list->Phead[i].x2 += x_ofs;
Pdst_list->Phead[i].y2 += y_ofs;
}
}
else
{
for (i = 0; i < Psrc_list->total_areas; i++)
{
} else {
for (i = 0; i < Psrc_list->total_areas; i++) {
Pdst_list->Phead[i].Pnext = (Psrc_list->Phead[i].Pnext == NULL) ? NULL : (Psrc_list->Phead[i].Pnext - Psrc_list->Phead) + Pdst_list->Phead;
Pdst_list->Phead[i].Pprev = (Psrc_list->Phead[i].Pprev == NULL) ? NULL : (Psrc_list->Phead[i].Pprev - Psrc_list->Phead) + Pdst_list->Phead;
}
@@ -245,18 +221,15 @@ namespace crnlib
void Area_List_copy(
Area_List* Psrc_list, Area_List* Pdst_list,
int x_ofs, int y_ofs)
{
int x_ofs, int y_ofs) {
Area* Parea = Psrc_list->Phead->Pnext;
Area_List_clear(Pdst_list);
if ((x_ofs) || (y_ofs))
{
if ((x_ofs) || (y_ofs)) {
Area* Pprev_area = Pdst_list->Phead;
while (Parea != Psrc_list->Ptail)
{
while (Parea != Psrc_list->Ptail) {
// Area *p, *Pnew_area;
Area* Pnew_area;
@@ -280,9 +253,7 @@ namespace crnlib
}
Pprev_area->Pnext = Pdst_list->Ptail;
}
else
{
} else {
#if 0
while (Parea != Psrc_list->Ptail)
{
@@ -298,8 +269,7 @@ namespace crnlib
Area* Pprev_area = Pdst_list->Phead;
while (Parea != Psrc_list->Ptail)
{
while (Parea != Psrc_list->Ptail) {
// Area *p, *Pnew_area;
Area* Pnew_area;
@@ -326,16 +296,14 @@ namespace crnlib
}
}
void Area_List_clear(Area_List *Plist)
{
void Area_List_clear(Area_List* Plist) {
Plist->Phead->Pnext = Plist->Ptail;
Plist->Ptail->Pprev = Plist->Phead;
Plist->Pfree = NULL;
Plist->next_free = 2;
}
void Area_List_set(Area_List *Plist, int x1, int y1, int x2, int y2)
{
void Area_List_set(Area_List* Plist, int x1, int y1, int x2, int y2) {
Plist->Pfree = NULL;
Plist->Phead[2].x1 = x1;
@@ -353,8 +321,7 @@ namespace crnlib
}
void Area_List_remove(Area_List* Plist,
int x1, int y1, int x2, int y2)
{
int x1, int y1, int x2, int y2) {
int l, h;
Area* Parea = Plist->Phead->Pnext;
@@ -363,23 +330,19 @@ namespace crnlib
area_fatal_error("area_list_remove", "invalid coords: %i %i %i %i", x1, y1, x2, y2);
#endif
while (Parea != Plist->Ptail)
{
while (Parea != Plist->Ptail) {
// Not touching
if ((x2 < Parea->x1) || (x1 > Parea->x2) ||
(y2 < Parea->y1) || (y1 > Parea->y2))
{
(y2 < Parea->y1) || (y1 > Parea->y2)) {
Parea = Parea->Pnext;
continue;
}
// Completely covers
if ((x1 <= Parea->x1) && (x2 >= Parea->x2) &&
(y1 <= Parea->y1) && (y2 >= Parea->y2))
{
(y1 <= Parea->y1) && (y2 >= Parea->y2)) {
if ((x1 == Parea->x1) && (x2 == Parea->x2) &&
(y1 == Parea->y1) && (y2 == Parea->y2))
{
(y1 == Parea->y1) && (y2 == Parea->y2)) {
delete_area(Plist, Parea);
return;
}
@@ -390,16 +353,14 @@ namespace crnlib
}
// top
if (y1 > Parea->y1)
{
if (y1 > Parea->y1) {
insert_area_before(Plist, Parea,
Parea->x1, Parea->y1,
Parea->x2, y1 - 1);
}
// bottom
if (y2 < Parea->y2)
{
if (y2 < Parea->y2) {
insert_area_before(Plist, Parea,
Parea->x1, y2 + 1,
Parea->x2, Parea->y2);
@@ -409,16 +370,14 @@ namespace crnlib
h = math::minimum(y2, Parea->y2);
// left middle
if (x1 > Parea->x1)
{
if (x1 > Parea->x1) {
insert_area_before(Plist, Parea,
Parea->x1, l,
x1 - 1, h);
}
// right middle
if (x2 < Parea->x2)
{
if (x2 < Parea->x2) {
insert_area_before(Plist, Parea,
x2 + 1, l,
Parea->x2, h);
@@ -427,8 +386,7 @@ namespace crnlib
// early out - we know there's nothing else to remove, as areas can
// never overlap
if ((x1 >= Parea->x1) && (x2 <= Parea->x2) &&
(y1 >= Parea->y1) && (y2 <= Parea->y2))
{
(y1 >= Parea->y1) && (y2 <= Parea->y2)) {
delete_area(Plist, Parea);
return;
}
@@ -439,8 +397,7 @@ namespace crnlib
void Area_List_insert(Area_List* Plist,
int x1, int y1, int x2, int y2,
bool combine)
{
bool combine) {
Area* Parea = Plist->Phead->Pnext;
#ifdef RECT_DEBUG
@@ -448,20 +405,17 @@ namespace crnlib
area_fatal_error("Area_List_insert", "invalid coords: %i %i %i %i", x1, y1, x2, y2);
#endif
while (Parea != Plist->Ptail)
{
while (Parea != Plist->Ptail) {
// totally covers
if ((x1 <= Parea->x1) && (x2 >= Parea->x2) &&
(y1 <= Parea->y1) && (y2 >= Parea->y2))
{
(y1 <= Parea->y1) && (y2 >= Parea->y2)) {
Parea = delete_area(Plist, Parea);
continue;
}
// intersects
if ((x2 >= Parea->x1) && (x1 <= Parea->x2) &&
(y2 >= Parea->y1) && (y1 <= Parea->y2))
{
(y2 >= Parea->y1) && (y1 <= Parea->y2)) {
int ax1, ay1, ax2, ay2;
ax1 = Parea->x1;
@@ -484,21 +438,15 @@ namespace crnlib
return;
}
if (combine)
{
if ((x1 == Parea->x1) && (x2 == Parea->x2))
{
if ((y2 == Parea->y1 - 1) || (y1 == Parea->y2 + 1))
{
if (combine) {
if ((x1 == Parea->x1) && (x2 == Parea->x2)) {
if ((y2 == Parea->y1 - 1) || (y1 == Parea->y2 + 1)) {
delete_area(Plist, Parea);
Area_List_insert(Plist, x1, math::minimum(y1, Parea->y1), x2, math::maximum(y2, Parea->y2), CRNLIB_TRUE);
return;
}
}
else if ((y1 == Parea->y1) && (y2 == Parea->y2))
{
if ((x2 == Parea->x1 - 1) || (x1 == Parea->x2 + 1))
{
} else if ((y1 == Parea->y1) && (y2 == Parea->y2)) {
if ((x2 == Parea->x1 - 1) || (x1 == Parea->x2 + 1)) {
delete_area(Plist, Parea);
Area_List_insert(Plist, math::minimum(x1, Parea->x1), y1, math::maximum(x2, Parea->x2), y2, CRNLIB_TRUE);
return;
@@ -513,24 +461,20 @@ namespace crnlib
}
void Area_List_intersect_area(Area_List* Plist,
int x1, int y1, int x2, int y2)
{
int x1, int y1, int x2, int y2) {
Area* Parea = Plist->Phead->Pnext;
while (Parea != Plist->Ptail)
{
while (Parea != Plist->Ptail) {
// doesn't cover
if ((x2 < Parea->x1) || (x1 > Parea->x2) ||
(y2 < Parea->y1) || (y1 > Parea->y2))
{
(y2 < Parea->y1) || (y1 > Parea->y2)) {
Parea = delete_area(Plist, Parea);
continue;
}
// totally covers
if ((x1 <= Parea->x1) && (x2 >= Parea->x2) &&
(y1 <= Parea->y1) && (y2 >= Parea->y2))
{
(y1 <= Parea->y1) && (y2 >= Parea->y2)) {
Parea = Parea->Pnext;
continue;
}
@@ -591,23 +535,21 @@ namespace crnlib
#if 1
void Area_List_intersect_Area_List(Area_List* Pouter_list,
Area_List* Pinner_list,
Area_List *Pdst_list)
{
Area_List* Pdst_list) {
Area* Parea1 = Pouter_list->Phead->Pnext;
while (Parea1 != Pouter_list->Ptail)
{
while (Parea1 != Pouter_list->Ptail) {
Area* Parea2 = Pinner_list->Phead->Pnext;
int x1, y1, x2, y2;
x1 = Parea1->x1; x2 = Parea1->x2;
y1 = Parea1->y1; y2 = Parea1->y2;
x1 = Parea1->x1;
x2 = Parea1->x2;
y1 = Parea1->y1;
y2 = Parea1->y2;
while (Parea2 != Pinner_list->Ptail)
{
while (Parea2 != Pinner_list->Ptail) {
if ((x1 <= Parea2->x2) && (x2 >= Parea2->x1) &&
(y1 <= Parea2->y2) && (y2 >= Parea2->y1))
{
(y1 <= Parea2->y2) && (y2 >= Parea2->y1)) {
int nx1, ny1, nx2, ny2;
nx1 = math::maximum(x1, Parea2->x1);
@@ -615,36 +557,24 @@ namespace crnlib
nx2 = math::minimum(x2, Parea2->x2);
ny2 = math::minimum(y2, Parea2->y2);
if (Pdst_list->Phead->Pnext == Pdst_list->Ptail)
{
if (Pdst_list->Phead->Pnext == Pdst_list->Ptail) {
insert_area_after(Pdst_list, Pdst_list->Phead,
nx1, ny1, nx2, ny2);
}
else
{
} else {
Area_Ptr Ptemp = Pdst_list->Phead->Pnext;
if ((Ptemp->x1 == nx1) && (Ptemp->x2 == nx2))
{
if (Ptemp->y1 == (ny2+1))
{
if ((Ptemp->x1 == nx1) && (Ptemp->x2 == nx2)) {
if (Ptemp->y1 == (ny2 + 1)) {
Ptemp->y1 = ny1;
goto next;
}
else if (Ptemp->y2 == (ny1-1))
{
} else if (Ptemp->y2 == (ny1 - 1)) {
Ptemp->y2 = ny2;
goto next;
}
}
else if ((Ptemp->y1 == ny1) && (Ptemp->y2 == ny2))
{
if (Ptemp->x1 == (nx2+1))
{
} else if ((Ptemp->y1 == ny1) && (Ptemp->y2 == ny2)) {
if (Ptemp->x1 == (nx2 + 1)) {
Ptemp->x1 = nx1;
goto next;
}
else if (Ptemp->x2 == (nx1-1))
{
} else if (Ptemp->x2 == (nx1 - 1)) {
Ptemp->x2 = nx2;
goto next;
}
@@ -665,14 +595,12 @@ namespace crnlib
}
#endif
Area_List_Ptr Area_List_create_optimal(Area_List_Ptr Plist)
{
Area_List_Ptr Area_List_create_optimal(Area_List_Ptr Plist) {
Area_Ptr Parea = Plist->Phead->Pnext, Parea_after;
int num = 2;
Area_List_Ptr Pnew_list;
while (Parea != Plist->Ptail)
{
while (Parea != Plist->Ptail) {
num++;
Parea = Parea->Pnext;
}
@@ -683,8 +611,7 @@ namespace crnlib
Parea_after = Pnew_list->Phead;
while (Parea != Plist->Ptail)
{
while (Parea != Plist->Ptail) {
Parea_after = insert_area_after(Pnew_list, Parea_after,
Parea->x1, Parea->y1,
Parea->x2, Parea->y2);
+3 -6
View File
@@ -2,10 +2,8 @@
// See Copyright Notice and license at the end of inc/crnlib.h
#pragma once
namespace crnlib
{
struct Area
{
namespace crnlib {
struct Area {
struct Area *Pprev, *Pnext;
int x1, y1, x2, y2;
@@ -17,8 +15,7 @@ namespace crnlib
typedef Area* Area_Ptr;
struct Area_List
{
struct Area_List {
int total_areas;
int next_free;
+6 -12
View File
@@ -8,13 +8,11 @@
static bool g_fail_exceptions;
static bool g_exit_on_failure = true;
void crnlib_enable_fail_exceptions(bool enabled)
{
void crnlib_enable_fail_exceptions(bool enabled) {
g_fail_exceptions = enabled;
}
void crnlib_assert(const char* pExp, const char* pFile, unsigned line)
{
void crnlib_assert(const char* pExp, const char* pFile, unsigned line) {
char buf[512];
sprintf_s(buf, sizeof(buf), "%s(%u): Assertion failed: \"%s\"\n", pFile, line, pExp);
@@ -27,8 +25,7 @@ void crnlib_assert(const char* pExp, const char* pFile, unsigned line)
crnlib_debug_break();
}
void crnlib_fail(const char* pExp, const char* pFile, unsigned line)
{
void crnlib_fail(const char* pExp, const char* pFile, unsigned line) {
char buf[512];
sprintf_s(buf, sizeof(buf), "%s(%u): Failure: \"%s\"\n", pFile, line, pExp);
@@ -49,10 +46,8 @@ void crnlib_fail(const char* pExp, const char* pFile, unsigned line)
exit(EXIT_FAILURE);
}
void trace(const char* pFmt, va_list args)
{
if (crnlib_is_debugger_present())
{
void trace(const char* pFmt, va_list args) {
if (crnlib_is_debugger_present()) {
char buf[512];
vsprintf_s(buf, sizeof(buf), pFmt, args);
@@ -60,8 +55,7 @@ void trace(const char* pFmt, va_list args)
}
};
void trace(const char* pFmt, ...)
{
void trace(const char* pFmt, ...) {
va_list args;
va_start(args, pFmt);
trace(pFmt, args);
+20 -14
View File
@@ -18,7 +18,10 @@ void crnlib_fail(const char* pExp, const char* pFile, unsigned line);
#define CRNLIB_VERIFY(_exp) (void)((!!(_exp)) || (crnlib_assert(#_exp, __FILE__, __LINE__), 0))
#define CRNLIB_FAIL(msg) do { crnlib_fail(#msg, __FILE__, __LINE__); } while(0)
#define CRNLIB_FAIL(msg) \
do { \
crnlib_fail(#msg, __FILE__, __LINE__); \
} while (0)
#define CRNLIB_ASSERT_OPEN_RANGE(x, l, h) CRNLIB_ASSERT((x >= l) && (x < h))
#define CRNLIB_ASSERT_CLOSED_RANGE(x, l, h) CRNLIB_ASSERT((x >= l) && (x <= h))
@@ -27,9 +30,14 @@ void trace(const char* pFmt, va_list args);
void trace(const char* pFmt, ...);
// Borrowed from boost libraries.
template <bool x> struct crnlib_assume_failure;
template <> struct crnlib_assume_failure<true> { enum { blah = 1 }; };
template<int x> struct crnlib_assume_try { };
template <bool x>
struct crnlib_assume_failure;
template <>
struct crnlib_assume_failure<true> {
enum { blah = 1 };
};
template <int x>
struct crnlib_assume_try {};
#define CRNLIB_JOINER_FINAL(a, b) a##b
#define CRNLIB_JOINER(a, b) CRNLIB_JOINER_FINAL(a, b)
@@ -37,24 +45,22 @@ template<int x> struct crnlib_assume_try { };
#define CRNLIB_ASSUME(p) typedef crnlib_assume_try<sizeof(crnlib_assume_failure<(bool)(p)>)> CRNLIB_JOIN(crnlib_assume_typedef, __COUNTER__)
#ifdef NDEBUG
template<typename T> inline T crnlib_assert_range(T i, T m)
{
m;
template <typename T>
inline T crnlib_assert_range(T i, T) {
return i;
}
template<typename T> inline T crnlib_assert_range_incl(T i, T m)
{
m;
template <typename T>
inline T crnlib_assert_range_incl(T i, T) {
return i;
}
#else
template<typename T> inline T crnlib_assert_range(T i, T m)
{
template <typename T>
inline T crnlib_assert_range(T i, T m) {
CRNLIB_ASSERT((i >= 0) && (i < m));
return i;
}
template<typename T> inline T crnlib_assert_range_incl(T i, T m)
{
template <typename T>
inline T crnlib_assert_range_incl(T i, T m) {
CRNLIB_ASSERT((i >= 0) && (i <= m));
return i;
}
+24 -48
View File
@@ -11,13 +11,11 @@
#endif
#if defined(__GNUC__) && CRNLIB_PLATFORM_PC
extern __inline__ __attribute__((__always_inline__,__gnu_inline__)) void crnlib_yield_processor()
{
extern __inline__ __attribute__((__always_inline__, __gnu_inline__)) void crnlib_yield_processor() {
__asm__ __volatile__("pause");
}
#else
CRNLIB_FORCE_INLINE void crnlib_yield_processor()
{
CRNLIB_FORCE_INLINE void crnlib_yield_processor() {
#if CRNLIB_USE_MSVC_INTRINSICS
#if CRNLIB_PLATFORM_PC_X64
_mm_pause();
@@ -37,57 +35,49 @@ CRNLIB_FORCE_INLINE void crnlib_yield_processor()
#endif
#endif // CRNLIB_USE_WIN32_ATOMIC_FUNCTIONS
namespace crnlib
{
namespace crnlib {
#if CRNLIB_USE_WIN32_ATOMIC_FUNCTIONS
typedef LONG atomic32_t;
typedef LONGLONG atomic64_t;
// Returns the original value.
inline atomic32_t atomic_compare_exchange32(atomic32_t volatile *pDest, atomic32_t exchange, atomic32_t comparand)
{
inline atomic32_t atomic_compare_exchange32(atomic32_t volatile* pDest, atomic32_t exchange, atomic32_t comparand) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
return InterlockedCompareExchange(pDest, exchange, comparand);
}
// Returns the original value.
inline atomic64_t atomic_compare_exchange64(atomic64_t volatile *pDest, atomic64_t exchange, atomic64_t comparand)
{
inline atomic64_t atomic_compare_exchange64(atomic64_t volatile* pDest, atomic64_t exchange, atomic64_t comparand) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 7) == 0);
return _InterlockedCompareExchange64(pDest, exchange, comparand);
}
// Returns the resulting incremented value.
inline atomic32_t atomic_increment32(atomic32_t volatile *pDest)
{
inline atomic32_t atomic_increment32(atomic32_t volatile* pDest) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
return InterlockedIncrement(pDest);
}
// Returns the resulting decremented value.
inline atomic32_t atomic_decrement32(atomic32_t volatile *pDest)
{
inline atomic32_t atomic_decrement32(atomic32_t volatile* pDest) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
return InterlockedDecrement(pDest);
}
// Returns the original value.
inline atomic32_t atomic_exchange32(atomic32_t volatile *pDest, atomic32_t val)
{
inline atomic32_t atomic_exchange32(atomic32_t volatile* pDest, atomic32_t val) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
return InterlockedExchange(pDest, val);
}
// Returns the resulting value.
inline atomic32_t atomic_add32(atomic32_t volatile *pDest, atomic32_t val)
{
inline atomic32_t atomic_add32(atomic32_t volatile* pDest, atomic32_t val) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
return InterlockedExchangeAdd(pDest, val) + val;
}
// Returns the original value.
inline atomic32_t atomic_exchange_add32(atomic32_t volatile *pDest, atomic32_t val)
{
inline atomic32_t atomic_exchange_add32(atomic32_t volatile* pDest, atomic32_t val) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
return InterlockedExchangeAdd(pDest, val);
}
@@ -96,50 +86,43 @@ namespace crnlib
typedef long long atomic64_t;
// Returns the original value.
inline atomic32_t atomic_compare_exchange32(atomic32_t volatile *pDest, atomic32_t exchange, atomic32_t comparand)
{
inline atomic32_t atomic_compare_exchange32(atomic32_t volatile* pDest, atomic32_t exchange, atomic32_t comparand) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
return __sync_val_compare_and_swap(pDest, comparand, exchange);
}
// Returns the original value.
inline atomic64_t atomic_compare_exchange64(atomic64_t volatile *pDest, atomic64_t exchange, atomic64_t comparand)
{
inline atomic64_t atomic_compare_exchange64(atomic64_t volatile* pDest, atomic64_t exchange, atomic64_t comparand) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 7) == 0);
return __sync_val_compare_and_swap(pDest, comparand, exchange);
}
// Returns the resulting incremented value.
inline atomic32_t atomic_increment32(atomic32_t volatile *pDest)
{
inline atomic32_t atomic_increment32(atomic32_t volatile* pDest) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
return __sync_add_and_fetch(pDest, 1);
}
// Returns the resulting decremented value.
inline atomic32_t atomic_decrement32(atomic32_t volatile *pDest)
{
inline atomic32_t atomic_decrement32(atomic32_t volatile* pDest) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
return __sync_sub_and_fetch(pDest, 1);
}
// Returns the original value.
inline atomic32_t atomic_exchange32(atomic32_t volatile *pDest, atomic32_t val)
{
inline atomic32_t atomic_exchange32(atomic32_t volatile* pDest, atomic32_t val) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
return __sync_lock_test_and_set(pDest, val);
}
// Returns the resulting value.
inline atomic32_t atomic_add32(atomic32_t volatile *pDest, atomic32_t val)
{
inline atomic32_t atomic_add32(atomic32_t volatile* pDest, atomic32_t val) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
return __sync_add_and_fetch(pDest, val);
}
// Returns the original value.
inline atomic32_t atomic_exchange_add32(atomic32_t volatile *pDest, atomic32_t val)
{
inline atomic32_t atomic_exchange_add32(atomic32_t volatile* pDest, atomic32_t val) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
return __sync_fetch_and_add(pDest, val);
}
@@ -150,8 +133,7 @@ namespace crnlib
typedef long atomic32_t;
typedef long long atomic64_t;
inline atomic32_t atomic_compare_exchange32(atomic32_t volatile *pDest, atomic32_t exchange, atomic32_t comparand)
{
inline atomic32_t atomic_compare_exchange32(atomic32_t volatile* pDest, atomic32_t exchange, atomic32_t comparand) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
atomic32_t cur = *pDest;
if (cur == comparand)
@@ -159,8 +141,7 @@ namespace crnlib
return cur;
}
inline atomic64_t atomic_compare_exchange64(atomic64_t volatile *pDest, atomic64_t exchange, atomic64_t comparand)
{
inline atomic64_t atomic_compare_exchange64(atomic64_t volatile* pDest, atomic64_t exchange, atomic64_t comparand) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 7) == 0);
atomic64_t cur = *pDest;
if (cur == comparand)
@@ -168,34 +149,29 @@ namespace crnlib
return cur;
}
inline atomic32_t atomic_increment32(atomic32_t volatile *pDest)
{
inline atomic32_t atomic_increment32(atomic32_t volatile* pDest) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
return (*pDest += 1);
}
inline atomic32_t atomic_decrement32(atomic32_t volatile *pDest)
{
inline atomic32_t atomic_decrement32(atomic32_t volatile* pDest) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
return (*pDest -= 1);
}
inline atomic32_t atomic_exchange32(atomic32_t volatile *pDest, atomic32_t val)
{
inline atomic32_t atomic_exchange32(atomic32_t volatile* pDest, atomic32_t val) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
atomic32_t cur = *pDest;
*pDest = val;
return cur;
}
inline atomic32_t atomic_add32(atomic32_t volatile *pDest, atomic32_t val)
{
inline atomic32_t atomic_add32(atomic32_t volatile* pDest, atomic32_t val) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
return (*pDest += val);
}
inline atomic32_t atomic_exchange_add32(atomic32_t volatile *pDest, atomic32_t val)
{
inline atomic32_t atomic_exchange_add32(atomic32_t volatile* pDest, atomic32_t val) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(pDest) & 3) == 0);
atomic32_t cur = *pDest;
*pDest += val;
+23 -41
View File
@@ -3,43 +3,36 @@
#pragma once
#include "crn_data_stream.h"
namespace crnlib
{
class buffer_stream : public data_stream
{
namespace crnlib {
class buffer_stream : public data_stream {
public:
buffer_stream() :
data_stream(),
buffer_stream()
: data_stream(),
m_pBuf(NULL),
m_size(0),
m_ofs(0)
{
m_ofs(0) {
}
buffer_stream(void* p, uint size) :
data_stream(),
buffer_stream(void* p, uint size)
: data_stream(),
m_pBuf(NULL),
m_size(0),
m_ofs(0)
{
m_ofs(0) {
open(p, size);
}
buffer_stream(const void* p, uint size) :
data_stream(),
buffer_stream(const void* p, uint size)
: data_stream(),
m_pBuf(NULL),
m_size(0),
m_ofs(0)
{
m_ofs(0) {
open(p, size);
}
virtual ~buffer_stream()
{
virtual ~buffer_stream() {
}
bool open(const void* p, uint size)
{
bool open(const void* p, uint size) {
CRNLIB_ASSERT(p);
close();
@@ -55,8 +48,7 @@ namespace crnlib
return true;
}
bool open(void* p, uint size)
{
bool open(void* p, uint size) {
CRNLIB_ASSERT(p);
close();
@@ -72,10 +64,8 @@ namespace crnlib
return true;
}
virtual bool close()
{
if (m_opened)
{
virtual bool close() {
if (m_opened) {
m_opened = false;
m_pBuf = NULL;
m_size = 0;
@@ -91,8 +81,7 @@ namespace crnlib
virtual const void* get_ptr() const { return m_pBuf; }
virtual uint read(void* pBuf, uint len)
{
virtual uint read(void* pBuf, uint len) {
CRNLIB_ASSERT(pBuf && (len <= 0x7FFFFFFF));
if ((!m_opened) || (!is_readable()) || (!len))
@@ -112,8 +101,7 @@ namespace crnlib
return len;
}
virtual uint write(const void* pBuf, uint len)
{
virtual uint write(const void* pBuf, uint len) {
CRNLIB_ASSERT(pBuf && (len <= 0x7FFFFFFF));
if ((!m_opened) || (!is_writable()) || (!len))
@@ -133,24 +121,21 @@ namespace crnlib
return len;
}
virtual bool flush()
{
virtual bool flush() {
if (!m_opened)
return false;
return true;
}
virtual uint64 get_size()
{
virtual uint64 get_size() {
if (!m_opened)
return 0;
return m_size;
}
virtual uint64 get_remaining()
{
virtual uint64 get_remaining() {
if (!m_opened)
return 0;
@@ -159,16 +144,14 @@ namespace crnlib
return m_size - m_ofs;
}
virtual uint64 get_ofs()
{
virtual uint64 get_ofs() {
if (!m_opened)
return 0;
return m_ofs;
}
virtual bool seek(int64 ofs, bool relative)
{
virtual bool seek(int64 ofs, bool relative) {
if ((!m_opened) || (!is_seekable()))
return false;
@@ -193,4 +176,3 @@ namespace crnlib
};
} // namespace crnlib
+30 -56
View File
@@ -3,41 +3,33 @@
#pragma once
#include "crn_data_stream.h"
namespace crnlib
{
class cfile_stream : public data_stream
{
namespace crnlib {
class cfile_stream : public data_stream {
public:
cfile_stream() : data_stream(), m_pFile(NULL), m_size(0), m_ofs(0), m_has_ownership(false)
{
cfile_stream()
: data_stream(), m_pFile(NULL), m_size(0), m_ofs(0), m_has_ownership(false) {
}
cfile_stream(FILE* pFile, const char* pFilename, uint attribs, bool has_ownership) :
data_stream(), m_pFile(NULL), m_size(0), m_ofs(0), m_has_ownership(false)
{
cfile_stream(FILE* pFile, const char* pFilename, uint attribs, bool has_ownership)
: data_stream(), m_pFile(NULL), m_size(0), m_ofs(0), m_has_ownership(false) {
open(pFile, pFilename, attribs, has_ownership);
}
cfile_stream(const char* pFilename, uint attribs = cDataStreamReadable | cDataStreamSeekable, bool open_existing = false) :
data_stream(), m_pFile(NULL), m_size(0), m_ofs(0), m_has_ownership(false)
{
cfile_stream(const char* pFilename, uint attribs = cDataStreamReadable | cDataStreamSeekable, bool open_existing = false)
: data_stream(), m_pFile(NULL), m_size(0), m_ofs(0), m_has_ownership(false) {
open(pFilename, attribs, open_existing);
}
virtual ~cfile_stream()
{
virtual ~cfile_stream() {
close();
}
virtual bool close()
{
virtual bool close() {
clear_error();
if (m_opened)
{
if (m_opened) {
bool status = true;
if (m_has_ownership)
{
if (m_has_ownership) {
if (EOF == fclose(m_pFile))
status = false;
}
@@ -54,8 +46,7 @@ namespace crnlib
return false;
}
bool open(FILE* pFile, const char* pFilename, uint attribs, bool has_ownership)
{
bool open(FILE* pFile, const char* pFilename, uint attribs, bool has_ownership) {
CRNLIB_ASSERT(pFile);
CRNLIB_ASSERT(pFilename);
@@ -76,8 +67,7 @@ namespace crnlib
return true;
}
bool open(const char* pFilename, uint attribs = cDataStreamReadable | cDataStreamSeekable, bool open_existing = false)
{
bool open(const char* pFilename, uint attribs = cDataStreamReadable | cDataStreamSeekable, bool open_existing = false) {
CRNLIB_ASSERT(pFilename);
close();
@@ -91,8 +81,7 @@ namespace crnlib
pMode = open_existing ? "ab" : "wb";
else if (is_readable())
pMode = "rb";
else
{
else {
set_error();
return false;
}
@@ -101,8 +90,7 @@ namespace crnlib
crn_fopen(&pFile, pFilename, pMode);
m_has_ownership = true;
if (!pFile)
{
if (!pFile) {
set_error();
return false;
}
@@ -114,8 +102,7 @@ namespace crnlib
FILE* get_file() const { return m_pFile; }
virtual uint read(void* pBuf, uint len)
{
virtual uint read(void* pBuf, uint len) {
CRNLIB_ASSERT(pBuf && (len <= 0x7FFFFFFF));
if (!m_opened || (!is_readable()) || (!len))
@@ -123,8 +110,7 @@ namespace crnlib
len = static_cast<uint>(math::minimum<uint64>(len, get_remaining()));
if (fread(pBuf, 1, len, m_pFile) != len)
{
if (fread(pBuf, 1, len, m_pFile) != len) {
set_error();
return 0;
}
@@ -133,15 +119,13 @@ namespace crnlib
return len;
}
virtual uint write(const void* pBuf, uint len)
{
virtual uint write(const void* pBuf, uint len) {
CRNLIB_ASSERT(pBuf && (len <= 0x7FFFFFFF));
if (!m_opened || (!is_writable()) || (!len))
return 0;
if (fwrite(pBuf, 1, len, m_pFile) != len)
{
if (fwrite(pBuf, 1, len, m_pFile) != len) {
set_error();
return 0;
}
@@ -152,13 +136,11 @@ namespace crnlib
return len;
}
virtual bool flush()
{
virtual bool flush() {
if ((!m_opened) || (!is_writable()))
return false;
if (EOF == fflush(m_pFile))
{
if (EOF == fflush(m_pFile)) {
set_error();
return false;
}
@@ -166,16 +148,14 @@ namespace crnlib
return true;
}
virtual uint64 get_size()
{
virtual uint64 get_size() {
if (!m_opened)
return 0;
return m_size;
}
virtual uint64 get_remaining()
{
virtual uint64 get_remaining() {
if (!m_opened)
return 0;
@@ -183,16 +163,14 @@ namespace crnlib
return m_size - m_ofs;
}
virtual uint64 get_ofs()
{
virtual uint64 get_ofs() {
if (!m_opened)
return 0;
return m_ofs;
}
virtual bool seek(int64 ofs, bool relative)
{
virtual bool seek(int64 ofs, bool relative) {
if ((!m_opened) || (!is_seekable()))
return false;
@@ -202,10 +180,8 @@ namespace crnlib
else if (static_cast<uint64>(new_ofs) > m_size)
return false;
if (static_cast<uint64>(new_ofs) != m_ofs)
{
if (crn_fseek(m_pFile, new_ofs, SEEK_SET) != 0)
{
if (static_cast<uint64>(new_ofs) != m_ofs) {
if (crn_fseek(m_pFile, new_ofs, SEEK_SET) != 0) {
set_error();
return false;
}
@@ -216,16 +192,14 @@ namespace crnlib
return true;
}
static bool read_file_into_array(const char* pFilename, vector<uint8>& buf)
{
static bool read_file_into_array(const char* pFilename, vector<uint8>& buf) {
cfile_stream in_stream(pFilename);
if (!in_stream.is_opened())
return false;
return in_stream.read_array(buf);
}
static bool write_array_to_file(const char* pFilename, const vector<uint8>& buf)
{
static bool write_array_to_file(const char* pFilename, const vector<uint8>& buf) {
cfile_stream out_stream(pFilename, cDataStreamWritable | cDataStreamSeekable);
if (!out_stream.is_opened())
return false;
+4 -9
View File
@@ -1,11 +1,9 @@
// File: crn_checksum.cpp
#include "crn_core.h"
namespace crnlib
{
namespace crnlib {
// From the public domain stb.h header.
uint adler32(const void* pBuf, size_t buflen, uint adler32)
{
uint adler32(const void* pBuf, size_t buflen, uint adler32) {
const uint8* buffer = static_cast<const uint8*>(pBuf);
const unsigned long ADLER_MOD = 65521;
@@ -38,13 +36,11 @@ namespace crnlib
return (s2 << 16) + s1;
}
uint16 crc16(const void* pBuf, size_t len, uint16 crc)
{
uint16 crc16(const void* pBuf, size_t len, uint16 crc) {
crc = ~crc;
const uint8* p = reinterpret_cast<const uint8*>(pBuf);
while (len)
{
while (len) {
const uint16 q = *p++ ^ (crc >> 8);
crc <<= 8U;
uint16 r = (q >> 4) ^ q;
@@ -60,4 +56,3 @@ namespace crnlib
}
} // namespace crnlib
+1 -2
View File
@@ -1,8 +1,7 @@
// File: crn_checksum.h
#pragma once
namespace crnlib
{
namespace crnlib {
const uint cInitAdler32 = 1U;
uint adler32(const void* pBuf, size_t buflen, uint adler32 = cInitAdler32);
+59 -123
View File
@@ -3,22 +3,18 @@
#pragma once
#include "crn_matrix.h"
namespace crnlib
{
namespace crnlib {
template <typename VectorType>
class clusterizer
{
class clusterizer {
public:
clusterizer() :
m_overall_variance(0.0f),
clusterizer()
: m_overall_variance(0.0f),
m_split_index(0),
m_heap_size(0),
m_quick(false)
{
m_quick(false) {
}
void clear()
{
void clear() {
m_training_vecs.clear();
m_codebook.clear();
m_nodes.clear();
@@ -28,20 +24,17 @@ namespace crnlib
m_quick = false;
}
void reserve_training_vecs(uint num_expected)
{
void reserve_training_vecs(uint num_expected) {
m_training_vecs.reserve(num_expected);
}
void add_training_vec(const VectorType& v, uint weight)
{
void add_training_vec(const VectorType& v, uint weight) {
m_training_vecs.push_back(std::make_pair(v, weight));
}
typedef bool (*progress_callback_func_ptr)(uint percentage_completed, void* pData);
bool generate_codebook(uint max_size, progress_callback_func_ptr pProgress_callback = NULL, void* pProgress_data = NULL, bool quick = false)
{
bool generate_codebook(uint max_size, progress_callback_func_ptr pProgress_callback = NULL, void* pProgress_data = NULL, bool quick = false) {
if (m_training_vecs.empty())
return false;
@@ -52,8 +45,7 @@ namespace crnlib
vq_node root;
root.m_vectors.reserve(m_training_vecs.size());
for (uint i = 0; i < m_training_vecs.size(); i++)
{
for (uint i = 0; i < m_training_vecs.size(); i++) {
const VectorType& v = m_training_vecs[i].first;
const uint weight = m_training_vecs[i].second;
@@ -85,8 +77,7 @@ namespace crnlib
m_right_children.reserve(m_training_vecs.size() + 1);
int prev_percentage = -1;
while ((total_leaves < max_size) && (m_heap_size))
{
while ((total_leaves < max_size) && (m_heap_size)) {
int worst_node_index = m_heap[1];
m_heap[1] = m_heap[m_heap_size];
@@ -97,11 +88,9 @@ namespace crnlib
split_node(worst_node_index);
total_leaves++;
if ((pProgress_callback) && ((total_leaves & 63) == 0) && (max_size))
{
if ((pProgress_callback) && ((total_leaves & 63) == 0) && (max_size)) {
int cur_percentage = (total_leaves * 100U + (max_size / 2U)) / max_size;
if (cur_percentage != prev_percentage)
{
if (cur_percentage != prev_percentage) {
if (!(*pProgress_callback)(cur_percentage, pProgress_data))
return false;
@@ -114,11 +103,9 @@ namespace crnlib
m_overall_variance = 0.0f;
for (uint i = 0; i < m_nodes.size(); i++)
{
for (uint i = 0; i < m_nodes.size(); i++) {
vq_node& node = m_nodes[i];
if (node.m_left != -1)
{
if (node.m_left != -1) {
CRNLIB_ASSERT(node.m_right != -1);
continue;
}
@@ -149,33 +136,27 @@ namespace crnlib
inline float get_overall_variance() const { return m_overall_variance; }
inline uint get_codebook_size() const
{
inline uint get_codebook_size() const {
return m_codebook.size();
}
inline const VectorType& get_codebook_entry(uint index) const
{
inline const VectorType& get_codebook_entry(uint index) const {
return m_codebook[index];
}
VectorType& get_codebook_entry(uint index)
{
VectorType& get_codebook_entry(uint index) {
return m_codebook[index];
}
typedef crnlib::vector<VectorType> vector_vec_type;
inline const vector_vec_type& get_codebook() const
{
inline const vector_vec_type& get_codebook() const {
return m_codebook;
}
uint find_best_codebook_entry(const VectorType& v) const
{
uint find_best_codebook_entry(const VectorType& v) const {
uint cur_node_index = 0;
for ( ; ; )
{
for (;;) {
const vq_node& cur_node = m_nodes[cur_node_index];
if (cur_node.m_left == -1)
@@ -194,12 +175,10 @@ namespace crnlib
}
}
const VectorType& find_best_codebook_entry(const VectorType& v, uint max_codebook_size) const
{
const VectorType& find_best_codebook_entry(const VectorType& v, uint max_codebook_size) const {
uint cur_node_index = 0;
for ( ; ; )
{
for (;;) {
const vq_node& cur_node = m_nodes[cur_node_index];
if ((cur_node.m_left == -1) || ((cur_node.m_codebook_index + 1) >= (int)max_codebook_size))
@@ -218,16 +197,13 @@ namespace crnlib
}
}
uint find_best_codebook_entry_fs(const VectorType& v) const
{
uint find_best_codebook_entry_fs(const VectorType& v) const {
float best_dist = math::cNearlyInfinite;
uint best_index = 0;
for (uint i = 0; i < m_codebook.size(); i++)
{
for (uint i = 0; i < m_codebook.size(); i++) {
float dist = m_codebook[i].squared_distance(v);
if (dist < best_dist)
{
if (dist < best_dist) {
best_dist = dist;
best_index = i;
if (best_dist == 0.0f)
@@ -238,8 +214,7 @@ namespace crnlib
return best_index;
}
void retrieve_clusters(uint max_clusters, crnlib::vector< crnlib::vector<uint> >& clusters) const
{
void retrieve_clusters(uint max_clusters, crnlib::vector<crnlib::vector<uint> >& clusters) const {
clusters.resize(0);
clusters.reserve(max_clusters);
@@ -248,12 +223,10 @@ namespace crnlib
uint cur_node_index = 0;
for ( ; ; )
{
for (;;) {
const vq_node& cur_node = m_nodes[cur_node_index];
if ( (cur_node.is_leaf()) || ((cur_node.m_codebook_index + 2) > (int)max_clusters) )
{
if ((cur_node.is_leaf()) || ((cur_node.m_codebook_index + 2) > (int)max_clusters)) {
clusters.resize(clusters.size() + 1);
clusters.back() = cur_node.m_vectors;
@@ -272,9 +245,9 @@ namespace crnlib
private:
training_vec_array m_training_vecs;
struct vq_node
{
vq_node() : m_centroid(cClear), m_total_weight(0), m_left(-1), m_right(-1), m_codebook_index(-1), m_unsplittable(false) { }
struct vq_node {
vq_node()
: m_centroid(cClear), m_total_weight(0), m_left(-1), m_right(-1), m_codebook_index(-1), m_unsplittable(false) {}
VectorType m_centroid;
uint64 m_total_weight;
@@ -308,16 +281,14 @@ namespace crnlib
bool m_quick;
void insert_heap(uint node_index)
{
void insert_heap(uint node_index) {
const float variance = m_nodes[node_index].m_variance;
uint pos = ++m_heap_size;
if (m_heap_size >= m_heap.size())
m_heap.resize(m_heap_size + 1);
for ( ; ; )
{
for (;;) {
uint parent = pos >> 1;
if (!parent)
break;
@@ -334,17 +305,14 @@ namespace crnlib
m_heap[pos] = node_index;
}
void down_heap(uint pos)
{
void down_heap(uint pos) {
uint child;
uint orig = m_heap[pos];
const float orig_variance = m_nodes[orig].m_variance;
while ((child = (pos << 1)) <= m_heap_size)
{
if (child < m_heap_size)
{
while ((child = (pos << 1)) <= m_heap_size) {
if (child < m_heap_size) {
if (m_nodes[m_heap[child]].m_variance < m_nodes[m_heap[child + 1]].m_variance)
child++;
}
@@ -360,18 +328,15 @@ namespace crnlib
m_heap[pos] = orig;
}
void compute_split_estimate(VectorType& left_child_res, VectorType& right_child_res, const vq_node& parent_node)
{
void compute_split_estimate(VectorType& left_child_res, VectorType& right_child_res, const vq_node& parent_node) {
VectorType furthest(0);
double furthest_dist = -1.0f;
for (uint i = 0; i < parent_node.m_vectors.size(); i++)
{
for (uint i = 0; i < parent_node.m_vectors.size(); i++) {
const VectorType& v = m_training_vecs[parent_node.m_vectors[i]].first;
double dist = v.squared_distance(parent_node.m_centroid);
if (dist > furthest_dist)
{
if (dist > furthest_dist) {
furthest_dist = dist;
furthest = v;
}
@@ -380,13 +345,11 @@ namespace crnlib
VectorType opposite(0);
double opposite_dist = -1.0f;
for (uint i = 0; i < parent_node.m_vectors.size(); i++)
{
for (uint i = 0; i < parent_node.m_vectors.size(); i++) {
const VectorType& v = m_training_vecs[parent_node.m_vectors[i]].first;
double dist = v.squared_distance(furthest);
if (dist > opposite_dist)
{
if (dist > opposite_dist) {
opposite_dist = dist;
opposite = v;
}
@@ -396,10 +359,8 @@ namespace crnlib
right_child_res = (opposite + parent_node.m_centroid) * .5f;
}
void compute_split_pca(VectorType& left_child_res, VectorType& right_child_res, const vq_node& parent_node)
{
if (parent_node.m_vectors.size() == 2)
{
void compute_split_pca(VectorType& left_child_res, VectorType& right_child_res, const vq_node& parent_node) {
if (parent_node.m_vectors.size() == 2) {
left_child_res = m_training_vecs[parent_node.m_vectors[0]].first;
right_child_res = m_training_vecs[parent_node.m_vectors[1]].first;
return;
@@ -410,8 +371,7 @@ namespace crnlib
matrix<N, N, float> covar;
covar.clear();
for (uint i = 0; i < parent_node.m_vectors.size(); i++)
{
for (uint i = 0; i < parent_node.m_vectors.size(); i++) {
const VectorType v(m_training_vecs[parent_node.m_vectors[i]].first - parent_node.m_centroid);
const VectorType w(v * (float)m_training_vecs[parent_node.m_vectors[i]].second);
@@ -433,22 +393,19 @@ namespace crnlib
VectorType axis; //(1.0f);
if (N == 1)
axis.set(1.0f);
else
{
else {
for (uint i = 0; i < N; i++)
axis[i] = math::lerp(.75f, 1.25f, i * (1.0f / math::maximum<int>(N - 1, 1)));
}
VectorType prev_axis(axis);
for (uint iter = 0; iter < 10; iter++)
{
for (uint iter = 0; iter < 10; iter++) {
VectorType x;
double max_sum = 0;
for (uint i = 0; i < N; i++)
{
for (uint i = 0; i < N; i++) {
double sum = 0;
for (uint j = 0; j < N; j++)
@@ -479,32 +436,25 @@ namespace crnlib
double left_weight = 0.0f;
double right_weight = 0.0f;
for (uint i = 0; i < parent_node.m_vectors.size(); i++)
{
for (uint i = 0; i < parent_node.m_vectors.size(); i++) {
const float weight = (float)m_training_vecs[parent_node.m_vectors[i]].second;
const VectorType& v = m_training_vecs[parent_node.m_vectors[i]].first;
double t = (v - parent_node.m_centroid) * axis;
if (t < 0.0f)
{
if (t < 0.0f) {
left_child += v * weight;
left_weight += weight;
}
else
{
} else {
right_child += v * weight;
right_weight += weight;
}
}
if ((left_weight > 0.0f) && (right_weight > 0.0f))
{
if ((left_weight > 0.0f) && (right_weight > 0.0f)) {
left_child_res = left_child * (float)(1.0f / left_weight);
right_child_res = right_child * (float)(1.0f / right_weight);
}
else
{
} else {
compute_split_estimate(left_child_res, right_child_res, parent_node);
}
}
@@ -632,8 +582,7 @@ namespace crnlib
crnlib::vector<uint> m_left_children;
crnlib::vector<uint> m_right_children;
void split_node(uint index)
{
void split_node(uint index) {
vq_node& parent_node = m_nodes[index];
if (parent_node.m_vectors.size() == 1)
@@ -654,8 +603,7 @@ namespace crnlib
float right_variance = 0.0f;
const uint cMaxLoops = m_quick ? 2 : 8;
for (uint total_loops = 0; total_loops < cMaxLoops; total_loops++)
{
for (uint total_loops = 0; total_loops < cMaxLoops; total_loops++) {
m_left_children.resize(0);
m_right_children.resize(0);
@@ -668,25 +616,21 @@ namespace crnlib
left_weight = 0;
right_weight = 0;
for (uint i = 0; i < parent_node.m_vectors.size(); i++)
{
for (uint i = 0; i < parent_node.m_vectors.size(); i++) {
const VectorType& v = m_training_vecs[parent_node.m_vectors[i]].first;
const uint weight = m_training_vecs[parent_node.m_vectors[i]].second;
double left_dist2 = left_child.squared_distance(v);
double right_dist2 = right_child.squared_distance(v);
if (left_dist2 < right_dist2)
{
if (left_dist2 < right_dist2) {
m_left_children.push_back(parent_node.m_vectors[i]);
new_left_child += (v * (float)weight);
left_weight += weight;
left_ttsum += v.dot(v) * weight;
}
else
{
} else {
m_right_children.push_back(parent_node.m_vectors[i]);
new_right_child += (v * (float)weight);
@@ -696,8 +640,7 @@ namespace crnlib
}
}
if ((!left_weight) || (!right_weight))
{
if ((!left_weight) || (!right_weight)) {
parent_node.m_unsplittable = true;
return;
}
@@ -709,10 +652,7 @@ namespace crnlib
new_right_child *= (1.0f / right_weight);
left_child = new_left_child;
left_weight = left_weight;
right_child = new_right_child;
right_weight = right_weight;
float total_variance = left_variance + right_variance;
if (total_variance < .00001f)
@@ -755,10 +695,6 @@ namespace crnlib
if ((right_child_node.m_vectors.size() > 1) && (right_child_node.m_variance > 0.0f))
insert_heap(right_child_index);
}
};
} // namespace crnlib
+163 -257
View File
@@ -3,12 +3,10 @@
#pragma once
#include "crn_core.h"
namespace crnlib
{
template<typename component_type> struct color_quad_component_traits
{
enum
{
namespace crnlib {
template <typename component_type>
struct color_quad_component_traits {
enum {
cSigned = false,
cFloat = false,
cMin = cUINT8_MIN,
@@ -16,10 +14,9 @@ namespace crnlib
};
};
template<> struct color_quad_component_traits<int8>
{
enum
{
template <>
struct color_quad_component_traits<int8> {
enum {
cSigned = true,
cFloat = false,
cMin = cINT8_MIN,
@@ -27,10 +24,9 @@ namespace crnlib
};
};
template<> struct color_quad_component_traits<int16>
{
enum
{
template <>
struct color_quad_component_traits<int16> {
enum {
cSigned = true,
cFloat = false,
cMin = cINT16_MIN,
@@ -38,10 +34,9 @@ namespace crnlib
};
};
template<> struct color_quad_component_traits<uint16>
{
enum
{
template <>
struct color_quad_component_traits<uint16> {
enum {
cSigned = false,
cFloat = false,
cMin = cUINT16_MIN,
@@ -49,10 +44,9 @@ namespace crnlib
};
};
template<> struct color_quad_component_traits<int32>
{
enum
{
template <>
struct color_quad_component_traits<int32> {
enum {
cSigned = true,
cFloat = false,
cMin = cINT32_MIN,
@@ -60,10 +54,9 @@ namespace crnlib
};
};
template<> struct color_quad_component_traits<uint32>
{
enum
{
template <>
struct color_quad_component_traits<uint32> {
enum {
cSigned = false,
cFloat = false,
cMin = cUINT32_MIN,
@@ -71,10 +64,9 @@ namespace crnlib
};
};
template<> struct color_quad_component_traits<float>
{
enum
{
template <>
struct color_quad_component_traits<float> {
enum {
cSigned = false,
cFloat = true,
cMin = cINT32_MIN,
@@ -82,10 +74,9 @@ namespace crnlib
};
};
template<> struct color_quad_component_traits<double>
{
enum
{
template <>
struct color_quad_component_traits<double> {
enum {
cSigned = false,
cFloat = true,
cMin = cINT32_MIN,
@@ -94,14 +85,11 @@ namespace crnlib
};
template <typename component_type, typename parameter_type>
class color_quad : public helpers::rel_ops<color_quad<component_type, parameter_type> >
{
class color_quad : public helpers::rel_ops<color_quad<component_type, parameter_type> > {
template <typename T>
static inline parameter_type clamp(T v)
{
static inline parameter_type clamp(T v) {
parameter_type result = static_cast<parameter_type>(v);
if (!component_traits::cFloat)
{
if (!component_traits::cFloat) {
if (v < component_traits::cMin)
result = static_cast<parameter_type>(component_traits::cMin);
else if (v > component_traits::cMax)
@@ -112,17 +100,12 @@ namespace crnlib
#ifdef _MSC_VER
template <>
static inline parameter_type clamp(int v)
{
if (!component_traits::cFloat)
{
if ((!component_traits::cSigned) && (component_traits::cMin == 0) && (component_traits::cMax == 0xFF))
{
static inline parameter_type clamp(int v) {
if (!component_traits::cFloat) {
if ((!component_traits::cSigned) && (component_traits::cMin == 0) && (component_traits::cMax == 0xFF)) {
if (v & 0xFFFFFF00U)
v = (~(static_cast<int>(v) >> 31)) & 0xFF;
}
else
{
} else {
if (v < component_traits::cMin)
v = component_traits::cMin;
else if (v > component_traits::cMax)
@@ -140,8 +123,7 @@ namespace crnlib
enum { cNumComps = 4 };
union
{
union {
struct
{
component_type r;
@@ -155,56 +137,46 @@ namespace crnlib
uint32 m_u32;
};
inline color_quad()
{
inline color_quad() {
}
inline color_quad(eClear) :
r(0), g(0), b(0), a(0)
{
inline color_quad(eClear)
: r(0), g(0), b(0), a(0) {
}
inline color_quad(const color_quad& other) :
r(other.r), g(other.g), b(other.b), a(other.a)
{
inline color_quad(const color_quad& other)
: r(other.r), g(other.g), b(other.b), a(other.a) {
}
explicit inline color_quad(parameter_type y, parameter_type alpha = component_traits::cMax)
{
explicit inline color_quad(parameter_type y, parameter_type alpha = component_traits::cMax) {
set(y, alpha);
}
inline color_quad(parameter_type red, parameter_type green, parameter_type blue, parameter_type alpha = component_traits::cMax)
{
inline color_quad(parameter_type red, parameter_type green, parameter_type blue, parameter_type alpha = component_traits::cMax) {
set(red, green, blue, alpha);
}
explicit inline color_quad(eNoClamp, parameter_type y, parameter_type alpha = component_traits::cMax)
{
explicit inline color_quad(eNoClamp, parameter_type y, parameter_type alpha = component_traits::cMax) {
set_noclamp_y_alpha(y, alpha);
}
inline color_quad(eNoClamp, parameter_type red, parameter_type green, parameter_type blue, parameter_type alpha = component_traits::cMax)
{
inline color_quad(eNoClamp, parameter_type red, parameter_type green, parameter_type blue, parameter_type alpha = component_traits::cMax) {
set_noclamp_rgba(red, green, blue, alpha);
}
template <typename other_component_type, typename other_parameter_type>
inline color_quad(const color_quad<other_component_type, other_parameter_type>& other) :
r(static_cast<component_type>(clamp(other.r))), g(static_cast<component_type>(clamp(other.g))), b(static_cast<component_type>(clamp(other.b))), a(static_cast<component_type>(clamp(other.a)))
{
inline color_quad(const color_quad<other_component_type, other_parameter_type>& other)
: r(static_cast<component_type>(clamp(other.r))), g(static_cast<component_type>(clamp(other.g))), b(static_cast<component_type>(clamp(other.b))), a(static_cast<component_type>(clamp(other.a))) {
}
inline void clear()
{
inline void clear() {
r = 0;
g = 0;
b = 0;
a = 0;
}
inline color_quad& operator= (const color_quad& other)
{
inline color_quad& operator=(const color_quad& other) {
r = other.r;
g = other.g;
b = other.b;
@@ -212,8 +184,7 @@ namespace crnlib
return *this;
}
inline color_quad& set_rgb(const color_quad& other)
{
inline color_quad& set_rgb(const color_quad& other) {
r = other.r;
g = other.g;
b = other.b;
@@ -221,8 +192,7 @@ namespace crnlib
}
template <typename other_component_type, typename other_parameter_type>
inline color_quad& operator=(const color_quad<other_component_type, other_parameter_type>& other)
{
inline color_quad& operator=(const color_quad<other_component_type, other_parameter_type>& other) {
r = static_cast<component_type>(clamp(other.r));
g = static_cast<component_type>(clamp(other.g));
b = static_cast<component_type>(clamp(other.b));
@@ -230,14 +200,12 @@ namespace crnlib
return *this;
}
inline color_quad& operator= (parameter_type y)
{
inline color_quad& operator=(parameter_type y) {
set(y, component_traits::cMax);
return *this;
}
inline color_quad& set(parameter_type y, parameter_type alpha = component_traits::cMax)
{
inline color_quad& set(parameter_type y, parameter_type alpha = component_traits::cMax) {
y = clamp(y);
alpha = clamp(alpha);
r = static_cast<component_type>(y);
@@ -247,8 +215,7 @@ namespace crnlib
return *this;
}
inline color_quad& set_noclamp_y_alpha(parameter_type y, parameter_type alpha = component_traits::cMax)
{
inline color_quad& set_noclamp_y_alpha(parameter_type y, parameter_type alpha = component_traits::cMax) {
CRNLIB_ASSERT((y >= component_traits::cMin) && (y <= component_traits::cMax));
CRNLIB_ASSERT((alpha >= component_traits::cMin) && (alpha <= component_traits::cMax));
@@ -259,8 +226,7 @@ namespace crnlib
return *this;
}
inline color_quad& set(parameter_type red, parameter_type green, parameter_type blue, parameter_type alpha = component_traits::cMax)
{
inline color_quad& set(parameter_type red, parameter_type green, parameter_type blue, parameter_type alpha = component_traits::cMax) {
r = static_cast<component_type>(clamp(red));
g = static_cast<component_type>(clamp(green));
b = static_cast<component_type>(clamp(blue));
@@ -268,8 +234,7 @@ namespace crnlib
return *this;
}
inline color_quad& set_noclamp_rgba(parameter_type red, parameter_type green, parameter_type blue, parameter_type alpha)
{
inline color_quad& set_noclamp_rgba(parameter_type red, parameter_type green, parameter_type blue, parameter_type alpha) {
CRNLIB_ASSERT((red >= component_traits::cMin) && (red <= component_traits::cMax));
CRNLIB_ASSERT((green >= component_traits::cMin) && (green <= component_traits::cMax));
CRNLIB_ASSERT((blue >= component_traits::cMin) && (blue <= component_traits::cMax));
@@ -282,8 +247,7 @@ namespace crnlib
return *this;
}
inline color_quad& set_noclamp_rgb(parameter_type red, parameter_type green, parameter_type blue)
{
inline color_quad& set_noclamp_rgb(parameter_type red, parameter_type green, parameter_type blue) {
CRNLIB_ASSERT((red >= component_traits::cMin) && (red <= component_traits::cMax));
CRNLIB_ASSERT((green >= component_traits::cMin) && (green <= component_traits::cMax));
CRNLIB_ASSERT((blue >= component_traits::cMin) && (blue <= component_traits::cMax));
@@ -298,11 +262,16 @@ namespace crnlib
static inline parameter_type get_max_comp() { return component_traits::cMax; }
static inline bool get_comps_are_signed() { return component_traits::cSigned; }
inline component_type operator[] (uint i) const { CRNLIB_ASSERT(i < cNumComps); return c[i]; }
inline component_type& operator[] (uint i) { CRNLIB_ASSERT(i < cNumComps); return c[i]; }
inline component_type operator[](uint i) const {
CRNLIB_ASSERT(i < cNumComps);
return c[i];
}
inline component_type& operator[](uint i) {
CRNLIB_ASSERT(i < cNumComps);
return c[i];
}
inline color_quad& set_component(uint i, parameter_type f)
{
inline color_quad& set_component(uint i, parameter_type f) {
CRNLIB_ASSERT(i < cNumComps);
c[i] = static_cast<component_type>(clamp(f));
@@ -310,8 +279,7 @@ namespace crnlib
return *this;
}
inline color_quad& set_grayscale(parameter_t l)
{
inline color_quad& set_grayscale(parameter_t l) {
component_t x = static_cast<component_t>(clamp(l));
c[0] = x;
c[1] = x;
@@ -319,68 +287,57 @@ namespace crnlib
return *this;
}
inline color_quad& clamp(const color_quad& l, const color_quad& h)
{
inline color_quad& clamp(const color_quad& l, const color_quad& h) {
for (uint i = 0; i < cNumComps; i++)
c[i] = static_cast<component_type>(math::clamp<parameter_type>(c[i], l[i], h[i]));
return *this;
}
inline color_quad& clamp(parameter_type l, parameter_type h)
{
inline color_quad& clamp(parameter_type l, parameter_type h) {
for (uint i = 0; i < cNumComps; i++)
c[i] = static_cast<component_type>(math::clamp<parameter_type>(c[i], l, h));
return *this;
}
// Returns CCIR 601 luma (consistent with color_utils::RGB_To_Y).
inline parameter_type get_luma() const
{
inline parameter_type get_luma() const {
return static_cast<parameter_type>((19595U * r + 38470U * g + 7471U * b + 32768U) >> 16U);
}
// Returns REC 709 luma.
inline parameter_type get_luma_rec709() const
{
inline parameter_type get_luma_rec709() const {
return static_cast<parameter_type>((13938U * r + 46869U * g + 4729U * b + 32768U) >> 16U);
}
// Beware of endianness!
inline uint32 get_uint32() const
{
inline uint32 get_uint32() const {
CRNLIB_ASSERT(sizeof(*this) == sizeof(uint32));
return *reinterpret_cast<const uint32*>(this);
}
// Beware of endianness!
inline uint64 get_uint64() const
{
inline uint64 get_uint64() const {
CRNLIB_ASSERT(sizeof(*this) == sizeof(uint64));
return *reinterpret_cast<const uint64*>(this);
}
inline uint squared_distance(const color_quad& c, bool alpha = true) const
{
inline uint squared_distance(const color_quad& c, bool alpha = true) const {
return math::square(r - c.r) + math::square(g - c.g) + math::square(b - c.b) + (alpha ? math::square(a - c.a) : 0);
}
inline bool rgb_equals(const color_quad& rhs) const
{
inline bool rgb_equals(const color_quad& rhs) const {
return (r == rhs.r) && (g == rhs.g) && (b == rhs.b);
}
inline bool operator== (const color_quad& rhs) const
{
inline bool operator==(const color_quad& rhs) const {
if (sizeof(color_quad) == sizeof(uint32))
return m_u32 == rhs.m_u32;
else
return (r == rhs.r) && (g == rhs.g) && (b == rhs.b) && (a == rhs.a);
}
inline bool operator< (const color_quad& rhs) const
{
for (uint i = 0; i < cNumComps; i++)
{
inline bool operator<(const color_quad& rhs) const {
for (uint i = 0; i < cNumComps; i++) {
if (c[i] < rhs.c[i])
return true;
else if (!(c[i] == rhs.c[i]))
@@ -389,82 +346,70 @@ namespace crnlib
return false;
}
color_quad& operator+= (const color_quad& other)
{
color_quad& operator+=(const color_quad& other) {
for (uint i = 0; i < 4; i++)
c[i] = static_cast<component_type>(clamp(c[i] + other.c[i]));
return *this;
}
color_quad& operator-= (const color_quad& other)
{
color_quad& operator-=(const color_quad& other) {
for (uint i = 0; i < 4; i++)
c[i] = static_cast<component_type>(clamp(c[i] - other.c[i]));
return *this;
}
color_quad& operator*= (parameter_type v)
{
color_quad& operator*=(parameter_type v) {
for (uint i = 0; i < 4; i++)
c[i] = static_cast<component_type>(clamp(c[i] * v));
return *this;
}
color_quad& operator/= (parameter_type v)
{
color_quad& operator/=(parameter_type v) {
for (uint i = 0; i < 4; i++)
c[i] = static_cast<component_type>(c[i] / v);
return *this;
}
color_quad get_swizzled(uint x, uint y, uint z, uint w) const
{
color_quad get_swizzled(uint x, uint y, uint z, uint w) const {
CRNLIB_ASSERT((x | y | z | w) < 4);
return color_quad(c[x], c[y], c[z], c[w]);
}
friend color_quad operator+ (const color_quad& lhs, const color_quad& rhs)
{
friend color_quad operator+(const color_quad& lhs, const color_quad& rhs) {
color_quad result(lhs);
result += rhs;
return result;
}
friend color_quad operator- (const color_quad& lhs, const color_quad& rhs)
{
friend color_quad operator-(const color_quad& lhs, const color_quad& rhs) {
color_quad result(lhs);
result -= rhs;
return result;
}
friend color_quad operator* (const color_quad& lhs, parameter_type v)
{
friend color_quad operator*(const color_quad& lhs, parameter_type v) {
color_quad result(lhs);
result *= v;
return result;
}
friend color_quad operator/ (const color_quad& lhs, parameter_type v)
{
friend color_quad operator/(const color_quad& lhs, parameter_type v) {
color_quad result(lhs);
result /= v;
return result;
}
friend color_quad operator* (parameter_type v, const color_quad& rhs)
{
friend color_quad operator*(parameter_type v, const color_quad& rhs) {
color_quad result(rhs);
result *= v;
return result;
}
inline bool is_grayscale() const
{
inline bool is_grayscale() const {
return (c[0] == c[1]) && (c[1] == c[2]);
}
uint get_min_component_index(bool alpha = true) const
{
uint get_min_component_index(bool alpha = true) const {
uint index = 0;
uint limit = alpha ? cNumComps : (cNumComps - 1);
for (uint i = 1; i < limit; i++)
@@ -473,8 +418,7 @@ namespace crnlib
return index;
}
uint get_max_component_index(bool alpha = true) const
{
uint get_max_component_index(bool alpha = true) const {
uint index = 0;
uint limit = alpha ? cNumComps : (cNumComps - 1);
for (uint i = 1; i < limit; i++)
@@ -483,59 +427,51 @@ namespace crnlib
return index;
}
operator size_t() const
{
operator size_t() const {
return (size_t)fast_hash(this, sizeof(*this));
}
void get_float4(float* pDst)
{
void get_float4(float* pDst) {
for (uint i = 0; i < 4; i++)
pDst[i] = ((*this)[i] - component_traits::cMin) / float(component_traits::cMax - component_traits::cMin);
}
void get_float3(float* pDst)
{
void get_float3(float* pDst) {
for (uint i = 0; i < 3; i++)
pDst[i] = ((*this)[i] - component_traits::cMin) / float(component_traits::cMax - component_traits::cMin);
}
static color_quad component_min(const color_quad& a, const color_quad& b)
{
static color_quad component_min(const color_quad& a, const color_quad& b) {
color_quad result;
for (uint i = 0; i < 4; i++)
result[i] = static_cast<component_type>(math::minimum(a[i], b[i]));
return result;
}
static color_quad component_max(const color_quad& a, const color_quad& b)
{
static color_quad component_max(const color_quad& a, const color_quad& b) {
color_quad result;
for (uint i = 0; i < 4; i++)
result[i] = static_cast<component_type>(math::maximum(a[i], b[i]));
return result;
}
static color_quad make_black()
{
static color_quad make_black() {
return color_quad(0, 0, 0, component_traits::cMax);
}
static color_quad make_white()
{
static color_quad make_white() {
return color_quad(component_traits::cMax, component_traits::cMax, component_traits::cMax, component_traits::cMax);
}
}; // class color_quad
template <typename c, typename q>
struct scalar_type< color_quad<c, q> >
{
struct scalar_type<color_quad<c, q> > {
enum { cFlag = true };
static inline void construct(color_quad<c, q>* p) {}
static inline void construct(color_quad<c, q>* p, const color_quad<c, q>& init) { memcpy(p, &init, sizeof(color_quad<c, q>)); }
static inline void construct_array(color_quad<c, q>* p, uint n) { p, n; }
static inline void destruct(color_quad<c, q>* p) { p; }
static inline void destruct_array(color_quad<c, q>* p, uint n) { p, n; }
static inline void construct_array(color_quad<c, q>*, uint) {}
static inline void destruct(color_quad<c, q>*) {}
static inline void destruct_array(color_quad<c, q>*, uint) {}
};
typedef color_quad<uint8, int> color_quad_u8;
@@ -547,10 +483,8 @@ namespace crnlib
typedef color_quad<float, float> color_quad_f;
typedef color_quad<double, double> color_quad_d;
namespace color
{
inline uint elucidian_distance(uint r0, uint g0, uint b0, uint r1, uint g1, uint b1)
{
namespace color {
inline uint elucidian_distance(uint r0, uint g0, uint b0, uint r1, uint g1, uint b1) {
int dr = (int)r0 - (int)r1;
int dg = (int)g0 - (int)g1;
int db = (int)b0 - (int)b1;
@@ -558,8 +492,7 @@ namespace crnlib
return static_cast<uint>(dr * dr + dg * dg + db * db);
}
inline uint elucidian_distance(uint r0, uint g0, uint b0, uint a0, uint r1, uint g1, uint b1, uint a1)
{
inline uint elucidian_distance(uint r0, uint g0, uint b0, uint a0, uint r1, uint g1, uint b1, uint a1) {
int dr = (int)r0 - (int)r1;
int dg = (int)g0 - (int)g1;
int db = (int)b0 - (int)b1;
@@ -568,16 +501,14 @@ namespace crnlib
return static_cast<uint>(dr * dr + dg * dg + db * db + da * da);
}
inline uint elucidian_distance(const color_quad_u8& c0, const color_quad_u8& c1, bool alpha)
{
inline uint elucidian_distance(const color_quad_u8& c0, const color_quad_u8& c1, bool alpha) {
if (alpha)
return elucidian_distance(c0.r, c0.g, c0.b, c0.a, c1.r, c1.g, c1.b, c1.a);
else
return elucidian_distance(c0.r, c0.g, c0.b, c1.r, c1.g, c1.b);
}
inline uint weighted_elucidian_distance(uint r0, uint g0, uint b0, uint r1, uint g1, uint b1, uint wr, uint wg, uint wb)
{
inline uint weighted_elucidian_distance(uint r0, uint g0, uint b0, uint r1, uint g1, uint b1, uint wr, uint wg, uint wb) {
int dr = (int)r0 - (int)r1;
int dg = (int)g0 - (int)g1;
int db = (int)b0 - (int)b1;
@@ -588,8 +519,7 @@ namespace crnlib
inline uint weighted_elucidian_distance(
uint r0, uint g0, uint b0, uint a0,
uint r1, uint g1, uint b1, uint a1,
uint wr, uint wg, uint wb, uint wa)
{
uint wr, uint wg, uint wb, uint wa) {
int dr = (int)r0 - (int)r1;
int dg = (int)g0 - (int)g1;
int db = (int)b0 - (int)b1;
@@ -598,8 +528,7 @@ namespace crnlib
return static_cast<uint>((wr * dr * dr) + (wg * dg * dg) + (wb * db * db) + (wa * da * da));
}
inline uint weighted_elucidian_distance(const color_quad_u8& c0, const color_quad_u8& c1, uint wr, uint wg, uint wb, uint wa)
{
inline uint weighted_elucidian_distance(const color_quad_u8& c0, const color_quad_u8& c1, uint wr, uint wg, uint wb, uint wa) {
return weighted_elucidian_distance(c0.r, c0.g, c0.b, c0.a, c1.r, c1.g, c1.b, c1.a, wr, wg, wb, wa);
}
@@ -611,21 +540,17 @@ namespace crnlib
const uint cGWeight = 25; //73;
const uint cBWeight = 1; //3;
inline uint color_distance(bool perceptual, const color_quad_u8& e1, const color_quad_u8& e2, bool alpha)
{
if (perceptual)
{
inline uint color_distance(bool perceptual, const color_quad_u8& e1, const color_quad_u8& e2, bool alpha) {
if (perceptual) {
if (alpha)
return weighted_elucidian_distance(e1, e2, cRWeight, cGWeight, cBWeight, cRWeight + cGWeight + cBWeight);
else
return weighted_elucidian_distance(e1, e2, cRWeight, cGWeight, cBWeight, 0);
}
else
} else
return elucidian_distance(e1, e2, alpha);
}
inline uint peak_color_error(const color_quad_u8& e1, const color_quad_u8& e2)
{
inline uint peak_color_error(const color_quad_u8& e1, const color_quad_u8& e2) {
return math::maximum<uint>(labs(e1[0] - e2[0]), labs(e1[1] - e2[1]), labs(e1[2] - e2[2]));
//return math::square<int>(e1[0] - e2[0]) + math::square<int>(e1[1] - e2[1]) + math::square<int>(e1[2] - e2[2]);
}
@@ -633,38 +558,42 @@ namespace crnlib
// y - [0,255]
// co - [-127,127]
// cg - [-126,127]
inline void RGB_to_YCoCg(int r, int g, int b, int& y, int& co, int& cg)
{
inline void RGB_to_YCoCg(int r, int g, int b, int& y, int& co, int& cg) {
y = (r >> 2) + (g >> 1) + (b >> 2);
co = (r >> 1) - (b >> 1);
cg = -(r >> 2) + (g >> 1) - (b >> 2);
}
inline void YCoCg_to_RGB(int y, int co, int cg, int& r, int& g, int& b)
{
inline void YCoCg_to_RGB(int y, int co, int cg, int& r, int& g, int& b) {
int tmp = y - cg;
g = y + cg;
r = tmp + co;
b = tmp - co;
}
static inline uint8 clamp_component(int i) { if (static_cast<uint>(i) > 255U) { if (i < 0) i = 0; else if (i > 255) i = 255; } return static_cast<uint8>(i); }
static inline uint8 clamp_component(int i) {
if (static_cast<uint>(i) > 255U) {
if (i < 0)
i = 0;
else if (i > 255)
i = 255;
}
return static_cast<uint8>(i);
}
// RGB->YCbCr constants, scaled by 2^16
const int YR = 19595, YG = 38470, YB = 7471, CB_R = -11059, CB_G = -21709, CB_B = 32768, CR_R = 32768, CR_G = -27439, CR_B = -5329;
// YCbCr->RGB constants, scaled by 2^16
const int R_CR = 91881, B_CB = 116130, G_CR = -46802, G_CB = -22554;
inline int RGB_to_Y(const color_quad_u8& rgb)
{
inline int RGB_to_Y(const color_quad_u8& rgb) {
const int r = rgb[0], g = rgb[1], b = rgb[2];
return (r * YR + g * YG + b * YB + 32768) >> 16;
}
// RGB to YCbCr (same as JFIF JPEG).
// Odd default biases account for 565 endpoint packing.
inline void RGB_to_YCC(color_quad_u8& ycc, const color_quad_u8& rgb, int cb_bias = 123, int cr_bias = 125)
{
inline void RGB_to_YCC(color_quad_u8& ycc, const color_quad_u8& rgb, int cb_bias = 123, int cr_bias = 125) {
const int r = rgb[0], g = rgb[1], b = rgb[2];
ycc.a = static_cast<uint8>((r * YR + g * YG + b * YB + 32768) >> 16);
ycc.r = clamp_component(cb_bias + ((r * CB_R + g * CB_G + b * CB_B + 32768) >> 16));
@@ -674,8 +603,7 @@ namespace crnlib
// YCbCr to RGB.
// Odd biases account for 565 endpoint packing.
inline void YCC_to_RGB(color_quad_u8& rgb, const color_quad_u8& ycc, int cb_bias = 123, int cr_bias = 125)
{
inline void YCC_to_RGB(color_quad_u8& rgb, const color_quad_u8& ycc, int cb_bias = 123, int cr_bias = 125) {
const int y = ycc.a;
const int cb = ycc.r - cb_bias;
const int cr = ycc.g - cr_bias;
@@ -691,8 +619,7 @@ namespace crnlib
// Float YCbCr->RGB constants
const float F_R_CR = S * R_CR, F_B_CB = S * B_CB, F_G_CR = S * G_CR, F_G_CB = S * G_CB;
inline void RGB_to_YCC_float(color_quad_f& ycc, const color_quad_u8& rgb)
{
inline void RGB_to_YCC_float(color_quad_f& ycc, const color_quad_u8& rgb) {
const int r = rgb[0], g = rgb[1], b = rgb[2];
ycc.a = r * F_YR + g * F_YG + b * F_YB;
ycc.r = r * F_CB_R + g * F_CB_G + b * F_CB_B;
@@ -700,8 +627,7 @@ namespace crnlib
ycc.b = 0;
}
inline void YCC_float_to_RGB(color_quad_u8& rgb, const color_quad_f& ycc)
{
inline void YCC_float_to_RGB(color_quad_u8& rgb, const color_quad_f& ycc) {
float y = ycc.a, cb = ycc.r, cr = ycc.g;
rgb.r = color::clamp_component(static_cast<int>(.5f + y + F_R_CR * cr));
rgb.g = color::clamp_component(static_cast<int>(.5f + y + F_G_CR * cr + F_G_CB * cb));
@@ -714,26 +640,21 @@ namespace crnlib
// This class purposely trades off speed for extremely flexibility. It can handle any component swizzle, any pixel type from 1-4 components and 1-32 bits/component,
// any pixel size between 1-16 bytes/pixel, any pixel stride, any color_quad data type (signed/unsigned/float 8/16/32 bits/component), and scaled/non-scaled components.
// On the downside, it's freaking slow.
class pixel_packer
{
class pixel_packer {
public:
pixel_packer()
{
pixel_packer() {
clear();
}
pixel_packer(uint num_comps, uint bits_per_comp, int pixel_stride = -1, bool reversed = false)
{
pixel_packer(uint num_comps, uint bits_per_comp, int pixel_stride = -1, bool reversed = false) {
init(num_comps, bits_per_comp, pixel_stride, reversed);
}
pixel_packer(const char* pComp_map, int pixel_stride = -1, int force_comp_size = -1)
{
pixel_packer(const char* pComp_map, int pixel_stride = -1, int force_comp_size = -1) {
init(pComp_map, pixel_stride, force_comp_size);
}
void clear()
{
void clear() {
utils::zero_this(this);
}
@@ -743,21 +664,27 @@ namespace crnlib
void set_pixel_stride(uint n) { m_pixel_stride = n; }
uint get_num_comps() const { return m_num_comps; }
uint get_comp_size(uint index) const { CRNLIB_ASSERT(index < 4); return m_comp_size[index]; }
uint get_comp_ofs(uint index) const { CRNLIB_ASSERT(index < 4); return m_comp_ofs[index]; }
uint get_comp_max(uint index) const { CRNLIB_ASSERT(index < 4); return m_comp_max[index]; }
uint get_comp_size(uint index) const {
CRNLIB_ASSERT(index < 4);
return m_comp_size[index];
}
uint get_comp_ofs(uint index) const {
CRNLIB_ASSERT(index < 4);
return m_comp_ofs[index];
}
uint get_comp_max(uint index) const {
CRNLIB_ASSERT(index < 4);
return m_comp_max[index];
}
bool get_rgb_is_luma() const { return m_rgb_is_luma; }
template <typename color_quad_type>
const void* unpack(const void* p, color_quad_type& color, bool rescale = true) const
{
const void* unpack(const void* p, color_quad_type& color, bool rescale = true) const {
const uint8* pSrc = static_cast<const uint8*>(p);
for (uint i = 0; i < 4; i++)
{
for (uint i = 0; i < 4; i++) {
const uint comp_size = m_comp_size[i];
if (!comp_size)
{
if (!comp_size) {
if (color_quad_type::component_traits::cFloat)
color[i] = static_cast<typename color_quad_type::parameter_t>((i == 3) ? 1 : 0);
else
@@ -767,8 +694,7 @@ namespace crnlib
uint n = 0, dst_bit_ofs = 0;
uint src_bit_ofs = m_comp_ofs[i];
while (dst_bit_ofs < comp_size)
{
while (dst_bit_ofs < comp_size) {
const uint byte_bit_ofs = src_bit_ofs & 7;
n |= ((pSrc[src_bit_ofs >> 3] >> byte_bit_ofs) << dst_bit_ofs);
@@ -792,8 +718,7 @@ namespace crnlib
color.set_component(i, static_cast<typename color_quad_type::parameter_t>(n));
}
if (m_rgb_is_luma)
{
if (m_rgb_is_luma) {
color[0] = color[1];
color[2] = color[1];
}
@@ -802,12 +727,10 @@ namespace crnlib
}
template <typename color_quad_type>
void* pack(const color_quad_type& color, void* p, bool rescale = true) const
{
void* pack(const color_quad_type& color, void* p, bool rescale = true) const {
uint8* pDst = static_cast<uint8*>(p);
for (uint i = 0; i < 4; i++)
{
for (uint i = 0; i < 4; i++) {
const uint comp_size = m_comp_size[i];
if (!comp_size)
continue;
@@ -815,8 +738,7 @@ namespace crnlib
uint32 mx = m_comp_max[i];
uint32 n;
if (color_quad_type::component_traits::cFloat)
{
if (color_quad_type::component_traits::cFloat) {
typename color_quad_type::parameter_t t = color[i];
if (t < 0.0f)
n = 0;
@@ -824,9 +746,7 @@ namespace crnlib
n = mx;
else
n = math::minimum<uint32>(static_cast<uint32>(floor(t + .5f)), mx);
}
else if (rescale)
{
} else if (rescale) {
if (color_quad_type::component_traits::cSigned)
n = math::maximum<int>(static_cast<int>(color[i]), 0);
else
@@ -834,9 +754,7 @@ namespace crnlib
const uint32 h = static_cast<uint32>(color_quad_type::component_traits::cMax);
n = static_cast<uint32>((static_cast<uint64>(n) * mx + (h >> 1)) / h);
}
else
{
} else {
if (color_quad_type::component_traits::cSigned)
n = math::minimum<uint32>(static_cast<uint32>(math::maximum<int>(static_cast<int>(color[i]), 0)), mx);
else
@@ -845,8 +763,7 @@ namespace crnlib
uint src_bit_ofs = 0;
uint dst_bit_ofs = m_comp_ofs[i];
while (src_bit_ofs < comp_size)
{
while (src_bit_ofs < comp_size) {
const uint cur_byte_bit_ofs = (dst_bit_ofs & 7);
const uint cur_byte_bits = 8 - cur_byte_bit_ofs;
@@ -867,18 +784,15 @@ namespace crnlib
return pDst + m_pixel_stride;
}
bool init(uint num_comps, uint bits_per_comp, int pixel_stride = -1, bool reversed = false)
{
bool init(uint num_comps, uint bits_per_comp, int pixel_stride = -1, bool reversed = false) {
clear();
if ((num_comps < 1) || (num_comps > 4) || (bits_per_comp < 1) || (bits_per_comp > 32))
{
if ((num_comps < 1) || (num_comps > 4) || (bits_per_comp < 1) || (bits_per_comp > 32)) {
CRNLIB_ASSERT(0);
return false;
}
for (uint i = 0; i < num_comps; i++)
{
for (uint i = 0; i < num_comps; i++) {
m_comp_size[i] = bits_per_comp;
m_comp_ofs[i] = i * bits_per_comp;
if (reversed)
@@ -900,14 +814,12 @@ namespace crnlib
// Y8A8
// A8R8G8B8
// First component is at LSB in memory. Assumes unsigned integer components, 1-32bits each.
bool init(const char* pComp_map, int pixel_stride = -1, int force_comp_size = -1)
{
bool init(const char* pComp_map, int pixel_stride = -1, int force_comp_size = -1) {
clear();
uint cur_bit_ofs = 0;
while (*pComp_map)
{
while (*pComp_map) {
char c = *pComp_map++;
int comp_index = -1;
@@ -927,14 +839,12 @@ namespace crnlib
uint comp_size = 0;
uint n = *pComp_map;
if ((n >= '0') && (n <= '9'))
{
if ((n >= '0') && (n <= '9')) {
comp_size = n - '0';
pComp_map++;
n = *pComp_map;
if ((n >= '0') && (n <= '9'))
{
if ((n >= '0') && (n <= '9')) {
comp_size = (comp_size * 10) + (n - '0');
pComp_map++;
}
@@ -946,8 +856,7 @@ namespace crnlib
if ((!comp_size) || (comp_size > 32))
return false;
if (comp_index == 4)
{
if (comp_index == 4) {
if (m_comp_size[0] || m_comp_size[1] || m_comp_size[2])
return false;
@@ -957,9 +866,7 @@ namespace crnlib
m_comp_size[1] = comp_size;
m_rgb_is_luma = true;
m_num_comps++;
}
else if (comp_index >= 0)
{
} else if (comp_index >= 0) {
if (m_comp_size[comp_index])
return false;
@@ -991,4 +898,3 @@ namespace crnlib
};
} // namespace crnlib
+27 -37
View File
@@ -6,51 +6,50 @@
#include "crn_winhdr.h"
#endif
namespace crnlib
{
void colorized_console::init()
{
namespace crnlib {
void colorized_console::init() {
console::init();
console::add_console_output_func(console_output_func, NULL);
}
void colorized_console::deinit()
{
void colorized_console::deinit() {
console::remove_console_output_func(console_output_func);
console::deinit();
}
void colorized_console::tick()
{
void colorized_console::tick() {
}
#ifdef CRNLIB_USE_WIN32_API
bool colorized_console::console_output_func(eConsoleMessageType type, const char* pMsg, void* pData)
{
pData;
bool colorized_console::console_output_func(eConsoleMessageType type, const char* pMsg, void*) {
if (console::get_output_disabled())
return true;
HANDLE cons = GetStdHandle(STD_OUTPUT_HANDLE);
DWORD attr = FOREGROUND_RED | FOREGROUND_GREEN | FOREGROUND_BLUE;
switch (type)
{
case cDebugConsoleMessage: attr = FOREGROUND_BLUE | FOREGROUND_INTENSITY; break;
case cMessageConsoleMessage: attr = FOREGROUND_GREEN | FOREGROUND_BLUE | FOREGROUND_INTENSITY; break;
case cWarningConsoleMessage: attr = FOREGROUND_GREEN | FOREGROUND_RED | FOREGROUND_INTENSITY; break;
case cErrorConsoleMessage: attr = FOREGROUND_RED | FOREGROUND_INTENSITY; break;
default: break;
switch (type) {
case cDebugConsoleMessage:
attr = FOREGROUND_BLUE | FOREGROUND_INTENSITY;
break;
case cMessageConsoleMessage:
attr = FOREGROUND_GREEN | FOREGROUND_BLUE | FOREGROUND_INTENSITY;
break;
case cWarningConsoleMessage:
attr = FOREGROUND_GREEN | FOREGROUND_RED | FOREGROUND_INTENSITY;
break;
case cErrorConsoleMessage:
attr = FOREGROUND_RED | FOREGROUND_INTENSITY;
break;
default:
break;
}
if (INVALID_HANDLE_VALUE != cons)
SetConsoleTextAttribute(cons, (WORD)attr);
if ((console::get_prefixes()) && (console::get_at_beginning_of_line()))
{
switch (type)
{
if ((console::get_prefixes()) && (console::get_at_beginning_of_line())) {
switch (type) {
case cDebugConsoleMessage:
printf("Debug: %s", pMsg);
break;
@@ -64,9 +63,7 @@ namespace crnlib
printf("%s", pMsg);
break;
}
}
else
{
} else {
printf("%s", pMsg);
}
@@ -79,16 +76,12 @@ namespace crnlib
return true;
}
#else
bool colorized_console::console_output_func(eConsoleMessageType type, const char* pMsg, void* pData)
{
pData;
bool colorized_console::console_output_func(eConsoleMessageType type, const char* pMsg, void*) {
if (console::get_output_disabled())
return true;
if ((console::get_prefixes()) && (console::get_at_beginning_of_line()))
{
switch (type)
{
if ((console::get_prefixes()) && (console::get_at_beginning_of_line())) {
switch (type) {
case cDebugConsoleMessage:
printf("Debug: %s", pMsg);
break;
@@ -102,9 +95,7 @@ namespace crnlib
printf("%s", pMsg);
break;
}
}
else
{
} else {
printf("%s", pMsg);
}
@@ -116,4 +107,3 @@ namespace crnlib
#endif
} // namespace crnlib
+2 -4
View File
@@ -3,10 +3,8 @@
#pragma once
#include "crn_console.h"
namespace crnlib
{
class colorized_console
{
namespace crnlib {
class colorized_console {
public:
static void init();
static void deinit();
+54 -116
View File
@@ -12,17 +12,14 @@
#if CRNLIB_USE_WIN32_API
#include "crn_winhdr.h"
#endif
namespace crnlib
{
void get_command_line_as_single_string(dynamic_string& cmd_line, int argc, char *argv[])
{
namespace crnlib {
void get_command_line_as_single_string(dynamic_string& cmd_line, int argc, char* argv[]) {
argc, argv;
#if CRNLIB_USE_WIN32_API
cmd_line.set(GetCommandLineA());
#else
cmd_line.clear();
for (int i = 0; i < argc; i++)
{
for (int i = 0; i < argc; i++) {
dynamic_string tmp(argv[i]);
if ((tmp.front() != '"') && (tmp.front() != '-') && (tmp.front() != '@'))
tmp = "\"" + tmp + "\"";
@@ -33,57 +30,44 @@ namespace crnlib
#endif
}
command_line_params::command_line_params()
{
command_line_params::command_line_params() {
}
void command_line_params::clear()
{
void command_line_params::clear() {
m_params.clear();
m_param_map.clear();
}
bool command_line_params::split_params(const char* p, dynamic_string_array& params)
{
bool command_line_params::split_params(const char* p, dynamic_string_array& params) {
bool within_param = false;
bool within_quote = false;
uint ofs = 0;
dynamic_string str;
while (p[ofs])
{
while (p[ofs]) {
const char c = p[ofs];
if (within_param)
{
if (within_quote)
{
if (within_param) {
if (within_quote) {
if (c == '"')
within_quote = false;
str.append_char(c);
}
else if ((c == ' ') || (c == '\t'))
{
if (!str.is_empty())
{
} else if ((c == ' ') || (c == '\t')) {
if (!str.is_empty()) {
params.push_back(str);
str.clear();
}
within_param = false;
}
else
{
} else {
if (c == '"')
within_quote = true;
str.append_char(c);
}
}
else if ((c != ' ') && (c != '\t'))
{
} else if ((c != ' ') && (c != '\t')) {
within_param = true;
if (c == '"')
@@ -95,8 +79,7 @@ namespace crnlib
ofs++;
}
if (within_quote)
{
if (within_quote) {
console::error("Unmatched quote in command line \"%s\"", p);
return false;
}
@@ -107,19 +90,16 @@ namespace crnlib
return true;
}
bool command_line_params::load_string_file(const char* pFilename, dynamic_string_array& strings)
{
bool command_line_params::load_string_file(const char* pFilename, dynamic_string_array& strings) {
cfile_stream in_stream;
if (!in_stream.open(pFilename, cDataStreamReadable | cDataStreamSeekable))
{
if (!in_stream.open(pFilename, cDataStreamReadable | cDataStreamSeekable)) {
console::error("Unable to open file \"%s\" for reading!", pFilename);
return false;
}
dynamic_string ansi_str;
for ( ; ; )
{
for (;;) {
if (!in_stream.read_line(ansi_str))
break;
@@ -133,15 +113,13 @@ namespace crnlib
return true;
}
bool command_line_params::parse(const dynamic_string_array& params, uint n, const param_desc* pParam_desc)
{
bool command_line_params::parse(const dynamic_string_array& params, uint n, const param_desc* pParam_desc) {
CRNLIB_ASSERT(n && pParam_desc);
m_params = params;
uint arg_index = 0;
while (arg_index < params.size())
{
while (arg_index < params.size()) {
const uint cur_arg_index = arg_index;
const dynamic_string& src_param = params[arg_index++];
@@ -153,8 +131,7 @@ namespace crnlib
if (src_param[0] == '-')
#endif
{
if (src_param.get_len() < 2)
{
if (src_param.get_len() < 2) {
console::error("Invalid command line parameter: \"%s\"", src_param.get_ptr());
return false;
}
@@ -178,8 +155,7 @@ namespace crnlib
if (key_str == pParam_desc[param_index].m_pName)
break;
if (param_index == n)
{
if (param_index == n) {
console::error("Unrecognized command line parameter: \"%s\"", src_param.get_ptr());
return false;
}
@@ -189,12 +165,10 @@ namespace crnlib
const uint cMaxValues = 16;
dynamic_string val_str[cMaxValues];
uint num_val_strs = 0;
if (desc.m_num_values)
{
if (desc.m_num_values) {
CRNLIB_ASSERT(desc.m_num_values <= cMaxValues);
if ((arg_index + desc.m_num_values) > params.size())
{
if ((arg_index + desc.m_num_values) > params.size()) {
console::error("Expected %u value(s) after command line parameter: \"%s\"", desc.m_num_values, src_param.get_ptr());
return false;
}
@@ -205,22 +179,17 @@ namespace crnlib
dynamic_string_array strings;
if ((desc.m_support_listing_file) && (val_str[0].get_len() >= 2) && (val_str[0][0] == '@'))
{
if ((desc.m_support_listing_file) && (val_str[0].get_len() >= 2) && (val_str[0][0] == '@')) {
dynamic_string filename(val_str[0]);
filename.right(1);
filename.unquote();
if (!load_string_file(filename.get_ptr(), strings))
{
if (!load_string_file(filename.get_ptr(), strings)) {
console::error("Failed loading listing file \"%s\"!", filename.get_ptr());
return false;
}
}
else
{
for (uint v = 0; v < num_val_strs; v++)
{
} else {
for (uint v = 0; v < num_val_strs; v++) {
val_str[v].unquote();
strings.push_back(val_str[v]);
}
@@ -231,9 +200,7 @@ namespace crnlib
pv.m_index = cur_arg_index;
pv.m_modifier = (int8)modifier;
m_param_map.insert(std::make_pair(key_str, pv));
}
else
{
} else {
param_value pv;
pv.m_values.push_back(src_param);
pv.m_values.back().unquote();
@@ -245,8 +212,7 @@ namespace crnlib
return true;
}
bool command_line_params::parse(const char* pCmd_line, uint n, const param_desc* pParam_desc, bool skip_first_param)
{
bool command_line_params::parse(const char* pCmd_line, uint n, const param_desc* pParam_desc, bool skip_first_param) {
CRNLIB_ASSERT(n && pParam_desc);
dynamic_string_array p;
@@ -262,8 +228,7 @@ namespace crnlib
return parse(p, n, pParam_desc);
}
bool command_line_params::is_param(uint index) const
{
bool command_line_params::is_param(uint index) const {
CRNLIB_ASSERT(index < m_params.size());
if (index >= m_params.size())
return false;
@@ -279,32 +244,27 @@ namespace crnlib
#endif
}
uint command_line_params::find(uint num_keys, const char** ppKeys, crnlib::vector<param_map_const_iterator>* pIterators, crnlib::vector<uint>* pUnmatched_indices) const
{
uint command_line_params::find(uint num_keys, const char** ppKeys, crnlib::vector<param_map_const_iterator>* pIterators, crnlib::vector<uint>* pUnmatched_indices) const {
CRNLIB_ASSERT(ppKeys);
if (pUnmatched_indices)
{
if (pUnmatched_indices) {
pUnmatched_indices->resize(m_params.size());
for (uint i = 0; i < m_params.size(); i++)
(*pUnmatched_indices)[i] = i;
}
uint n = 0;
for (uint i = 0; i < num_keys; i++)
{
for (uint i = 0; i < num_keys; i++) {
const char* pKey = ppKeys[i];
param_map_const_iterator begin, end;
find(pKey, begin, end);
while (begin != end)
{
while (begin != end) {
if (pIterators)
pIterators->push_back(begin);
if (pUnmatched_indices)
{
if (pUnmatched_indices) {
int k = pUnmatched_indices->find(begin->second.m_index);
if (k >= 0)
pUnmatched_indices->erase_unordered(k);
@@ -318,22 +278,19 @@ namespace crnlib
return n;
}
void command_line_params::find(const char* pKey, param_map_const_iterator& begin, param_map_const_iterator& end) const
{
void command_line_params::find(const char* pKey, param_map_const_iterator& begin, param_map_const_iterator& end) const {
dynamic_string key(pKey);
begin = m_param_map.lower_bound(key);
end = m_param_map.upper_bound(key);
}
uint command_line_params::get_count(const char* pKey) const
{
uint command_line_params::get_count(const char* pKey) const {
param_map_const_iterator begin, end;
find(pKey, begin, end);
uint n = 0;
while (begin != end)
{
while (begin != end) {
n++;
begin++;
}
@@ -341,8 +298,7 @@ namespace crnlib
return n;
}
command_line_params::param_map_const_iterator command_line_params::get_param(const char* pKey, uint index) const
{
command_line_params::param_map_const_iterator command_line_params::get_param(const char* pKey, uint index) const {
param_map_const_iterator begin, end;
find(pKey, begin, end);
@@ -351,8 +307,7 @@ namespace crnlib
uint n = 0;
while ((begin != end) && (n != index))
{
while ((begin != end) && (n != index)) {
n++;
begin++;
}
@@ -363,13 +318,11 @@ namespace crnlib
return begin;
}
bool command_line_params::has_value(const char* pKey, uint index) const
{
bool command_line_params::has_value(const char* pKey, uint index) const {
return get_num_values(pKey, index) != 0;
}
uint command_line_params::get_num_values(const char* pKey, uint index) const
{
uint command_line_params::get_num_values(const char* pKey, uint index) const {
param_map_const_iterator it = get_param(pKey, index);
if (it == end())
@@ -378,8 +331,7 @@ namespace crnlib
return it->second.m_values.size();
}
bool command_line_params::get_value_as_bool(const char* pKey, uint index, bool def) const
{
bool command_line_params::get_value_as_bool(const char* pKey, uint index, bool def) const {
param_map_const_iterator it = get_param(pKey, index);
if (it == end())
return def;
@@ -390,27 +342,22 @@ namespace crnlib
return true;
}
int command_line_params::get_value_as_int(const char* pKey, uint index, int def, int l, int h, uint value_index) const
{
int command_line_params::get_value_as_int(const char* pKey, uint index, int def, int l, int h, uint value_index) const {
param_map_const_iterator it = get_param(pKey, index);
if ((it == end()) || (value_index >= it->second.m_values.size()))
return def;
int val;
const char* p = it->second.m_values[value_index].get_ptr();
if (!string_to_int(p, val))
{
if (!string_to_int(p, val)) {
crnlib::console::warning("Invalid value specified for parameter \"%s\", using default value of %i", pKey, def);
return def;
}
if (val < l)
{
if (val < l) {
crnlib::console::warning("Value %i for parameter \"%s\" is out of range, clamping to %i", val, pKey, l);
val = l;
}
else if (val > h)
{
} else if (val > h) {
crnlib::console::warning("Value %i for parameter \"%s\" is out of range, clamping to %i", val, pKey, h);
val = h;
}
@@ -418,27 +365,22 @@ namespace crnlib
return val;
}
float command_line_params::get_value_as_float(const char* pKey, uint index, float def, float l, float h, uint value_index) const
{
float command_line_params::get_value_as_float(const char* pKey, uint index, float def, float l, float h, uint value_index) const {
param_map_const_iterator it = get_param(pKey, index);
if ((it == end()) || (value_index >= it->second.m_values.size()))
return def;
float val;
const char* p = it->second.m_values[value_index].get_ptr();
if (!string_to_float(p, val))
{
if (!string_to_float(p, val)) {
crnlib::console::warning("Invalid value specified for float parameter \"%s\", using default value of %f", pKey, def);
return def;
}
if (val < l)
{
if (val < l) {
crnlib::console::warning("Value %f for parameter \"%s\" is out of range, clamping to %f", val, pKey, l);
val = l;
}
else if (val > h)
{
} else if (val > h) {
crnlib::console::warning("Value %f for parameter \"%s\" is out of range, clamping to %f", val, pKey, h);
val = h;
}
@@ -446,11 +388,9 @@ namespace crnlib
return val;
}
bool command_line_params::get_value_as_string(const char* pKey, uint index, dynamic_string& value, uint value_index) const
{
bool command_line_params::get_value_as_string(const char* pKey, uint index, dynamic_string& value, uint value_index) const {
param_map_const_iterator it = get_param(pKey, index);
if ((it == end()) || (value_index >= it->second.m_values.size()))
{
if ((it == end()) || (value_index >= it->second.m_values.size())) {
value.empty();
return false;
}
@@ -459,8 +399,7 @@ namespace crnlib
return true;
}
const dynamic_string& command_line_params::get_value_as_string_or_empty(const char* pKey, uint index, uint value_index) const
{
const dynamic_string& command_line_params::get_value_as_string_or_empty(const char* pKey, uint index, uint value_index) const {
param_map_const_iterator it = get_param(pKey, index);
if ((it == end()) || (value_index >= it->second.m_values.size()))
return g_empty_dynamic_string;
@@ -469,4 +408,3 @@ namespace crnlib
}
} // namespace crnlib
+6 -9
View File
@@ -4,18 +4,16 @@
#include "crn_value.h"
#include <map>
namespace crnlib
{
namespace crnlib {
// Returns the command line passed to the app as a string.
// On systems where this isn't trivial, this function combines together the separate arguments, quoting and adding spaces as needed.
void get_command_line_as_single_string(dynamic_string& cmd_line, int argc, char* argv[]);
class command_line_params
{
class command_line_params {
public:
struct param_value
{
inline param_value() : m_index(0), m_modifier(0) { }
struct param_value {
inline param_value()
: m_index(0), m_modifier(0) {}
dynamic_string_array m_values;
uint m_index;
@@ -32,8 +30,7 @@ namespace crnlib
static bool split_params(const char* p, dynamic_string_array& params);
struct param_desc
{
struct param_desc {
const char* m_pName;
uint m_num_values;
bool m_support_listing_file;
+777 -1631
View File
File diff suppressed because it is too large Load Diff
+51 -107
View File
@@ -2,9 +2,7 @@
// See Copyright Notice and license at the end of inc/crnlib.h
#pragma once
#define CRND_HEADER_FILE_ONLY
#include "../inc/crn_decomp.h"
#undef CRND_HEADER_FILE_ONLY
#include "../inc/crn_defs.h"
#include "../inc/crnlib.h"
#include "crn_symbol_codec.h"
@@ -13,10 +11,8 @@
#include "crn_image_utils.h"
#include "crn_texture_comp.h"
namespace crnlib
{
class crn_comp : public itexture_comp
{
namespace crnlib {
class crn_comp : public itexture_comp {
CRNLIB_NO_COPY_OR_ASSIGNMENT_OP(crn_comp);
public:
@@ -25,7 +21,7 @@ namespace crnlib
virtual const char* get_ext() const { return "CRN"; }
virtual bool compress_init(const crn_comp_params& params);
virtual bool compress_init(const crn_comp_params&) { return true; };
virtual bool compress_pass(const crn_comp_params& params, float* pEffective_bitrate);
virtual void compress_deinit();
@@ -41,27 +37,7 @@ namespace crnlib
image_u8 m_images[cCRNMaxFaces][cCRNMaxLevels];
struct level_tag
{
uint m_width, m_height;
uint m_chunk_width, m_chunk_height;
uint m_group_index;
uint m_num_chunks;
uint m_first_chunk;
uint m_group_first_chunk;
} m_levels[cCRNMaxLevels];
struct mip_group
{
mip_group() : m_first_chunk(0), m_num_chunks(0) { }
uint m_first_chunk;
uint m_num_chunks;
};
crnlib::vector<mip_group> m_mip_groups;
enum comp
{
enum comp {
cColor,
cAlpha0,
cAlpha1,
@@ -69,113 +45,81 @@ namespace crnlib
};
bool m_has_comp[cNumComps];
bool m_has_etc_color_blocks;
bool m_has_subblocks;
struct chunk_detail
{
chunk_detail() { utils::zero_object(*this); }
uint m_first_endpoint_index;
uint m_first_selector_index;
struct level_details {
uint first_block;
uint num_blocks;
uint block_width;
};
typedef crnlib::vector<chunk_detail> chunk_detail_vec;
chunk_detail_vec m_chunk_details;
crnlib::vector<level_details> m_levels;
crnlib::vector<uint> m_endpoint_indices[cNumComps];
crnlib::vector<uint> m_selector_indices[cNumComps];
uint m_total_chunks;
dxt_hc::pixel_chunk_vec m_chunks;
uint m_total_blocks;
crnlib::vector<uint32> m_color_endpoints;
crnlib::vector<uint32> m_alpha_endpoints;
crnlib::vector<uint32> m_color_selectors;
crnlib::vector<uint64> m_alpha_selectors;
crnlib::vector<dxt_hc::endpoint_indices_details> m_endpoint_indices;
crnlib::vector<dxt_hc::selector_indices_details> m_selector_indices;
crnd::crn_header m_crn_header;
crnlib::vector<uint8> m_comp_data;
dxt_hc m_hvq;
symbol_histogram m_chunk_encoding_hist;
static_huffman_data_model m_chunk_encoding_dm;
symbol_histogram m_reference_hist;
static_huffman_data_model m_reference_dm;
crnlib::vector<uint16> m_endpoint_remaping[2];
symbol_histogram m_endpoint_index_hist[2];
static_huffman_data_model m_endpoint_index_dm[2]; // color, alpha
static_huffman_data_model m_endpoint_index_dm[2];
crnlib::vector<uint16> m_selector_remaping[2];
symbol_histogram m_selector_index_hist[2];
static_huffman_data_model m_selector_index_dm[2]; // color, alpha
static_huffman_data_model m_selector_index_dm[2];
crnlib::vector<uint8> m_packed_chunks[cCRNMaxLevels];
crnlib::vector<uint8> m_packed_blocks[cCRNMaxLevels];
crnlib::vector<uint8> m_packed_data_models;
crnlib::vector<uint8> m_packed_color_endpoints;
crnlib::vector<uint8> m_packed_color_selectors;
crnlib::vector<uint8> m_packed_alpha_endpoints;
crnlib::vector<uint8> m_packed_alpha_selectors;
void clear();
void append_chunks(const image_u8& img, uint num_chunks_x, uint num_chunks_y, dxt_hc::pixel_chunk_vec& chunks, float weight);
static float color_endpoint_similarity_func(uint index_a, uint index_b, void* pContext);
static float alpha_endpoint_similarity_func(uint index_a, uint index_b, void* pContext);
void sort_color_endpoint_codebook(crnlib::vector<uint>& remapping, const crnlib::vector<uint>& endpoints);
void sort_alpha_endpoint_codebook(crnlib::vector<uint>& remapping, const crnlib::vector<uint>& endpoints);
bool pack_color_endpoints(crnlib::vector<uint8>& data, const crnlib::vector<uint>& remapping, const crnlib::vector<uint>& endpoint_indices, uint trial_index);
bool pack_alpha_endpoints(crnlib::vector<uint8>& data, const crnlib::vector<uint>& remapping, const crnlib::vector<uint>& endpoint_indices, uint trial_index);
static float color_selector_similarity_func(uint index_a, uint index_b, void* pContext);
static float alpha_selector_similarity_func(uint index_a, uint index_b, void* pContext);
void sort_selector_codebook(crnlib::vector<uint>& remapping, const crnlib::vector<dxt_hc::selectors>& selectors, const uint8* pTo_linear);
bool pack_selectors(
crnlib::vector<uint8>& packed_data,
const crnlib::vector<uint>& selector_indices,
const crnlib::vector<dxt_hc::selectors>& selectors,
const crnlib::vector<uint>& remapping,
uint max_selector_value,
const uint8* pTo_linear,
uint trial_index);
bool alias_images();
void create_chunks();
bool quantize_chunks();
void create_chunk_indices();
bool pack_chunks(
uint first_chunk, uint num_chunks,
bool pack_color_endpoints(crnlib::vector<uint8>& packed_data, const crnlib::vector<uint16>& remapping);
bool pack_color_endpoints_etc(crnlib::vector<uint8>& packed_data, const crnlib::vector<uint16>& remapping);
bool pack_color_selectors(crnlib::vector<uint8>& packed_data, const crnlib::vector<uint16>& remapping);
bool pack_alpha_endpoints(crnlib::vector<uint8>& packed_data, const crnlib::vector<uint16>& remapping);
bool pack_alpha_selectors(crnlib::vector<uint8>& packed_data, const crnlib::vector<uint16>& remapping);
bool pack_blocks(
uint group,
bool clear_histograms,
symbol_codec* pCodec,
const crnlib::vector<uint>* pColor_endpoint_remap,
const crnlib::vector<uint>* pColor_selector_remap,
const crnlib::vector<uint>* pAlpha_endpoint_remap,
const crnlib::vector<uint>* pAlpha_selector_remap);
const crnlib::vector<uint16>* pColor_endpoint_remap,
const crnlib::vector<uint16>* pColor_selector_remap,
const crnlib::vector<uint16>* pAlpha_endpoint_remap,
const crnlib::vector<uint16>* pAlpha_selector_remap
);
bool pack_chunks_simulation(
uint first_chunk, uint num_chunks,
uint& total_bits,
const crnlib::vector<uint>* pColor_endpoint_remap,
const crnlib::vector<uint>* pColor_selector_remap,
const crnlib::vector<uint>* pAlpha_endpoint_remap,
const crnlib::vector<uint>* pAlpha_selector_remap);
bool alias_images();
void clear();
bool quantize_images();
void optimize_color_endpoint_codebook_task(uint64 data, void* pData_ptr);
bool optimize_color_endpoint_codebook(crnlib::vector<uint>& remapping);
void optimize_color_endpoints_task(uint64 data, void* pData_ptr);
void optimize_color_selectors();
void optimize_color();
void optimize_color_selector_codebook_task(uint64 data, void* pData_ptr);
bool optimize_color_selector_codebook(crnlib::vector<uint>& remapping);
void optimize_alpha_endpoint_codebook_task(uint64 data, void* pData_ptr);
bool optimize_alpha_endpoint_codebook(crnlib::vector<uint>& remapping);
void optimize_alpha_selector_codebook_task(uint64 data, void* pData_ptr);
bool optimize_alpha_selector_codebook(crnlib::vector<uint>& remapping);
bool create_comp_data();
void optimize_alpha_endpoints_task(uint64 data, void* pData_ptr);
void optimize_alpha_selectors();
void optimize_alpha();
bool pack_data_models();
bool update_progress(uint phase_index, uint subphase_index, uint subphase_total);
bool compress_internal();
static void append_vec(crnlib::vector<uint8>& a, const void* p, uint size);
static void append_vec(crnlib::vector<uint8>& a, const crnlib::vector<uint8>& b);
bool create_comp_data();
bool update_progress(uint phase_index, uint subphase_index, uint subphase_total);
bool compress_internal();
};
} // namespace crnlib
+40 -62
View File
@@ -5,8 +5,7 @@
#include "crn_data_stream.h"
#include "crn_threading.h"
namespace crnlib
{
namespace crnlib {
eConsoleMessageType console::m_default_category = cInfoConsoleMessage;
crnlib::vector<console::console_func> console::m_output_funcs;
bool console::m_crlf = true;
@@ -19,39 +18,32 @@ namespace crnlib
const uint cConsoleBufSize = 4096;
void console::init()
{
if (!m_pMutex)
{
void console::init() {
if (!m_pMutex) {
m_pMutex = crnlib_new<mutex>();
}
}
void console::deinit()
{
if (m_pMutex)
{
void console::deinit() {
if (m_pMutex) {
crnlib_delete(m_pMutex);
m_pMutex = NULL;
}
}
void console::disable_crlf()
{
void console::disable_crlf() {
init();
m_crlf = false;
}
void console::enable_crlf()
{
void console::enable_crlf() {
init();
m_crlf = true;
}
void console::vprintf(eConsoleMessageType type, const char* p, va_list args)
{
void console::vprintf(eConsoleMessageType type, const char* p, va_list args) {
init();
scoped_mutex lock(*m_pMutex);
@@ -63,27 +55,30 @@ namespace crnlib
bool handled = false;
if (m_output_funcs.size())
{
if (m_output_funcs.size()) {
for (uint i = 0; i < m_output_funcs.size(); i++)
if (m_output_funcs[i].m_func(type, buf, m_output_funcs[i].m_pData))
handled = true;
}
const char* pPrefix = NULL;
if ((m_prefixes) && (m_at_beginning_of_line))
{
switch (type)
{
case cDebugConsoleMessage: pPrefix = "Debug: "; break;
case cWarningConsoleMessage: pPrefix = "Warning: "; break;
case cErrorConsoleMessage: pPrefix = "Error: "; break;
default: break;
if ((m_prefixes) && (m_at_beginning_of_line)) {
switch (type) {
case cDebugConsoleMessage:
pPrefix = "Debug: ";
break;
case cWarningConsoleMessage:
pPrefix = "Warning: ";
break;
case cErrorConsoleMessage:
pPrefix = "Error: ";
break;
default:
break;
}
}
if ((!m_output_disabled) && (!handled))
{
if ((!m_output_disabled) && (!handled)) {
if (pPrefix)
::printf("%s", pPrefix);
::printf(m_crlf ? "%s\n" : "%s", buf);
@@ -92,8 +87,7 @@ namespace crnlib
uint n = strlen(buf);
m_at_beginning_of_line = (m_crlf) || ((n) && (buf[n - 1] == '\n'));
if ((type != cProgressConsoleMessage) && (m_pLog_stream))
{
if ((type != cProgressConsoleMessage) && (m_pLog_stream)) {
// Yes this is bad.
dynamic_string tmp_buf(buf);
@@ -104,38 +98,33 @@ namespace crnlib
}
}
void console::printf(eConsoleMessageType type, const char* p, ...)
{
void console::printf(eConsoleMessageType type, const char* p, ...) {
va_list args;
va_start(args, p);
vprintf(type, p, args);
va_end(args);
}
void console::printf(const char* p, ...)
{
void console::printf(const char* p, ...) {
va_list args;
va_start(args, p);
vprintf(m_default_category, p, args);
va_end(args);
}
void console::set_default_category(eConsoleMessageType category)
{
void console::set_default_category(eConsoleMessageType category) {
init();
m_default_category = category;
}
eConsoleMessageType console::get_default_category()
{
eConsoleMessageType console::get_default_category() {
init();
return m_default_category;
}
void console::add_console_output_func(console_output_func pFunc, void* pData)
{
void console::add_console_output_func(console_output_func pFunc, void* pData) {
init();
scoped_mutex lock(*m_pMutex);
@@ -143,76 +132,65 @@ namespace crnlib
m_output_funcs.push_back(console_func(pFunc, pData));
}
void console::remove_console_output_func(console_output_func pFunc)
{
void console::remove_console_output_func(console_output_func pFunc) {
init();
scoped_mutex lock(*m_pMutex);
for (int i = m_output_funcs.size() - 1; i >= 0; i--)
{
if (m_output_funcs[i].m_func == pFunc)
{
for (int i = m_output_funcs.size() - 1; i >= 0; i--) {
if (m_output_funcs[i].m_func == pFunc) {
m_output_funcs.erase(m_output_funcs.begin() + i);
}
}
if (!m_output_funcs.size())
{
if (!m_output_funcs.size()) {
m_output_funcs.clear();
}
}
void console::progress(const char* p, ...)
{
void console::progress(const char* p, ...) {
va_list args;
va_start(args, p);
vprintf(cProgressConsoleMessage, p, args);
va_end(args);
}
void console::info(const char* p, ...)
{
void console::info(const char* p, ...) {
va_list args;
va_start(args, p);
vprintf(cInfoConsoleMessage, p, args);
va_end(args);
}
void console::message(const char* p, ...)
{
void console::message(const char* p, ...) {
va_list args;
va_start(args, p);
vprintf(cMessageConsoleMessage, p, args);
va_end(args);
}
void console::cons(const char* p, ...)
{
void console::cons(const char* p, ...) {
va_list args;
va_start(args, p);
vprintf(cConsoleConsoleMessage, p, args);
va_end(args);
}
void console::debug(const char* p, ...)
{
void console::debug(const char* p, ...) {
va_list args;
va_start(args, p);
vprintf(cDebugConsoleMessage, p, args);
va_end(args);
}
void console::warning(const char* p, ...)
{
void console::warning(const char* p, ...) {
va_list args;
va_start(args, p);
vprintf(cWarningConsoleMessage, p, args);
va_end(args);
}
void console::error(const char* p, ...)
{
void console::error(const char* p, ...) {
va_list args;
va_start(args, p);
vprintf(cErrorConsoleMessage, p, args);
+9 -16
View File
@@ -7,14 +7,12 @@
#include <tchar.h>
#include <conio.h>
#endif
namespace crnlib
{
namespace crnlib {
class dynamic_string;
class data_stream;
class mutex;
enum eConsoleMessageType
{
enum eConsoleMessageType {
cDebugConsoleMessage, // debugging messages
cProgressConsoleMessage, // progress messages
cInfoConsoleMessage, // ordinary messages
@@ -28,8 +26,7 @@ namespace crnlib
typedef bool (*console_output_func)(eConsoleMessageType type, const char* pMsg, void* pData);
class console
{
class console {
public:
static void init();
static void deinit();
@@ -77,9 +74,9 @@ namespace crnlib
private:
static eConsoleMessageType m_default_category;
struct console_func
{
console_func(console_output_func func = NULL, void* pData = NULL) : m_func(func), m_pData(pData) { }
struct console_func {
console_func(console_output_func func = NULL, void* pData = NULL)
: m_func(func), m_pData(pData) {}
console_output_func m_func;
void* m_pData;
@@ -98,15 +95,13 @@ namespace crnlib
};
#if defined(WIN32)
inline int crn_getch()
{
inline int crn_getch() {
return _getch();
}
#elif defined(__GNUC__)
#include <termios.h>
#include <unistd.h>
inline int crn_getch()
{
inline int crn_getch() {
struct termios oldt, newt;
int ch;
tcgetattr(STDIN_FILENO, &oldt);
@@ -118,11 +113,9 @@ namespace crnlib
return ch;
}
#else
inline int crn_getch()
{
inline int crn_getch() {
printf("crn_getch: Unimplemented");
return 0;
}
#endif
} // namespace crnlib
+1 -2
View File
@@ -6,8 +6,7 @@
#include "crn_winhdr.h"
#endif
namespace crnlib
{
namespace crnlib {
const char* g_copyright_str = "Copyright (c) 2010-2016 Richard Geldreich, Jr. and Binomial LLC";
const char* g_sig_str = "C8cfRlaorj0wLtnMSxrBJxTC85rho2L9hUZKHcBL";
+25 -39
View File
@@ -3,30 +3,29 @@
#include "crn_core.h"
#include "crn_data_stream.h"
namespace crnlib
{
data_stream::data_stream() :
m_attribs(0),
m_opened(false), m_error(false), m_got_cr(false)
{
namespace crnlib {
data_stream::data_stream()
: m_attribs(0),
m_opened(false),
m_error(false),
m_got_cr(false) {
}
data_stream::data_stream(const char* pName, uint attribs) :
m_name(pName),
data_stream::data_stream(const char* pName, uint attribs)
: m_name(pName),
m_attribs(static_cast<uint16>(attribs)),
m_opened(false), m_error(false), m_got_cr(false)
{
m_opened(false),
m_error(false),
m_got_cr(false) {
}
uint64 data_stream::skip(uint64 len)
{
uint64 data_stream::skip(uint64 len) {
uint64 total_bytes_read = 0;
const uint cBufSize = 1024;
uint8 buf[cBufSize];
while (len)
{
while (len) {
const uint64 bytes_to_read = math::minimum<uint64>(sizeof(buf), len);
const uint64 bytes_read = read(buf, static_cast<uint>(bytes_to_read));
total_bytes_read += bytes_read;
@@ -40,33 +39,26 @@ namespace crnlib
return total_bytes_read;
}
bool data_stream::read_line(dynamic_string& str)
{
bool data_stream::read_line(dynamic_string& str) {
str.empty();
for ( ; ; )
{
for (;;) {
const int c = read_byte();
const bool prev_got_cr = m_got_cr;
m_got_cr = false;
if (c < 0)
{
if (c < 0) {
if (!str.is_empty())
break;
return false;
}
else if ((26 == c) || (!c))
} else if ((26 == c) || (!c))
continue;
else if (13 == c)
{
else if (13 == c) {
m_got_cr = true;
break;
}
else if (10 == c)
{
} else if (10 == c) {
if (prev_got_cr)
continue;
@@ -79,8 +71,7 @@ namespace crnlib
return true;
}
bool data_stream::printf(const char* p, ...)
{
bool data_stream::printf(const char* p, ...) {
va_list args;
va_start(args, p);
@@ -91,26 +82,22 @@ namespace crnlib
return write(buf.get_ptr(), buf.get_len() * sizeof(char)) == buf.get_len() * sizeof(char);
}
bool data_stream::write_line(const dynamic_string& str)
{
bool data_stream::write_line(const dynamic_string& str) {
if (!str.is_empty())
return write(str.get_ptr(), str.get_len()) == str.get_len();
return true;
}
bool data_stream::read_array(vector<uint8>& buf)
{
if (buf.size() < get_remaining())
{
bool data_stream::read_array(vector<uint8>& buf) {
if (buf.size() < get_remaining()) {
if (get_remaining() > 1024U * 1024U * 1024U)
return false;
buf.resize((uint)get_remaining());
}
if (!get_remaining())
{
if (!get_remaining()) {
buf.resize(0);
return true;
}
@@ -118,8 +105,7 @@ namespace crnlib
return read(&buf[0], buf.size()) == buf.size();
}
bool data_stream::write_array(const vector<uint8>& buf)
{
bool data_stream::write_array(const vector<uint8>& buf) {
if (!buf.empty())
return write(&buf[0], buf.size()) == buf.size();
return true;
+19 -10
View File
@@ -2,10 +2,8 @@
// See Copyright Notice and license at the end of inc/crnlib.h
#pragma once
namespace crnlib
{
enum data_stream_attribs
{
namespace crnlib {
enum data_stream_attribs {
cDataStreamReadable = 1,
cDataStreamWritable = 2,
cDataStreamSeekable = 4
@@ -14,8 +12,7 @@ namespace crnlib
const int64 DATA_STREAM_SIZE_UNKNOWN = cINT64_MAX;
const int64 DATA_STREAM_SIZE_INFINITE = cUINT64_MAX;
class data_stream
{
class data_stream {
data_stream(const data_stream&);
data_stream& operator=(const data_stream&);
@@ -27,7 +24,12 @@ namespace crnlib
virtual data_stream* get_parent() { return NULL; }
virtual bool close() { m_opened = false; m_error = false; m_got_cr = false; return true; }
virtual bool close() {
m_opened = false;
m_error = false;
m_got_cr = false;
return true;
}
typedef uint16 attribs_t;
inline attribs_t get_attribs() const { return m_attribs; }
@@ -60,13 +62,21 @@ namespace crnlib
virtual const void* get_ptr() const { return NULL; }
inline int read_byte() { uint8 c; if (read(&c, 1) != 1) return -1; return c; }
inline int read_byte() {
uint8 c;
if (read(&c, 1) != 1)
return -1;
return c;
}
inline bool write_byte(uint8 c) { return write(&c, 1) == 1; }
bool read_line(dynamic_string& str);
bool printf(const char* p, ...);
bool write_line(const dynamic_string& str);
bool write_bom() { uint16 bom = 0xFEFF; return write(&bom, sizeof(bom)) == sizeof(bom); }
bool write_bom() {
uint16 bom = 0xFEFF;
return write(&bom, sizeof(bom)) == sizeof(bom);
}
bool read_array(vector<uint8>& buf);
bool write_array(const vector<uint8>& buf);
@@ -86,4 +96,3 @@ namespace crnlib
};
} // namespace crnlib
+176 -149
View File
@@ -3,18 +3,24 @@
#pragma once
#include "crn_data_stream.h"
namespace crnlib
{
namespace crnlib {
// Defaults to little endian mode.
class data_stream_serializer
{
class data_stream_serializer {
public:
data_stream_serializer() : m_pStream(NULL), m_little_endian(true) { }
data_stream_serializer(data_stream* pStream) : m_pStream(pStream), m_little_endian(true) { }
data_stream_serializer(data_stream& stream) : m_pStream(&stream), m_little_endian(true) { }
data_stream_serializer(const data_stream_serializer& other) : m_pStream(other.m_pStream), m_little_endian(other.m_little_endian) { }
data_stream_serializer()
: m_pStream(NULL), m_little_endian(true) {}
data_stream_serializer(data_stream* pStream)
: m_pStream(pStream), m_little_endian(true) {}
data_stream_serializer(data_stream& stream)
: m_pStream(&stream), m_little_endian(true) {}
data_stream_serializer(const data_stream_serializer& other)
: m_pStream(other.m_pStream), m_little_endian(other.m_little_endian) {}
data_stream_serializer& operator= (const data_stream_serializer& rhs) { m_pStream = rhs.m_pStream; m_little_endian = rhs.m_little_endian; return *this; }
data_stream_serializer& operator=(const data_stream_serializer& rhs) {
m_pStream = rhs.m_pStream;
m_little_endian = rhs.m_little_endian;
return *this;
}
data_stream* get_stream() const { return m_pStream; }
void set_stream(data_stream* pStream) { m_pStream = pStream; }
@@ -26,19 +32,16 @@ namespace crnlib
bool get_little_endian() const { return m_little_endian; }
void set_little_endian(bool little_endian) { m_little_endian = little_endian; }
bool write(const void* pBuf, uint len)
{
bool write(const void* pBuf, uint len) {
return m_pStream->write(pBuf, len) == len;
}
bool read(void* pBuf, uint len)
{
bool read(void* pBuf, uint len) {
return m_pStream->read(pBuf, len) == len;
}
// size = size of each element, count = number of elements, returns actual count of elements written
uint write(const void* pBuf, uint size, uint count)
{
uint write(const void* pBuf, uint size, uint count) {
uint actual_size = size * count;
if (!actual_size)
return 0;
@@ -49,8 +52,7 @@ namespace crnlib
}
// size = size of each element, count = number of elements, returns actual count of elements read
uint read(void* pBuf, uint size, uint count)
{
uint read(void* pBuf, uint size, uint count) {
uint actual_size = size * count;
if (!actual_size)
return 0;
@@ -60,28 +62,23 @@ namespace crnlib
return n / size;
}
bool write_chars(const char* pBuf, uint len)
{
bool write_chars(const char* pBuf, uint len) {
return write(pBuf, len);
}
bool read_chars(char* pBuf, uint len)
{
bool read_chars(char* pBuf, uint len) {
return read(pBuf, len);
}
bool skip(uint len)
{
bool skip(uint len) {
return m_pStream->skip(len) == len;
}
template <typename T>
bool write_object(const T& obj)
{
bool write_object(const T& obj) {
if (m_little_endian == c_crnlib_little_endian_platform)
return write(&obj, sizeof(obj));
else
{
else {
uint8 buf[sizeof(T)];
uint buf_size = sizeof(T);
void* pBuf = buf;
@@ -92,12 +89,10 @@ namespace crnlib
}
template <typename T>
bool read_object(T& obj)
{
bool read_object(T& obj) {
if (m_little_endian == c_crnlib_little_endian_platform)
return read(&obj, sizeof(obj));
else
{
else {
uint8 buf[sizeof(T)];
if (!read(buf, sizeof(T)))
return false;
@@ -111,14 +106,12 @@ namespace crnlib
}
template <typename T>
bool write_value(T value)
{
bool write_value(T value) {
return write_object(value);
}
template <typename T>
T read_value(const T& on_error_value = T())
{
T read_value(const T& on_error_value = T()) {
T result;
if (!read_object(result))
result = on_error_value;
@@ -126,23 +119,19 @@ namespace crnlib
}
template <typename T>
bool write_enum(T e)
{
bool write_enum(T e) {
int val = static_cast<int>(e);
return write_object(val);
}
template <typename T>
T read_enum()
{
T read_enum() {
return static_cast<T>(read_value<int>());
}
// Writes uint using a simple variable length code (VLC).
bool write_uint_vlc(uint val)
{
do
{
bool write_uint_vlc(uint val) {
do {
uint8 c = static_cast<uint8>(val) & 0x7F;
if (val <= 0x7F)
c |= 0x80;
@@ -157,13 +146,11 @@ namespace crnlib
}
// Reads uint using a simple variable length code (VLC).
bool read_uint_vlc(uint& val)
{
bool read_uint_vlc(uint& val) {
val = 0;
uint shift = 0;
for ( ; ; )
{
for (;;) {
if (shift >= 32)
return false;
@@ -181,8 +168,7 @@ namespace crnlib
return true;
}
bool write_c_str(const char* p)
{
bool write_c_str(const char* p) {
uint len = static_cast<uint>(strlen(p));
if (!write_uint_vlc(len))
return false;
@@ -190,8 +176,7 @@ namespace crnlib
return write_chars(p, len);
}
bool read_c_str(char* pBuf, uint buf_size)
{
bool read_c_str(char* pBuf, uint buf_size) {
uint len;
if (!read_uint_vlc(len))
return false;
@@ -203,16 +188,14 @@ namespace crnlib
return read_chars(pBuf, len);
}
bool write_string(const dynamic_string& str)
{
bool write_string(const dynamic_string& str) {
if (!write_uint_vlc(str.get_len()))
return false;
return write_chars(str.get_ptr(), str.get_len());
}
bool read_string(dynamic_string& str)
{
bool read_string(dynamic_string& str) {
uint len;
if (!read_uint_vlc(len))
return false;
@@ -220,13 +203,11 @@ namespace crnlib
if (!str.set_len(len))
return false;
if (len)
{
if (len) {
if (!read_chars(str.get_ptr_raw(), len))
return false;
if (memchr(str.get_ptr(), 0, len) != NULL)
{
if (memchr(str.get_ptr(), 0, len) != NULL) {
str.truncate(0);
return false;
}
@@ -236,13 +217,11 @@ namespace crnlib
}
template <typename T>
bool write_vector(const T& vec)
{
bool write_vector(const T& vec) {
if (!write_uint_vlc(vec.size()))
return false;
for (uint i = 0; i < vec.size(); i++)
{
for (uint i = 0; i < vec.size(); i++) {
*this << vec[i];
if (get_error())
return false;
@@ -252,8 +231,7 @@ namespace crnlib
};
template <typename T>
bool read_vector(T& vec, uint num_expected = UINT_MAX)
{
bool read_vector(T& vec, uint num_expected = UINT_MAX) {
uint size;
if (!read_uint_vlc(size))
return false;
@@ -265,8 +243,7 @@ namespace crnlib
return false;
vec.resize(size);
for (uint i = 0; i < vec.size(); i++)
{
for (uint i = 0; i < vec.size(); i++) {
*this >> vec[i];
if (get_error())
@@ -276,52 +253,42 @@ namespace crnlib
return true;
}
bool read_entire_file(crnlib::vector<uint8>& buf)
{
bool read_entire_file(crnlib::vector<uint8>& buf) {
return m_pStream->read_array(buf);
}
bool write_entire_file(const crnlib::vector<uint8>& buf)
{
bool write_entire_file(const crnlib::vector<uint8>& buf) {
return m_pStream->write_array(buf);
}
// Got this idea from the Molly Rocket forums.
// fmt may contain the characters "1", "2", or "4".
bool writef(char *fmt, ...)
{
bool writef(char* fmt, ...) {
va_list v;
va_start(v, fmt);
while (*fmt)
{
switch (*fmt++)
{
case '1':
{
while (*fmt) {
switch (*fmt++) {
case '1': {
const uint8 x = static_cast<uint8>(va_arg(v, uint));
if (!write_value(x))
return false;
}
case '2':
{
case '2': {
const uint16 x = static_cast<uint16>(va_arg(v, uint));
if (!write_value(x))
return false;
}
case '4':
{
case '4': {
const uint32 x = static_cast<uint32>(va_arg(v, uint));
if (!write_value(x))
return false;
}
case ' ':
case ',':
{
case ',': {
break;
}
default:
{
default: {
CRNLIB_ASSERT(0);
return false;
}
@@ -334,43 +301,35 @@ namespace crnlib
// Got this idea from the Molly Rocket forums.
// fmt may contain the characters "1", "2", or "4".
bool readf(char *fmt, ...)
{
bool readf(char* fmt, ...) {
va_list v;
va_start(v, fmt);
while (*fmt)
{
switch (*fmt++)
{
case '1':
{
while (*fmt) {
switch (*fmt++) {
case '1': {
uint8* x = va_arg(v, uint8*);
CRNLIB_ASSERT(x);
if (!read_object(*x))
return false;
}
case '2':
{
case '2': {
uint16* x = va_arg(v, uint16*);
CRNLIB_ASSERT(x);
if (!read_object(*x))
return false;
}
case '4':
{
case '4': {
uint32* x = va_arg(v, uint32*);
CRNLIB_ASSERT(x);
if (!read_object(*x))
return false;
}
case ' ':
case ',':
{
case ',': {
break;
}
default:
{
default: {
CRNLIB_ASSERT(0);
return false;
}
@@ -388,81 +347,149 @@ namespace crnlib
};
// Write operators
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, bool val) { serializer.write_value(val); return serializer; }
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, int8 val) { serializer.write_value(val); return serializer; }
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, uint8 val) { serializer.write_value(val); return serializer; }
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, int16 val) { serializer.write_value(val); return serializer; }
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, uint16 val) { serializer.write_value(val); return serializer; }
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, int32 val) { serializer.write_value(val); return serializer; }
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, uint32 val) { serializer.write_uint_vlc(val); return serializer; }
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, int64 val) { serializer.write_value(val); return serializer; }
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, uint64 val) { serializer.write_value(val); return serializer; }
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, long val) { serializer.write_value(val); return serializer; }
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, unsigned long val) { serializer.write_value(val); return serializer; }
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, float val) { serializer.write_value(val); return serializer; }
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, double val) { serializer.write_value(val); return serializer; }
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, const char* p) { serializer.write_c_str(p); return serializer; }
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, bool val) {
serializer.write_value(val);
return serializer;
}
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, int8 val) {
serializer.write_value(val);
return serializer;
}
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, uint8 val) {
serializer.write_value(val);
return serializer;
}
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, int16 val) {
serializer.write_value(val);
return serializer;
}
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, uint16 val) {
serializer.write_value(val);
return serializer;
}
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, int32 val) {
serializer.write_value(val);
return serializer;
}
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, uint32 val) {
serializer.write_uint_vlc(val);
return serializer;
}
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, int64 val) {
serializer.write_value(val);
return serializer;
}
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, uint64 val) {
serializer.write_value(val);
return serializer;
}
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, long val) {
serializer.write_value(val);
return serializer;
}
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, unsigned long val) {
serializer.write_value(val);
return serializer;
}
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, float val) {
serializer.write_value(val);
return serializer;
}
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, double val) {
serializer.write_value(val);
return serializer;
}
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, const char* p) {
serializer.write_c_str(p);
return serializer;
}
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, const dynamic_string& str)
{
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, const dynamic_string& str) {
serializer.write_string(str);
return serializer;
}
template <typename T>
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, const crnlib::vector<T>& vec)
{
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, const crnlib::vector<T>& vec) {
serializer.write_vector(vec);
return serializer;
}
template <typename T>
inline data_stream_serializer& operator<< (data_stream_serializer& serializer, const T* p)
{
inline data_stream_serializer& operator<<(data_stream_serializer& serializer, const T* p) {
serializer.write_object(*p);
return serializer;
}
// Read operators
inline data_stream_serializer& operator>> (data_stream_serializer& serializer, bool& val) { serializer.read_object(val); return serializer; }
inline data_stream_serializer& operator>> (data_stream_serializer& serializer, int8& val) { serializer.read_object(val); return serializer; }
inline data_stream_serializer& operator>> (data_stream_serializer& serializer, uint8& val) { serializer.read_object(val); return serializer; }
inline data_stream_serializer& operator>> (data_stream_serializer& serializer, int16& val) { serializer.read_object(val); return serializer; }
inline data_stream_serializer& operator>> (data_stream_serializer& serializer, uint16& val) { serializer.read_object(val); return serializer; }
inline data_stream_serializer& operator>> (data_stream_serializer& serializer, int32& val) { serializer.read_object(val); return serializer; }
inline data_stream_serializer& operator>> (data_stream_serializer& serializer, uint32& val) { serializer.read_uint_vlc(val); return serializer; }
inline data_stream_serializer& operator>> (data_stream_serializer& serializer, int64& val) { serializer.read_object(val); return serializer; }
inline data_stream_serializer& operator>> (data_stream_serializer& serializer, uint64& val) { serializer.read_object(val); return serializer; }
inline data_stream_serializer& operator>> (data_stream_serializer& serializer, long& val) { serializer.read_object(val); return serializer; }
inline data_stream_serializer& operator>> (data_stream_serializer& serializer, unsigned long& val) { serializer.read_object(val); return serializer; }
inline data_stream_serializer& operator>> (data_stream_serializer& serializer, float& val) { serializer.read_object(val); return serializer; }
inline data_stream_serializer& operator>> (data_stream_serializer& serializer, double& val) { serializer.read_object(val); return serializer; }
inline data_stream_serializer& operator>>(data_stream_serializer& serializer, bool& val) {
serializer.read_object(val);
return serializer;
}
inline data_stream_serializer& operator>>(data_stream_serializer& serializer, int8& val) {
serializer.read_object(val);
return serializer;
}
inline data_stream_serializer& operator>>(data_stream_serializer& serializer, uint8& val) {
serializer.read_object(val);
return serializer;
}
inline data_stream_serializer& operator>>(data_stream_serializer& serializer, int16& val) {
serializer.read_object(val);
return serializer;
}
inline data_stream_serializer& operator>>(data_stream_serializer& serializer, uint16& val) {
serializer.read_object(val);
return serializer;
}
inline data_stream_serializer& operator>>(data_stream_serializer& serializer, int32& val) {
serializer.read_object(val);
return serializer;
}
inline data_stream_serializer& operator>>(data_stream_serializer& serializer, uint32& val) {
serializer.read_uint_vlc(val);
return serializer;
}
inline data_stream_serializer& operator>>(data_stream_serializer& serializer, int64& val) {
serializer.read_object(val);
return serializer;
}
inline data_stream_serializer& operator>>(data_stream_serializer& serializer, uint64& val) {
serializer.read_object(val);
return serializer;
}
inline data_stream_serializer& operator>>(data_stream_serializer& serializer, long& val) {
serializer.read_object(val);
return serializer;
}
inline data_stream_serializer& operator>>(data_stream_serializer& serializer, unsigned long& val) {
serializer.read_object(val);
return serializer;
}
inline data_stream_serializer& operator>>(data_stream_serializer& serializer, float& val) {
serializer.read_object(val);
return serializer;
}
inline data_stream_serializer& operator>>(data_stream_serializer& serializer, double& val) {
serializer.read_object(val);
return serializer;
}
inline data_stream_serializer& operator>> (data_stream_serializer& serializer, dynamic_string& str)
{
inline data_stream_serializer& operator>>(data_stream_serializer& serializer, dynamic_string& str) {
serializer.read_string(str);
return serializer;
}
template <typename T>
inline data_stream_serializer& operator>> (data_stream_serializer& serializer, crnlib::vector<T>& vec)
{
inline data_stream_serializer& operator>>(data_stream_serializer& serializer, crnlib::vector<T>& vec) {
serializer.read_vector(vec);
return serializer;
}
template <typename T>
inline data_stream_serializer& operator>> (data_stream_serializer& serializer, T* p)
{
inline data_stream_serializer& operator>>(data_stream_serializer& serializer, T* p) {
serializer.read_object(*p);
return serializer;
}
} // namespace crnlib
+35 -67
View File
@@ -5,44 +5,36 @@
#include "crn_dynamic_stream.h"
#include "crn_lzma_codec.h"
namespace crnlib
{
dds_comp::dds_comp() :
m_pParams(NULL),
namespace crnlib {
dds_comp::dds_comp()
: m_pParams(NULL),
m_pixel_fmt(PIXEL_FMT_INVALID),
m_pQDXT_state(NULL)
{
m_pQDXT_state(NULL) {
}
dds_comp::~dds_comp()
{
dds_comp::~dds_comp() {
crnlib_delete(m_pQDXT_state);
}
void dds_comp::clear()
{
void dds_comp::clear() {
m_src_tex.clear();
m_packed_tex.clear();
m_comp_data.clear();
m_pParams = NULL;
m_pixel_fmt = PIXEL_FMT_INVALID;
m_task_pool.deinit();
if (m_pQDXT_state)
{
if (m_pQDXT_state) {
crnlib_delete(m_pQDXT_state);
m_pQDXT_state = NULL;
}
}
bool dds_comp::create_dds_tex(mipmapped_texture &dds_tex)
{
bool dds_comp::create_dds_tex(mipmapped_texture& dds_tex) {
image_u8 images[cCRNMaxFaces][cCRNMaxLevels];
bool has_alpha = false;
for (uint face_index = 0; face_index < m_pParams->m_faces; face_index++)
{
for (uint level_index = 0; level_index < m_pParams->m_levels; level_index++)
{
for (uint face_index = 0; face_index < m_pParams->m_faces; face_index++) {
for (uint level_index = 0; level_index < m_pParams->m_levels; level_index++) {
const uint width = math::maximum(1U, m_pParams->m_width >> level_index);
const uint height = math::maximum(1U, m_pParams->m_height >> level_index);
@@ -60,12 +52,9 @@ namespace crnlib
images[face_index][level_index].set_component_valid(3, has_alpha);
image_utils::conversion_type conv_type = image_utils::get_image_conversion_type_from_crn_format((crn_format)m_pParams->m_format);
if (conv_type != image_utils::cConversion_Invalid)
{
for (uint face_index = 0; face_index < m_pParams->m_faces; face_index++)
{
for (uint level_index = 0; level_index < m_pParams->m_levels; level_index++)
{
if (conv_type != image_utils::cConversion_Invalid) {
for (uint face_index = 0; face_index < m_pParams->m_faces; face_index++) {
for (uint level_index = 0; level_index < m_pParams->m_levels; level_index++) {
image_u8 cooked_image(images[face_index][level_index]);
image_utils::convert_image(cooked_image, conv_type);
@@ -77,10 +66,8 @@ namespace crnlib
face_vec faces(m_pParams->m_faces);
for (uint face_index = 0; face_index < m_pParams->m_faces; face_index++)
{
for (uint level_index = 0; level_index < m_pParams->m_levels; level_index++)
{
for (uint face_index = 0; face_index < m_pParams->m_faces; face_index++) {
for (uint level_index = 0; level_index < m_pParams->m_levels; level_index++) {
mip_level* pMip = crnlib_new<mip_level>();
image_u8* pImage = crnlib_new<image_u8>();
@@ -99,34 +86,27 @@ namespace crnlib
return true;
}
static bool progress_callback_func(uint percentage_complete, void* pUser_data_ptr)
{
static bool progress_callback_func(uint percentage_complete, void* pUser_data_ptr) {
const crn_comp_params& params = *(const crn_comp_params*)pUser_data_ptr;
return params.m_pProgress_func(0, 1, percentage_complete, 100, params.m_pProgress_func_data) != 0;
}
static bool progress_callback_func_phase_0(uint percentage_complete, void* pUser_data_ptr)
{
static bool progress_callback_func_phase_0(uint percentage_complete, void* pUser_data_ptr) {
const crn_comp_params& params = *(const crn_comp_params*)pUser_data_ptr;
return params.m_pProgress_func(0, 2, percentage_complete, 100, params.m_pProgress_func_data) != 0;
}
static bool progress_callback_func_phase_1(uint percentage_complete, void* pUser_data_ptr)
{
static bool progress_callback_func_phase_1(uint percentage_complete, void* pUser_data_ptr) {
const crn_comp_params& params = *(const crn_comp_params*)pUser_data_ptr;
return params.m_pProgress_func(1, 2, percentage_complete, 100, params.m_pProgress_func_data) != 0;
}
bool dds_comp::convert_to_dxt(const crn_comp_params& params)
{
if ((params.m_quality_level == cCRNMaxQualityLevel) || (params.m_format == cCRNFmtDXT3))
{
bool dds_comp::convert_to_dxt(const crn_comp_params& params) {
if ((params.m_quality_level == cCRNMaxQualityLevel) || (params.m_format == cCRNFmtDXT3)) {
m_packed_tex = m_src_tex;
if (!m_packed_tex.convert(m_pixel_fmt, false, m_pack_params))
return false;
}
else
{
} else {
const bool hierarchical = (params.m_flags & cCRNCompFlagHierarchical) != 0;
m_q1_params.m_quality_level = params.m_quality_level;
@@ -135,12 +115,10 @@ namespace crnlib
m_q5_params.m_quality_level = params.m_quality_level;
m_q5_params.m_hierarchical = hierarchical;
if (!m_pQDXT_state)
{
if (!m_pQDXT_state) {
m_pQDXT_state = crnlib_new<mipmapped_texture::qdxt_state>(m_task_pool);
if (params.m_pProgress_func)
{
if (params.m_pProgress_func) {
m_q1_params.m_pProgress_func = progress_callback_func_phase_0;
m_q1_params.m_pProgress_data = (void*)&params;
m_q5_params.m_pProgress_func = progress_callback_func_phase_0;
@@ -150,16 +128,12 @@ namespace crnlib
if (!m_src_tex.qdxt_pack_init(*m_pQDXT_state, m_packed_tex, m_q1_params, m_q5_params, m_pixel_fmt, false))
return false;
if (params.m_pProgress_func)
{
if (params.m_pProgress_func) {
m_q1_params.m_pProgress_func = progress_callback_func_phase_1;
m_q5_params.m_pProgress_func = progress_callback_func_phase_1;
}
}
else
{
if (params.m_pProgress_func)
{
} else {
if (params.m_pProgress_func) {
m_q1_params.m_pProgress_func = progress_callback_func;
m_q1_params.m_pProgress_data = (void*)&params;
m_q5_params.m_pProgress_func = progress_callback_func;
@@ -174,8 +148,7 @@ namespace crnlib
return true;
}
bool dds_comp::compress_init(const crn_comp_params& params)
{
bool dds_comp::compress_init(const crn_comp_params& params) {
clear();
m_pParams = &params;
@@ -190,8 +163,7 @@ namespace crnlib
return false;
m_pack_params.init(*m_pParams);
if (params.m_pProgress_func)
{
if (params.m_pProgress_func) {
m_pack_params.m_pProgress_callback = progress_callback_func;
m_pack_params.m_pProgress_callback_user_data_ptr = (void*)&params;
}
@@ -213,9 +185,9 @@ namespace crnlib
return true;
}
bool dds_comp::compress_pass(const crn_comp_params& params, float *pEffective_bitrate)
{
if (pEffective_bitrate) *pEffective_bitrate = 0.0f;
bool dds_comp::compress_pass(const crn_comp_params& params, float* pEffective_bitrate) {
if (pEffective_bitrate)
*pEffective_bitrate = 0.0f;
if (!m_pParams)
return false;
@@ -233,16 +205,13 @@ namespace crnlib
m_comp_data.swap(out_stream.get_buf());
if (pEffective_bitrate)
{
if (pEffective_bitrate) {
lzma_codec lossless_codec;
crnlib::vector<uint8> cmp_tex_bytes;
if (lossless_codec.pack(m_comp_data.get_ptr(), m_comp_data.size(), cmp_tex_bytes))
{
if (lossless_codec.pack(m_comp_data.get_ptr(), m_comp_data.size(), cmp_tex_bytes)) {
uint comp_size = cmp_tex_bytes.size();
if (comp_size)
{
if (comp_size) {
*pEffective_bitrate = (comp_size * 8.0f) / m_src_tex.get_total_pixels_in_all_faces_and_mips();
}
}
@@ -251,8 +220,7 @@ namespace crnlib
return true;
}
void dds_comp::compress_deinit()
{
void dds_comp::compress_deinit() {
clear();
}
+2 -4
View File
@@ -5,10 +5,8 @@
#include "crn_mipmapped_texture.h"
#include "crn_texture_comp.h"
namespace crnlib
{
class dds_comp : public itexture_comp
{
namespace crnlib {
class dds_comp : public itexture_comp {
CRNLIB_NO_COPY_OR_ASSIGNMENT_OP(dds_comp);
public:
+83 -92
View File
@@ -7,8 +7,7 @@
#include "crn_dxt_fast.h"
#include "crn_intersect.h"
namespace crnlib
{
namespace crnlib {
const uint8 g_dxt5_from_linear[cDXT5SelectorValues] = {0U, 2U, 3U, 4U, 5U, 6U, 7U, 1U};
const uint8 g_dxt5_to_linear[cDXT5SelectorValues] = {0U, 7U, 1U, 2U, 3U, 4U, 5U, 6U};
@@ -20,82 +19,102 @@ namespace crnlib
const uint8 g_six_alpha_invert_table[cDXT5SelectorValues] = {1, 0, 5, 4, 3, 2, 6, 7};
const uint8 g_eight_alpha_invert_table[cDXT5SelectorValues] = {1, 0, 7, 6, 5, 4, 3, 2};
const char* get_dxt_format_string(dxt_format fmt)
{
switch (fmt)
{
case cDXT1: return "DXT1";
case cDXT1A: return "DXT1A";
case cDXT3: return "DXT3";
case cDXT5: return "DXT5";
case cDXT5A: return "DXT5A";
case cDXN_XY: return "DXN_XY";
case cDXN_YX: return "DXN_YX";
case cETC1: return "ETC1";
default: break;
const char* get_dxt_format_string(dxt_format fmt) {
switch (fmt) {
case cDXT1:
return "DXT1";
case cDXT1A:
return "DXT1A";
case cDXT3:
return "DXT3";
case cDXT5:
return "DXT5";
case cDXT5A:
return "DXT5A";
case cDXN_XY:
return "DXN_XY";
case cDXN_YX:
return "DXN_YX";
case cETC1:
return "ETC1";
case cETC2:
return "ETC2";
case cETC2A:
return "ETC2A";
case cETC1S:
return "ETC1S";
case cETC2AS:
return "ETC2AS";
default:
break;
}
CRNLIB_ASSERT(false);
return "?";
}
const char* get_dxt_compressor_name(crn_dxt_compressor_type c)
{
switch (c)
{
case cCRNDXTCompressorCRN: return "CRN";
case cCRNDXTCompressorCRNF: return "CRNF";
case cCRNDXTCompressorRYG: return "RYG";
const char* get_dxt_compressor_name(crn_dxt_compressor_type c) {
switch (c) {
case cCRNDXTCompressorCRN:
return "CRN";
case cCRNDXTCompressorCRNF:
return "CRNF";
case cCRNDXTCompressorRYG:
return "RYG";
#if CRNLIB_SUPPORT_ATI_COMPRESS
case cCRNDXTCompressorATI: return "ATI";
case cCRNDXTCompressorATI:
return "ATI";
#endif
default: break;
default:
break;
}
CRNLIB_ASSERT(false);
return "?";
}
uint get_dxt_format_bits_per_pixel(dxt_format fmt)
{
switch (fmt)
{
uint get_dxt_format_bits_per_pixel(dxt_format fmt) {
switch (fmt) {
case cDXT1:
case cDXT1A:
case cDXT5A:
case cETC1:
case cETC2:
case cETC1S:
return 4;
case cDXT3:
case cDXT5:
case cDXN_XY:
case cDXN_YX:
case cETC2A:
case cETC2AS:
return 8;
default: break;
default:
break;
}
CRNLIB_ASSERT(false);
return 0;
}
bool get_dxt_format_has_alpha(dxt_format fmt)
{
switch (fmt)
{
bool get_dxt_format_has_alpha(dxt_format fmt) {
switch (fmt) {
case cDXT1A:
case cDXT3:
case cDXT5:
case cDXT5A:
case cETC2A:
case cETC2AS:
return true;
default: break;
default:
break;
}
return false;
}
uint16 dxt1_block::pack_color(const color_quad_u8& color, bool scaled, uint bias)
{
uint16 dxt1_block::pack_color(const color_quad_u8& color, bool scaled, uint bias) {
uint r = color.r;
uint g = color.g;
uint b = color.b;
if (scaled)
{
if (scaled) {
r = (r * 31U + bias) / 255U;
g = (g * 63U + bias) / 255U;
b = (b * 31U + bias) / 255U;
@@ -108,19 +127,16 @@ namespace crnlib
return static_cast<uint16>(b | (g << 5U) | (r << 11U));
}
uint16 dxt1_block::pack_color(uint r, uint g, uint b, bool scaled, uint bias)
{
uint16 dxt1_block::pack_color(uint r, uint g, uint b, bool scaled, uint bias) {
return pack_color(color_quad_u8(r, g, b, 0), scaled, bias);
}
color_quad_u8 dxt1_block::unpack_color(uint16 packed_color, bool scaled, uint alpha)
{
color_quad_u8 dxt1_block::unpack_color(uint16 packed_color, bool scaled, uint alpha) {
uint b = packed_color & 31U;
uint g = (packed_color >> 5U) & 63U;
uint r = (packed_color >> 11U) & 31U;
if (scaled)
{
if (scaled) {
b = (b << 3U) | (b >> 2U);
g = (g << 2U) | (g >> 4U);
r = (r << 3U) | (r >> 2U);
@@ -129,16 +145,14 @@ namespace crnlib
return color_quad_u8(cNoClamp, r, g, b, math::minimum(alpha, 255U));
}
void dxt1_block::unpack_color(uint& r, uint& g, uint& b, uint16 packed_color, bool scaled)
{
void dxt1_block::unpack_color(uint& r, uint& g, uint& b, uint16 packed_color, bool scaled) {
color_quad_u8 c(unpack_color(packed_color, scaled, 0));
r = c.r;
g = c.g;
b = c.b;
}
void dxt1_block::get_block_colors_NV5x(color_quad_u8* pDst, uint16 packed_col0, uint16 packed_col1, bool color4)
{
void dxt1_block::get_block_colors_NV5x(color_quad_u8* pDst, uint16 packed_col0, uint16 packed_col1, bool color4) {
color_quad_u8 col0(unpack_color(packed_col0, false));
color_quad_u8 col1(unpack_color(packed_col1, false));
@@ -165,8 +179,7 @@ namespace crnlib
pDst[3].g = static_cast<uint8>((256 * pDst[1].g - gdiff / 4 + 128 - gdiff * 80) / 256);
pDst[3].b = static_cast<uint8>(((2 * col1.b + col0.b) * 22) / 8);
pDst[3].a = 0xFF;
}
else {
} else {
pDst[2].r = static_cast<uint8>(((col0.r + col1.r) * 33) / 8);
pDst[2].g = static_cast<uint8>((256 * pDst[0].g + gdiff / 4 + 128 + gdiff * 128) / 256);
pDst[2].b = static_cast<uint8>(((col0.b + col1.b) * 33) / 8);
@@ -179,8 +192,7 @@ namespace crnlib
}
}
uint dxt1_block::get_block_colors3(color_quad_u8* pDst, uint16 color0, uint16 color1)
{
uint dxt1_block::get_block_colors3(color_quad_u8* pDst, uint16 color0, uint16 color1) {
color_quad_u8 c0(unpack_color(color0, true));
color_quad_u8 c1(unpack_color(color1, true));
@@ -192,8 +204,7 @@ namespace crnlib
return 3;
}
uint dxt1_block::get_block_colors4(color_quad_u8* pDst, uint16 color0, uint16 color1)
{
uint dxt1_block::get_block_colors4(color_quad_u8* pDst, uint16 color0, uint16 color1) {
color_quad_u8 c0(unpack_color(color0, true));
color_quad_u8 c1(unpack_color(color1, true));
@@ -207,8 +218,7 @@ namespace crnlib
return 4;
}
uint dxt1_block::get_block_colors3_round(color_quad_u8* pDst, uint16 color0, uint16 color1)
{
uint dxt1_block::get_block_colors3_round(color_quad_u8* pDst, uint16 color0, uint16 color1) {
color_quad_u8 c0(unpack_color(color0, true));
color_quad_u8 c1(unpack_color(color1, true));
@@ -220,8 +230,7 @@ namespace crnlib
return 3;
}
uint dxt1_block::get_block_colors4_round(color_quad_u8* pDst, uint16 color0, uint16 color1)
{
uint dxt1_block::get_block_colors4_round(color_quad_u8* pDst, uint16 color0, uint16 color1) {
color_quad_u8 c0(unpack_color(color0, true));
color_quad_u8 c1(unpack_color(color1, true));
@@ -236,45 +245,37 @@ namespace crnlib
return 4;
}
uint dxt1_block::get_block_colors(color_quad_u8* pDst, uint16 color0, uint16 color1)
{
uint dxt1_block::get_block_colors(color_quad_u8* pDst, uint16 color0, uint16 color1) {
if (color0 > color1)
return get_block_colors4(pDst, color0, color1);
else
return get_block_colors3(pDst, color0, color1);
}
uint dxt1_block::get_block_colors_round(color_quad_u8* pDst, uint16 color0, uint16 color1)
{
uint dxt1_block::get_block_colors_round(color_quad_u8* pDst, uint16 color0, uint16 color1) {
if (color0 > color1)
return get_block_colors4_round(pDst, color0, color1);
else
return get_block_colors3_round(pDst, color0, color1);
}
color_quad_u8 dxt1_block::unpack_endpoint(uint32 endpoints, uint index, bool scaled, uint alpha)
{
color_quad_u8 dxt1_block::unpack_endpoint(uint32 endpoints, uint index, bool scaled, uint alpha) {
CRNLIB_ASSERT(index < 2);
return unpack_color(static_cast<uint16>((endpoints >> (index * 16U)) & 0xFFFFU), scaled, alpha);
}
uint dxt1_block::pack_endpoints(uint lo, uint hi)
{
uint dxt1_block::pack_endpoints(uint lo, uint hi) {
CRNLIB_ASSERT((lo <= 0xFFFFU) && (hi <= 0xFFFFU));
return lo | (hi << 16U);
}
void dxt3_block::set_alpha(uint x, uint y, uint value, bool scaled)
{
void dxt3_block::set_alpha(uint x, uint y, uint value, bool scaled) {
CRNLIB_ASSERT((x < cDXTBlockSize) && (y < cDXTBlockSize));
if (scaled)
{
if (scaled) {
CRNLIB_ASSERT(value <= 0xFF);
value = (value * 15U + 128U) / 255U;
}
else
{
} else {
CRNLIB_ASSERT(value <= 0xF);
}
@@ -287,8 +288,7 @@ namespace crnlib
m_alpha[ofs] = static_cast<uint8>(c);
}
uint dxt3_block::get_alpha(uint x, uint y, bool scaled) const
{
uint dxt3_block::get_alpha(uint x, uint y, bool scaled) const {
CRNLIB_ASSERT((x < cDXTBlockSize) && (y < cDXTBlockSize));
uint value = m_alpha[(y << 1U) + (x >> 1U)];
@@ -302,8 +302,7 @@ namespace crnlib
return value;
}
uint dxt5_block::get_block_values6(color_quad_u8* pDst, uint l, uint h)
{
uint dxt5_block::get_block_values6(color_quad_u8* pDst, uint l, uint h) {
pDst[0].a = static_cast<uint8>(l);
pDst[1].a = static_cast<uint8>(h);
pDst[2].a = static_cast<uint8>((l * 4 + h) / 5);
@@ -315,8 +314,7 @@ namespace crnlib
return 6;
}
uint dxt5_block::get_block_values8(color_quad_u8* pDst, uint l, uint h)
{
uint dxt5_block::get_block_values8(color_quad_u8* pDst, uint l, uint h) {
pDst[0].a = static_cast<uint8>(l);
pDst[1].a = static_cast<uint8>(h);
pDst[2].a = static_cast<uint8>((l * 6 + h) / 7);
@@ -328,16 +326,14 @@ namespace crnlib
return 8;
}
uint dxt5_block::get_block_values(color_quad_u8* pDst, uint l, uint h)
{
uint dxt5_block::get_block_values(color_quad_u8* pDst, uint l, uint h) {
if (l > h)
return get_block_values8(pDst, l, h);
else
return get_block_values6(pDst, l, h);
}
uint dxt5_block::get_block_values6(uint* pDst, uint l, uint h)
{
uint dxt5_block::get_block_values6(uint* pDst, uint l, uint h) {
pDst[0] = l;
pDst[1] = h;
pDst[2] = (l * 4 + h) / 5;
@@ -349,8 +345,7 @@ namespace crnlib
return 6;
}
uint dxt5_block::get_block_values8(uint* pDst, uint l, uint h)
{
uint dxt5_block::get_block_values8(uint* pDst, uint l, uint h) {
pDst[0] = l;
pDst[1] = h;
pDst[2] = (l * 6 + h) / 7;
@@ -362,20 +357,17 @@ namespace crnlib
return 8;
}
uint dxt5_block::unpack_endpoint(uint packed, uint index)
{
uint dxt5_block::unpack_endpoint(uint packed, uint index) {
CRNLIB_ASSERT(index < 2);
return (packed >> (8 * index)) & 0xFF;
}
uint dxt5_block::pack_endpoints(uint lo, uint hi)
{
uint dxt5_block::pack_endpoints(uint lo, uint hi) {
CRNLIB_ASSERT((lo <= 0xFF) && (hi <= 0xFF));
return lo | (hi << 8U);
}
uint dxt5_block::get_block_values(uint* pDst, uint l, uint h)
{
uint dxt5_block::get_block_values(uint* pDst, uint l, uint h) {
if (l > h)
return get_block_values8(pDst, l, h);
else
@@ -383,4 +375,3 @@ namespace crnlib
}
} // namespace crnlib
+49 -85
View File
@@ -11,10 +11,8 @@
#define CRNLIB_DXT_ALT_ROUNDING 1
namespace crnlib
{
enum dxt_constants
{
namespace crnlib {
enum dxt_constants {
cDXT1BytesPerBlock = 8U,
cDXT5NBytesPerBlock = 16U,
@@ -30,8 +28,7 @@ namespace crnlib
cDXTBlockSize = 1U << cDXTBlockShift
};
enum dxt_format
{
enum dxt_format {
cDXTInvalid = -1,
// cDXT1/1A must appear first!
@@ -45,7 +42,11 @@ namespace crnlib
cDXN_XY, // inverted relative to standard ATI2, 360's DXN
cDXN_YX, // standard ATI2,
cETC1 // Ericsson texture compression (color only, 4x4 blocks, 4bpp, 64-bits/block)
cETC1,
cETC2,
cETC2A,
cETC1S,
cETC2AS,
};
const float cDXT1MaxLinearValue = 3.0f;
@@ -79,38 +80,32 @@ namespace crnlib
const char* get_dxt_compressor_name(crn_dxt_compressor_type c);
struct dxt1_block
{
struct dxt1_block {
uint8 m_low_color[2];
uint8 m_high_color[2];
enum { cNumSelectorBytes = 4 };
uint8 m_selectors[cNumSelectorBytes];
inline void clear()
{
inline void clear() {
utils::zero_this(this);
}
// These methods assume the in-memory rep is in LE byte order.
inline uint get_low_color() const
{
inline uint get_low_color() const {
return m_low_color[0] | (m_low_color[1] << 8U);
}
inline uint get_high_color() const
{
inline uint get_high_color() const {
return m_high_color[0] | (m_high_color[1] << 8U);
}
inline void set_low_color(uint16 c)
{
inline void set_low_color(uint16 c) {
m_low_color[0] = static_cast<uint8>(c & 0xFF);
m_low_color[1] = static_cast<uint8>((c >> 8) & 0xFF);
}
inline void set_high_color(uint16 c)
{
inline void set_high_color(uint16 c) {
m_high_color[0] = static_cast<uint8>(c & 0xFF);
m_high_color[1] = static_cast<uint8>((c >> 8) & 0xFF);
}
@@ -119,26 +114,21 @@ namespace crnlib
inline bool is_alpha_block() const { return get_low_color() <= get_high_color(); }
inline bool is_non_alpha_block() const { return !is_alpha_block(); }
inline uint get_selector(uint x, uint y) const
{
inline uint get_selector(uint x, uint y) const {
CRNLIB_ASSERT((x < 4U) && (y < 4U));
return (m_selectors[y] >> (x * cDXT1SelectorBits)) & cDXT1SelectorMask;
}
inline void set_selector(uint x, uint y, uint val)
{
inline void set_selector(uint x, uint y, uint val) {
CRNLIB_ASSERT((x < 4U) && (y < 4U) && (val < 4U));
m_selectors[y] &= (~(cDXT1SelectorMask << (x * cDXT1SelectorBits)));
m_selectors[y] |= (val << (x * cDXT1SelectorBits));
}
inline void flip_x(uint w = 4, uint h = 4)
{
for (uint x = 0; x < (w / 2); x++)
{
for (uint y = 0; y < h; y++)
{
inline void flip_x(uint w = 4, uint h = 4) {
for (uint x = 0; x < (w / 2); x++) {
for (uint y = 0; y < h; y++) {
const uint c = get_selector(x, y);
set_selector(x, y, get_selector((w - 1) - x, y));
set_selector((w - 1) - x, y, c);
@@ -146,12 +136,9 @@ namespace crnlib
}
}
inline void flip_y(uint w = 4, uint h = 4)
{
for (uint y = 0; y < (h / 2); y++)
{
for (uint x = 0; x < w; x++)
{
inline void flip_y(uint w = 4, uint h = 4) {
for (uint y = 0; y < (h / 2); y++) {
for (uint x = 0; x < w; x++) {
const uint c = get_selector(x, y);
set_selector(x, y, get_selector(x, (h - 1) - y));
set_selector(x, (h - 1) - y, c);
@@ -184,20 +171,16 @@ namespace crnlib
CRNLIB_DEFINE_BITWISE_COPYABLE(dxt1_block);
struct dxt3_block
{
struct dxt3_block {
enum { cNumAlphaBytes = 8 };
uint8 m_alpha[cNumAlphaBytes];
void set_alpha(uint x, uint y, uint value, bool scaled);
uint get_alpha(uint x, uint y, bool scaled) const;
inline void flip_x(uint w = 4, uint h = 4)
{
for (uint x = 0; x < (w / 2); x++)
{
for (uint y = 0; y < h; y++)
{
inline void flip_x(uint w = 4, uint h = 4) {
for (uint x = 0; x < (w / 2); x++) {
for (uint y = 0; y < h; y++) {
const uint c = get_alpha(x, y, false);
set_alpha(x, y, get_alpha((w - 1) - x, y, false), false);
set_alpha((w - 1) - x, y, c, false);
@@ -205,12 +188,9 @@ namespace crnlib
}
}
inline void flip_y(uint w = 4, uint h = 4)
{
for (uint y = 0; y < (h / 2); y++)
{
for (uint x = 0; x < w; x++)
{
inline void flip_y(uint w = 4, uint h = 4) {
for (uint y = 0; y < (h / 2); y++) {
for (uint x = 0; x < w; x++) {
const uint c = get_alpha(x, y, false);
set_alpha(x, y, get_alpha(x, (h - 1) - y, false), false);
set_alpha(x, (h - 1) - y, c, false);
@@ -221,36 +201,30 @@ namespace crnlib
CRNLIB_DEFINE_BITWISE_COPYABLE(dxt3_block);
struct dxt5_block
{
struct dxt5_block {
uint8 m_endpoints[2];
enum { cNumSelectorBytes = 6 };
uint8 m_selectors[cNumSelectorBytes];
inline void clear()
{
inline void clear() {
utils::zero_this(this);
}
inline uint get_low_alpha() const
{
inline uint get_low_alpha() const {
return m_endpoints[0];
}
inline uint get_high_alpha() const
{
inline uint get_high_alpha() const {
return m_endpoints[1];
}
inline void set_low_alpha(uint i)
{
inline void set_low_alpha(uint i) {
CRNLIB_ASSERT(i <= cUINT8_MAX);
m_endpoints[0] = static_cast<uint8>(i);
}
inline void set_high_alpha(uint i)
{
inline void set_high_alpha(uint i) {
CRNLIB_ASSERT(i <= cUINT8_MAX);
m_endpoints[1] = static_cast<uint8>(i);
}
@@ -258,10 +232,12 @@ namespace crnlib
inline bool is_alpha6_block() const { return get_low_alpha() <= get_high_alpha(); }
uint get_endpoints_as_word() const { return m_endpoints[0] | (m_endpoints[1] << 8); }
uint get_selectors_as_word(uint index) { CRNLIB_ASSERT(index < 3); return m_selectors[index * 2] | (m_selectors[index * 2 + 1] << 8); }
uint get_selectors_as_word(uint index) {
CRNLIB_ASSERT(index < 3);
return m_selectors[index * 2] | (m_selectors[index * 2 + 1] << 8);
}
inline uint get_selector(uint x, uint y) const
{
inline uint get_selector(uint x, uint y) const {
CRNLIB_ASSERT((x < 4U) && (y < 4U));
uint selector_index = (y * 4) + x;
@@ -277,8 +253,7 @@ namespace crnlib
return (v >> bit_ofs) & 7;
}
inline void set_selector(uint x, uint y, uint val)
{
inline void set_selector(uint x, uint y, uint val) {
CRNLIB_ASSERT((x < 4U) && (y < 4U) && (val < 8U));
uint selector_index = (y * 4) + x;
@@ -299,12 +274,9 @@ namespace crnlib
m_selectors[byte_index + 1] = static_cast<uint8>(v >> 8);
}
inline void flip_x(uint w = 4, uint h = 4)
{
for (uint x = 0; x < (w / 2); x++)
{
for (uint y = 0; y < h; y++)
{
inline void flip_x(uint w = 4, uint h = 4) {
for (uint x = 0; x < (w / 2); x++) {
for (uint y = 0; y < h; y++) {
const uint c = get_selector(x, y);
set_selector(x, y, get_selector((w - 1) - x, y));
set_selector((w - 1) - x, y, c);
@@ -312,12 +284,9 @@ namespace crnlib
}
}
inline void flip_y(uint w = 4, uint h = 4)
{
for (uint y = 0; y < (h / 2); y++)
{
for (uint x = 0; x < w; x++)
{
inline void flip_y(uint w = 4, uint h = 4) {
for (uint y = 0; y < (h / 2); y++) {
for (uint x = 0; x < w; x++) {
const uint c = get_selector(x, y);
set_selector(x, y, get_selector(x, (h - 1) - y));
set_selector(x, (h - 1) - y, c);
@@ -343,12 +312,10 @@ namespace crnlib
CRNLIB_DEFINE_BITWISE_COPYABLE(dxt5_block);
struct dxt_pixel_block
{
struct dxt_pixel_block {
color_quad_u8 m_pixels[cDXTBlockSize][cDXTBlockSize]; // [y][x]
inline void clear()
{
inline void clear() {
utils::zero_object(*this);
}
};
@@ -356,6 +323,3 @@ namespace crnlib
CRNLIB_DEFINE_BITWISE_COPYABLE(dxt_pixel_block);
} // namespace crnlib
+625 -978
View File
File diff suppressed because it is too large Load Diff
+60 -136
View File
@@ -3,22 +3,20 @@
#pragma once
#include "crn_dxt.h"
namespace crnlib
{
struct dxt1_solution_coordinates
{
inline dxt1_solution_coordinates() : m_low_color(0), m_high_color(0){ }
namespace crnlib {
struct dxt1_solution_coordinates {
inline dxt1_solution_coordinates()
: m_low_color(0), m_high_color(0) {}
inline dxt1_solution_coordinates(uint16 l, uint16 h) : m_low_color(l), m_high_color(h) { }
inline dxt1_solution_coordinates(uint16 l, uint16 h)
: m_low_color(l), m_high_color(h) {}
inline dxt1_solution_coordinates(const color_quad_u8& l, const color_quad_u8& h, bool scaled = true) :
m_low_color(dxt1_block::pack_color(l, scaled)),
m_high_color(dxt1_block::pack_color(h, scaled))
{
inline dxt1_solution_coordinates(const color_quad_u8& l, const color_quad_u8& h, bool scaled = true)
: m_low_color(dxt1_block::pack_color(l, scaled)),
m_high_color(dxt1_block::pack_color(h, scaled)) {
}
inline dxt1_solution_coordinates(vec3F nl, vec3F nh)
{
inline dxt1_solution_coordinates(vec3F nl, vec3F nh) {
#if CRNLIB_DXT_ALT_ROUNDING
// Umm, wtf?
nl.clamp(0.0f, .999f);
@@ -38,14 +36,12 @@ namespace crnlib
uint16 m_low_color;
uint16 m_high_color;
inline void clear()
{
inline void clear() {
m_low_color = 0;
m_high_color = 0;
}
inline dxt1_solution_coordinates& canonicalize()
{
inline dxt1_solution_coordinates& canonicalize() {
if (m_low_color < m_high_color)
utils::swap(m_low_color, m_high_color);
return *this;
@@ -53,8 +49,7 @@ namespace crnlib
inline operator size_t() const { return fast_hash(this, sizeof(*this)); }
inline bool operator== (const dxt1_solution_coordinates& other) const
{
inline bool operator==(const dxt1_solution_coordinates& other) const {
uint16 l0 = math::minimum(m_low_color, m_high_color);
uint16 h0 = math::maximum(m_low_color, m_high_color);
@@ -64,13 +59,11 @@ namespace crnlib
return (l0 == l1) && (h0 == h1);
}
inline bool operator!= (const dxt1_solution_coordinates& other) const
{
inline bool operator!=(const dxt1_solution_coordinates& other) const {
return !(*this == other);
}
inline bool operator< (const dxt1_solution_coordinates& other) const
{
inline bool operator<(const dxt1_solution_coordinates& other) const {
uint16 l0 = math::minimum(m_low_color, m_high_color);
uint16 h0 = math::maximum(m_low_color, m_high_color);
@@ -79,8 +72,7 @@ namespace crnlib
if (l0 < l1)
return true;
else if (l0 == l1)
{
else if (l0 == l1) {
if (h0 < h1)
return true;
}
@@ -93,36 +85,32 @@ namespace crnlib
CRNLIB_DEFINE_BITWISE_COPYABLE(dxt1_solution_coordinates);
struct unique_color
{
struct unique_color {
inline unique_color() {}
inline unique_color(const color_quad_u8& color, uint weight) : m_color(color), m_weight(weight) { }
inline unique_color(const color_quad_u8& color, uint weight)
: m_color(color), m_weight(weight) {}
color_quad_u8 m_color;
uint m_weight;
inline bool operator< (const unique_color& c) const
{
inline bool operator<(const unique_color& c) const {
return *reinterpret_cast<const uint32*>(&m_color) < *reinterpret_cast<const uint32*>(&c.m_color);
}
inline bool operator== (const unique_color& c) const
{
inline bool operator==(const unique_color& c) const {
return *reinterpret_cast<const uint32*>(&m_color) == *reinterpret_cast<const uint32*>(&c.m_color);
}
};
CRNLIB_DEFINE_BITWISE_COPYABLE(unique_color);
class dxt1_endpoint_optimizer
{
class dxt1_endpoint_optimizer {
public:
dxt1_endpoint_optimizer();
struct params
{
params() :
m_block_index(0),
struct params {
params()
: m_block_index(0),
m_pPixels(NULL),
m_num_pixels(0),
m_dxt1a_alpha_threshold(128U),
@@ -133,11 +121,7 @@ namespace crnlib
m_grayscale_sampling(false),
m_endpoint_caching(true),
m_use_transparent_indices_for_black(false),
m_force_alpha_blocks(false)
{
m_color_weights[0] = 1;
m_color_weights[1] = 1;
m_color_weights[2] = 1;
m_force_alpha_blocks(false) {
}
uint m_block_index;
@@ -155,12 +139,11 @@ namespace crnlib
bool m_endpoint_caching;
bool m_use_transparent_indices_for_black;
bool m_force_alpha_blocks;
int m_color_weights[3];
};
struct results
{
inline results() : m_pSelectors(NULL) { }
struct results {
inline results()
: m_pSelectors(NULL) {}
uint64 m_error;
@@ -169,54 +152,20 @@ namespace crnlib
uint8* m_pSelectors;
bool m_alpha_block;
bool m_reordered;
bool m_alternate_rounding;
bool m_enforce_selector;
uint8 m_enforced_selector;
};
struct solution
{
solution() { }
solution(const solution& other)
{
m_results = other.m_results;
m_selectors = other.m_selectors;
m_results.m_pSelectors = m_selectors.begin();
}
solution& operator= (const solution& rhs)
{
if (this == &rhs)
return *this;
m_results = rhs.m_results;
m_selectors = rhs.m_selectors;
m_results.m_pSelectors = m_selectors.begin();
return *this;
}
results m_results;
crnlib::vector<uint8> m_selectors;
inline bool operator< (const solution& other) const
{
return m_results.m_error < other.m_results.m_error;
}
static inline bool coords_equal(const solution& lhs, const solution& rhs)
{
return (lhs.m_results.m_low_color == rhs.m_results.m_low_color) && (lhs.m_results.m_high_color == rhs.m_results.m_high_color);
}
};
typedef crnlib::vector<solution> solution_vec;
bool compute(const params& p, results& r, solution_vec* pSolutions = NULL);
bool compute(const params& p, results& r);
private:
const params* m_pParams;
results* m_pResults;
solution_vec* m_pSolutions;
bool m_perceptual;
bool m_has_color_weighting;
bool m_evaluate_hc;
typedef crnlib::vector<unique_color> unique_color_vec;
@@ -225,8 +174,13 @@ namespace crnlib
unique_color_hash_map m_unique_color_hash_map;
unique_color_vec m_unique_colors; // excludes transparent colors!
unique_color_vec m_evaluated_colors;
unique_color_vec m_temp_unique_colors;
struct {
uint64 low, high;
} m_rDist[32], m_gDist[64], m_bDist[32];
uint m_total_unique_color_weight;
bool m_has_transparent_pixels;
@@ -239,8 +193,6 @@ namespace crnlib
vec3F m_principle_axis;
bool m_all_pixels_grayscale;
crnlib::vector<uint16> m_unique_packed_colors;
crnlib::vector<uint8> m_trial_selectors;
@@ -254,31 +206,27 @@ namespace crnlib
crnlib::vector<vec3I> m_lo_cells;
crnlib::vector<vec3I> m_hi_cells;
uint m_total_evals;
struct potential_solution
{
potential_solution() : m_coords(), m_error(cUINT64_MAX), m_alpha_block(false), m_valid(false)
{
struct potential_solution {
potential_solution()
: m_coords(), m_error(cUINT64_MAX), m_alpha_block(false) {
}
dxt1_solution_coordinates m_coords;
crnlib::vector<uint8> m_selectors;
uint64 m_error;
bool m_alpha_block;
bool m_valid;
bool m_alternate_rounding;
bool m_enforce_selector;
uint8 m_enforced_selector;
void clear()
{
void clear() {
m_coords.clear();
m_selectors.resize(0);
m_error = cUINT64_MAX;
m_alpha_block = false;
m_valid = false;
}
bool are_selectors_all_equal() const
{
bool are_selectors_all_equal() const {
if (m_selectors.empty())
return false;
const uint s = m_selectors[0];
@@ -297,56 +245,32 @@ namespace crnlib
bool refine_solution(int refinement_level = 0);
bool evaluate_solution(
const dxt1_solution_coordinates& coords,
bool early_out,
potential_solution* pBest_solution,
bool alternate_rounding = false);
bool evaluate_solution(const dxt1_solution_coordinates& coords, bool alternate_rounding = false);
bool evaluate_solution_uber(const dxt1_solution_coordinates& coords, bool alternate_rounding);
bool evaluate_solution_fast(const dxt1_solution_coordinates& coords, bool alternate_rounding);
bool evaluate_solution_hc_perceptual(const dxt1_solution_coordinates& coords, bool alternate_rounding);
bool evaluate_solution_hc_uniform(const dxt1_solution_coordinates& coords, bool alternate_rounding);
void compute_selectors();
void compute_selectors_hc();
bool evaluate_solution_uber(
potential_solution& solution,
const dxt1_solution_coordinates& coords,
bool early_out,
potential_solution* pBest_solution,
bool alternate_rounding = false);
bool evaluate_solution_fast(
potential_solution& solution,
const dxt1_solution_coordinates& coords,
bool early_out,
potential_solution* pBest_solution,
bool alternate_rounding = false);
void clear();
void find_unique_colors();
bool handle_all_transparent_block();
bool handle_solid_block();
bool handle_multicolor_block();
bool handle_grayscale_block();
void handle_multicolor_block();
void compute_pca(vec3F& axis, const vec3F_array& norm_colors, const vec3F& def);
void compute_vectors(const vec3F& perceptual_weights);
void return_solution(results& results, const potential_solution& solution);
void return_solution();
void try_combinatorial_encoding();
void compute_endpoint_component_errors(uint comp_index, uint64 (&error)[4][256], uint64 (&best_remaining_error)[4]);
void optimize_endpoint_comps();
bool optimize_endpoints(vec3F& low_color, vec3F& high_color);
void optimize_endpoints(vec3F& low_color, vec3F& high_color);
bool try_alpha_as_black_optimization();
bool try_average_block_as_solid();
bool try_median4(const vec3F& low_color, const vec3F& high_color);
bool compute_internal(const params& p, results& r, solution_vec* pSolutions);
void compute_internal(const params& p, results& r);
unique_color lerp_color(const color_quad_u8& a, const color_quad_u8& b, float f, int rounding = 1);
inline uint color_distance(bool perceptual, const color_quad_u8& e1, const color_quad_u8& e2, bool alpha);
static inline vec3F unpack_to_vec3F_raw(uint16 packed_color);
static inline vec3F unpack_to_vec3F(uint16 packed_color);
};
inline void swap(dxt1_endpoint_optimizer::solution& a, dxt1_endpoint_optimizer::solution& b)
{
std::swap(a.m_results, b.m_results);
a.m_selectors.swap(b.m_selectors);
}
} // namespace crnlib
+28 -48
View File
@@ -6,18 +6,15 @@
#include "crn_dxt_fast.h"
#include "crn_intersect.h"
namespace crnlib
{
dxt5_endpoint_optimizer::dxt5_endpoint_optimizer() :
m_pParams(NULL),
m_pResults(NULL)
{
namespace crnlib {
dxt5_endpoint_optimizer::dxt5_endpoint_optimizer()
: m_pParams(NULL),
m_pResults(NULL) {
m_unique_values.reserve(16);
m_unique_value_weights.reserve(16);
}
bool dxt5_endpoint_optimizer::compute(const params& p, results& r)
{
bool dxt5_endpoint_optimizer::compute(const params& p, results& r) {
m_pParams = &p;
m_pResults = &r;
@@ -30,14 +27,12 @@ namespace crnlib
for (uint i = 0; i < 256; i++)
m_unique_value_map[i] = -1;
for (uint i = 0; i < p.m_num_pixels; i++)
{
for (uint i = 0; i < p.m_num_pixels; i++) {
uint alpha = p.m_pPixels[i][p.m_comp_index];
int index = m_unique_value_map[alpha];
if (index == -1)
{
if (index == -1) {
index = m_unique_values.size();
m_unique_value_map[alpha] = index;
@@ -49,9 +44,9 @@ namespace crnlib
m_unique_value_weights[index]++;
}
if (m_unique_values.size() == 1)
{
if (m_unique_values.size() == 1) {
r.m_block_type = 0;
r.m_reordered = false;
r.m_error = 0;
r.m_first_endpoint = m_unique_values[0];
r.m_second_endpoint = m_unique_values[0];
@@ -64,27 +59,23 @@ namespace crnlib
r.m_error = cUINT64_MAX;
for (uint i = 0; i < m_unique_values.size() - 1; i++)
{
for (uint i = 0; i < m_unique_values.size() - 1; i++) {
const uint low_endpoint = m_unique_values[i];
for (uint j = i + 1; j < m_unique_values.size(); j++)
{
for (uint j = i + 1; j < m_unique_values.size(); j++) {
const uint high_endpoint = m_unique_values[j];
evaluate_solution(low_endpoint, high_endpoint);
}
}
if ((m_pParams->m_quality >= cCRNDXTQualityBetter) && (m_pResults->m_error))
{
if ((m_pParams->m_quality >= cCRNDXTQualityBetter) && (m_pResults->m_error)) {
m_flags.resize(256 * 256);
m_flags.clear_all_bits();
const int cProbeAmount = (m_pParams->m_quality == cCRNDXTQualityUber) ? 16 : 8;
for (int l_delta = -cProbeAmount; l_delta <= cProbeAmount; l_delta++)
{
for (int l_delta = -cProbeAmount; l_delta <= cProbeAmount; l_delta++) {
const int l = m_pResults->m_first_endpoint + l_delta;
if (l < 0)
continue;
@@ -93,8 +84,7 @@ namespace crnlib
const uint bit_index = l * 256;
for (int h_delta = -cProbeAmount; h_delta <= cProbeAmount; h_delta++)
{
for (int h_delta = -cProbeAmount; h_delta <= cProbeAmount; h_delta++) {
const int h = m_pResults->m_second_endpoint + h_delta;
if (h < 0)
continue;
@@ -112,34 +102,30 @@ namespace crnlib
}
}
if (m_pResults->m_first_endpoint == m_pResults->m_second_endpoint)
{
m_pResults->m_reordered = false;
if (m_pResults->m_first_endpoint == m_pResults->m_second_endpoint) {
for (uint i = 0; i < m_best_selectors.size(); i++)
m_best_selectors[i] = 0;
}
else if (m_pResults->m_block_type)
{
} else if (m_pResults->m_block_type) {
//if (l > h)
// eight alpha
// else
// six alpha
if (m_pResults->m_first_endpoint > m_pResults->m_second_endpoint)
{
if (m_pResults->m_first_endpoint > m_pResults->m_second_endpoint) {
utils::swap(m_pResults->m_first_endpoint, m_pResults->m_second_endpoint);
m_pResults->m_reordered = true;
for (uint i = 0; i < m_best_selectors.size(); i++)
m_best_selectors[i] = g_six_alpha_invert_table[m_best_selectors[i]];
}
}
else if (!(m_pResults->m_first_endpoint > m_pResults->m_second_endpoint))
{
} else if (!(m_pResults->m_first_endpoint > m_pResults->m_second_endpoint)) {
utils::swap(m_pResults->m_first_endpoint, m_pResults->m_second_endpoint);
m_pResults->m_reordered = true;
for (uint i = 0; i < m_best_selectors.size(); i++)
m_best_selectors[i] = g_eight_alpha_invert_table[m_best_selectors[i]];
}
for (uint i = 0; i < m_pParams->m_num_pixels; i++)
{
for (uint i = 0; i < m_pParams->m_num_pixels; i++) {
uint alpha = m_pParams->m_pPixels[i][m_pParams->m_comp_index];
int index = m_unique_value_map[alpha];
@@ -150,10 +136,8 @@ namespace crnlib
return true;
}
void dxt5_endpoint_optimizer::evaluate_solution(uint low_endpoint, uint high_endpoint)
{
for (uint block_type = 0; block_type < (m_pParams->m_use_both_block_types ? 2U : 1U); block_type++)
{
void dxt5_endpoint_optimizer::evaluate_solution(uint low_endpoint, uint high_endpoint) {
for (uint block_type = 0; block_type < (m_pParams->m_use_both_block_types ? 2U : 1U); block_type++) {
uint selector_values[8];
if (!block_type)
@@ -163,21 +147,18 @@ namespace crnlib
uint64 trial_error = 0;
for (uint i = 0; i < m_unique_values.size(); i++)
{
for (uint i = 0; i < m_unique_values.size(); i++) {
const uint val = m_unique_values[i];
const uint weight = m_unique_value_weights[i];
uint best_selector_error = UINT_MAX;
uint best_selector = 0;
for (uint j = 0; j < 8; j++)
{
for (uint j = 0; j < 8; j++) {
int selector_error = val - selector_values[j];
selector_error = selector_error * selector_error * (int)weight;
if (static_cast<uint>(selector_error) < best_selector_error)
{
if (static_cast<uint>(selector_error) < best_selector_error) {
best_selector_error = selector_error;
best_selector = j;
if (!best_selector_error)
@@ -192,8 +173,7 @@ namespace crnlib
break;
}
if (trial_error < m_pResults->m_error)
{
if (trial_error < m_pResults->m_error) {
m_pResults->m_error = trial_error;
m_pResults->m_first_endpoint = static_cast<uint8>(low_endpoint);
m_pResults->m_second_endpoint = static_cast<uint8>(high_endpoint);
+8 -12
View File
@@ -3,23 +3,19 @@
#pragma once
#include "crn_dxt.h"
namespace crnlib
{
class dxt5_endpoint_optimizer
{
namespace crnlib {
class dxt5_endpoint_optimizer {
public:
dxt5_endpoint_optimizer();
struct params
{
params() :
m_block_index(0),
struct params {
params()
: m_block_index(0),
m_pPixels(NULL),
m_num_pixels(0),
m_comp_index(3),
m_quality(cCRNDXTQualityUber),
m_use_both_block_types(true)
{
m_use_both_block_types(true) {
}
uint m_block_index;
@@ -33,8 +29,7 @@ namespace crnlib
bool m_use_both_block_types;
};
struct results
{
struct results {
uint8* m_pSelectors;
uint64 m_error;
@@ -43,6 +38,7 @@ namespace crnlib
uint8 m_second_endpoint;
uint8 m_block_type; // 1 if 6-alpha, otherwise 8-alpha
bool m_reordered;
};
bool compute(const params& p, results& r);
+101 -254
View File
@@ -4,16 +4,13 @@
#include "crn_dxt_endpoint_refiner.h"
#include "crn_dxt1.h"
namespace crnlib
{
dxt_endpoint_refiner::dxt_endpoint_refiner() :
m_pParams(NULL),
m_pResults(NULL)
{
namespace crnlib {
dxt_endpoint_refiner::dxt_endpoint_refiner()
: m_pParams(NULL),
m_pResults(NULL) {
}
bool dxt_endpoint_refiner::refine(const params& p, results& r)
{
bool dxt_endpoint_refiner::refine(const params& p, results& r) {
if (!p.m_num_pixels)
return false;
@@ -34,8 +31,7 @@ namespace crnlib
vec<3, double> first_color(0.0f);
// This linear solver is from Squish.
for( uint i = 0; i < p.m_num_pixels; ++i )
{
for (uint i = 0; i < p.m_num_pixels; ++i) {
uint8 c = p.m_pSelectors[i];
double k;
@@ -66,26 +62,18 @@ namespace crnlib
// zero where non-determinate
vec<3, double> a, b;
if( beta2_sum == 0.0f )
{
if (beta2_sum == 0.0f) {
a = alphax_sum / alpha2_sum;
b.clear();
}
else if( alpha2_sum == 0.0f )
{
} else if (alpha2_sum == 0.0f) {
a.clear();
b = betax_sum / beta2_sum;
}
else
{
} else {
double factor = alpha2_sum * beta2_sum - alphabeta_sum * alphabeta_sum;
if (factor != 0.0f)
{
if (factor != 0.0f) {
a = (alphax_sum * beta2_sum - betax_sum * alphabeta_sum) / factor;
b = (betax_sum * alpha2_sum - alphax_sum * alphabeta_sum) / factor;
}
else
{
} else {
a = first_color;
b = first_color;
}
@@ -103,260 +91,119 @@ namespace crnlib
else
optimize_dxt5(l, h);
//if (r.m_low_color < r.m_high_color)
// utils::swap(r.m_low_color, r.m_high_color);
return r.m_error < p.m_error_to_beat;
}
void dxt_endpoint_refiner::optimize_dxt5(vec3F low_color, vec3F high_color)
{
float nl = low_color[0];
float nh = high_color[0];
void dxt_endpoint_refiner::optimize_dxt5(vec3F low_color, vec3F high_color) {
uint8 L0 = math::clamp<int>(low_color[0] * 256.0f, 0, 255);
uint8 H0 = math::clamp<int>(high_color[0] * 256.0f, 0, 255);
#if CRNLIB_DXT_ALT_ROUNDING
nl = math::clamp(nl, 0.0f, .999f);
nh = math::clamp(nh, 0.0f, .999f);
uint il = (int)floor(nl * 256.0f);
uint ih = (int)floor(nh * 256.0f);
#else
uint il = (int)floor(.5f + math::clamp(nl, 0.0f, 1.0f) * 255.0f);
uint ih = (int)floor(.5f + math::clamp(nh, 0.0f, 1.0f) * 255.0f);
#endif
uint64 hist[8] = {}, D2[8] = {}, DD[8] = {};
for (uint c = m_pParams->m_alpha_comp_index, i = 0; i < m_pParams->m_num_pixels; i++) {
uint8 a = m_pParams->m_pPixels[i][c];
uint8 s = m_pParams->m_pSelectors[i];
hist[s]++;
D2[s] += a * 2;
DD[s] += a * a;
}
crnlib::vector<uint> trial_solutions;
trial_solutions.reserve(256);
trial_solutions.push_back(il | (ih << 8));
sparse_bit_array flags;
flags.resize(256 * 256);
flags.set_bit((il * 256) + ih);
const int cProbeAmount = 11;
for (int l_delta = -cProbeAmount; l_delta <= cProbeAmount; l_delta++)
{
const int l = il + l_delta;
if (l < 0)
continue;
else if (l > 255)
break;
const uint bit_index = l * 256;
for (int h_delta = -cProbeAmount; h_delta <= cProbeAmount; h_delta++)
{
const int h = ih + h_delta;
if (h < 0)
continue;
else if (h > 255)
break;
if ((flags.get_bit(bit_index + h)) || (flags.get_bit(h * 256 + l)))
continue;
flags.set_bit(bit_index + h);
trial_solutions.push_back(l | (h << 8));
uint16 solutions[529];
uint solutions_count = 0;
solutions[solutions_count++] = L0 == H0 ? H0 ? H0 - 1 << 8 | L0 : 1 : L0 > H0 ? H0 << 8 | L0 : L0 << 8 | H0;
uint8 minL = L0 <= 11 ? 0 : L0 - 11, maxL = L0 >= 244 ? 255 : L0 + 11;
uint8 minH = H0 <= 11 ? 0 : H0 - 11, maxH = H0 >= 244 ? 255 : H0 + 11;
for (uint16 L = minL; L <= maxL; L++) {
for (uint16 H = minH; H <= maxH; H++) {
if ((maxH < L || L <= H || H < minL) && (L != L0 || H != H0) && (L != H0 || H != L0))
solutions[solutions_count++] = L == H ? H ? H - 1 << 8 | L : 1 : L > H ? H << 8 | L : L << 8 | H;
}
}
for (uint trial = 0; trial < trial_solutions.size(); trial++)
{
uint l = trial_solutions[trial] & 0xFF;
uint h = trial_solutions[trial] >> 8;
if (l == h)
{
if (h)
h--;
else
l++;
}
else if (l < h)
{
utils::swap(l, h);
}
CRNLIB_ASSERT(l > h);
uint values[cDXT5SelectorValues];
dxt5_block::get_block_values8(values, l, h);
uint total_error = 0;
for (uint j = 0; j < m_pParams->m_num_pixels; j++)
{
int p = m_pParams->m_pPixels[j][m_pParams->m_alpha_comp_index];
int c = values[m_pParams->m_pSelectors[j]];
int error = p - c;
error *= error;
total_error += error;
if (total_error > m_pResults->m_error)
break;
}
if (total_error < m_pResults->m_error)
{
m_pResults->m_error = total_error;
m_pResults->m_low_color = static_cast<uint16>(l);
m_pResults->m_high_color = static_cast<uint16>(h);
if (m_pResults->m_error == 0)
for (uint i = 0; i < solutions_count; i++) {
uint8 L = solutions[i] & 0xFF;
uint8 H = solutions[i] >> 8;
uint values[8];
dxt5_block::get_block_values8(values, L, H);
uint64 error = 0;
for (uint64 s = 0; s < 8; s++)
error += hist[s] * values[s] * values[s] - D2[s] * values[s] + DD[s];
if (error < m_pResults->m_error) {
m_pResults->m_low_color = L;
m_pResults->m_high_color = H;
m_pResults->m_error = error;
if (!m_pResults->m_error)
return;
}
}
}
void dxt_endpoint_refiner::optimize_dxt1(vec3F low_color, vec3F high_color)
{
uint selector_hist[4];
utils::zero_object(selector_hist);
for (uint i = 0; i < m_pParams->m_num_pixels; i++)
selector_hist[m_pParams->m_pSelectors[i]]++;
void dxt_endpoint_refiner::optimize_dxt1(vec3F low_color, vec3F high_color) {
uint16 L0 = math::clamp<int>(low_color[0] * 32.0f, 0, 31) << 11 | math::clamp<int>(low_color[1] * 64.0f, 0, 63) << 5 | math::clamp<int>(low_color[2] * 32.0f, 0, 31);
uint16 H0 = math::clamp<int>(high_color[0] * 32.0f, 0, 31) << 11 | math::clamp<int>(high_color[1] * 64.0f, 0, 63) << 5 | math::clamp<int>(high_color[2] * 32.0f, 0, 31);
dxt1_solution_coordinates c(low_color, high_color);
for (uint pass = 0; pass < 8; pass++)
{
const uint64 initial_error = m_pResults->m_error;
dxt1_solution_coordinates_vec coords_to_try;
coords_to_try.resize(0);
color_quad_u8 lc(dxt1_block::unpack_color(c.m_low_color, false));
color_quad_u8 hc(dxt1_block::unpack_color(c.m_high_color, false));
for (int i = 0; i < 27; i++)
{
if (13 == i) continue;
const int ir = (i % 3) - 1;
const int ig = ((i / 3) % 3) - 1;
const int ib = ((i / 9) % 3) - 1;
int r = lc.r + ir;
int g = lc.g + ig;
int b = lc.b + ib;
if ((r < 0) || (r > 31)|| (g < 0) || (g > 63) || (b < 0) || (b > 31)) continue;
coords_to_try.push_back(
dxt1_solution_coordinates(dxt1_block::pack_color(r, g, b, false), c.m_high_color)
);
uint64 hist[4] = {}, D2[4][3] = {}, DD[4][3] = {};
for (uint i = 0; i < m_pParams->m_num_pixels; i++) {
const color_quad_u8& pixel = m_pParams->m_pPixels[i];
uint8 s = m_pParams->m_pSelectors[i];
hist[s]++;
for (uint c = 0; c < 3; c++) {
D2[s][c] += pixel[c] * 2;
DD[s][c] += pixel[c] * pixel[c];
}
for (int i = 0; i < 27; i++)
{
if (13 == i) continue;
const int ir = (i % 3) - 1;
const int ig = ((i / 3) % 3) - 1;
const int ib = ((i / 9) % 3) - 1;
int r = hc.r + ir;
int g = hc.g + ig;
int b = hc.b + ib;
if ((r < 0) || (r > 31)|| (g < 0) || (g > 63) || (b < 0) || (b > 31)) continue;
coords_to_try.push_back(dxt1_solution_coordinates(c.m_low_color, dxt1_block::pack_color(r, g, b, false)));
}
crnlib::vector<uint> solutions(54);
bool preserveL = hist[0] + hist[2] > hist[1] + hist[3];
bool improved = true;
std::sort(coords_to_try.begin(), coords_to_try.end());
dxt1_solution_coordinates_vec::const_iterator p_last = std::unique(coords_to_try.begin(), coords_to_try.end());
uint num_coords_to_try = (uint)(p_last - coords_to_try.begin());
for (uint i = 0; i < num_coords_to_try; i++)
{
for (uint iterations = 8; improved && iterations; iterations--) {
improved = false;
uint solutions_count = 0;
for (uint16 b0 = L0 & 31, g0 = L0 >> 5 & 63, r0 = L0 >> 11 & 31, b = b0 ? b0 - 1 : b0; b <= b0 + 1 && b <= 31; b++) {
for (uint16 g = g0 ? g0 - 1 : g0; g <= g0 + 1 && g <= 63; g++) {
for (uint16 r = r0 ? r0 - 1 : r0; r <= r0 + 1 && r <= 31; r++) {
uint16 L = r << 11 | g << 5 | b;
if (L != L0)
solutions[solutions_count++] = L > H0 ? L | H0 << 16 : H0 | L << 16;
}
}
}
for (uint16 b0 = H0 & 31, g0 = H0 >> 5 & 63, r0 = H0 >> 11 & 31, b = b0 ? b0 - 1 : b0; b <= b0 + 1 && b <= 31; b++) {
for (uint16 g = g0 ? g0 - 1 : g0; g <= g0 + 1 && g <= 63; g++) {
for (uint16 r = r0 ? r0 - 1 : r0; r <= r0 + 1 && r <= 31; r++) {
uint16 H = r << 11 | g << 5 | b;
if (H != H0)
solutions[solutions_count++] = H > L0 ? H | L0 << 16 : L0 | H << 16;
}
}
}
std::sort(solutions.begin(), solutions.begin() + solutions_count);
for (uint i = 0; i < solutions_count; i++) {
if (i && solutions[i] == solutions[i - 1])
continue;
uint16 L = solutions[i] & 0xFFFF;
uint16 H = solutions[i] >> 16;
if (L == H) {
L += !preserveL ? ~L & 0x1F ? 0x1 : ~L & 0xF800 ? 0x800 : ~L & 0x7E0 ? 0x20 : 0 : !L ? 0x1 : 0;
H -= preserveL ? H & 0x1F ? 0x1 : H & 0xF800 ? 0x800 : H & 0x7E0 ? 0x20 : 0 : H == 0xFFFF ? 0x1 : 0;
}
color_quad_u8 block_colors[4];
uint16 l = coords_to_try[i].m_low_color;
uint16 h = coords_to_try[i].m_high_color;
if (l < h)
utils::swap(l, h);
else if (l == h)
{
color_quad_u8 lc(dxt1_block::unpack_color(l, false));
color_quad_u8 hc(dxt1_block::unpack_color(h, false));
bool retry = false;
if ((selector_hist[0] + selector_hist[2]) > (selector_hist[1] + selector_hist[3]))
{
// l affects the output more than h, so muck with h
if (hc[2] != 0)
hc[2]--;
else if (hc[0] != 0)
hc[0]--;
else if (hc[1] != 0)
hc[1]--;
else
retry = true;
dxt1_block::get_block_colors4(block_colors, L, H);
uint64 error = 0;
for (uint64 s = 0, d[3]; s < 4; s++) {
for (uint c = 0; c < 3; c++)
d[c] = hist[s] * block_colors[s][c] * block_colors[s][c] - D2[s][c] * block_colors[s][c] + DD[s][c];
error += m_pParams->m_perceptual ? d[0] * 8 + d[1] * 25 + d[2] : d[0] + d[1] + d[2];
}
else
{
// h affects the output more than l, so muck with l
if (lc[2] != 31)
lc[2]++;
else if (lc[0] != 31)
lc[0]++;
else if (lc[1] != 63)
lc[1]++;
else
retry = true;
}
if (retry)
{
if (l == 0)
l++;
else
h--;
}
else
{
l = dxt1_block::pack_color(lc, false);
h = dxt1_block::pack_color(hc, false);
}
CRNLIB_ASSERT(l > h);
}
dxt1_block::get_block_colors4(block_colors, l, h);
uint total_error = 0;
for (uint j = 0; j < m_pParams->m_num_pixels; j++)
{
const color_quad_u8& c = block_colors[m_pParams->m_pSelectors[j]];
total_error += color::color_distance(m_pParams->m_perceptual, c, m_pParams->m_pPixels[j], false);
if (total_error > m_pResults->m_error)
break;
}
if (total_error < m_pResults->m_error)
{
m_pResults->m_error = total_error;
m_pResults->m_low_color = l;
m_pResults->m_high_color = h;
CRNLIB_ASSERT(l > h);
if (m_pResults->m_error == 0)
if (error < m_pResults->m_error) {
m_pResults->m_low_color = L0 = L;
m_pResults->m_high_color = H0 = H;
m_pResults->m_error = error;
if (!m_pResults->m_error)
return;
improved = true;
}
}
if (m_pResults->m_error == initial_error)
break;
c.m_low_color = m_pResults->m_low_color;
c.m_high_color = m_pResults->m_high_color;
}
}
} // namespace crnlib
+7 -12
View File
@@ -3,18 +3,15 @@
#pragma once
#include "crn_dxt.h"
namespace crnlib
{
namespace crnlib {
// TODO: Experimental/Not fully implemented
class dxt_endpoint_refiner
{
class dxt_endpoint_refiner {
public:
dxt_endpoint_refiner();
struct params
{
params() :
m_block_index(0),
struct params {
params()
: m_block_index(0),
m_pPixels(NULL),
m_num_pixels(0),
m_pSelectors(NULL),
@@ -22,8 +19,7 @@ namespace crnlib
m_error_to_beat(cUINT64_MAX),
m_dxt1_selectors(true),
m_perceptual(true),
m_highest_quality(true)
{
m_highest_quality(true) {
}
uint m_block_index;
@@ -42,8 +38,7 @@ namespace crnlib
bool m_highest_quality;
};
struct results
{
struct results {
uint16 m_low_color;
uint16 m_high_color;
uint64 m_error;
+76 -156
View File
@@ -5,18 +5,14 @@
#include "crn_dxt_fast.h"
#include "crn_ryg_dxt.hpp"
namespace crnlib
{
namespace dxt_fast
{
static inline int mul_8bit(int a, int b)
{
namespace crnlib {
namespace dxt_fast {
static inline int mul_8bit(int a, int b) {
int t = a * b + 128;
return (t + (t >> 8)) >> 8;
}
static inline color_quad_u8& unpack_color(color_quad_u8& c, uint v)
{
static inline color_quad_u8& unpack_color(color_quad_u8& c, uint v) {
uint rv = (v & 0xf800) >> 11;
uint gv = (v & 0x07e0) >> 5;
uint bv = (v & 0x001f) >> 0;
@@ -29,13 +25,11 @@ namespace crnlib
return c;
}
static inline uint pack_color(const color_quad_u8& c)
{
static inline uint pack_color(const color_quad_u8& c) {
return (mul_8bit(c.r, 31) << 11) + (mul_8bit(c.g, 63) << 5) + mul_8bit(c.b, 31);
}
static inline void lerp_color(color_quad_u8& result, const color_quad_u8& p1, const color_quad_u8& p2, uint f)
{
static inline void lerp_color(color_quad_u8& result, const color_quad_u8& p1, const color_quad_u8& p2, uint f) {
CRNLIB_ASSERT(f <= 255);
result.r = static_cast<uint8>(p1.r + mul_8bit(p2.r - p1.r, f));
@@ -43,8 +37,7 @@ namespace crnlib
result.b = static_cast<uint8>(p1.b + mul_8bit(p2.b - p1.b, f));
}
static inline void eval_colors(color_quad_u8* pColors, uint c0, uint c1)
{
static inline void eval_colors(color_quad_u8* pColors, uint c0, uint c1) {
unpack_color(pColors[0], c0);
unpack_color(pColors[1], c1);
@@ -63,8 +56,7 @@ namespace crnlib
}
// false if all selectors equal
static bool match_block_colors(uint n, const color_quad_u8* pBlock, const color_quad_u8* pColors, uint8* pSelectors)
{
static bool match_block_colors(uint n, const color_quad_u8* pBlock, const color_quad_u8* pColors, uint8* pSelectors) {
int dirr = pColors[0].r - pColors[1].r;
int dirg = pColors[0].g - pColors[1].g;
int dirb = pColors[0].b - pColors[1].b;
@@ -86,8 +78,7 @@ namespace crnlib
c3Point >>= 1;
bool status = false;
for (uint i = 0; i < n; i++)
{
for (uint i = 0; i < n; i++) {
int dot = pBlock[i].r * dirr + pBlock[i].g * dirg + pBlock[i].b * dirb;
uint8 s;
@@ -105,12 +96,10 @@ namespace crnlib
return status;
}
static bool optimize_block_colors(uint n, const color_quad_u8* block, uint& max16, uint& min16, uint ave_color[3], float axis[3])
{
static bool optimize_block_colors(uint n, const color_quad_u8* block, uint& max16, uint& min16, uint ave_color[3], float axis[3]) {
int min[3], max[3];
for(uint ch = 0; ch < 3; ch++)
{
for (uint ch = 0; ch < 3; ch++) {
const uint8* bp = ((const uint8*)block) + ch;
int minv, maxv;
@@ -118,8 +107,7 @@ namespace crnlib
minv = maxv = bp[0];
const uint l = n << 2;
for (uint i = 4; i < l; i += 4)
{
for (uint i = 4; i < l; i += 4) {
muv += bp[i];
minv = math::minimum<int>(minv, bp[i]);
maxv = math::maximum<int>(maxv, bp[i]);
@@ -138,8 +126,7 @@ namespace crnlib
for (int i = 0; i < 6; i++)
cov[i] = 0;
for(uint i=0;i<n;i++)
{
for (uint i = 0; i < n; i++) {
double r = (int)block[i].r - (int)ave_color[0];
double g = (int)block[i].g - (int)ave_color[1];
double b = (int)block[i].b - (int)ave_color[2];
@@ -161,8 +148,7 @@ namespace crnlib
vfb = max[2] - min[2];
static const uint nIterPower = 4;
for(uint iter = 0; iter < nIterPower; iter++)
{
for (uint iter = 0; iter < nIterPower; iter++) {
double r = vfr * covf[0] + vfg * covf[1] + vfb * covf[2];
double g = vfr * covf[1] + vfg * covf[3] + vfb * covf[4];
double b = vfr * covf[2] + vfg * covf[4] + vfb * covf[5];
@@ -184,9 +170,7 @@ namespace crnlib
axis[0] = (float)v_r;
axis[1] = (float)v_g;
axis[2] = (float)v_b;
}
else
{
} else {
magn = 512.0f / magn;
vfr *= magn;
vfg *= magn;
@@ -205,18 +189,15 @@ namespace crnlib
color_quad_u8 minp(block[0]);
color_quad_u8 maxp(block[0]);
for(uint i = 1; i < n; i++)
{
for (uint i = 1; i < n; i++) {
int dot = block[i].r * v_r + block[i].g * v_g + block[i].b * v_b;
if (dot < mind)
{
if (dot < mind) {
mind = dot;
minp = block[i];
}
if (dot > maxd)
{
if (dot > maxd) {
maxd = dot;
maxp = block[i];
}
@@ -231,8 +212,7 @@ namespace crnlib
// The refinement function. (Clever code, part 2)
// Tries to optimize colors to suit block contents better.
// (By solving a least squares system via normal equations+Cramer's rule)
static bool refine_block(uint n, const color_quad_u8 *block, uint &max16, uint &min16, const uint8* pSelectors)
{
static bool refine_block(uint n, const color_quad_u8* block, uint& max16, uint& min16, const uint8* pSelectors) {
static const int w1Tab[4] = {3, 0, 2, 1};
static const int prods_0[4] = {0x00, 0x00, 0x02, 0x02};
@@ -247,8 +227,7 @@ namespace crnlib
At1_r = At1_g = At1_b = 0;
At2_r = At2_g = At2_b = 0;
for(uint i = 0; i < n; i++)
{
for (uint i = 0; i < n; i++) {
double r = block[i].r;
double g = block[i].g;
double b = block[i].b;
@@ -298,12 +277,10 @@ namespace crnlib
}
// false if all selectors equal
static bool determine_selectors(uint n, const color_quad_u8* block, uint min16, uint max16, uint8* pSelectors)
{
static bool determine_selectors(uint n, const color_quad_u8* block, uint min16, uint max16, uint8* pSelectors) {
color_quad_u8 color[4];
if (max16 != min16)
{
if (max16 != min16) {
eval_colors(color, min16, max16);
return match_block_colors(n, block, color, pSelectors);
@@ -313,8 +290,7 @@ namespace crnlib
return false;
}
static uint64 determine_error(uint n, const color_quad_u8* block, uint min16, uint max16, uint64 early_out_error)
{
static uint64 determine_error(uint n, const color_quad_u8* block, uint min16, uint max16, uint64 early_out_error) {
color_quad_u8 color[4];
eval_colors(color, min16, max16);
@@ -338,13 +314,11 @@ namespace crnlib
uint64 total_error = 0;
for (uint i = 0; i < n; i++)
{
for (uint i = 0; i < n; i++) {
const color_quad_u8& a = block[i];
uint s = 0;
if (min16 != max16)
{
if (min16 != max16) {
int dot = a.r * dirr + a.g * dirg + a.b * dirb;
if (dot < halfPoint)
@@ -371,21 +345,18 @@ namespace crnlib
return total_error;
}
static bool refine_endpoints(uint n, const color_quad_u8* pBlock, uint& low16, uint& high16, uint8* pSelectors)
{
static bool refine_endpoints(uint n, const color_quad_u8* pBlock, uint& low16, uint& high16, uint8* pSelectors) {
bool optimized = false;
const int limits[3] = {31, 63, 31};
for (uint trial = 0; trial < 2; trial++)
{
for (uint trial = 0; trial < 2; trial++) {
color_quad_u8 color[4];
eval_colors(color, low16, high16);
uint64 total_error[3] = {0, 0, 0};
for (uint i = 0; i < n; i++)
{
for (uint i = 0; i < n; i++) {
const color_quad_u8& a = pBlock[i];
const uint s = pSelectors[i];
@@ -411,20 +382,17 @@ namespace crnlib
bool trial_optimized = false;
for (uint axis = 0; axis < 3; axis++)
{
for (uint axis = 0; axis < 3; axis++) {
if (!total_error[axis])
continue;
const sU8* const pExpand = (axis == 1) ? ryg_dxt::Expand6 : ryg_dxt::Expand5;
for (uint e = 0; e < 2; e++)
{
for (uint e = 0; e < 2; e++) {
uint v[4];
v[e ^ 1] = expanded_endpoints[e ^ 1][axis];
for (int t = -1; t <= 1; t += 2)
{
for (int t = -1; t <= 1; t += 2) {
int a = endpoints[e][axis] + t;
if ((a < 0) || (a > limits[axis]))
continue;
@@ -440,8 +408,7 @@ namespace crnlib
uint64 axis_error = 0;
for (uint i = 0; i < n; i++)
{
for (uint i = 0; i < n; i++) {
const color_quad_u8& p = pBlock[i];
int e = v[pSelectors[i]] - p[axis];
@@ -452,8 +419,7 @@ namespace crnlib
break;
}
if (axis_error < total_error[axis])
{
if (axis_error < total_error[axis]) {
//total_error[axis] = axis_error;
endpoints[e][axis] = (uint8)a;
@@ -470,8 +436,7 @@ namespace crnlib
utils::zero_object(total_error);
for (uint i = 0; i < n; i++)
{
for (uint i = 0; i < n; i++) {
const color_quad_u8& a = pBlock[i];
const uint s = pSelectors[i];
@@ -505,8 +470,7 @@ namespace crnlib
return optimized;
}
static void refine_endpoints2(uint n, const color_quad_u8* pBlock, uint& low16, uint& high16, uint8* pSelectors, float axis[3])
{
static void refine_endpoints2(uint n, const color_quad_u8* pBlock, uint& low16, uint& high16, uint8* pSelectors, float axis[3]) {
uint64 orig_error = determine_error(n, pBlock, low16, high16, cUINT64_MAX);
if (!orig_error)
return;
@@ -536,8 +500,7 @@ namespace crnlib
uint64 cur_error = orig_error;
for (uint iter = 0; iter < num_iters; iter++)
{
for (uint iter = 0; iter < num_iters; iter++) {
color_quad_u8 endpoints[2];
endpoints[0] = dxt1_block::unpack_color((uint16)low16, false);
@@ -547,8 +510,7 @@ namespace crnlib
vec3F high_color(endpoints[1][0], endpoints[1][1], endpoints[1][2]);
vec3F probe_low_color(low_color + initial_ofs);
for (uint i = 0; i < num_trials; i++)
{
for (uint i = 0; i < num_trials; i++) {
int r = math::clamp((int)floor(probe_low_color[0]), 0, 31);
int g = math::clamp((int)floor(probe_low_color[1]), 0, 63);
int b = math::clamp((int)floor(probe_low_color[2]), 0, 31);
@@ -558,8 +520,7 @@ namespace crnlib
}
vec3F probe_high_color(high_color + initial_ofs);
for (uint i = 0; i < num_trials; i++)
{
for (uint i = 0; i < num_trials; i++) {
int r = math::clamp((int)floor(probe_high_color[0]), 0, 31);
int g = math::clamp((int)floor(probe_high_color[1]), 0, 63);
int b = math::clamp((int)floor(probe_high_color[2]), 0, 31);
@@ -580,10 +541,8 @@ namespace crnlib
c = fast_hash(&c, sizeof(c));
hash[(c >> 6) & 3] = 1ULL << (c & 63);
for (uint i = 0; i < num_trials; i++)
{
for (uint j = 0; j < num_trials; j++)
{
for (uint i = 0; i < num_trials; i++) {
for (uint j = 0; j < num_trials; j++) {
uint l = probe_low[i];
uint h = probe_high[j];
if (l < h)
@@ -599,8 +558,7 @@ namespace crnlib
hash[ofs] |= mask;
uint64 new_error = determine_error(n, pBlock, l, h, cur_error);
if (new_error < cur_error)
{
if (new_error < cur_error) {
best_l = l;
best_h = h;
cur_error = new_error;
@@ -610,8 +568,7 @@ namespace crnlib
bool improved = false;
if ((best_l != low16) || (best_h != high16))
{
if ((best_l != low16) || (best_h != high16)) {
low16 = best_l;
high16 = best_h;
@@ -619,8 +576,7 @@ namespace crnlib
improved = true;
}
if (refine_endpoints(n, pBlock, low16, high16, pSelectors))
{
if (refine_endpoints(n, pBlock, low16, high16, pSelectors)) {
improved = true;
uint64 cur_error = determine_error(n, pBlock, low16, high16, cUINT64_MAX);
@@ -637,8 +593,7 @@ namespace crnlib
//if (end_error > orig_error) DebugBreak();
}
static void compress_solid_block(uint n, uint ave_color[3], uint& low16, uint& high16, uint8* pSelectors)
{
static void compress_solid_block(uint n, uint ave_color[3], uint& low16, uint& high16, uint8* pSelectors) {
uint r = ave_color[0];
uint g = ave_color[1];
uint b = ave_color[2];
@@ -649,23 +604,18 @@ namespace crnlib
high16 = (ryg_dxt::OMatch5[r][1] << 11) | (ryg_dxt::OMatch6[g][1] << 5) | ryg_dxt::OMatch5[b][1];
}
void compress_color_block(uint n, const color_quad_u8* block, uint& low16, uint& high16, uint8* pSelectors, bool refine)
{
void compress_color_block(uint n, const color_quad_u8* block, uint& low16, uint& high16, uint8* pSelectors, bool refine) {
CRNLIB_ASSERT((n & 15) == 0);
uint ave_color[3];
float axis[3];
if (!optimize_block_colors(n, block, low16, high16, ave_color, axis))
{
if (!optimize_block_colors(n, block, low16, high16, ave_color, axis)) {
compress_solid_block(n, ave_color, low16, high16, pSelectors);
}
else
{
} else {
if (!determine_selectors(n, block, low16, high16, pSelectors))
compress_solid_block(n, ave_color, low16, high16, pSelectors);
else
{
else {
if (refine_block(n, block, low16, high16, pSelectors))
determine_selectors(n, block, low16, high16, pSelectors);
@@ -674,16 +624,14 @@ namespace crnlib
}
}
if (low16 < high16)
{
if (low16 < high16) {
utils::swap(low16, high16);
for (uint i = 0; i < n; i++)
pSelectors[i] ^= 1;
}
}
void compress_color_block(dxt1_block* pDXT1_block, const color_quad_u8* pBlock, bool refine)
{
void compress_color_block(dxt1_block* pDXT1_block, const color_quad_u8* pBlock, bool refine) {
uint8 color_selectors[16];
uint low16, high16;
dxt_fast::compress_color_block(16, pBlock, low16, high16, color_selectors, refine);
@@ -692,8 +640,7 @@ namespace crnlib
pDXT1_block->set_high_color(static_cast<uint16>(high16));
uint mask = 0;
for (int i = 15; i >= 0; i--)
{
for (int i = 15; i >= 0; i--) {
mask <<= 2;
mask |= color_selectors[i];
}
@@ -704,13 +651,11 @@ namespace crnlib
pDXT1_block->m_selectors[3] = (uint8)((mask >> 24) & 0xFF);
}
void compress_alpha_block(uint n, const color_quad_u8* block, uint& low8, uint& high8, uint8* pSelectors, uint comp_index)
{
void compress_alpha_block(uint n, const color_quad_u8* block, uint& low8, uint& high8, uint8* pSelectors, uint comp_index) {
int min, max;
min = max = block[0][comp_index];
for (uint i = 1; i < n; i++)
{
for (uint i = 1; i < n; i++) {
min = math::minimum<int>(min, block[i][comp_index]);
max = math::maximum<int>(max, block[i][comp_index]);
}
@@ -723,14 +668,18 @@ namespace crnlib
int dist4 = dist * 4;
int dist2 = dist * 2;
for (uint i = 0; i < n; i++)
{
for (uint i = 0; i < n; i++) {
int a = block[i][comp_index] * 7 - bias;
int ind, t;
t = (dist4 - a) >> 31; ind = t & 4; a -= dist4 & t;
t = (dist2 - a) >> 31; ind += t & 2; a -= dist2 & t;
t = (dist - a) >> 31; ind += t & 1;
t = (dist4 - a) >> 31;
ind = t & 4;
a -= dist4 & t;
t = (dist2 - a) >> 31;
ind += t & 2;
a -= dist2 & t;
t = (dist - a) >> 31;
ind += t & 1;
ind = -ind & 7;
ind ^= (2 > ind);
@@ -739,8 +688,7 @@ namespace crnlib
}
}
void compress_alpha_block(dxt5_block* pDXT5_block, const color_quad_u8* pBlock, uint comp_index)
{
void compress_alpha_block(dxt5_block* pDXT5_block, const color_quad_u8* pBlock, uint comp_index) {
uint8 selectors[16];
uint low8, high8;
@@ -753,12 +701,10 @@ namespace crnlib
uint bits = 0;
uint8* pDst = pDXT5_block->m_selectors;
for (uint i = 0; i < 16; i++)
{
for (uint i = 0; i < 16; i++) {
mask |= (selectors[i] << bits);
if ((bits += 3) >= 8)
{
if ((bits += 3) >= 8) {
*pDst++ = static_cast<uint8>(mask);
mask >>= 8;
bits -= 8;
@@ -766,15 +712,13 @@ namespace crnlib
}
}
void find_representative_colors(uint n, const color_quad_u8* pBlock, color_quad_u8& lo, color_quad_u8& hi)
{
void find_representative_colors(uint n, const color_quad_u8* pBlock, color_quad_u8& lo, color_quad_u8& hi) {
uint64 ave64[3];
ave64[0] = 0;
ave64[1] = 0;
ave64[2] = 0;
for (uint i = 0; i < n; i++)
{
for (uint i = 0; i < n; i++) {
ave64[0] += pBlock[i].r;
ave64[1] += pBlock[i].g;
ave64[2] += pBlock[i].b;
@@ -787,14 +731,12 @@ namespace crnlib
int furthest_dist = -1;
uint furthest_index = 0;
for (uint i = 0; i < n; i++)
{
for (uint i = 0; i < n; i++) {
int r = pBlock[i].r - ave[0];
int g = pBlock[i].g - ave[1];
int b = pBlock[i].b - ave[2];
int dist = r * r + g * g + b * b;
if (dist > furthest_dist)
{
if (dist > furthest_dist) {
furthest_dist = dist;
furthest_index = i;
}
@@ -804,14 +746,12 @@ namespace crnlib
int opp_dist = -1;
uint opp_index = 0;
for (uint i = 0; i < n; i++)
{
for (uint i = 0; i < n; i++) {
int r = pBlock[i].r - lo_color.r;
int g = pBlock[i].g - lo_color.g;
int b = pBlock[i].b - lo_color.b;
int dist = r * r + g * g + b * b;
if (dist > opp_dist)
{
if (dist > opp_dist) {
opp_dist = dist;
opp_index = i;
}
@@ -819,15 +759,13 @@ namespace crnlib
color_quad_u8 hi_color(pBlock[opp_index]);
for (uint i = 0; i < 3; i++)
{
for (uint i = 0; i < 3; i++) {
lo_color[i] = static_cast<uint8>((lo_color[i] + ave[i]) >> 1);
hi_color[i] = static_cast<uint8>((hi_color[i] + ave[i]) >> 1);
}
const uint cMaxIters = 4;
for (uint iter_index = 0; iter_index < cMaxIters; iter_index++)
{
for (uint iter_index = 0; iter_index < cMaxIters; iter_index++) {
if ((lo_color[0] == hi_color[0]) && (lo_color[1] == hi_color[1]) && (lo_color[2] == hi_color[2]))
break;
@@ -849,8 +787,7 @@ namespace crnlib
vec_g *= 2;
vec_b *= 2;
for (uint i = 0; i < n; i++)
{
for (uint i = 0; i < n; i++) {
const color_quad_u8& c = pBlock[i];
const int dot = c[0] * vec_r + c[1] * vec_g + c[2] * vec_b;
@@ -875,16 +812,14 @@ namespace crnlib
(new_color8[1][0] == hi_color[0]) && (new_color8[1][1] == hi_color[1]) && (new_color8[1][2] == hi_color[2]))
break;
for (uint i = 0; i < 3; i++)
{
for (uint i = 0; i < 3; i++) {
lo_color[i] = new_color8[0][i];
hi_color[i] = new_color8[1][i];
}
}
uint energy[2] = {0, 0};
for (uint i = 0; i < 3; i++)
{
for (uint i = 0; i < 3; i++) {
energy[0] += lo_color[i] * lo_color[i];
energy[1] += hi_color[i] * hi_color[i];
}
@@ -899,18 +834,3 @@ namespace crnlib
} // namespace dxt_fast
} // namespace crnlib
+2 -4
View File
@@ -4,10 +4,8 @@
#include "crn_color.h"
#include "crn_dxt.h"
namespace crnlib
{
namespace dxt_fast
{
namespace crnlib {
namespace dxt_fast {
void compress_color_block(uint n, const color_quad_u8* block, uint& low16, uint& high16, uint8* pSelectors, bool refine = false);
void compress_color_block(dxt1_block* pDXT1_block, const color_quad_u8* pBlock, bool refine = false);
+1093 -2321
View File
File diff suppressed because it is too large Load Diff
+130 -356
View File
@@ -14,340 +14,171 @@
#define CRN_NO_FUNCTION_DEFINITIONS
#include "../inc/crnlib.h"
namespace crnlib
{
namespace crnlib {
const uint cTotalCompressionPhases = 25;
class dxt_hc
{
class dxt_hc {
public:
dxt_hc();
~dxt_hc();
struct pixel_chunk
{
pixel_chunk() { clear(); }
dxt_pixel_block m_blocks[cChunkBlockHeight][cChunkBlockWidth];
const color_quad_u8& operator() (uint cx, uint cy) const
{
CRNLIB_ASSERT((cx < cChunkPixelWidth) && (cy < cChunkPixelHeight));
return m_blocks[cy >> cBlockPixelHeightShift][cx >> cBlockPixelWidthShift].m_pixels
[cy & (cBlockPixelHeight - 1)][cx & (cBlockPixelWidth - 1)];
}
color_quad_u8& operator() (uint cx, uint cy)
{
CRNLIB_ASSERT((cx < cChunkPixelWidth) && (cy < cChunkPixelHeight));
return m_blocks[cy >> cBlockPixelHeightShift][cx >> cBlockPixelWidthShift].m_pixels
[cy & (cBlockPixelHeight - 1)][cx & (cBlockPixelWidth - 1)];
}
inline void clear()
{
utils::zero_object(*this);
m_weight = 1.0f;
}
float m_weight;
struct endpoint_indices_details {
union {
struct {
uint16 color;
uint16 alpha0;
uint16 alpha1;
};
uint16 component[3];
};
uint8 reference;
endpoint_indices_details() { utils::zero_object(*this); }
};
typedef crnlib::vector<pixel_chunk> pixel_chunk_vec;
struct selector_indices_details {
union {
struct {
uint16 color;
uint16 alpha0;
uint16 alpha1;
};
uint16 component[3];
};
selector_indices_details() { utils::zero_object(*this); }
};
struct params
{
params() :
struct tile_details {
crnlib::vector<color_quad_u8> pixels;
float weight;
vec<6, float> color_endpoint;
vec<2, float> alpha_endpoints[2];
uint16 cluster_indices[3];
};
crnlib::vector<tile_details> m_tiles;
uint m_num_tiles;
float m_color_derating[cCRNMaxLevels][8];
float m_alpha_derating[8];
float m_uint8_to_float[256];
color_quad_u8 (*m_blocks)[16];
uint m_num_blocks;
crnlib::vector<float> m_block_weights;
crnlib::vector<uint8> m_block_encodings;
crnlib::vector<uint64> m_block_selectors[3];
crnlib::vector<uint32> m_color_selectors;
crnlib::vector<uint64> m_alpha_selectors;
crnlib::vector<bool> m_color_selectors_used;
crnlib::vector<bool> m_alpha_selectors_used;
crnlib::vector<uint> m_tile_indices;
crnlib::vector<endpoint_indices_details> m_endpoint_indices;
crnlib::vector<selector_indices_details> m_selector_indices;
struct params {
params()
: m_num_blocks(0),
m_num_levels(0),
m_num_faces(0),
m_format(cDXT1),
m_perceptual(true),
m_hierarchical(true),
m_color_endpoint_codebook_size(3072),
m_color_selector_codebook_size(3072),
m_alpha_endpoint_codebook_size(3072),
m_alpha_selector_codebook_size(3072),
m_adaptive_tile_color_psnr_derating(2.0f), // was 3.4f
m_adaptive_tile_color_psnr_derating(2.0f),
m_adaptive_tile_alpha_psnr_derating(2.0f),
m_adaptive_tile_color_alpha_weighting_ratio(3.0f),
m_num_levels(0),
m_format(cDXT1),
m_hierarchical(true),
m_perceptual(true),
m_debugging(false),
m_pProgress_func(NULL),
m_pProgress_func_data(NULL)
{
m_pProgress_func(0),
m_pProgress_func_data(0) {
m_alpha_component_indices[0] = 3;
m_alpha_component_indices[1] = 0;
for (uint i = 0; i < cCRNMaxLevels; i++)
{
m_levels[i].m_first_chunk = 0;
m_levels[i].m_num_chunks = 0;
for (uint i = 0; i < cCRNMaxLevels; i++) {
m_levels[i].m_first_block = 0;
m_levels[i].m_num_blocks = 0;
m_levels[i].m_block_width = 0;
}
}
// Valid range for codebook sizes: [32,8192] (non-power of two values are okay)
uint m_num_blocks;
uint m_num_levels;
uint m_num_faces;
struct {
uint m_first_block;
uint m_num_blocks;
uint m_block_width;
float m_weight;
} m_levels[cCRNMaxLevels];
dxt_format m_format;
bool m_perceptual;
bool m_hierarchical;
uint m_color_endpoint_codebook_size;
uint m_color_selector_codebook_size;
uint m_alpha_endpoint_codebook_size;
uint m_alpha_selector_codebook_size;
// Higher values cause fewer 8x4, 4x8, and 4x4 blocks to be utilized less often (lower quality/smaller files).
// Lower values cause the encoder to use large tiles less often (better quality/larger files).
// Valid range: [0.0,100.0].
// A value of 0 will cause the encoder to only use tiles larger than 4x4 if doing so would incur to quality loss.
float m_adaptive_tile_color_psnr_derating;
float m_adaptive_tile_alpha_psnr_derating;
float m_adaptive_tile_color_alpha_weighting_ratio;
uint m_alpha_component_indices[2];
struct miplevel_desc
{
uint m_first_chunk;
uint m_num_chunks;
};
// The mip level data is optional!
miplevel_desc m_levels[cCRNMaxLevels];
uint m_num_levels;
dxt_format m_format;
// If m_hierarchical is false, only 4x4 blocks will be used by the encoder (leading to higher quality/larger files).
bool m_hierarchical;
// If m_perceptual is true, perceptual color metrics will be used by the encoder.
bool m_perceptual;
task_pool* m_pTask_pool;
bool m_debugging;
crn_progress_callback_func m_pProgress_func;
void* m_pProgress_func_data;
};
void clear();
// Main compression function
bool compress(const params& p, uint num_chunks, const pixel_chunk* pChunks, task_pool& task_pool);
// Output accessors
inline uint get_num_chunks() const { return m_num_chunks; }
struct chunk_encoding
{
chunk_encoding() { utils::zero_object(*this); };
// Index into g_chunk_encodings.
uint8 m_encoding_index;
// Number of tiles, endpoint indices.
uint8 m_num_tiles;
// Color, alpha0, alpha1
enum { cColorIndex = 0, cAlpha0Index = 1, cAlpha1Index = 2 };
uint16 m_endpoint_indices[3][cChunkMaxTiles];
uint16 m_selector_indices[3][cChunkBlockHeight][cChunkBlockWidth]; // [block_y][block_x]
};
typedef crnlib::vector<chunk_encoding> chunk_encoding_vec;
inline const chunk_encoding& get_chunk_encoding(uint chunk_index) const { return m_chunk_encoding[chunk_index]; }
inline const chunk_encoding_vec& get_chunk_encoding_vec() const { return m_chunk_encoding; }
struct selectors
{
selectors() { utils::zero_object(*this); }
uint8 m_selectors[cBlockPixelHeight][cBlockPixelWidth];
uint8 get_by_index(uint i) const { CRNLIB_ASSERT(i < (cBlockPixelWidth * cBlockPixelHeight)); const uint8* p = (const uint8*)m_selectors; return *(p + i); }
void set_by_index(uint i, uint v) { CRNLIB_ASSERT(i < (cBlockPixelWidth * cBlockPixelHeight)); uint8* p = (uint8*)m_selectors; *(p + i) = static_cast<uint8>(v); }
};
typedef crnlib::vector<selectors> selectors_vec;
// Color endpoints
inline uint get_color_endpoint_codebook_size() const { return m_color_endpoints.size(); }
inline uint get_color_endpoint(uint codebook_index) const { return m_color_endpoints[codebook_index]; }
const crnlib::vector<uint>& get_color_endpoint_vec() const { return m_color_endpoints; }
// Color selectors
uint get_color_selector_codebook_size() const { return m_color_selectors.size(); }
const selectors& get_color_selectors(uint codebook_index) const { return m_color_selectors[codebook_index]; }
const crnlib::vector<selectors>& get_color_selectors_vec() const { return m_color_selectors; }
// Alpha endpoints
inline uint get_alpha_endpoint_codebook_size() const { return m_alpha_endpoints.size(); }
inline uint get_alpha_endpoint(uint codebook_index) const { return m_alpha_endpoints[codebook_index]; }
const crnlib::vector<uint>& get_alpha_endpoint_vec() const { return m_alpha_endpoints; }
// Alpha selectors
uint get_alpha_selector_codebook_size() const { return m_alpha_selectors.size(); }
const selectors& get_alpha_selectors(uint codebook_index) const { return m_alpha_selectors[codebook_index]; }
const crnlib::vector<selectors>& get_alpha_selectors_vec() const { return m_alpha_selectors; }
// Debug images
const pixel_chunk_vec& get_compressed_chunk_pixels() const { return m_dbg_chunk_pixels; }
const pixel_chunk_vec& get_compressed_chunk_pixels_tile_vis() const { return m_dbg_chunk_pixels_tile_vis; }
const pixel_chunk_vec& get_compressed_chunk_pixels_color_quantized() const { return m_dbg_chunk_pixels_color_quantized; }
const pixel_chunk_vec& get_compressed_chunk_pixels_alpha_quantized() const { return m_dbg_chunk_pixels_alpha_quantized; }
const pixel_chunk_vec& get_compressed_chunk_pixels_final() const { return m_dbg_chunk_pixels_final; }
const pixel_chunk_vec& get_compressed_chunk_pixels_orig_color_selectors() const { return m_dbg_chunk_pixels_orig_color_selectors; }
const pixel_chunk_vec& get_compressed_chunk_pixels_quantized_color_selectors() const { return m_dbg_chunk_pixels_quantized_color_selectors; }
const pixel_chunk_vec& get_compressed_chunk_pixels_final_color_selectors() const { return m_dbg_chunk_pixels_final_color_selectors; }
const pixel_chunk_vec& get_compressed_chunk_pixels_orig_alpha_selectors() const { return m_dbg_chunk_pixels_orig_alpha_selectors; }
const pixel_chunk_vec& get_compressed_chunk_pixels_quantized_alpha_selectors() const { return m_dbg_chunk_pixels_quantized_alpha_selectors; }
const pixel_chunk_vec& get_compressed_chunk_pixels_final_alpha_selectors() const { return m_dbg_chunk_pixels_final_alpha_selectors; }
static void create_debug_image_from_chunks(uint num_chunks_x, uint num_chunks_y, const pixel_chunk_vec& chunks, const chunk_encoding_vec *pChunk_encodings, image_u8& img, bool serpentine_scan, int comp_index = -1);
bool compress(
color_quad_u8 (*blocks)[16],
crnlib::vector<endpoint_indices_details>& endpoint_indices,
crnlib::vector<selector_indices_details>& selector_indices,
crnlib::vector<uint32>& color_endpoints,
crnlib::vector<uint32>& alpha_endpoints,
crnlib::vector<uint32>& color_selectors,
crnlib::vector<uint64>& alpha_selectors,
const params& p
);
private:
params m_params;
uint m_num_chunks;
const pixel_chunk* m_pChunks;
chunk_encoding_vec m_chunk_encoding;
uint m_num_alpha_blocks; // 0, 1, or 2
uint m_num_alpha_blocks;
bool m_has_color_blocks;
bool m_has_alpha0_blocks;
bool m_has_alpha1_blocks;
bool m_has_etc_color_blocks;
bool m_has_subblocks;
struct compressed_tile
{
uint m_endpoint_cluster_index;
uint m_first_endpoint;
uint m_second_endpoint;
uint8 m_selectors[cChunkPixelWidth * cChunkPixelHeight];
void set_selector(uint x, uint y, uint s)
{
CRNLIB_ASSERT((x < m_pixel_width) && (y < m_pixel_height));
m_selectors[x + y * m_pixel_width] = static_cast<uint8>(s);
}
uint get_selector(uint x, uint y) const
{
CRNLIB_ASSERT((x < m_pixel_width) && (y < m_pixel_height));
return m_selectors[x + y * m_pixel_width];
}
uint8 m_pixel_width;
uint8 m_pixel_height;
uint8 m_layout_index;
bool m_alpha_encoding;
enum {
cColor = 0,
cAlpha0 = 1,
cAlpha1 = 2,
cNumComps = 3
};
struct compressed_chunk
{
compressed_chunk() { utils::zero_object(*this); }
uint8 m_encoding_index;
uint8 m_num_tiles;
compressed_tile m_tiles[cChunkMaxTiles];
compressed_tile m_quantized_tiles[cChunkMaxTiles];
uint16 m_endpoint_cluster_index[cChunkMaxTiles];
uint16 m_selector_cluster_index[cChunkBlockHeight][cChunkBlockWidth];
struct color_cluster {
color_cluster() : first_endpoint(0), second_endpoint(0) {}
crnlib::vector<uint> blocks[3];
crnlib::vector<color_quad_u8> pixels;
uint first_endpoint;
uint second_endpoint;
color_quad_u8 color_values[4];
};
crnlib::vector<color_cluster> m_color_clusters;
typedef crnlib::vector<compressed_chunk> compressed_chunk_vec;
enum
{
cColorChunks = 0,
cAlpha0Chunks = 1,
cAlpha1Chunks = 2,
cNumCompressedChunkVecs = 3
struct alpha_cluster {
alpha_cluster() : first_endpoint(0), second_endpoint(0) {}
crnlib::vector<uint> blocks[3];
crnlib::vector<color_quad_u8> pixels;
uint first_endpoint;
uint second_endpoint;
uint alpha_values[8];
bool refined_alpha;
uint refined_alpha_values[8];
};
compressed_chunk_vec m_compressed_chunks[cNumCompressedChunkVecs];
volatile atomic32_t m_encoding_hist[cNumChunkEncodings];
atomic32_t m_total_tiles;
void compress_dxt1_block(
dxt1_endpoint_optimizer::results& results,
uint chunk_index, const image_u8& chunk, uint x_ofs, uint y_ofs, uint width, uint height,
uint8* pSelectors);
void compress_dxt5_block(
dxt5_endpoint_optimizer::results& results,
uint chunk_index, const image_u8& chunk, uint x_ofs, uint y_ofs, uint width, uint height, uint component_index,
uint8* pAlpha_selectors);
void determine_compressed_chunks_task(uint64 data, void* pData_ptr);
bool determine_compressed_chunks();
struct tile_cluster
{
tile_cluster() : m_first_endpoint(0), m_second_endpoint(0), m_error(0), m_alpha_encoding(false) { }
// first = chunk, second = tile
// if an alpha tile, second's upper 16 bits contains the alpha index (0 or 1)
crnlib::vector< std::pair<uint, uint> > m_tiles;
uint m_first_endpoint;
uint m_second_endpoint;
uint64 m_error;
bool m_alpha_encoding;
};
typedef crnlib::vector<tile_cluster> tile_cluster_vec;
tile_cluster_vec m_color_clusters;
tile_cluster_vec m_alpha_clusters;
selectors_vec m_color_selectors;
selectors_vec m_alpha_selectors;
// For each selector, this array indicates every chunk/tile/tile block that use this color selector.
struct block_id
{
block_id() { utils::zero_object(*this); }
block_id(uint chunk_index, uint alpha_index, uint tile_index, uint block_x, uint block_y) :
m_chunk_index(chunk_index), m_alpha_index((uint8)alpha_index), m_tile_index((uint8)tile_index), m_block_x((uint8)block_x), m_block_y((uint8)block_y) { }
uint m_chunk_index;
uint8 m_alpha_index;
uint8 m_tile_index;
uint8 m_block_x;
uint8 m_block_y;
};
typedef crnlib::vector< crnlib::vector< block_id > > chunk_blocks_using_selectors_vec;
chunk_blocks_using_selectors_vec m_chunk_blocks_using_color_selectors;
chunk_blocks_using_selectors_vec m_chunk_blocks_using_alpha_selectors; // second's upper 16 bits contain alpha index!
crnlib::vector<uint> m_color_endpoints; // not valid until end, only for user access
crnlib::vector<uint> m_alpha_endpoints; // not valid until end, only for user access
// Debugging
pixel_chunk_vec m_dbg_chunk_pixels;
pixel_chunk_vec m_dbg_chunk_pixels_tile_vis;
pixel_chunk_vec m_dbg_chunk_pixels_color_quantized;
pixel_chunk_vec m_dbg_chunk_pixels_alpha_quantized;
pixel_chunk_vec m_dbg_chunk_pixels_orig_color_selectors;
pixel_chunk_vec m_dbg_chunk_pixels_quantized_color_selectors;
pixel_chunk_vec m_dbg_chunk_pixels_final_color_selectors;
pixel_chunk_vec m_dbg_chunk_pixels_orig_alpha_selectors;
pixel_chunk_vec m_dbg_chunk_pixels_quantized_alpha_selectors;
pixel_chunk_vec m_dbg_chunk_pixels_final_alpha_selectors;
pixel_chunk_vec m_dbg_chunk_pixels_final;
crnlib::vector<alpha_cluster> m_alpha_clusters;
crn_thread_id_t m_main_thread_id;
bool m_canceled;
@@ -356,84 +187,27 @@ namespace crnlib
int m_prev_phase_index;
int m_prev_percentage_complete;
typedef vec<6, float> vec6F;
typedef vec<16, float> vec16F;
typedef tree_clusterizer<vec2F> vec2F_tree_vq;
typedef tree_clusterizer<vec6F> vec6F_tree_vq;
typedef tree_clusterizer<vec16F> vec16F_tree_vq;
struct assign_color_endpoint_clusters_state
{
CRNLIB_NO_COPY_OR_ASSIGNMENT_OP(assign_color_endpoint_clusters_state);
assign_color_endpoint_clusters_state(vec6F_tree_vq& vq, crnlib::vector< crnlib::vector<vec6F> >& training_vecs) :
m_vq(vq), m_training_vecs(training_vecs) { }
vec6F_tree_vq& m_vq;
crnlib::vector< crnlib::vector<vec6F> >& m_training_vecs;
};
struct create_selector_codebook_state
{
CRNLIB_NO_COPY_OR_ASSIGNMENT_OP(create_selector_codebook_state);
create_selector_codebook_state(dxt_hc& hc, bool alpha_blocks, uint comp_index_start, uint comp_index_end, vec16F_tree_vq& selector_vq, chunk_blocks_using_selectors_vec& chunk_blocks_using_selectors, selectors_vec& selectors_cb) :
m_hc(hc),
m_alpha_blocks(alpha_blocks),
m_comp_index_start(comp_index_start),
m_comp_index_end(comp_index_end),
m_selector_vq(selector_vq),
m_chunk_blocks_using_selectors(chunk_blocks_using_selectors),
m_selectors_cb(selectors_cb)
{
}
dxt_hc& m_hc;
bool m_alpha_blocks;
uint m_comp_index_start;
uint m_comp_index_end;
vec16F_tree_vq& m_selector_vq;
chunk_blocks_using_selectors_vec& m_chunk_blocks_using_selectors;
selectors_vec& m_selectors_cb;
mutable spinlock m_chunk_blocks_using_selectors_lock;
};
void assign_color_endpoint_clusters_task(uint64 data, void* pData_ptr);
bool determine_color_endpoint_clusters();
struct determine_alpha_endpoint_clusters_state
{
vec2F_tree_vq m_vq;
crnlib::vector< crnlib::vector<vec2F> > m_training_vecs[2];
};
void determine_alpha_endpoint_clusters_task(uint64 data, void* pData_ptr);
bool determine_alpha_endpoint_clusters();
vec<6, float> palettize_color(color_quad_u8* pixels, uint pixels_count);
vec<2, float> palettize_alpha(color_quad_u8* pixels, uint pixels_count, uint comp_index);
void determine_tiles_task(uint64 data, void* pData_ptr);
void determine_tiles_task_etc(uint64 data, void* pData_ptr);
void determine_color_endpoint_codebook_task(uint64 data, void* pData_ptr);
bool determine_color_endpoint_codebook();
void determine_color_endpoint_codebook_task_etc(uint64 data, void* pData_ptr);
void determine_color_endpoint_clusters_task(uint64 data, void* pData_ptr);
void determine_color_endpoints();
void determine_alpha_endpoint_codebook_task(uint64 data, void* pData_ptr);
bool determine_alpha_endpoint_codebook();
void determine_alpha_endpoint_clusters_task(uint64 data, void* pData_ptr);
void determine_alpha_endpoints();
void create_quantized_debug_images();
void create_color_selector_codebook_task(uint64 data, void* pData_ptr);
void create_color_selector_codebook();
void create_selector_codebook_task(uint64 data, void* pData_ptr);
bool create_selector_codebook(bool alpha_blocks);
void create_alpha_selector_codebook_task(uint64 data, void* pData_ptr);
void create_alpha_selector_codebook();
bool refine_quantized_color_endpoints();
bool refine_quantized_color_selectors();
bool refine_quantized_alpha_endpoints();
bool refine_quantized_alpha_selectors();
void create_final_debug_image();
bool create_chunk_encodings();
bool update_progress(uint phase_index, uint subphase_index, uint subphase_total);
bool compress_internal(const params& p, uint num_chunks, const pixel_chunk* pChunks);
};
CRNLIB_DEFINE_BITWISE_COPYABLE(dxt_hc::pixel_chunk);
CRNLIB_DEFINE_BITWISE_COPYABLE(dxt_hc::chunk_encoding);
CRNLIB_DEFINE_BITWISE_COPYABLE(dxt_hc::selectors);
} // namespace crnlib
+3 -9
View File
@@ -3,8 +3,7 @@
#include "crn_core.h"
#include "crn_dxt_hc_common.h"
namespace crnlib
{
namespace crnlib {
chunk_encoding_desc g_chunk_encodings[cNumChunkEncodings] =
{
{1, {{0, 0, 8, 8, 0}}},
@@ -18,8 +17,7 @@ namespace crnlib
{3, {{0, 0, 4, 8, 3}, {4, 0, 4, 4, 6}, {4, 4, 4, 4, 8}}},
{3, {{4, 0, 4, 8, 4}, {0, 0, 4, 4, 5}, {0, 4, 4, 4, 7}}},
{ 4, { { 0, 0, 4, 4, 5 }, { 4, 0, 4, 4, 6 }, { 0, 4, 4, 4, 7 }, { 4, 4, 4, 4, 8 } } }
};
{4, {{0, 0, 4, 4, 5}, {4, 0, 4, 4, 6}, {0, 4, 4, 4, 7}, {4, 4, 4, 4, 8}}}};
chunk_tile_desc g_chunk_tile_layouts[cNumChunkTileLayouts] =
{
@@ -38,10 +36,6 @@ namespace crnlib
{0, 0, 4, 4, 5},
{4, 0, 4, 4, 6},
{0, 4, 4, 4, 7},
{ 4, 4, 4, 4, 8 }
};
{4, 4, 4, 4, 8}};
} // namespace crnlib
+3 -6
View File
@@ -2,10 +2,8 @@
// See Copyright Notice and license at the end of inc/crnlib.h
#pragma once
namespace crnlib
{
struct chunk_tile_desc
{
namespace crnlib {
struct chunk_tile_desc {
// These values are in pixels, and always a multiple of cBlockPixelWidth/cBlockPixelHeight.
uint m_x_ofs;
uint m_y_ofs;
@@ -14,8 +12,7 @@ namespace crnlib
uint m_layout_index;
};
struct chunk_encoding_desc
{
struct chunk_encoding_desc {
uint m_num_tiles;
chunk_tile_desc m_tiles[4];
};
+456 -504
View File
File diff suppressed because it is too large Load Diff
+40 -32
View File
@@ -11,12 +11,10 @@
#define CRNLIB_SUPPORT_ATI_COMPRESS 0
namespace crnlib
{
namespace crnlib {
class task_pool;
class dxt_image
{
class dxt_image {
public:
dxt_image();
dxt_image(const dxt_image& other);
@@ -38,13 +36,12 @@ namespace crnlib
dxt_format get_format() const { return m_format; }
bool has_color() const { return (m_format == cDXT1) || (m_format == cDXT1A) || (m_format == cDXT3) || (m_format == cDXT5) || (m_format == cETC1); }
bool has_color() const { return (m_format == cDXT1) || (m_format == cDXT1A) || (m_format == cDXT3) || (m_format == cDXT5) || (m_format == cETC1) || (m_format == cETC2) || (m_format == cETC2A) || (m_format == cETC1S) || (m_format == cETC2AS); }
// Will be pretty slow if the image is DXT1, as this method scans for alpha blocks/selectors.
bool has_alpha() const;
enum element_type
{
enum element_type {
cUnused = 0,
cColorDXT1, // DXT1 color block
@@ -53,25 +50,46 @@ namespace crnlib
cAlphaDXT5, // DXT5 alpha block (only)
cColorETC1, // ETC1 color block
cColorETC2, // ETC2 color block
cAlphaETC2, // ETC2 alpha block (only)
};
element_type get_element_type(uint element_index) const { CRNLIB_ASSERT(element_index < m_num_elements_per_block); return m_element_type[element_index]; }
element_type get_element_type(uint element_index) const {
CRNLIB_ASSERT(element_index < m_num_elements_per_block);
return m_element_type[element_index];
}
//Returns -1 for RGB, or [0,3]
int8 get_element_component_index(uint element_index) const { CRNLIB_ASSERT(element_index < m_num_elements_per_block); return m_element_component_index[element_index]; }
int8 get_element_component_index(uint element_index) const {
CRNLIB_ASSERT(element_index < m_num_elements_per_block);
return m_element_component_index[element_index];
}
struct element
{
struct element {
uint8 m_bytes[8];
uint get_le_word(uint index) const { CRNLIB_ASSERT(index < 4); return m_bytes[index*2] | (m_bytes[index * 2 + 1] << 8); }
uint get_be_word(uint index) const { CRNLIB_ASSERT(index < 4); return m_bytes[index*2 + 1] | (m_bytes[index * 2] << 8); }
uint get_le_word(uint index) const {
CRNLIB_ASSERT(index < 4);
return m_bytes[index * 2] | (m_bytes[index * 2 + 1] << 8);
}
uint get_be_word(uint index) const {
CRNLIB_ASSERT(index < 4);
return m_bytes[index * 2 + 1] | (m_bytes[index * 2] << 8);
}
void set_le_word(uint index, uint val) { CRNLIB_ASSERT((index < 4) && (val <= cUINT16_MAX)); m_bytes[index*2] = static_cast<uint8>(val & 0xFF); m_bytes[index * 2 + 1] = static_cast<uint8>((val >> 8) & 0xFF); }
void set_be_word(uint index, uint val) { CRNLIB_ASSERT((index < 4) && (val <= cUINT16_MAX)); m_bytes[index*2+1] = static_cast<uint8>(val & 0xFF); m_bytes[index * 2] = static_cast<uint8>((val >> 8) & 0xFF); }
void set_le_word(uint index, uint val) {
CRNLIB_ASSERT((index < 4) && (val <= cUINT16_MAX));
m_bytes[index * 2] = static_cast<uint8>(val & 0xFF);
m_bytes[index * 2 + 1] = static_cast<uint8>((val >> 8) & 0xFF);
}
void set_be_word(uint index, uint val) {
CRNLIB_ASSERT((index < 4) && (val <= cUINT16_MAX));
m_bytes[index * 2 + 1] = static_cast<uint8>(val & 0xFF);
m_bytes[index * 2] = static_cast<uint8>((val >> 8) & 0xFF);
}
void clear()
{
void clear() {
memset(this, 0, sizeof(*this));
}
};
@@ -81,15 +99,12 @@ namespace crnlib
bool init(dxt_format fmt, uint width, uint height, bool clear_elements);
bool init(dxt_format fmt, uint width, uint height, uint num_elements, element* pElements, bool create_copy);
struct pack_params
{
pack_params()
{
struct pack_params {
pack_params() {
clear();
}
void clear()
{
void clear() {
m_quality = cCRNDXTQualityUber;
m_perceptual = true;
m_dithering = false;
@@ -105,13 +120,9 @@ namespace crnlib
m_progress_range = 100;
m_use_transparent_indices_for_black = false;
m_pTask_pool = NULL;
m_color_weights[0] = 1;
m_color_weights[1] = 1;
m_color_weights[2] = 1;
}
void init(const crn_comp_params &params)
{
void init(const crn_comp_params& params) {
m_perceptual = (params.m_flags & cCRNCompFlagPerceptual) != 0;
m_num_helper_threads = params.m_num_helper_threads;
m_use_both_block_types = (params.m_flags & cCRNCompFlagUseBothBlockTypes) != 0;
@@ -146,8 +157,6 @@ namespace crnlib
uint m_progress_range;
task_pool* m_pTask_pool;
int m_color_weights[3];
};
bool init(dxt_format fmt, const image_u8& img, const pack_params& p = dxt_image::pack_params());
@@ -178,8 +187,7 @@ namespace crnlib
// get_block_pixels() only sets those components stored in the image!
bool get_block_pixels(uint block_x, uint block_y, color_quad_u8* pPixels) const;
struct set_block_pixels_context
{
struct set_block_pixels_context {
dxt1_endpoint_optimizer m_dxt1_optimizer;
dxt5_endpoint_optimizer m_dxt5_optimizer;
pack_etc1_block_context m_etc1_optimizer;
+29 -53
View File
@@ -3,38 +3,31 @@
#pragma once
#include "crn_data_stream.h"
namespace crnlib
{
class dynamic_stream : public data_stream
{
namespace crnlib {
class dynamic_stream : public data_stream {
public:
dynamic_stream(uint initial_size, const char* pName = "dynamic_stream", uint attribs = cDataStreamSeekable | cDataStreamWritable | cDataStreamReadable) :
data_stream(pName, attribs),
m_ofs(0)
{
dynamic_stream(uint initial_size, const char* pName = "dynamic_stream", uint attribs = cDataStreamSeekable | cDataStreamWritable | cDataStreamReadable)
: data_stream(pName, attribs),
m_ofs(0) {
open(initial_size, pName, attribs);
}
dynamic_stream(const void* pBuf, uint size, const char* pName = "dynamic_stream", uint attribs = cDataStreamSeekable | cDataStreamWritable | cDataStreamReadable) :
data_stream(pName, attribs),
m_ofs(0)
{
dynamic_stream(const void* pBuf, uint size, const char* pName = "dynamic_stream", uint attribs = cDataStreamSeekable | cDataStreamWritable | cDataStreamReadable)
: data_stream(pName, attribs),
m_ofs(0) {
open(pBuf, size, pName, attribs);
}
dynamic_stream() :
data_stream(),
m_ofs(0)
{
dynamic_stream()
: data_stream(),
m_ofs(0) {
open();
}
virtual ~dynamic_stream()
{
virtual ~dynamic_stream() {
}
bool open(uint initial_size = 0, const char* pName = "dynamic_stream", uint attribs = cDataStreamSeekable | cDataStreamWritable | cDataStreamReadable)
{
bool open(uint initial_size = 0, const char* pName = "dynamic_stream", uint attribs = cDataStreamSeekable | cDataStreamWritable | cDataStreamReadable) {
close();
m_opened = true;
@@ -46,10 +39,8 @@ namespace crnlib
return true;
}
bool reopen(const char* pName, uint attribs)
{
if (!m_opened)
{
bool reopen(const char* pName, uint attribs) {
if (!m_opened) {
return open(0, pName, attribs);
}
@@ -58,14 +49,11 @@ namespace crnlib
return true;
}
bool open(const void* pBuf, uint size, const char* pName = "dynamic_stream", uint attribs = cDataStreamSeekable | cDataStreamWritable | cDataStreamReadable)
{
if (!m_opened)
{
bool open(const void* pBuf, uint size, const char* pName = "dynamic_stream", uint attribs = cDataStreamSeekable | cDataStreamWritable | cDataStreamReadable) {
if (!m_opened) {
m_opened = true;
m_buf.resize(size);
if (size)
{
if (size) {
CRNLIB_ASSERT(pBuf);
memcpy(&m_buf[0], pBuf, size);
}
@@ -78,10 +66,8 @@ namespace crnlib
return false;
}
virtual bool close()
{
if (m_opened)
{
virtual bool close() {
if (m_opened) {
m_opened = false;
m_buf.clear();
m_ofs = 0;
@@ -94,18 +80,15 @@ namespace crnlib
const crnlib::vector<uint8>& get_buf() const { return m_buf; }
crnlib::vector<uint8>& get_buf() { return m_buf; }
void reserve(uint size)
{
if (m_opened)
{
void reserve(uint size) {
if (m_opened) {
m_buf.reserve(size);
}
}
virtual const void* get_ptr() const { return m_buf.empty() ? NULL : &m_buf[0]; }
virtual uint read(void* pBuf, uint len)
{
virtual uint read(void* pBuf, uint len) {
CRNLIB_ASSERT(pBuf && (len <= 0x7FFFFFFF));
if ((!m_opened) || (!is_readable()) || (!len))
@@ -125,8 +108,7 @@ namespace crnlib
return len;
}
virtual uint write(const void* pBuf, uint len)
{
virtual uint write(const void* pBuf, uint len) {
CRNLIB_ASSERT(pBuf && (len <= 0x7FFFFFFF));
if ((!m_opened) || (!is_writable()) || (!len))
@@ -144,24 +126,21 @@ namespace crnlib
return len;
}
virtual bool flush()
{
virtual bool flush() {
if (!m_opened)
return false;
return true;
}
virtual uint64 get_size()
{
virtual uint64 get_size() {
if (!m_opened)
return 0;
return m_buf.size();
}
virtual uint64 get_remaining()
{
virtual uint64 get_remaining() {
if (!m_opened)
return 0;
@@ -170,16 +149,14 @@ namespace crnlib
return m_buf.size() - m_ofs;
}
virtual uint64 get_ofs()
{
virtual uint64 get_ofs() {
if (!m_opened)
return 0;
return m_ofs;
}
virtual bool seek(int64 ofs, bool relative)
{
virtual bool seek(int64 ofs, bool relative) {
if ((!m_opened) || (!is_seekable()))
return false;
@@ -203,4 +180,3 @@ namespace crnlib
};
} // namespace crnlib
+87 -172
View File
@@ -3,15 +3,11 @@
#include "crn_core.h"
#include "crn_strutils.h"
namespace crnlib
{
namespace crnlib {
dynamic_string g_empty_dynamic_string;
dynamic_string::dynamic_string(eVarArg dummy, const char* p, ...) :
m_buf_size(0), m_len(0), m_pStr(NULL)
{
dummy;
dynamic_string::dynamic_string(eVarArg, const char* p, ...)
: m_buf_size(0), m_len(0), m_pStr(NULL) {
CRNLIB_ASSERT(p);
va_list args;
@@ -20,32 +16,27 @@ namespace crnlib
va_end(args);
}
dynamic_string::dynamic_string(const char* p) :
m_buf_size(0), m_len(0), m_pStr(NULL)
{
dynamic_string::dynamic_string(const char* p)
: m_buf_size(0), m_len(0), m_pStr(NULL) {
CRNLIB_ASSERT(p);
set(p);
}
dynamic_string::dynamic_string(const char* p, uint len) :
m_buf_size(0), m_len(0), m_pStr(NULL)
{
dynamic_string::dynamic_string(const char* p, uint len)
: m_buf_size(0), m_len(0), m_pStr(NULL) {
CRNLIB_ASSERT(p);
set_from_buf(p, len);
}
dynamic_string::dynamic_string(const dynamic_string& other) :
m_buf_size(0), m_len(0), m_pStr(NULL)
{
dynamic_string::dynamic_string(const dynamic_string& other)
: m_buf_size(0), m_len(0), m_pStr(NULL) {
set(other);
}
void dynamic_string::clear()
{
void dynamic_string::clear() {
check();
if (m_pStr)
{
if (m_pStr) {
crnlib_delete_array(m_pStr);
m_pStr = NULL;
@@ -54,20 +45,16 @@ namespace crnlib
}
}
void dynamic_string::empty()
{
void dynamic_string::empty() {
truncate(0);
}
void dynamic_string::optimize()
{
void dynamic_string::optimize() {
if (!m_len)
clear();
else
{
else {
uint min_buf_size = math::next_pow2((uint)m_len + 1);
if (m_buf_size > min_buf_size)
{
if (m_buf_size > min_buf_size) {
char* p = crnlib_new_array<char>(min_buf_size);
memcpy(p, m_pStr, m_len + 1);
@@ -81,8 +68,7 @@ namespace crnlib
}
}
int dynamic_string::compare(const char* p, bool case_sensitive) const
{
int dynamic_string::compare(const char* p, bool case_sensitive) const {
CRNLIB_ASSERT(p);
const int result = (case_sensitive ? strcmp : crn_stricmp)(get_ptr_priv(), p);
@@ -95,13 +81,11 @@ namespace crnlib
return 0;
}
int dynamic_string::compare(const dynamic_string& rhs, bool case_sensitive) const
{
int dynamic_string::compare(const dynamic_string& rhs, bool case_sensitive) const {
return compare(rhs.get_ptr_priv(), case_sensitive);
}
dynamic_string& dynamic_string::set(const char* p, uint max_len)
{
dynamic_string& dynamic_string::set(const char* p, uint max_len) {
CRNLIB_ASSERT(p);
const uint len = math::minimum<uint>(max_len, static_cast<uint>(strlen(p)));
@@ -109,15 +93,12 @@ namespace crnlib
if ((!len) || (len >= cUINT16_MAX))
clear();
else if ((m_pStr) && (p >= m_pStr) && (p < (m_pStr + m_buf_size)))
{
else if ((m_pStr) && (p >= m_pStr) && (p < (m_pStr + m_buf_size))) {
if (m_pStr != p)
memmove(m_pStr, p, len);
m_pStr[len] = '\0';
m_len = static_cast<uint16>(len);
}
else if (ensure_buf(len, false))
{
} else if (ensure_buf(len, false)) {
m_len = static_cast<uint16>(len);
memcpy(m_pStr, p, m_len + 1);
}
@@ -127,24 +108,18 @@ namespace crnlib
return *this;
}
dynamic_string& dynamic_string::set(const dynamic_string& other, uint max_len)
{
if (this == &other)
{
if (max_len < m_len)
{
dynamic_string& dynamic_string::set(const dynamic_string& other, uint max_len) {
if (this == &other) {
if (max_len < m_len) {
m_pStr[max_len] = '\0';
m_len = static_cast<uint16>(max_len);
}
}
else
{
} else {
const uint len = math::minimum<uint>(max_len, other.m_len);
if (!len)
clear();
else if (ensure_buf(len, false))
{
else if (ensure_buf(len, false)) {
m_len = static_cast<uint16>(len);
memcpy(m_pStr, other.get_ptr_priv(), m_len);
m_pStr[len] = '\0';
@@ -156,18 +131,15 @@ namespace crnlib
return *this;
}
bool dynamic_string::set_len(uint new_len, char fill_char)
{
if ((new_len >= cUINT16_MAX) || (!fill_char))
{
bool dynamic_string::set_len(uint new_len, char fill_char) {
if ((new_len >= cUINT16_MAX) || (!fill_char)) {
CRNLIB_ASSERT(0);
return false;
}
uint cur_len = m_len;
if (ensure_buf(new_len, true))
{
if (ensure_buf(new_len, true)) {
if (new_len > cur_len)
memset(m_pStr + cur_len, fill_char, new_len - cur_len);
@@ -181,8 +153,7 @@ namespace crnlib
return true;
}
dynamic_string& dynamic_string::set_from_raw_buf_and_assume_ownership(char *pBuf, uint buf_size_in_chars, uint len_in_chars)
{
dynamic_string& dynamic_string::set_from_raw_buf_and_assume_ownership(char* pBuf, uint buf_size_in_chars, uint len_in_chars) {
CRNLIB_ASSERT(buf_size_in_chars <= cUINT16_MAX);
CRNLIB_ASSERT(math::is_power_of_2(buf_size_in_chars) || (buf_size_in_chars == cUINT16_MAX));
CRNLIB_ASSERT((len_in_chars + 1) <= buf_size_in_chars);
@@ -198,27 +169,23 @@ namespace crnlib
return *this;
}
dynamic_string& dynamic_string::set_from_buf(const void* pBuf, uint buf_size)
{
dynamic_string& dynamic_string::set_from_buf(const void* pBuf, uint buf_size) {
CRNLIB_ASSERT(pBuf);
if (buf_size >= cUINT16_MAX)
{
if (buf_size >= cUINT16_MAX) {
clear();
return *this;
}
#ifdef CRNLIB_BUILD_DEBUG
if ((buf_size) && (memchr(pBuf, 0, buf_size) != NULL))
{
if ((buf_size) && (memchr(pBuf, 0, buf_size) != NULL)) {
CRNLIB_ASSERT(0);
clear();
return *this;
}
#endif
if (ensure_buf(buf_size, false))
{
if (ensure_buf(buf_size, false)) {
if (buf_size)
memcpy(m_pStr, pBuf, buf_size);
@@ -232,28 +199,23 @@ namespace crnlib
return *this;
}
dynamic_string& dynamic_string::set_char(uint index, char c)
{
dynamic_string& dynamic_string::set_char(uint index, char c) {
CRNLIB_ASSERT(index <= m_len);
if (!c)
truncate(index);
else if (index < m_len)
{
else if (index < m_len) {
m_pStr[index] = c;
check();
}
else if (index == m_len)
} else if (index == m_len)
append_char(c);
return *this;
}
dynamic_string& dynamic_string::append_char(char c)
{
if (ensure_buf(m_len + 1))
{
dynamic_string& dynamic_string::append_char(char c) {
if (ensure_buf(m_len + 1)) {
m_pStr[m_len] = c;
m_pStr[m_len + 1] = '\0';
m_len++;
@@ -263,10 +225,8 @@ namespace crnlib
return *this;
}
dynamic_string& dynamic_string::truncate(uint new_len)
{
if (new_len < m_len)
{
dynamic_string& dynamic_string::truncate(uint new_len) {
if (new_len < m_len) {
m_pStr[new_len] = '\0';
m_len = static_cast<uint16>(new_len);
check();
@@ -274,10 +234,8 @@ namespace crnlib
return *this;
}
dynamic_string& dynamic_string::tolower()
{
if (m_len)
{
dynamic_string& dynamic_string::tolower() {
if (m_len) {
#ifdef _MSC_VER
_strlwr_s(get_ptr_priv(), m_buf_size);
#else
@@ -287,10 +245,8 @@ namespace crnlib
return *this;
}
dynamic_string& dynamic_string::toupper()
{
if (m_len)
{
dynamic_string& dynamic_string::toupper() {
if (m_len) {
#ifdef _MSC_VER
_strupr_s(get_ptr_priv(), m_buf_size);
#else
@@ -300,14 +256,12 @@ namespace crnlib
return *this;
}
dynamic_string& dynamic_string::append(const char* p)
{
dynamic_string& dynamic_string::append(const char* p) {
CRNLIB_ASSERT(p);
uint len = static_cast<uint>(strlen(p));
uint new_total_len = m_len + len;
if ((new_total_len) && ensure_buf(new_total_len))
{
if ((new_total_len) && ensure_buf(new_total_len)) {
memcpy(m_pStr + m_len, p, len + 1);
m_len = static_cast<uint16>(m_len + len);
check();
@@ -316,12 +270,10 @@ namespace crnlib
return *this;
}
dynamic_string& dynamic_string::append(const dynamic_string& other)
{
dynamic_string& dynamic_string::append(const dynamic_string& other) {
uint len = other.m_len;
uint new_total_len = m_len + len;
if ((new_total_len) && ensure_buf(new_total_len))
{
if ((new_total_len) && ensure_buf(new_total_len)) {
memcpy(m_pStr + m_len, other.get_ptr_priv(), len + 1);
m_len = static_cast<uint16>(m_len + len);
check();
@@ -330,23 +282,19 @@ namespace crnlib
return *this;
}
dynamic_string operator+ (const char* p, const dynamic_string& a)
{
dynamic_string operator+(const char* p, const dynamic_string& a) {
return dynamic_string(p).append(a);
}
dynamic_string operator+ (const dynamic_string& a, const char* p)
{
dynamic_string operator+(const dynamic_string& a, const char* p) {
return dynamic_string(a).append(p);
}
dynamic_string operator+ (const dynamic_string& a, const dynamic_string& b)
{
dynamic_string operator+(const dynamic_string& a, const dynamic_string& b) {
return dynamic_string(a).append(b);
}
dynamic_string& dynamic_string::format_args(const char* p, va_list args)
{
dynamic_string& dynamic_string::format_args(const char* p, va_list args) {
CRNLIB_ASSERT(p);
const uint cBufSize = 4096;
@@ -359,8 +307,7 @@ namespace crnlib
#endif
if (l <= 0)
clear();
else if (ensure_buf(l, false))
{
else if (ensure_buf(l, false)) {
memcpy(m_pStr, buf, l + 1);
m_len = static_cast<uint16>(l);
@@ -371,8 +318,7 @@ namespace crnlib
return *this;
}
dynamic_string& dynamic_string::format(const char* p, ...)
{
dynamic_string& dynamic_string::format(const char* p, ...) {
CRNLIB_ASSERT(p);
va_list args;
@@ -382,10 +328,8 @@ namespace crnlib
return *this;
}
dynamic_string& dynamic_string::crop(uint start, uint len)
{
if (start >= m_len)
{
dynamic_string& dynamic_string::crop(uint start, uint len) {
if (start >= m_len) {
clear();
return *this;
}
@@ -404,40 +348,32 @@ namespace crnlib
return *this;
}
dynamic_string& dynamic_string::substring(uint start, uint end)
{
dynamic_string& dynamic_string::substring(uint start, uint end) {
CRNLIB_ASSERT(start <= end);
if (start > end)
return *this;
return crop(start, end - start);
}
dynamic_string& dynamic_string::left(uint len)
{
dynamic_string& dynamic_string::left(uint len) {
return substring(0, len);
}
dynamic_string& dynamic_string::mid(uint start, uint len)
{
dynamic_string& dynamic_string::mid(uint start, uint len) {
return crop(start, len);
}
dynamic_string& dynamic_string::right(uint start)
{
dynamic_string& dynamic_string::right(uint start) {
return substring(start, get_len());
}
dynamic_string& dynamic_string::tail(uint num)
{
dynamic_string& dynamic_string::tail(uint num) {
return substring(math::maximum<int>(static_cast<int>(get_len()) - static_cast<int>(num), 0), get_len());
}
dynamic_string& dynamic_string::unquote()
{
if (m_len >= 2)
{
if ( ((*this)[0] == '\"') && ((*this)[m_len - 1] == '\"') )
{
dynamic_string& dynamic_string::unquote() {
if (m_len >= 2) {
if (((*this)[0] == '\"') && ((*this)[m_len - 1] == '\"')) {
return mid(1, m_len - 2);
}
}
@@ -445,8 +381,7 @@ namespace crnlib
return *this;
}
int dynamic_string::find_left(const char* p, bool case_sensitive) const
{
int dynamic_string::find_left(const char* p, bool case_sensitive) const {
CRNLIB_ASSERT(p);
const int p_len = (int)strlen(p);
@@ -458,13 +393,11 @@ namespace crnlib
return -1;
}
bool dynamic_string::contains(const char* p, bool case_sensitive) const
{
bool dynamic_string::contains(const char* p, bool case_sensitive) const {
return find_left(p, case_sensitive) >= 0;
}
uint dynamic_string::count_char(char c) const
{
uint dynamic_string::count_char(char c) const {
uint count = 0;
for (uint i = 0; i < m_len; i++)
if (m_pStr[i] == c)
@@ -472,24 +405,21 @@ namespace crnlib
return count;
}
int dynamic_string::find_left(char c) const
{
int dynamic_string::find_left(char c) const {
for (uint i = 0; i < m_len; i++)
if (m_pStr[i] == c)
return i;
return -1;
}
int dynamic_string::find_right(char c) const
{
int dynamic_string::find_right(char c) const {
for (int i = (int)m_len - 1; i >= 0; i--)
if (m_pStr[i] == c)
return i;
return -1;
}
int dynamic_string::find_right(const char* p, bool case_sensitive) const
{
int dynamic_string::find_right(const char* p, bool case_sensitive) const {
CRNLIB_ASSERT(p);
const int p_len = (int)strlen(p);
@@ -500,8 +430,7 @@ namespace crnlib
return -1;
}
dynamic_string& dynamic_string::trim()
{
dynamic_string& dynamic_string::trim() {
int s, e;
for (s = 0; s < (int)m_len; s++)
if (!isspace(m_pStr[s]))
@@ -514,8 +443,7 @@ namespace crnlib
return crop(s, e - s + 1);
}
dynamic_string& dynamic_string::trim_crlf()
{
dynamic_string& dynamic_string::trim_crlf() {
int s = 0, e;
for (e = m_len - 1; e > s; e--)
@@ -525,8 +453,7 @@ namespace crnlib
return crop(s, e - s + 1);
}
dynamic_string& dynamic_string::remap(int from_char, int to_char)
{
dynamic_string& dynamic_string::remap(int from_char, int to_char) {
for (uint i = 0; i < m_len; i++)
if (m_pStr[i] == from_char)
m_pStr[i] = (char)to_char;
@@ -534,14 +461,10 @@ namespace crnlib
}
#ifdef CRNLIB_BUILD_DEBUG
void dynamic_string::check() const
{
if (!m_pStr)
{
void dynamic_string::check() const {
if (!m_pStr) {
CRNLIB_ASSERT(!m_buf_size && !m_len);
}
else
{
} else {
CRNLIB_ASSERT(m_buf_size);
CRNLIB_ASSERT((m_buf_size == cUINT16_MAX) || math::is_power_of_2((uint32)m_buf_size));
CRNLIB_ASSERT(m_len < m_buf_size);
@@ -553,14 +476,12 @@ namespace crnlib
}
#endif
bool dynamic_string::ensure_buf(uint len, bool preserve_contents)
{
bool dynamic_string::ensure_buf(uint len, bool preserve_contents) {
uint buf_size_needed = len + 1;
CRNLIB_ASSERT(buf_size_needed <= cUINT16_MAX);
if (buf_size_needed <= cUINT16_MAX)
{
if (buf_size_needed <= cUINT16_MAX) {
if (buf_size_needed > m_buf_size)
expand_buf(buf_size_needed, preserve_contents);
}
@@ -568,12 +489,10 @@ namespace crnlib
return m_buf_size >= buf_size_needed;
}
bool dynamic_string::expand_buf(uint new_buf_size, bool preserve_contents)
{
bool dynamic_string::expand_buf(uint new_buf_size, bool preserve_contents) {
new_buf_size = math::minimum<uint>(cUINT16_MAX, math::next_pow2(math::maximum<uint>(m_buf_size, new_buf_size)));
if (new_buf_size != m_buf_size)
{
if (new_buf_size != m_buf_size) {
char* p = crnlib_new_array<char>(new_buf_size);
if (preserve_contents)
@@ -591,15 +510,13 @@ namespace crnlib
return m_buf_size >= new_buf_size;
}
void dynamic_string::swap(dynamic_string& other)
{
void dynamic_string::swap(dynamic_string& other) {
utils::swap(other.m_buf_size, m_buf_size);
utils::swap(other.m_len, m_len);
utils::swap(other.m_pStr, m_pStr);
}
int dynamic_string::serialize(void* pBuf, uint buf_size, bool little_endian) const
{
int dynamic_string::serialize(void* pBuf, uint buf_size, bool little_endian) const {
uint buf_left = buf_size;
//if (m_len > cUINT16_MAX)
@@ -619,11 +536,11 @@ namespace crnlib
return buf_size - buf_left;
}
int dynamic_string::deserialize(const void* pBuf, uint buf_size, bool little_endian)
{
int dynamic_string::deserialize(const void* pBuf, uint buf_size, bool little_endian) {
uint buf_left = buf_size;
if (buf_left < sizeof(uint16)) return -1;
if (buf_left < sizeof(uint16))
return -1;
uint16 l;
if (!utils::read_obj(l, pBuf, buf_left, little_endian))
@@ -639,8 +556,7 @@ namespace crnlib
return buf_size - buf_left;
}
void dynamic_string::translate_lf_to_crlf()
{
void dynamic_string::translate_lf_to_crlf() {
if (find_left(0x0A) < 0)
return;
@@ -650,8 +566,7 @@ namespace crnlib
// normal sequence is 0x0D 0x0A (CR LF, \r\n)
int prev_char = -1;
for (uint i = 0; i < get_len(); i++)
{
for (uint i = 0; i < get_len(); i++) {
const int cur_char = (*this)[i];
if ((cur_char == 0x0A) && (prev_char != 0x0D))
+27 -15
View File
@@ -2,19 +2,21 @@
// See Copyright Notice and license at the end of inc/crnlib.h
#pragma once
namespace crnlib
{
namespace crnlib {
enum { cMaxDynamicStringLen = cUINT16_MAX - 1 };
class dynamic_string
{
class dynamic_string {
public:
inline dynamic_string() : m_buf_size(0), m_len(0), m_pStr(NULL) { }
inline dynamic_string()
: m_buf_size(0), m_len(0), m_pStr(NULL) {}
dynamic_string(eVarArg dummy, const char* p, ...);
dynamic_string(const char* p);
dynamic_string(const char* p, uint len);
dynamic_string(const dynamic_string& other);
inline ~dynamic_string() { if (m_pStr) crnlib_delete_array(m_pStr); }
inline ~dynamic_string() {
if (m_pStr)
crnlib_delete_array(m_pStr);
}
// Truncates the string to 0 chars and frees the buffer.
void clear();
@@ -22,7 +24,13 @@ namespace crnlib
// Truncates the string to 0 chars, but does not free the buffer.
void empty();
inline const char *assume_ownership() { const char *p = m_pStr; m_pStr = NULL; m_len = 0; m_buf_size = 0; return p; }
inline const char* assume_ownership() {
const char* p = m_pStr;
m_pStr = NULL;
m_len = 0;
m_buf_size = 0;
return p;
}
inline uint get_len() const { return m_len; }
inline bool is_empty() const { return !m_len; }
@@ -36,7 +44,10 @@ namespace crnlib
inline char front() const { return m_len ? m_pStr[0] : '\0'; }
inline char back() const { return m_len ? m_pStr[m_len - 1] : '\0'; }
inline char operator[] (uint i) const { CRNLIB_ASSERT(i <= m_len); return get_ptr()[i]; }
inline char operator[](uint i) const {
CRNLIB_ASSERT(i <= m_len);
return get_ptr()[i];
}
inline operator size_t() const { return fast_hash(get_ptr(), m_len) ^ fast_hash(&m_len, sizeof(m_len)); }
@@ -76,7 +87,10 @@ namespace crnlib
dynamic_string& set_char(uint index, char c);
dynamic_string& append_char(char c);
dynamic_string& append_char(int c) { CRNLIB_ASSERT((c >= 0) && (c <= 255)); return append_char(static_cast<char>(c)); }
dynamic_string& append_char(int c) {
CRNLIB_ASSERT((c >= 0) && (c <= 255));
return append_char(static_cast<char>(c));
}
dynamic_string& truncate(uint new_len);
dynamic_string& tolower();
dynamic_string& toupper();
@@ -130,6 +144,7 @@ namespace crnlib
static inline char* create_raw_buffer(uint& buf_size_in_chars);
static inline void free_raw_buffer(char* p) { crnlib_delete_array(p); }
dynamic_string& set_from_raw_buf_and_assume_ownership(char* pBuf, uint buf_size_in_chars, uint len_in_chars);
private:
uint16 m_buf_size;
uint16 m_len;
@@ -155,15 +170,12 @@ namespace crnlib
CRNLIB_DEFINE_BITWISE_MOVABLE(dynamic_string);
inline void swap (dynamic_string& a, dynamic_string& b)
{
inline void swap(dynamic_string& a, dynamic_string& b) {
a.swap(b);
}
inline char *dynamic_string::create_raw_buffer(uint& buf_size_in_chars)
{
if (buf_size_in_chars > cUINT16_MAX)
{
inline char* dynamic_string::create_raw_buffer(uint& buf_size_in_chars) {
if (buf_size_in_chars > cUINT16_MAX) {
CRNLIB_ASSERT(0);
return NULL;
}
+593 -448
View File
File diff suppressed because it is too large Load Diff
+72 -138
View File
@@ -4,10 +4,8 @@
#include "../inc/crnlib.h"
#include "crn_dxt.h"
namespace crnlib
{
enum etc_constants
{
namespace crnlib {
enum etc_constants {
cETC1BytesPerBlock = 8U,
cETC1SelectorBits = 2U,
@@ -69,19 +67,16 @@ namespace crnlib
extern const uint8 g_etc1_to_selector_index[cETC1SelectorValues];
extern const uint8 g_selector_index_to_etc1[cETC1SelectorValues];
struct etc1_coord2
{
struct etc1_coord2 {
uint8 m_x, m_y;
};
extern const etc1_coord2 g_etc1_pixel_coords[2][2][8]; // [flipped][subblock][subblock_pixel]
struct etc1_block
{
struct etc1_block {
// big endian uint64:
// bit ofs: 56 48 40 32 24 16 8 0
// byte ofs: b0, b1, b2, b3, b4, b5, b6, b7
union
{
union {
uint64 m_uint64;
uint8 m_bytes[8];
};
@@ -92,20 +87,17 @@ namespace crnlib
enum { cNumSelectorBytes = 4 };
uint8 m_selectors[cNumSelectorBytes];
inline void clear()
{
inline void clear() {
utils::zero_this(this);
}
inline uint get_general_bits(uint ofs, uint num) const
{
inline uint get_general_bits(uint ofs, uint num) const {
CRNLIB_ASSERT((ofs + num) <= 64U);
CRNLIB_ASSERT(num && (num < 32U));
return (utils::read_be64(&m_uint64) >> ofs) & ((1UL << num) - 1UL);
}
inline void set_general_bits(uint ofs, uint num, uint bits)
{
inline void set_general_bits(uint ofs, uint num, uint bits) {
CRNLIB_ASSERT((ofs + num) <= 64U);
CRNLIB_ASSERT(num && (num < 32U));
@@ -116,8 +108,7 @@ namespace crnlib
utils::write_be64(&m_uint64, x);
}
inline uint get_byte_bits(uint ofs, uint num) const
{
inline uint get_byte_bits(uint ofs, uint num) const {
CRNLIB_ASSERT((ofs + num) <= 64U);
CRNLIB_ASSERT(num && (num <= 8U));
CRNLIB_ASSERT((ofs >> 3) == ((ofs + num - 1) >> 3));
@@ -126,8 +117,7 @@ namespace crnlib
return (m_bytes[byte_ofs] >> byte_bit_ofs) & ((1 << num) - 1);
}
inline void set_byte_bits(uint ofs, uint num, uint bits)
{
inline void set_byte_bits(uint ofs, uint num, uint bits) {
CRNLIB_ASSERT((ofs + num) <= 64U);
CRNLIB_ASSERT(num && (num < 32U));
CRNLIB_ASSERT((ofs >> 3) == ((ofs + num - 1) >> 3));
@@ -141,40 +131,34 @@ namespace crnlib
// false = left/right subblocks
// true = upper/lower subblocks
inline bool get_flip_bit() const
{
inline bool get_flip_bit() const {
return (m_bytes[3] & 1) != 0;
}
inline void set_flip_bit(bool flip)
{
inline void set_flip_bit(bool flip) {
m_bytes[3] &= ~1;
m_bytes[3] |= static_cast<uint8>(flip);
}
inline bool get_diff_bit() const
{
inline bool get_diff_bit() const {
return (m_bytes[3] & 2) != 0;
}
inline void set_diff_bit(bool diff)
{
inline void set_diff_bit(bool diff) {
m_bytes[3] &= ~2;
m_bytes[3] |= (static_cast<uint>(diff) << 1);
}
// Returns intensity modifier table (0-7) used by subblock subblock_id.
// subblock_id=0 left/top (CW 1), 1=right/bottom (CW 2)
inline uint get_inten_table(uint subblock_id) const
{
inline uint get_inten_table(uint subblock_id) const {
CRNLIB_ASSERT(subblock_id < 2);
const uint ofs = subblock_id ? 2 : 5;
return (m_bytes[3] >> ofs) & 7;
}
// Sets intensity modifier table (0-7) used by subblock subblock_id (0 or 1)
inline void set_inten_table(uint subblock_id, uint t)
{
inline void set_inten_table(uint subblock_id, uint t) {
CRNLIB_ASSERT(subblock_id < 2);
CRNLIB_ASSERT(t < 8);
const uint ofs = subblock_id ? 2 : 5;
@@ -183,8 +167,7 @@ namespace crnlib
}
// Returned selector value ranges from 0-3 and is a direct index into g_etc1_inten_tables.
inline uint get_selector(uint x, uint y) const
{
inline uint get_selector(uint x, uint y) const {
CRNLIB_ASSERT((x | y) < 4);
const uint bit_index = x * 4 + y;
@@ -198,8 +181,7 @@ namespace crnlib
}
// Selector "val" ranges from 0-3 and is a direct index into g_etc1_inten_tables.
inline void set_selector(uint x, uint y, uint val)
{
inline void set_selector(uint x, uint y, uint val) {
CRNLIB_ASSERT((x | y | val) < 4);
const uint bit_index = x * 4 + y;
@@ -220,33 +202,25 @@ namespace crnlib
p[-2] |= (msb << byte_bit_ofs);
}
inline void set_base4_color(uint idx, uint16 c)
{
if (idx)
{
inline void set_base4_color(uint idx, uint16 c) {
if (idx) {
set_byte_bits(cETC1AbsColor4R2BitOffset, 4, (c >> 8) & 15);
set_byte_bits(cETC1AbsColor4G2BitOffset, 4, (c >> 4) & 15);
set_byte_bits(cETC1AbsColor4B2BitOffset, 4, c & 15);
}
else
{
} else {
set_byte_bits(cETC1AbsColor4R1BitOffset, 4, (c >> 8) & 15);
set_byte_bits(cETC1AbsColor4G1BitOffset, 4, (c >> 4) & 15);
set_byte_bits(cETC1AbsColor4B1BitOffset, 4, c & 15);
}
}
inline uint16 get_base4_color(uint idx) const
{
inline uint16 get_base4_color(uint idx) const {
uint r, g, b;
if (idx)
{
if (idx) {
r = get_byte_bits(cETC1AbsColor4R2BitOffset, 4);
g = get_byte_bits(cETC1AbsColor4G2BitOffset, 4);
b = get_byte_bits(cETC1AbsColor4B2BitOffset, 4);
}
else
{
} else {
r = get_byte_bits(cETC1AbsColor4R1BitOffset, 4);
g = get_byte_bits(cETC1AbsColor4G1BitOffset, 4);
b = get_byte_bits(cETC1AbsColor4B1BitOffset, 4);
@@ -254,30 +228,26 @@ namespace crnlib
return static_cast<uint16>(b | (g << 4U) | (r << 8U));
}
inline void set_base5_color(uint16 c)
{
inline void set_base5_color(uint16 c) {
set_byte_bits(cETC1BaseColor5RBitOffset, 5, (c >> 10) & 31);
set_byte_bits(cETC1BaseColor5GBitOffset, 5, (c >> 5) & 31);
set_byte_bits(cETC1BaseColor5BBitOffset, 5, c & 31);
}
inline uint16 get_base5_color() const
{
inline uint16 get_base5_color() const {
const uint r = get_byte_bits(cETC1BaseColor5RBitOffset, 5);
const uint g = get_byte_bits(cETC1BaseColor5GBitOffset, 5);
const uint b = get_byte_bits(cETC1BaseColor5BBitOffset, 5);
return static_cast<uint16>(b | (g << 5U) | (r << 10U));
}
void set_delta3_color(uint16 c)
{
void set_delta3_color(uint16 c) {
set_byte_bits(cETC1DeltaColor3RBitOffset, 3, (c >> 6) & 7);
set_byte_bits(cETC1DeltaColor3GBitOffset, 3, (c >> 3) & 7);
set_byte_bits(cETC1DeltaColor3BBitOffset, 3, c & 7);
}
inline uint16 get_delta3_color() const
{
inline uint16 get_delta3_color() const {
const uint r = get_byte_bits(cETC1DeltaColor3RBitOffset, 3);
const uint g = get_byte_bits(cETC1DeltaColor3GBitOffset, 3);
const uint b = get_byte_bits(cETC1DeltaColor3BBitOffset, 3);
@@ -315,16 +285,12 @@ namespace crnlib
static bool get_diff_subblock_colors(color_quad_u8* pDst, uint16 packed_color5, uint16 packed_delta3, uint table_idx);
static void get_abs_subblock_colors(color_quad_u8* pDst, uint16 packed_color4, uint table_idx);
static inline void unscaled_to_scaled_color(color_quad_u8& dst, const color_quad_u8& src, bool color4)
{
if (color4)
{
static inline void unscaled_to_scaled_color(color_quad_u8& dst, const color_quad_u8& src, bool color4) {
if (color4) {
dst.r = src.r | (src.r << 4);
dst.g = src.g | (src.g << 4);
dst.b = src.b | (src.b << 4);
}
else
{
} else {
dst.r = (src.r >> 2) | (src.r << 3);
dst.g = (src.g >> 2) | (src.g << 3);
dst.b = (src.b >> 2) | (src.b << 3);
@@ -338,8 +304,7 @@ namespace crnlib
// Returns false if the block is invalid (it will still be unpacked with clamping).
bool unpack_etc1(const etc1_block& block, color_quad_u8* pDst, bool preserve_alpha = false);
enum crn_etc_quality
{
enum crn_etc_quality {
cCRNETCQualityFast,
cCRNETCQualityMedium,
cCRNETCQualitySlow,
@@ -349,79 +314,65 @@ namespace crnlib
cCRNETCQualityForceDWORD = 0xFFFFFFFF
};
struct crn_etc1_pack_params
{
struct crn_etc1_pack_params {
crn_etc_quality m_quality;
bool m_perceptual;
bool m_dithering;
inline crn_etc1_pack_params()
{
inline crn_etc1_pack_params() {
clear();
}
void clear()
{
void clear() {
m_quality = cCRNETCQualitySlow;
m_perceptual = true;
m_dithering = false;
}
};
struct etc1_solution_coordinates
{
inline etc1_solution_coordinates() :
m_unscaled_color(0, 0, 0, 0),
struct etc1_solution_coordinates {
inline etc1_solution_coordinates()
: m_unscaled_color(0, 0, 0, 0),
m_inten_table(0),
m_color4(false)
{
m_color4(false) {
}
inline etc1_solution_coordinates(uint r, uint g, uint b, uint inten_table, bool color4) :
m_unscaled_color(r, g, b, 255),
inline etc1_solution_coordinates(uint r, uint g, uint b, uint inten_table, bool color4)
: m_unscaled_color(r, g, b, 255),
m_inten_table(inten_table),
m_color4(color4)
{
m_color4(color4) {
}
inline etc1_solution_coordinates(const color_quad_u8& c, uint inten_table, bool color4) :
m_unscaled_color(c),
inline etc1_solution_coordinates(const color_quad_u8& c, uint inten_table, bool color4)
: m_unscaled_color(c),
m_inten_table(inten_table),
m_color4(color4)
{
m_color4(color4) {
}
inline etc1_solution_coordinates(const etc1_solution_coordinates& other)
{
inline etc1_solution_coordinates(const etc1_solution_coordinates& other) {
*this = other;
}
inline etc1_solution_coordinates& operator= (const etc1_solution_coordinates& rhs)
{
inline etc1_solution_coordinates& operator=(const etc1_solution_coordinates& rhs) {
m_unscaled_color = rhs.m_unscaled_color;
m_inten_table = rhs.m_inten_table;
m_color4 = rhs.m_color4;
return *this;
}
inline void clear()
{
inline void clear() {
m_unscaled_color.clear();
m_inten_table = 0;
m_color4 = false;
}
inline color_quad_u8 get_scaled_color() const
{
inline color_quad_u8 get_scaled_color() const {
int br, bg, bb;
if (m_color4)
{
if (m_color4) {
br = m_unscaled_color.r | (m_unscaled_color.r << 4);
bg = m_unscaled_color.g | (m_unscaled_color.g << 4);
bb = m_unscaled_color.b | (m_unscaled_color.b << 4);
}
else
{
} else {
br = (m_unscaled_color.r >> 2) | (m_unscaled_color.r << 3);
bg = (m_unscaled_color.g >> 2) | (m_unscaled_color.g << 3);
bb = (m_unscaled_color.b >> 2) | (m_unscaled_color.b << 3);
@@ -429,17 +380,13 @@ namespace crnlib
return color_quad_u8(br, bg, bb);
}
inline void get_block_colors(color_quad_u8* pBlock_colors)
{
inline void get_block_colors(color_quad_u8* pBlock_colors) {
int br, bg, bb;
if (m_color4)
{
if (m_color4) {
br = m_unscaled_color.r | (m_unscaled_color.r << 4);
bg = m_unscaled_color.g | (m_unscaled_color.g << 4);
bb = m_unscaled_color.b | (m_unscaled_color.b << 4);
}
else
{
} else {
br = (m_unscaled_color.r >> 2) | (m_unscaled_color.r << 3);
bg = (m_unscaled_color.g >> 2) | (m_unscaled_color.g << 3);
bb = (m_unscaled_color.b >> 2) | (m_unscaled_color.b << 3);
@@ -456,45 +403,37 @@ namespace crnlib
bool m_color4;
};
class etc1_optimizer
{
class etc1_optimizer {
CRNLIB_NO_COPY_OR_ASSIGNMENT_OP(etc1_optimizer);
public:
etc1_optimizer()
{
etc1_optimizer() {
clear();
}
void clear()
{
void clear() {
m_pParams = NULL;
m_pResult = NULL;
m_pSorted_luma = NULL;
m_pSorted_luma_indices = NULL;
}
struct params : crn_etc1_pack_params
{
params()
{
struct params : crn_etc1_pack_params {
params() {
clear();
}
params(const crn_etc1_pack_params& base_params) :
crn_etc1_pack_params(base_params)
{
params(const crn_etc1_pack_params& base_params)
: crn_etc1_pack_params(base_params) {
clear_optimizer_params();
}
void clear()
{
void clear() {
crn_etc1_pack_params::clear();
clear_optimizer_params();
}
void clear_optimizer_params()
{
void clear_optimizer_params() {
m_num_src_pixels = 0;
m_pSrc_pixels = 0;
@@ -518,8 +457,7 @@ namespace crnlib
bool m_constrain_against_base_color5;
};
struct results
{
struct results {
uint64 m_error;
color_quad_u8 m_block_color_unscaled;
uint m_block_inten_table;
@@ -527,8 +465,7 @@ namespace crnlib
uint8* m_pSelectors;
bool m_block_color4;
inline results& operator= (const results& rhs)
{
inline results& operator=(const results& rhs) {
m_block_color_unscaled = rhs.m_block_color_unscaled;
m_block_color4 = rhs.m_block_color4;
m_block_inten_table = rhs.m_block_inten_table;
@@ -543,10 +480,9 @@ namespace crnlib
bool compute();
private:
struct potential_solution
{
potential_solution() : m_coords(), m_error(cUINT64_MAX), m_valid(false)
{
struct potential_solution {
potential_solution()
: m_coords(), m_error(cUINT64_MAX), m_valid(false) {
}
etc1_solution_coordinates m_coords;
@@ -554,16 +490,14 @@ namespace crnlib
uint64 m_error;
bool m_valid;
void clear()
{
void clear() {
m_coords.clear();
m_selectors.resize(0);
m_error = cUINT64_MAX;
m_valid = false;
}
bool are_selectors_all_equal() const
{
bool are_selectors_all_equal() const {
if (m_selectors.empty())
return false;
const uint s = m_selectors[0];
@@ -597,13 +531,13 @@ namespace crnlib
bool evaluate_solution_fast(const etc1_solution_coordinates& coords, potential_solution& trial_solution, potential_solution* pBest_solution);
};
struct pack_etc1_block_context
{
struct pack_etc1_block_context {
etc1_optimizer m_optimizer;
};
void pack_etc1_block_init();
uint64 pack_etc1_block(etc1_block& block, const color_quad_u8* pSrc_pixels, crn_etc1_pack_params& pack_params, pack_etc1_block_context& context);
uint64 pack_etc1s_block(etc1_block& block, const color_quad_u8* pSrc_pixels, crn_etc1_pack_params& pack_params);
} // namespace crnlib
+69 -121
View File
@@ -18,11 +18,9 @@
#include <libgen.h>
#endif
namespace crnlib
{
namespace crnlib {
#if CRNLIB_USE_WIN32_API
bool file_utils::is_read_only(const char* pFilename)
{
bool file_utils::is_read_only(const char* pFilename) {
uint32 dst_file_attribs = GetFileAttributesA(pFilename);
if (dst_file_attribs == INVALID_FILE_ATTRIBUTES)
return false;
@@ -31,13 +29,11 @@ namespace crnlib
return false;
}
bool file_utils::disable_read_only(const char* pFilename)
{
bool file_utils::disable_read_only(const char* pFilename) {
uint32 dst_file_attribs = GetFileAttributesA(pFilename);
if (dst_file_attribs == INVALID_FILE_ATTRIBUTES)
return false;
if (dst_file_attribs & FILE_ATTRIBUTE_READONLY)
{
if (dst_file_attribs & FILE_ATTRIBUTE_READONLY) {
dst_file_attribs &= ~FILE_ATTRIBUTE_READONLY;
if (SetFileAttributesA(pFilename, dst_file_attribs))
return true;
@@ -45,16 +41,14 @@ namespace crnlib
return false;
}
bool file_utils::is_older_than(const char* pSrcFilename, const char* pDstFilename)
{
bool file_utils::is_older_than(const char* pSrcFilename, const char* pDstFilename) {
WIN32_FILE_ATTRIBUTE_DATA src_file_attribs;
const BOOL src_file_exists = GetFileAttributesExA(pSrcFilename, GetFileExInfoStandard, &src_file_attribs);
WIN32_FILE_ATTRIBUTE_DATA dst_file_attribs;
const BOOL dest_file_exists = GetFileAttributesExA(pDstFilename, GetFileExInfoStandard, &dst_file_attribs);
if ((dest_file_exists) && (src_file_exists))
{
if ((dest_file_exists) && (src_file_exists)) {
LONG timeComp = CompareFileTime(&src_file_attribs.ftLastWriteTime, &dst_file_attribs.ftLastWriteTime);
if (timeComp < 0)
return true;
@@ -62,8 +56,7 @@ namespace crnlib
return false;
}
bool file_utils::does_file_exist(const char* pFilename)
{
bool file_utils::does_file_exist(const char* pFilename) {
const DWORD fullAttributes = GetFileAttributesA(pFilename);
if (fullAttributes == INVALID_FILE_ATTRIBUTES)
@@ -75,8 +68,7 @@ namespace crnlib
return true;
}
bool file_utils::does_dir_exist(const char* pDir)
{
bool file_utils::does_dir_exist(const char* pDir) {
//-- Get the file attributes.
DWORD fullAttributes = GetFileAttributesA(pDir);
@@ -89,8 +81,7 @@ namespace crnlib
return false;
}
bool file_utils::get_file_size(const char* pFilename, uint64& file_size)
{
bool file_utils::get_file_size(const char* pFilename, uint64& file_size) {
file_size = 0;
WIN32_FILE_ATTRIBUTE_DATA attr;
@@ -106,29 +97,25 @@ namespace crnlib
return true;
}
#elif defined(__GNUC__)
bool file_utils::is_read_only(const char* pFilename)
{
bool file_utils::is_read_only(const char* pFilename) {
pFilename;
// TODO
return false;
}
bool file_utils::disable_read_only(const char* pFilename)
{
bool file_utils::disable_read_only(const char* pFilename) {
pFilename;
// TODO
return false;
}
bool file_utils::is_older_than(const char *pSrcFilename, const char* pDstFilename)
{
bool file_utils::is_older_than(const char* pSrcFilename, const char* pDstFilename) {
pSrcFilename, pDstFilename;
// TODO
return false;
}
bool file_utils::does_file_exist(const char* pFilename)
{
bool file_utils::does_file_exist(const char* pFilename) {
struct stat stat_buf;
int result = stat(pFilename, &stat_buf);
if (result)
@@ -138,8 +125,7 @@ namespace crnlib
return false;
}
bool file_utils::does_dir_exist(const char* pDir)
{
bool file_utils::does_dir_exist(const char* pDir) {
struct stat stat_buf;
int result = stat(pDir, &stat_buf);
if (result)
@@ -149,8 +135,7 @@ namespace crnlib
return false;
}
bool file_utils::get_file_size(const char* pFilename, uint64& file_size)
{
bool file_utils::get_file_size(const char* pFilename, uint64& file_size) {
file_size = 0;
struct stat stat_buf;
int result = stat(pFilename, &stat_buf);
@@ -162,25 +147,21 @@ namespace crnlib
return true;
}
#else
bool file_utils::is_read_only(const char* pFilename)
{
bool file_utils::is_read_only(const char* pFilename) {
return false;
}
bool file_utils::disable_read_only(const char* pFilename)
{
bool file_utils::disable_read_only(const char* pFilename) {
pFilename;
// TODO
return false;
}
bool file_utils::is_older_than(const char *pSrcFilename, const char* pDstFilename)
{
bool file_utils::is_older_than(const char* pSrcFilename, const char* pDstFilename) {
return false;
}
bool file_utils::does_file_exist(const char* pFilename)
{
bool file_utils::does_file_exist(const char* pFilename) {
FILE* pFile;
crn_fopen(&pFile, pFilename, "rb");
if (!pFile)
@@ -189,13 +170,11 @@ namespace crnlib
return true;
}
bool file_utils::does_dir_exist(const char* pDir)
{
bool file_utils::does_dir_exist(const char* pDir) {
return false;
}
bool file_utils::get_file_size(const char* pFilename, uint64& file_size)
{
bool file_utils::get_file_size(const char* pFilename, uint64& file_size) {
FILE* pFile;
crn_fopen(&pFile, pFilename, "rb");
if (!pFile)
@@ -207,11 +186,9 @@ namespace crnlib
}
#endif
bool file_utils::get_file_size(const char* pFilename, uint32& file_size)
{
bool file_utils::get_file_size(const char* pFilename, uint32& file_size) {
uint64 file_size64;
if (!get_file_size(pFilename, file_size64))
{
if (!get_file_size(pFilename, file_size64)) {
file_size = 0;
return false;
}
@@ -223,8 +200,7 @@ namespace crnlib
return true;
}
bool file_utils::is_path_separator(char c)
{
bool file_utils::is_path_separator(char c) {
#ifdef WIN32
return (c == '/') || (c == '\\');
#else
@@ -232,8 +208,7 @@ namespace crnlib
#endif
}
bool file_utils::is_path_or_drive_separator(char c)
{
bool file_utils::is_path_or_drive_separator(char c) {
#ifdef WIN32
return (c == '/') || (c == '\\') || (c == ':');
#else
@@ -241,8 +216,7 @@ namespace crnlib
#endif
}
bool file_utils::is_drive_separator(char c)
{
bool file_utils::is_drive_separator(char c) {
#ifdef WIN32
return (c == ':');
#else
@@ -251,8 +225,7 @@ namespace crnlib
#endif
}
bool file_utils::split_path(const char* p, dynamic_string* pDrive, dynamic_string* pDir, dynamic_string* pFilename, dynamic_string* pExt)
{
bool file_utils::split_path(const char* p, dynamic_string* pDrive, dynamic_string* pDir, dynamic_string* pFilename, dynamic_string* pExt) {
CRNLIB_ASSERT(p);
#ifdef WIN32
@@ -279,24 +252,28 @@ namespace crnlib
pExt ? ext_buf : NULL);
#endif
if (pDrive) *pDrive = drive_buf;
if (pDir) *pDir = dir_buf;
if (pFilename) *pFilename = fname_buf;
if (pExt) *pExt = ext_buf;
if (pDrive)
*pDrive = drive_buf;
if (pDir)
*pDir = dir_buf;
if (pFilename)
*pFilename = fname_buf;
if (pExt)
*pExt = ext_buf;
#else
char dirtmp[1024];
char nametmp[1024];
strcpy_safe(dirtmp, sizeof(dirtmp), p);
strcpy_safe(nametmp, sizeof(nametmp), p);
if (pDrive) pDrive->clear();
if (pDrive)
pDrive->clear();
const char* pDirName = dirname(dirtmp);
if (!pDirName)
return false;
if (pDir)
{
if (pDir) {
pDir->set(pDirName);
if ((!pDir->is_empty()) && (pDir->back() != '/'))
pDir->append_char('/');
@@ -306,14 +283,12 @@ namespace crnlib
if (!pBaseName)
return false;
if (pFilename)
{
if (pFilename) {
pFilename->set(pBaseName);
remove_extension(*pFilename);
}
if (pExt)
{
if (pExt) {
pExt->set(pBaseName);
get_extension(*pExt);
*pExt = "." + *pExt;
@@ -323,8 +298,7 @@ namespace crnlib
return true;
}
bool file_utils::split_path(const char* p, dynamic_string& path, dynamic_string& filename)
{
bool file_utils::split_path(const char* p, dynamic_string& path, dynamic_string& filename) {
dynamic_string temp_drive, temp_path, temp_ext;
if (!split_path(p, &temp_drive, &temp_path, &filename, &temp_ext))
return false;
@@ -335,8 +309,7 @@ namespace crnlib
return true;
}
bool file_utils::get_pathname(const char* p, dynamic_string& path)
{
bool file_utils::get_pathname(const char* p, dynamic_string& path) {
dynamic_string temp_drive, temp_path;
if (!split_path(p, &temp_drive, &temp_path, NULL, NULL))
return false;
@@ -345,8 +318,7 @@ namespace crnlib
return true;
}
bool file_utils::get_filename(const char* p, dynamic_string& filename)
{
bool file_utils::get_filename(const char* p, dynamic_string& filename) {
dynamic_string temp_ext;
if (!split_path(p, NULL, NULL, &filename, &temp_ext))
return false;
@@ -355,11 +327,9 @@ namespace crnlib
return true;
}
void file_utils::combine_path(dynamic_string& dst, const char* pA, const char* pB)
{
void file_utils::combine_path(dynamic_string& dst, const char* pA, const char* pB) {
dynamic_string temp(pA);
if ((!temp.is_empty()) && (!is_path_separator(pB[0])))
{
if ((!temp.is_empty()) && (!is_path_separator(pB[0]))) {
char c = temp[temp.get_len() - 1];
if (!is_path_separator(c))
temp.append_char(CRNLIB_PATH_SEPERATOR_CHAR);
@@ -368,14 +338,12 @@ namespace crnlib
dst.swap(temp);
}
void file_utils::combine_path(dynamic_string& dst, const char* pA, const char* pB, const char* pC)
{
void file_utils::combine_path(dynamic_string& dst, const char* pA, const char* pB, const char* pC) {
combine_path(dst, pA, pB);
combine_path(dst, dst.get_ptr(), pC);
}
bool file_utils::full_path(dynamic_string& path)
{
bool file_utils::full_path(dynamic_string& path) {
#ifdef WIN32
char buf[1024];
char* p = _fullpath(buf, path.get_ptr(), sizeof(buf));
@@ -386,15 +354,12 @@ namespace crnlib
char* p;
dynamic_string pn, fn;
split_path(path.get_ptr(), pn, fn);
if ((fn == ".") || (fn == ".."))
{
if ((fn == ".") || (fn == "..")) {
p = realpath(path.get_ptr(), buf);
if (!p)
return false;
path.set(buf);
}
else
{
} else {
if (pn.is_empty())
pn = "./";
p = realpath(pn.get_ptr(), buf);
@@ -407,8 +372,7 @@ namespace crnlib
return true;
}
bool file_utils::get_extension(dynamic_string& filename)
{
bool file_utils::get_extension(dynamic_string& filename) {
int sep = -1;
#ifdef WIN32
sep = filename.find_right('\\');
@@ -417,8 +381,7 @@ namespace crnlib
sep = filename.find_right('/');
int dot = filename.find_right('.');
if (dot < sep)
{
if (dot < sep) {
filename.clear();
return false;
}
@@ -428,8 +391,7 @@ namespace crnlib
return true;
}
bool file_utils::remove_extension(dynamic_string& filename)
{
bool file_utils::remove_extension(dynamic_string& filename) {
int sep = -1;
#ifdef WIN32
sep = filename.find_right('\\');
@@ -446,24 +408,23 @@ namespace crnlib
return true;
}
bool file_utils::create_path(const dynamic_string& fullpath)
{
bool got_unc = false; got_unc;
bool file_utils::create_path(const dynamic_string& fullpath) {
#ifdef WIN32
bool got_unc = false;
#endif
dynamic_string cur_path;
const int l = fullpath.get_len();
int n = 0;
while (n < l)
{
while (n < l) {
const char c = fullpath.get_ptr()[n];
const bool sep = is_path_separator(c);
const bool back_sep = is_path_separator(cur_path.back());
const bool is_last_char = (n == (l - 1));
if ( ((sep) && (!back_sep)) || (is_last_char) )
{
if (((sep) && (!back_sep)) || (is_last_char)) {
if ((is_last_char) && (!sep))
cur_path.append_char(c);
@@ -476,20 +437,17 @@ namespace crnlib
// \cool\blah
if ((cur_path.get_len() == 2) && (cur_path[1] == ':'))
valid = false;
else if ((cur_path.get_len() >= 2) && (cur_path[0] == '\\') && (cur_path[1] == '\\'))
{
else if ((cur_path.get_len() >= 2) && (cur_path[0] == '\\') && (cur_path[1] == '\\')) {
if (!got_unc)
valid = false;
got_unc = true;
}
else if (cur_path == "\\")
} else if (cur_path == "\\")
valid = false;
#endif
if (cur_path == "/")
valid = false;
if ((valid) && (cur_path.get_len()))
{
if ((valid) && (cur_path.get_len())) {
#ifdef WIN32
_mkdir(cur_path.get_ptr());
#else
@@ -506,19 +464,16 @@ namespace crnlib
return true;
}
void file_utils::trim_trailing_seperator(dynamic_string& path)
{
void file_utils::trim_trailing_seperator(dynamic_string& path) {
if ((path.get_len()) && (is_path_separator(path.back())))
path.truncate(path.get_len() - 1);
}
// See http://www.codeproject.com/KB/string/wildcmp.aspx
int file_utils::wildcmp(const char* pWild, const char* pString)
{
int file_utils::wildcmp(const char* pWild, const char* pString) {
const char *cp = NULL, *mp = NULL;
while ((*pString) && (*pWild != '*'))
{
while ((*pString) && (*pWild != '*')) {
if ((*pWild != *pString) && (*pWild != '?'))
return 0;
pWild++;
@@ -527,22 +482,16 @@ namespace crnlib
// Either *pString=='\0' or *pWild='*' here.
while (*pString)
{
if (*pWild == '*')
{
while (*pString) {
if (*pWild == '*') {
if (!*++pWild)
return 1;
mp = pWild;
cp = pString + 1;
}
else if ((*pWild == *pString) || (*pWild == '?'))
{
} else if ((*pWild == *pString) || (*pWild == '?')) {
pWild++;
pString++;
}
else
{
} else {
pWild = mp;
pString = cp++;
}
@@ -554,8 +503,7 @@ namespace crnlib
return !*pWild;
}
bool file_utils::write_buf_to_file(const char* pPath, const void* pData, size_t data_size)
{
bool file_utils::write_buf_to_file(const char* pPath, const void* pData, size_t data_size) {
FILE* pFile = NULL;
#ifdef _MSC_VER
+2 -4
View File
@@ -2,10 +2,8 @@
// See Copyright Notice and license at the end of inc/crnlib.h
#pragma once
namespace crnlib
{
struct file_utils
{
namespace crnlib {
struct file_utils {
// Returns true if pSrcFilename is older than pDstFilename
static bool is_read_only(const char* pFilename);
static bool disable_read_only(const char* pFilename);
+32 -66
View File
@@ -13,19 +13,16 @@
#include <dirent.h>
#endif
namespace crnlib
{
namespace crnlib {
#ifdef CRNLIB_USE_WIN32_API
bool find_files::find(const char* pBasepath, const char* pFilespec, uint flags)
{
bool find_files::find(const char* pBasepath, const char* pFilespec, uint flags) {
m_last_error = S_OK;
m_files.resize(0);
return find_internal(pBasepath, "", pFilespec, flags, 0);
}
bool find_files::find(const char* pSpec, uint flags)
{
bool find_files::find(const char* pSpec, uint flags) {
dynamic_string find_name(pSpec);
if (!file_utils::full_path(find_name))
@@ -38,34 +35,27 @@ namespace crnlib
return find(find_pathname.get_ptr(), find_filename.get_ptr(), flags);
}
bool find_files::find_internal(const char* pBasepath, const char* pRelpath, const char* pFilespec, uint flags, int level)
{
bool find_files::find_internal(const char* pBasepath, const char* pRelpath, const char* pFilespec, uint flags, int level) {
WIN32_FIND_DATAA find_data;
dynamic_string filename;
dynamic_string_array child_paths;
if (flags & cFlagRecursive)
{
if (flags & cFlagRecursive) {
if (strlen(pRelpath))
file_utils::combine_path(filename, pBasepath, pRelpath, "*");
else
file_utils::combine_path(filename, pBasepath, "*");
HANDLE handle = FindFirstFileA(filename.get_ptr(), &find_data);
if (handle == INVALID_HANDLE_VALUE)
{
if (handle == INVALID_HANDLE_VALUE) {
HRESULT hres = GetLastError();
if ((level == 0) && (hres != NO_ERROR) && (hres != ERROR_FILE_NOT_FOUND))
{
if ((level == 0) && (hres != NO_ERROR) && (hres != ERROR_FILE_NOT_FOUND)) {
m_last_error = hres;
return false;
}
}
else
{
do
{
} else {
do {
const bool is_dir = (find_data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != 0;
bool skip = !is_dir;
@@ -75,14 +65,12 @@ namespace crnlib
if (find_data.dwFileAttributes & (FILE_ATTRIBUTE_SYSTEM | FILE_ATTRIBUTE_TEMPORARY))
skip = true;
if (find_data.dwFileAttributes & FILE_ATTRIBUTE_HIDDEN)
{
if (find_data.dwFileAttributes & FILE_ATTRIBUTE_HIDDEN) {
if ((flags & cFlagAllowHidden) == 0)
skip = true;
}
if (!skip)
{
if (!skip) {
dynamic_string child_path(find_data.cFileName);
if ((!child_path.count_char('?')) && (!child_path.count_char('*')))
child_paths.push_back(child_path);
@@ -95,8 +83,7 @@ namespace crnlib
FindClose(handle);
handle = INVALID_HANDLE_VALUE;
if (hres != ERROR_NO_MORE_FILES)
{
if (hres != ERROR_NO_MORE_FILES) {
m_last_error = hres;
return false;
}
@@ -109,19 +96,14 @@ namespace crnlib
file_utils::combine_path(filename, pBasepath, pFilespec);
HANDLE handle = FindFirstFileA(filename.get_ptr(), &find_data);
if (handle == INVALID_HANDLE_VALUE)
{
if (handle == INVALID_HANDLE_VALUE) {
HRESULT hres = GetLastError();
if ((level == 0) && (hres != NO_ERROR) && (hres != ERROR_FILE_NOT_FOUND))
{
if ((level == 0) && (hres != NO_ERROR) && (hres != ERROR_FILE_NOT_FOUND)) {
m_last_error = hres;
return false;
}
}
else
{
do
{
} else {
do {
const bool is_dir = (find_data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != 0;
bool skip = false;
@@ -131,16 +113,13 @@ namespace crnlib
if (find_data.dwFileAttributes & (FILE_ATTRIBUTE_SYSTEM | FILE_ATTRIBUTE_TEMPORARY))
skip = true;
if (find_data.dwFileAttributes & FILE_ATTRIBUTE_HIDDEN)
{
if (find_data.dwFileAttributes & FILE_ATTRIBUTE_HIDDEN) {
if ((flags & cFlagAllowHidden) == 0)
skip = true;
}
if (!skip)
{
if (((is_dir) && (flags & cFlagAllowDirs)) || ((!is_dir) && (flags & cFlagAllowFiles)))
{
if (!skip) {
if (((is_dir) && (flags & cFlagAllowDirs)) || ((!is_dir) && (flags & cFlagAllowFiles))) {
m_files.resize(m_files.size() + 1);
file_desc& file = m_files.back();
file.m_is_dir = is_dir;
@@ -160,15 +139,13 @@ namespace crnlib
FindClose(handle);
if (hres != ERROR_NO_MORE_FILES)
{
if (hres != ERROR_NO_MORE_FILES) {
m_last_error = hres;
return false;
}
}
for (uint i = 0; i < child_paths.size(); i++)
{
for (uint i = 0; i < child_paths.size(); i++) {
dynamic_string child_path;
if (strlen(pRelpath))
file_utils::combine_path(child_path, pRelpath, child_paths[i].get_ptr());
@@ -182,14 +159,12 @@ namespace crnlib
return true;
}
#elif defined(__GNUC__)
bool find_files::find(const char* pBasepath, const char* pFilespec, uint flags)
{
bool find_files::find(const char* pBasepath, const char* pFilespec, uint flags) {
m_files.resize(0);
return find_internal(pBasepath, "", pFilespec, flags, 0);
}
bool find_files::find(const char* pSpec, uint flags)
{
bool find_files::find(const char* pSpec, uint flags) {
dynamic_string find_name(pSpec);
if (!file_utils::full_path(find_name))
@@ -202,16 +177,14 @@ namespace crnlib
return find(find_pathname.get_ptr(), find_filename.get_ptr(), flags);
}
bool find_files::find_internal(const char* pBasepath, const char* pRelpath, const char* pFilespec, uint flags, int level)
{
bool find_files::find_internal(const char* pBasepath, const char* pRelpath, const char* pFilespec, uint flags, int level) {
dynamic_string pathname;
if (strlen(pRelpath))
file_utils::combine_path(pathname, pBasepath, pRelpath);
else
pathname = pBasepath;
if (!pathname.is_empty())
{
if (!pathname.is_empty()) {
char c = pathname.back();
if (c != '/')
pathname += "/";
@@ -224,8 +197,7 @@ namespace crnlib
dynamic_string_array paths;
for ( ; ; )
{
for (;;) {
struct dirent* ep = readdir(dp);
if (!ep)
break;
@@ -237,18 +209,14 @@ namespace crnlib
dynamic_string filename(ep->d_name);
if (is_directory)
{
if (flags & cFlagRecursive)
{
if (is_directory) {
if (flags & cFlagRecursive) {
paths.push_back(filename);
}
}
if (((is_file) && (flags & cFlagAllowFiles)) || ((is_directory) && (flags & cFlagAllowDirs)))
{
if (0 == fnmatch(pFilespec, filename.get_ptr(), 0))
{
if (((is_file) && (flags & cFlagAllowFiles)) || ((is_directory) && (flags & cFlagAllowDirs))) {
if (0 == fnmatch(pFilespec, filename.get_ptr(), 0)) {
m_files.resize(m_files.size() + 1);
file_desc& file = m_files.back();
file.m_is_dir = is_directory;
@@ -263,10 +231,8 @@ namespace crnlib
closedir(dp);
dp = NULL;
if (flags & cFlagRecursive)
{
for (uint i = 0; i < paths.size(); i++)
{
if (flags & cFlagRecursive) {
for (uint i = 0; i < paths.size(); i++) {
dynamic_string childpath;
if (strlen(pRelpath))
file_utils::combine_path(childpath, pRelpath, paths[i].get_ptr());
+7 -11
View File
@@ -2,14 +2,12 @@
// See Copyright Notice and license at the end of inc/crnlib.h
#pragma once
namespace crnlib
{
class find_files
{
namespace crnlib {
class find_files {
public:
struct file_desc
{
inline file_desc() : m_is_dir(false) { }
struct file_desc {
inline file_desc()
: m_is_dir(false) {}
dynamic_string m_fullname;
dynamic_string m_base;
@@ -25,13 +23,11 @@ namespace crnlib
typedef crnlib::vector<file_desc> file_desc_vec;
inline find_files()
{
inline find_files() {
m_last_error = 0; // S_OK;
}
enum flags
{
enum flags {
cFlagRecursive = 1,
cFlagAllowDirs = 2,
cFlagAllowFiles = 4,
+13 -27
View File
@@ -6,12 +6,9 @@
#include "freeImagePlus.h"
namespace crnlib
{
namespace freeimage_image_utils
{
inline bool load_from_file(image_u8& dest, const wchar_t* pFilename, int fi_flag)
{
namespace crnlib {
namespace freeimage_image_utils {
inline bool load_from_file(image_u8& dest, const wchar_t* pFilename, int fi_flag) {
fipImage src_image;
if (!src_image.loadU(pFilename, fi_flag))
@@ -36,13 +33,11 @@ namespace crnlib
bool grayscale = true;
bool has_alpha = false;
for (uint y = 0; y < height; y++)
{
for (uint y = 0; y < height; y++) {
const BYTE* pSrc = src_image.getScanLine((WORD)(height - 1 - y));
color_quad_u8* pD = pDst;
for (uint x = width; x; x--)
{
for (uint x = width; x; x--) {
color_quad_u8 c;
c.r = pSrc[FI_RGBA_RED];
c.g = pSrc[FI_RGBA_GREEN];
@@ -72,25 +67,21 @@ namespace crnlib
const int cSaveLuma = -1;
inline bool save_to_grayscale_file(const wchar_t* pFilename, const image_u8& src, int component, int fi_flag)
{
inline bool save_to_grayscale_file(const wchar_t* pFilename, const image_u8& src, int component, int fi_flag) {
fipImage dst_image(FIT_BITMAP, (WORD)src.get_width(), (WORD)src.get_height(), 8);
RGBQUAD* p = dst_image.getPalette();
for (uint i = 0; i < dst_image.getPaletteSize(); i++)
{
for (uint i = 0; i < dst_image.getPaletteSize(); i++) {
p[i].rgbRed = (BYTE)i;
p[i].rgbGreen = (BYTE)i;
p[i].rgbBlue = (BYTE)i;
p[i].rgbReserved = 255;
}
for (uint y = 0; y < src.get_height(); y++)
{
for (uint y = 0; y < src.get_height(); y++) {
const color_quad_u8* pSrc = src.get_scanline(y);
for (uint x = 0; x < src.get_width(); x++)
{
for (uint x = 0; x < src.get_width(); x++) {
BYTE v;
if (component == cSaveLuma)
v = (BYTE)(*pSrc).get_luma();
@@ -108,13 +99,11 @@ namespace crnlib
return true;
}
inline bool save_to_file(const wchar_t* pFilename, const image_u8& src, int fi_flag, bool ignore_alpha = false)
{
inline bool save_to_file(const wchar_t* pFilename, const image_u8& src, int fi_flag, bool ignore_alpha = false) {
const bool save_alpha = src.is_component_valid(3);
uint bpp = (save_alpha && !ignore_alpha) ? 32 : 24;
if (bpp == 32)
{
if (bpp == 32) {
dynamic_wstring ext(pFilename);
get_extension(ext);
@@ -127,10 +116,8 @@ namespace crnlib
fipImage dst_image(FIT_BITMAP, (WORD)src.get_width(), (WORD)src.get_height(), (WORD)bpp);
for (uint y = 0; y < src.get_height(); y++)
{
for (uint x = 0; x < src.get_width(); x++)
{
for (uint y = 0; y < src.get_height(); y++) {
for (uint x = 0; x < src.get_width(); x++) {
color_quad_u8 c(src(x, y));
RGBQUAD quad;
@@ -155,4 +142,3 @@ namespace crnlib
} // namespace freeimage_image_utils
} // namespace crnlib
+12 -12
View File
@@ -5,26 +5,23 @@
#include "crn_core.h"
#undef get16bits
#if (defined(__GNUC__) && defined(__i386__)) || defined(__WATCOMC__) \
|| defined(_MSC_VER) || defined (__BORLANDC__) || defined (__TURBOC__)
#if (defined(__GNUC__) && defined(__i386__)) || defined(__WATCOMC__) || defined(_MSC_VER) || defined(__BORLANDC__) || defined(__TURBOC__)
#define get16bits(d) (*((const uint16*)(d)))
#endif
#if !defined(get16bits)
#define get16bits(d) ((((uint32)(((const uint8 *)(d))[1])) << 8)\
+(uint32)(((const uint8 *)(d))[0]) )
#define get16bits(d) ((((uint32)(((const uint8*)(d))[1])) << 8) + (uint32)(((const uint8*)(d))[0]))
#endif
namespace crnlib
{
uint32 fast_hash (const void* p, int len)
{
namespace crnlib {
uint32 fast_hash(const void* p, int len) {
const char* data = static_cast<const char*>(p);
uint32 hash = len, tmp;
int rem;
if (len <= 0 || data == NULL) return 0;
if (len <= 0 || data == NULL)
return 0;
rem = len & 3;
len >>= 2;
@@ -40,16 +37,19 @@ namespace crnlib
/* Handle end cases */
switch (rem) {
case 3: hash += get16bits (data);
case 3:
hash += get16bits(data);
hash ^= hash << 16;
hash ^= data[sizeof(uint16)] << 18;
hash += hash >> 11;
break;
case 2: hash += get16bits (data);
case 2:
hash += get16bits(data);
hash ^= hash << 11;
hash += hash >> 17;
break;
case 1: hash += *data;
case 1:
hash += *data;
hash ^= hash << 10;
hash += hash >> 1;
}
+3 -6
View File
@@ -2,13 +2,11 @@
// See Copyright Notice and license at the end of inc/crnlib.h
#pragma once
namespace crnlib
{
namespace crnlib {
uint32 fast_hash(const void* p, int len);
// 4-byte integer hash, full avalanche
inline uint32 bitmix32c(uint32 a)
{
inline uint32 bitmix32c(uint32 a) {
a = (a + 0x7ed55d16) + (a << 12);
a = (a ^ 0xc761c23c) ^ (a >> 19);
a = (a + 0x165667b1) + (a << 5);
@@ -19,8 +17,7 @@ namespace crnlib
}
// 4-byte integer hash, full avalanche, no constants
inline uint32 bitmix32(uint32 a)
{
inline uint32 bitmix32(uint32 a) {
a -= (a << 6);
a ^= (a >> 17);
a -= (a << 9);
+1 -2
View File
@@ -4,8 +4,7 @@
#include "crn_hash_map.h"
#include "crn_rand.h"
namespace crnlib
{
namespace crnlib {
#if 0
class counted_obj
{
+130 -241
View File
@@ -12,41 +12,34 @@
#include "crn_sparse_bit_array.h"
#include "crn_hash.h"
namespace crnlib
{
namespace crnlib {
template <typename T>
struct hasher
{
struct hasher {
inline size_t operator()(const T& key) const { return static_cast<size_t>(key); }
};
template <typename T>
struct bit_hasher
{
struct bit_hasher {
inline size_t operator()(const T& key) const { return static_cast<size_t>(fast_hash(&key, sizeof(key))); }
};
template <typename T>
struct equal_to
{
struct equal_to {
inline bool operator()(const T& a, const T& b) const { return a == b; }
};
// Important: The Hasher and Equals objects must be bitwise movable!
template <typename Key, typename Value = empty_type, typename Hasher = hasher<Key>, typename Equals = equal_to<Key> >
class hash_map
{
class hash_map {
friend class iterator;
friend class const_iterator;
enum state
{
enum state {
cStateInvalid = 0,
cStateValid = 1
};
enum
{
enum {
cMinHashSize = 4U
};
@@ -58,23 +51,20 @@ namespace crnlib
typedef Hasher hasher_type;
typedef Equals equals_type;
hash_map() :
m_hash_shift(32), m_num_valid(0), m_grow_threshold(0)
{
hash_map()
: m_hash_shift(32), m_num_valid(0), m_grow_threshold(0) {
}
hash_map(const hash_map& other) :
m_values(other.m_values),
hash_map(const hash_map& other)
: m_values(other.m_values),
m_hash_shift(other.m_hash_shift),
m_hasher(other.m_hasher),
m_equals(other.m_equals),
m_num_valid(other.m_num_valid),
m_grow_threshold(other.m_grow_threshold)
{
m_grow_threshold(other.m_grow_threshold) {
}
hash_map& operator= (const hash_map& other)
{
hash_map& operator=(const hash_map& other) {
if (this == &other)
return *this;
@@ -90,8 +80,7 @@ namespace crnlib
return *this;
}
inline ~hash_map()
{
inline ~hash_map() {
clear();
}
@@ -105,20 +94,15 @@ namespace crnlib
void set_hasher(const Hasher& hasher) { m_hasher = hasher; }
inline void clear()
{
if (!m_values.empty())
{
if (CRNLIB_HAS_DESTRUCTOR(Key) || CRNLIB_HAS_DESTRUCTOR(Value))
{
inline void clear() {
if (!m_values.empty()) {
if (CRNLIB_HAS_DESTRUCTOR(Key) || CRNLIB_HAS_DESTRUCTOR(Value)) {
node* p = &get_node(0);
node* p_end = p + m_values.size();
uint num_remaining = m_num_valid;
while (p != p_end)
{
if (p->state)
{
while (p != p_end) {
if (p->state) {
destruct_value_type(p);
num_remaining--;
if (!num_remaining)
@@ -137,21 +121,17 @@ namespace crnlib
}
}
inline void reset()
{
inline void reset() {
if (!m_num_valid)
return;
if (CRNLIB_HAS_DESTRUCTOR(Key) || CRNLIB_HAS_DESTRUCTOR(Value))
{
if (CRNLIB_HAS_DESTRUCTOR(Key) || CRNLIB_HAS_DESTRUCTOR(Value)) {
node* p = &get_node(0);
node* p_end = p + m_values.size();
uint num_remaining = m_num_valid;
while (p != p_end)
{
if (p->state)
{
while (p != p_end) {
if (p->state) {
destruct_value_type(p);
p->state = cStateInvalid;
@@ -162,21 +142,15 @@ namespace crnlib
p++;
}
}
else if (sizeof(node) <= 32)
{
} else if (sizeof(node) <= 32) {
memset(&m_values[0], 0, m_values.size_in_bytes());
}
else
{
} else {
node* p = &get_node(0);
node* p_end = p + m_values.size();
uint num_remaining = m_num_valid;
while (p != p_end)
{
if (p->state)
{
while (p != p_end) {
if (p->state) {
p->state = cStateInvalid;
num_remaining--;
@@ -191,23 +165,19 @@ namespace crnlib
m_num_valid = 0;
}
inline uint size()
{
inline uint size() {
return m_num_valid;
}
inline uint get_table_size()
{
inline uint get_table_size() {
return m_values.size();
}
inline bool empty()
{
inline bool empty() {
return !m_num_valid;
}
inline void reserve(uint new_capacity)
{
inline void reserve(uint new_capacity) {
uint new_hash_size = math::maximum(1U, new_capacity);
new_hash_size = new_hash_size * 2U;
@@ -223,34 +193,33 @@ namespace crnlib
class const_iterator;
class iterator
{
class iterator {
friend class hash_map<Key, Value, Hasher, Equals>;
friend class hash_map<Key, Value, Hasher, Equals>::const_iterator;
public:
inline iterator() : m_pTable(NULL), m_index(0) { }
inline iterator(hash_map_type& table, uint index) : m_pTable(&table), m_index(index) { }
inline iterator(const iterator& other) : m_pTable(other.m_pTable), m_index(other.m_index) { }
inline iterator()
: m_pTable(NULL), m_index(0) {}
inline iterator(hash_map_type& table, uint index)
: m_pTable(&table), m_index(index) {}
inline iterator(const iterator& other)
: m_pTable(other.m_pTable), m_index(other.m_index) {}
inline iterator& operator= (const iterator& other)
{
inline iterator& operator=(const iterator& other) {
m_pTable = other.m_pTable;
m_index = other.m_index;
return *this;
}
// post-increment
inline iterator operator++(int)
{
inline iterator operator++(int) {
iterator result(*this);
++*this;
return result;
}
// pre-increment
inline iterator& operator++()
{
inline iterator& operator++() {
probe();
return *this;
}
@@ -267,57 +236,54 @@ namespace crnlib
hash_map_type* m_pTable;
uint m_index;
inline value_type* get_cur() const
{
inline value_type* get_cur() const {
CRNLIB_ASSERT(m_pTable && (m_index < m_pTable->m_values.size()));
CRNLIB_ASSERT(m_pTable->get_node_state(m_index) == cStateValid);
return &m_pTable->get_node(m_index);
}
inline void probe()
{
inline void probe() {
CRNLIB_ASSERT(m_pTable);
m_index = m_pTable->find_next(m_index);
}
};
class const_iterator
{
class const_iterator {
friend class hash_map<Key, Value, Hasher, Equals>;
friend class hash_map<Key, Value, Hasher, Equals>::iterator;
public:
inline const_iterator() : m_pTable(NULL), m_index(0) { }
inline const_iterator(const hash_map_type& table, uint index) : m_pTable(&table), m_index(index) { }
inline const_iterator(const iterator& other) : m_pTable(other.m_pTable), m_index(other.m_index) { }
inline const_iterator(const const_iterator& other) : m_pTable(other.m_pTable), m_index(other.m_index) { }
inline const_iterator()
: m_pTable(NULL), m_index(0) {}
inline const_iterator(const hash_map_type& table, uint index)
: m_pTable(&table), m_index(index) {}
inline const_iterator(const iterator& other)
: m_pTable(other.m_pTable), m_index(other.m_index) {}
inline const_iterator(const const_iterator& other)
: m_pTable(other.m_pTable), m_index(other.m_index) {}
inline const_iterator& operator= (const const_iterator& other)
{
inline const_iterator& operator=(const const_iterator& other) {
m_pTable = other.m_pTable;
m_index = other.m_index;
return *this;
}
inline const_iterator& operator= (const iterator& other)
{
inline const_iterator& operator=(const iterator& other) {
m_pTable = other.m_pTable;
m_index = other.m_index;
return *this;
}
// post-increment
inline const_iterator operator++(int)
{
inline const_iterator operator++(int) {
const_iterator result(*this);
++*this;
return result;
}
// pre-increment
inline const_iterator& operator++()
{
inline const_iterator& operator++() {
probe();
return *this;
}
@@ -334,44 +300,38 @@ namespace crnlib
const hash_map_type* m_pTable;
uint m_index;
inline const value_type* get_cur() const
{
inline const value_type* get_cur() const {
CRNLIB_ASSERT(m_pTable && (m_index < m_pTable->m_values.size()));
CRNLIB_ASSERT(m_pTable->get_node_state(m_index) == cStateValid);
return &m_pTable->get_node(m_index);
}
inline void probe()
{
inline void probe() {
CRNLIB_ASSERT(m_pTable);
m_index = m_pTable->find_next(m_index);
}
};
inline const_iterator begin() const
{
inline const_iterator begin() const {
if (!m_num_valid)
return end();
return const_iterator(*this, find_next(-1));
}
inline const_iterator end() const
{
inline const_iterator end() const {
return const_iterator(*this, m_values.size());
}
inline iterator begin()
{
inline iterator begin() {
if (!m_num_valid)
return end();
return iterator(*this, find_next(-1));
}
inline iterator end()
{
inline iterator end() {
return iterator(*this, m_values.size());
}
@@ -379,16 +339,13 @@ namespace crnlib
// insert_resutt.second will be true if a new key/value was inserted, or false if the key already existed (in which case first will point to the already existing value).
typedef std::pair<iterator, bool> insert_result;
inline insert_result insert(const Key& k, const Value& v = Value())
{
inline insert_result insert(const Key& k, const Value& v = Value()) {
insert_result result;
if (!insert_no_grow(result, k, v))
{
if (!insert_no_grow(result, k, v)) {
grow();
// This must succeed.
if (!insert_no_grow(result, k, v))
{
if (!insert_no_grow(result, k, v)) {
CRNLIB_FAIL("insert() failed");
}
}
@@ -396,23 +353,19 @@ namespace crnlib
return result;
}
inline insert_result insert(const value_type& v)
{
inline insert_result insert(const value_type& v) {
return insert(v.first, v.second);
}
inline const_iterator find(const Key& k) const
{
inline const_iterator find(const Key& k) const {
return const_iterator(*this, find_index(k));
}
inline iterator find(const Key& k)
{
inline iterator find(const Key& k) {
return iterator(*this, find_index(k));
}
inline bool erase(const Key& k)
{
inline bool erase(const Key& k) {
int i = find_index(k);
if (i >= static_cast<int>(m_values.size()))
@@ -424,21 +377,16 @@ namespace crnlib
m_num_valid--;
for ( ; ; )
{
for (;;) {
int r, j = i;
node* pSrc = pDst;
do
{
if (!i)
{
do {
if (!i) {
i = m_values.size() - 1;
pSrc = &get_node(i);
}
else
{
} else {
i--;
pSrc--;
}
@@ -456,8 +404,7 @@ namespace crnlib
}
}
inline void swap(hash_map_type& other)
{
inline void swap(hash_map_type& other) {
m_values.swap(other.m_values);
utils::swap(m_hash_shift, other.m_hash_shift);
utils::swap(m_num_valid, other.m_num_valid);
@@ -467,13 +414,11 @@ namespace crnlib
}
private:
struct node : public value_type
{
struct node : public value_type {
uint8 state;
};
static inline void construct_value_type(value_type* pDst, const Key& k, const Value& v)
{
static inline void construct_value_type(value_type* pDst, const Key& k, const Value& v) {
if (CRNLIB_IS_BITWISE_COPYABLE(Key))
memcpy(&pDst->first, &k, sizeof(Key));
else
@@ -485,14 +430,10 @@ namespace crnlib
scalar_type<Value>::construct(&pDst->second, v);
}
static inline void construct_value_type(value_type* pDst, const value_type* pSrc)
{
if ((CRNLIB_IS_BITWISE_COPYABLE(Key)) && (CRNLIB_IS_BITWISE_COPYABLE(Value)))
{
static inline void construct_value_type(value_type* pDst, const value_type* pSrc) {
if ((CRNLIB_IS_BITWISE_COPYABLE(Key)) && (CRNLIB_IS_BITWISE_COPYABLE(Value))) {
memcpy(pDst, pSrc, sizeof(value_type));
}
else
{
} else {
if (CRNLIB_IS_BITWISE_COPYABLE(Key))
memcpy(&pDst->first, &pSrc->first, sizeof(Key));
else
@@ -505,36 +446,29 @@ namespace crnlib
}
}
static inline void destruct_value_type(value_type* p)
{
static inline void destruct_value_type(value_type* p) {
scalar_type<Key>::destruct(&p->first);
scalar_type<Value>::destruct(&p->second);
}
// Moves *pSrc to *pDst efficiently.
// pDst should NOT be constructed on entry.
static inline void move_node(node* pDst, node* pSrc)
{
static inline void move_node(node* pDst, node* pSrc) {
CRNLIB_ASSERT(!pDst->state);
if (CRNLIB_IS_BITWISE_COPYABLE_OR_MOVABLE(Key) && CRNLIB_IS_BITWISE_COPYABLE_OR_MOVABLE(Value))
{
if (CRNLIB_IS_BITWISE_COPYABLE_OR_MOVABLE(Key) && CRNLIB_IS_BITWISE_COPYABLE_OR_MOVABLE(Value)) {
memcpy(pDst, pSrc, sizeof(node));
}
else
{
} else {
if (CRNLIB_IS_BITWISE_COPYABLE_OR_MOVABLE(Key))
memcpy(&pDst->first, &pSrc->first, sizeof(Key));
else
{
else {
scalar_type<Key>::construct(&pDst->first, pSrc->first);
scalar_type<Key>::destruct(&pSrc->first);
}
if (CRNLIB_IS_BITWISE_COPYABLE_OR_MOVABLE(Value))
memcpy(&pDst->second, &pSrc->second, sizeof(Value));
else
{
else {
scalar_type<Value>::construct(&pDst->second, pSrc->second);
scalar_type<Value>::destruct(&pSrc->second);
}
@@ -545,58 +479,45 @@ namespace crnlib
pSrc->state = cStateInvalid;
}
struct raw_node
{
inline raw_node()
{
struct raw_node {
inline raw_node() {
node* p = reinterpret_cast<node*>(this);
p->state = cStateInvalid;
}
inline ~raw_node()
{
inline ~raw_node() {
node* p = reinterpret_cast<node*>(this);
if (p->state)
hash_map_type::destruct_value_type(p);
}
inline raw_node(const raw_node& other)
{
inline raw_node(const raw_node& other) {
node* pDst = reinterpret_cast<node*>(this);
const node* pSrc = reinterpret_cast<const node*>(&other);
if (pSrc->state)
{
if (pSrc->state) {
hash_map_type::construct_value_type(pDst, pSrc);
pDst->state = cStateValid;
}
else
} else
pDst->state = cStateInvalid;
}
inline raw_node& operator= (const raw_node& rhs)
{
inline raw_node& operator=(const raw_node& rhs) {
if (this == &rhs)
return *this;
node* pDst = reinterpret_cast<node*>(this);
const node* pSrc = reinterpret_cast<const node*>(&rhs);
if (pSrc->state)
{
if (pDst->state)
{
if (pSrc->state) {
if (pDst->state) {
pDst->first = pSrc->first;
pDst->second = pSrc->second;
}
else
{
} else {
hash_map_type::construct_value_type(pDst, pSrc);
pDst->state = cStateValid;
}
}
else if (pDst->state)
{
} else if (pDst->state) {
hash_map_type::destruct_value_type(pDst);
pDst->state = cStateInvalid;
}
@@ -619,8 +540,7 @@ namespace crnlib
uint m_grow_threshold;
inline int hash_key(const Key& k) const
{
inline int hash_key(const Key& k) const {
CRNLIB_ASSERT((1U << (32U - m_hash_shift)) == m_values.size());
uint hash = static_cast<uint>(m_hasher(k));
@@ -632,33 +552,27 @@ namespace crnlib
return hash;
}
inline const node& get_node(uint index) const
{
inline const node& get_node(uint index) const {
return *reinterpret_cast<const node*>(&m_values[index]);
}
inline node& get_node(uint index)
{
inline node& get_node(uint index) {
return *reinterpret_cast<node*>(&m_values[index]);
}
inline state get_node_state(uint index) const
{
inline state get_node_state(uint index) const {
return static_cast<state>(get_node(index).state);
}
inline void set_node_state(uint index, bool valid)
{
inline void set_node_state(uint index, bool valid) {
get_node(index).state = valid;
}
inline void grow()
{
inline void grow() {
rehash(math::maximum<uint>(cMinHashSize, m_values.size() * 2U));
}
inline void rehash(uint new_hash_size)
{
inline void rehash(uint new_hash_size) {
CRNLIB_ASSERT(new_hash_size >= m_num_valid);
CRNLIB_ASSERT(math::is_power_of_2(new_hash_size));
@@ -674,10 +588,8 @@ namespace crnlib
node* pNode = reinterpret_cast<node*>(m_values.begin());
node* pNode_end = pNode + m_values.size();
while (pNode != pNode_end)
{
if (pNode->state)
{
while (pNode != pNode_end) {
if (pNode->state) {
new_map.move_into(pNode);
if (new_map.m_num_valid == m_num_valid)
@@ -695,8 +607,7 @@ namespace crnlib
swap(new_map);
}
inline uint find_next(int index) const
{
inline uint find_next(int index) const {
index++;
if (index >= static_cast<int>(m_values.size()))
@@ -704,8 +615,7 @@ namespace crnlib
const node* pNode = &get_node(index);
for ( ; ; )
{
for (;;) {
if (pNode->state)
break;
@@ -718,29 +628,22 @@ namespace crnlib
return index;
}
inline uint find_index(const Key& k) const
{
if (m_num_valid)
{
inline uint find_index(const Key& k) const {
if (m_num_valid) {
int index = hash_key(k);
const node* pNode = &get_node(index);
if (pNode->state)
{
if (pNode->state) {
if (m_equals(pNode->first, k))
return index;
const int orig_index = index;
for ( ; ; )
{
if (!index)
{
for (;;) {
if (!index) {
index = m_values.size() - 1;
pNode = &get_node(index);
}
else
{
} else {
index--;
pNode--;
}
@@ -760,18 +663,15 @@ namespace crnlib
return m_values.size();
}
inline bool insert_no_grow(insert_result& result, const Key& k, const Value& v = Value())
{
inline bool insert_no_grow(insert_result& result, const Key& k, const Value& v = Value()) {
if (!m_values.size())
return false;
int index = hash_key(k);
node* pNode = &get_node(index);
if (pNode->state)
{
if (m_equals(pNode->first, k))
{
if (pNode->state) {
if (m_equals(pNode->first, k)) {
result.first = iterator(*this, index);
result.second = false;
return true;
@@ -779,15 +679,11 @@ namespace crnlib
const int orig_index = index;
for ( ; ; )
{
if (!index)
{
for (;;) {
if (!index) {
index = m_values.size() - 1;
pNode = &get_node(index);
}
else
{
} else {
index--;
pNode--;
}
@@ -798,8 +694,7 @@ namespace crnlib
if (!pNode->state)
break;
if (m_equals(pNode->first, k))
{
if (m_equals(pNode->first, k)) {
result.first = iterator(*this, index);
result.second = false;
return true;
@@ -823,30 +718,23 @@ namespace crnlib
return true;
}
inline void move_into(node* pNode)
{
inline void move_into(node* pNode) {
int index = hash_key(pNode->first);
node* pDst_node = &get_node(index);
if (pDst_node->state)
{
if (pDst_node->state) {
const int orig_index = index;
for ( ; ; )
{
if (!index)
{
for (;;) {
if (!index) {
index = m_values.size() - 1;
pDst_node = &get_node(index);
}
else
{
} else {
index--;
pDst_node--;
}
if (index == orig_index)
{
if (index == orig_index) {
CRNLIB_ASSERT(false);
return;
}
@@ -863,11 +751,12 @@ namespace crnlib
};
template <typename Key, typename Value, typename Hasher, typename Equals>
struct bitwise_movable< hash_map<Key, Value, Hasher, Equals> > { enum { cFlag = true }; };
struct bitwise_movable<hash_map<Key, Value, Hasher, Equals> > {
enum { cFlag = true };
};
template <typename Key, typename Value, typename Hasher, typename Equals>
inline void swap(hash_map<Key, Value, Hasher, Equals>& a, hash_map<Key, Value, Hasher, Equals>& b)
{
inline void swap(hash_map<Key, Value, Hasher, Equals>& a, hash_map<Key, Value, Hasher, Equals>& b) {
a.swap(b);
}
+19 -21
View File
@@ -2,15 +2,18 @@
// See Copyright Notice and license at the end of inc/crnlib.h
#pragma once
#define CRNLIB_NO_COPY_OR_ASSIGNMENT_OP(c) c(const c&); c& operator= (const c&);
#define CRNLIB_NO_HEAP_ALLOC() private: static void* operator new(size_t); static void* operator new[](size_t);
#define CRNLIB_NO_COPY_OR_ASSIGNMENT_OP(c) \
c(const c&); \
c& operator=(const c&);
#define CRNLIB_NO_HEAP_ALLOC() \
private: \
static void* operator new(size_t); \
static void* operator new[](size_t);
namespace crnlib
{
namespace helpers
{
template<typename T> struct rel_ops
{
namespace crnlib {
namespace helpers {
template <typename T>
struct rel_ops {
friend bool operator!=(const T& x, const T& y) { return (!(x == y)); }
friend bool operator>(const T& x, const T& y) { return (y < x); }
friend bool operator<=(const T& x, const T& y) { return (!(y < x)); }
@@ -18,42 +21,37 @@ namespace crnlib
};
template <typename T>
inline T* construct(T* p)
{
inline T* construct(T* p) {
return new (static_cast<void*>(p)) T;
}
template <typename T, typename U>
inline T* construct(T* p, const U& init)
{
inline T* construct(T* p, const U& init) {
return new (static_cast<void*>(p)) T(init);
}
template <typename T>
inline void construct_array(T* p, uint n)
{
inline void construct_array(T* p, uint n) {
T* q = p + n;
for (; p != q; ++p)
new (static_cast<void*>(p)) T;
}
template <typename T, typename U>
inline void construct_array(T* p, uint n, const U& init)
{
inline void construct_array(T* p, uint n, const U& init) {
T* q = p + n;
for (; p != q; ++p)
new (static_cast<void*>(p)) T(init);
}
template <typename T>
inline void destruct(T* p)
{
p;
inline void destruct(T* p) {
(void)p;
p->~T();
}
template <typename T> inline void destruct_array(T* p, uint n)
{
template <typename T>
inline void destruct_array(T* p, uint n) {
T* q = p + n;
for (; p != q; ++p)
p->~T();
+63 -84
View File
@@ -3,22 +3,18 @@
#include "crn_core.h"
#include "crn_huffman_codes.h"
namespace crnlib
{
struct sym_freq
{
namespace crnlib {
struct sym_freq {
uint m_freq;
uint16 m_left;
uint16 m_right;
inline bool operator< (const sym_freq& other) const
{
inline bool operator<(const sym_freq& other) const {
return m_freq > other.m_freq;
}
};
static inline sym_freq* radix_sort_syms(uint num_syms, sym_freq* syms0, sym_freq* syms1)
{
static inline sym_freq* radix_sort_syms(uint num_syms, sym_freq* syms0, sym_freq* syms1) {
const uint cMaxPasses = 2;
uint hist[256 * cMaxPasses];
@@ -27,8 +23,7 @@ namespace crnlib
sym_freq* p = syms0;
sym_freq* q = syms0 + (num_syms >> 1) * 2;
for ( ; p != q; p += 2)
{
for (; p != q; p += 2) {
const uint freq0 = p[0].m_freq;
const uint freq1 = p[1].m_freq;
@@ -39,8 +34,7 @@ namespace crnlib
hist[256 + ((freq1 >> 8) & 0xFF)]++;
}
if (num_syms & 1)
{
if (num_syms & 1) {
const uint freq = p->m_freq;
hist[freq & 0xFF]++;
@@ -50,15 +44,13 @@ namespace crnlib
sym_freq* pCur_syms = syms0;
sym_freq* pNew_syms = syms1;
for (uint pass = 0; pass < cMaxPasses; pass++)
{
for (uint pass = 0; pass < cMaxPasses; pass++) {
const uint* pHist = &hist[pass << 8];
uint offsets[256];
uint cur_ofs = 0;
for (uint i = 0; i < 256; i += 2)
{
for (uint i = 0; i < 256; i += 2) {
offsets[i] = cur_ofs;
cur_ofs += pHist[i];
@@ -71,13 +63,11 @@ namespace crnlib
sym_freq* p = pCur_syms;
sym_freq* q = pCur_syms + (num_syms >> 1) * 2;
for ( ; p != q; p += 2)
{
for (; p != q; p += 2) {
uint c0 = p[0].m_freq;
uint c1 = p[1].m_freq;
if (pass)
{
if (pass) {
c0 >>= 8;
c1 >>= 8;
}
@@ -85,17 +75,14 @@ namespace crnlib
c0 &= 0xFF;
c1 &= 0xFF;
if (c0 == c1)
{
if (c0 == c1) {
uint dst_offset0 = offsets[c0];
offsets[c0] = dst_offset0 + 2;
pNew_syms[dst_offset0] = p[0];
pNew_syms[dst_offset0 + 1] = p[1];
}
else
{
} else {
uint dst_offset0 = offsets[c0]++;
uint dst_offset1 = offsets[c1]++;
@@ -104,8 +91,7 @@ namespace crnlib
}
}
if (num_syms & 1)
{
if (num_syms & 1) {
uint c = ((p->m_freq) >> pass_shift) & 0xFF;
uint dst_offset = offsets[c];
@@ -121,8 +107,7 @@ namespace crnlib
#ifdef CRNLIB_ASSERTS_ENABLED
uint prev_freq = 0;
for (uint i = 0; i < num_syms; i++)
{
for (uint i = 0; i < num_syms; i++) {
CRNLIB_ASSERT(!(pCur_syms[i].m_freq < prev_freq));
prev_freq = pCur_syms[i].m_freq;
}
@@ -131,8 +116,7 @@ namespace crnlib
return pCur_syms;
}
struct huffman_work_tables
{
struct huffman_work_tables {
enum { cMaxInternalNodes = cHuffmanMaxSupportedSyms };
sym_freq syms0[cHuffmanMaxSupportedSyms + 1 + cMaxInternalNodes];
@@ -141,13 +125,11 @@ namespace crnlib
uint16 queue[cMaxInternalNodes];
};
void* create_generate_huffman_codes_tables()
{
void* create_generate_huffman_codes_tables() {
return crnlib_new<huffman_work_tables>();
}
void free_generate_huffman_codes_tables(void* p)
{
void free_generate_huffman_codes_tables(void* p) {
crnlib_delete(static_cast<huffman_work_tables*>(p));
}
@@ -166,21 +148,30 @@ namespace crnlib
int dpth; /* current depth of leaves */
/* check for pathological cases */
if (n==0) { return; }
if (n==1) { A[0] = 0; return; }
if (n == 0) {
return;
}
if (n == 1) {
A[0] = 0;
return;
}
/* first pass, left to right, setting parent pointers */
A[0] += A[1]; root = 0; leaf = 2;
A[0] += A[1];
root = 0;
leaf = 2;
for (next = 1; next < n - 1; next++) {
/* select first item for a pairing */
if (leaf >= n || A[root] < A[leaf]) {
A[next] = A[root]; A[root++] = next;
A[next] = A[root];
A[root++] = next;
} else
A[next] = A[leaf++];
/* add on the second item */
if (leaf >= n || (root < next && A[root] < A[leaf])) {
A[next] += A[root]; A[root++] = next;
A[next] += A[root];
A[root++] = next;
} else
A[next] += A[leaf++];
}
@@ -191,38 +182,43 @@ namespace crnlib
A[next] = A[A[next]] + 1;
/* third pass, right to left, setting leaf depths */
avbl = 1; used = dpth = 0; root = n-2; next = n-1;
avbl = 1;
used = dpth = 0;
root = n - 2;
next = n - 1;
while (avbl > 0) {
while (root >= 0 && A[root] == dpth) {
used++; root--;
used++;
root--;
}
while (avbl > used) {
A[next--] = dpth; avbl--;
A[next--] = dpth;
avbl--;
}
avbl = 2*used; dpth++; used = 0;
avbl = 2 * used;
dpth++;
used = 0;
}
}
#endif
bool generate_huffman_codes(void* pContext, uint num_syms, const uint16* pFreq, uint8* pCodesizes, uint& max_code_size, uint& total_freq_ret)
{
bool generate_huffman_codes(void* pContext, uint num_syms, const uint16* pFreq, uint8* pCodesizes, uint& max_code_size, uint& total_freq_ret) {
if ((!num_syms) || (num_syms > cHuffmanMaxSupportedSyms))
return false;
huffman_work_tables& state = *static_cast<huffman_work_tables*>(pContext);;
huffman_work_tables& state = *static_cast<huffman_work_tables*>(pContext);
;
uint max_freq = 0;
uint total_freq = 0;
uint num_used_syms = 0;
for (uint i = 0; i < num_syms; i++)
{
for (uint i = 0; i < num_syms; i++) {
uint freq = pFreq[i];
if (!freq)
pCodesizes[i] = 0;
else
{
else {
total_freq += freq;
max_freq = math::maximum(max_freq, freq);
@@ -236,8 +232,7 @@ namespace crnlib
total_freq_ret = total_freq;
if (num_used_syms == 1)
{
if (num_used_syms == 1) {
pCodesizes[state.syms0[0].m_left] = 1;
return true;
}
@@ -252,8 +247,7 @@ namespace crnlib
calculate_minimum_redundancy(x, num_used_syms);
uint max_len = 0;
for (uint i = 0; i < num_used_syms; i++)
{
for (uint i = 0; i < num_used_syms; i++) {
uint len = x[i];
max_len = math::maximum(len, max_len);
pCodesizes[state.syms0[i].m_left] = static_cast<uint8>(len);
@@ -275,32 +269,27 @@ namespace crnlib
uint next_lowest_sym = 0;
uint num_nodes_remaining = num_used_syms;
do
{
do {
uint left_freq = syms[next_lowest_sym].m_freq;
uint left_child = next_lowest_sym;
if ((queue_end > queue_front) && (syms[state.queue[queue_front]].m_freq < left_freq))
{
if ((queue_end > queue_front) && (syms[state.queue[queue_front]].m_freq < left_freq)) {
left_child = state.queue[queue_front];
left_freq = syms[left_child].m_freq;
queue_front++;
}
else
} else
next_lowest_sym++;
uint right_freq = syms[next_lowest_sym].m_freq;
uint right_child = next_lowest_sym;
if ((queue_end > queue_front) && (syms[state.queue[queue_front]].m_freq < right_freq))
{
if ((queue_end > queue_front) && (syms[state.queue[queue_front]].m_freq < right_freq)) {
right_child = state.queue[queue_front];
right_freq = syms[right_child].m_freq;
queue_front++;
}
else
} else
next_lowest_sym++;
const uint internal_node_index = next_internal_node;
@@ -330,8 +319,7 @@ namespace crnlib
uint max_level = 0;
for ( ; ; )
{
for (;;) {
uint level = cur_node_index >> 16;
uint node_index = cur_node_index & 0xFFFF;
@@ -340,36 +328,28 @@ namespace crnlib
uint next_level = (cur_node_index + 0x10000) & 0xFFFF0000;
if (left_child < num_used_syms)
{
if (left_child < num_used_syms) {
max_level = math::maximum(max_level, level);
pCodesizes[syms[left_child].m_left] = static_cast<uint8>(level + 1);
if (right_child < num_used_syms)
{
if (right_child < num_used_syms) {
pCodesizes[syms[right_child].m_left] = static_cast<uint8>(level + 1);
if (pStack == pStack_top) break;
if (pStack == pStack_top)
break;
cur_node_index = *--pStack;
}
else
{
} else {
cur_node_index = next_level | right_child;
}
}
else
{
if (right_child < num_used_syms)
{
} else {
if (right_child < num_used_syms) {
max_level = math::maximum(max_level, level);
pCodesizes[syms[right_child].m_left] = static_cast<uint8>(level + 1);
cur_node_index = next_level | left_child;
}
else
{
} else {
*pStack++ = next_level | left_child;
cur_node_index = next_level | right_child;
@@ -384,4 +364,3 @@ namespace crnlib
}
} // namespace crnlib
+1 -2
View File
@@ -2,8 +2,7 @@
// See Copyright Notice and license at the end of inc/crnlib.h
#pragma once
namespace crnlib
{
namespace crnlib {
const uint cHuffmanMaxSupportedSyms = 8192;
void* create_generate_huffman_codes_tables();
+108 -196
View File
@@ -6,30 +6,26 @@
#include "crn_pixel_format.h"
#include "crn_rect.h"
namespace crnlib
{
namespace crnlib {
template <typename color_type>
class image
{
class image {
public:
typedef color_type color_t;
typedef crnlib::vector<color_type> pixel_buf_t;
image() :
m_width(0),
image()
: m_width(0),
m_height(0),
m_pitch(0),
m_total(0),
m_comp_flags(pixel_format_helpers::cDefaultCompFlags),
m_pPixels(NULL)
{
m_pPixels(NULL) {
}
// pitch is in PIXELS, not bytes.
image(uint width, uint height, uint pitch = UINT_MAX, const color_type& background = color_type::make_black(), uint flags = pixel_format_helpers::cDefaultCompFlags) :
m_comp_flags(flags)
{
image(uint width, uint height, uint pitch = UINT_MAX, const color_type& background = color_type::make_black(), uint flags = pixel_format_helpers::cDefaultCompFlags)
: m_comp_flags(flags) {
CRNLIB_ASSERT((width > 0) && (height > 0));
if (pitch == UINT_MAX)
pitch = width;
@@ -47,37 +43,29 @@ namespace crnlib
}
// pitch is in PIXELS, not bytes.
image(color_type* pPixels, uint width, uint height, uint pitch = UINT_MAX, uint flags = pixel_format_helpers::cDefaultCompFlags)
{
image(color_type* pPixels, uint width, uint height, uint pitch = UINT_MAX, uint flags = pixel_format_helpers::cDefaultCompFlags) {
alias(pPixels, width, height, pitch, flags);
}
image& operator= (const image& other)
{
image& operator=(const image& other) {
if (this == &other)
return *this;
if (other.m_pixel_buf.empty())
{
if (other.m_pixel_buf.empty()) {
// This doesn't look very safe - let's make a new instance.
//m_pixel_buf.clear();
//m_pPixels = other.m_pPixels;
const uint total_pixels = other.m_pitch * other.m_height;
if ((total_pixels) && (other.m_pPixels))
{
if ((total_pixels) && (other.m_pPixels)) {
m_pixel_buf.resize(total_pixels);
m_pixel_buf.insert(0, other.m_pPixels, m_pixel_buf.size());
m_pPixels = &m_pixel_buf.front();
}
else
{
} else {
m_pixel_buf.clear();
m_pPixels = NULL;
}
}
else
{
} else {
m_pixel_buf = other.m_pixel_buf;
m_pPixels = &m_pixel_buf.front();
}
@@ -91,15 +79,13 @@ namespace crnlib
return *this;
}
image(const image& other) :
m_width(0), m_height(0), m_pitch(0), m_total(0), m_comp_flags(pixel_format_helpers::cDefaultCompFlags), m_pPixels(NULL)
{
image(const image& other)
: m_width(0), m_height(0), m_pitch(0), m_total(0), m_comp_flags(pixel_format_helpers::cDefaultCompFlags), m_pPixels(NULL) {
*this = other;
}
// pitch is in PIXELS, not bytes.
void alias(color_type* pPixels, uint width, uint height, uint pitch = UINT_MAX, uint flags = pixel_format_helpers::cDefaultCompFlags)
{
void alias(color_type* pPixels, uint width, uint height, uint pitch = UINT_MAX, uint flags = pixel_format_helpers::cDefaultCompFlags) {
m_pixel_buf.clear();
m_pPixels = pPixels;
@@ -112,19 +98,16 @@ namespace crnlib
}
// pitch is in PIXELS, not bytes.
bool grant_ownership(color_type* pPixels, uint width, uint height, uint pitch = UINT_MAX, uint flags = pixel_format_helpers::cDefaultCompFlags)
{
bool grant_ownership(color_type* pPixels, uint width, uint height, uint pitch = UINT_MAX, uint flags = pixel_format_helpers::cDefaultCompFlags) {
if (pitch == UINT_MAX)
pitch = width;
if ((!pPixels) || (!width) || (!height) || (pitch < width))
{
if ((!pPixels) || (!width) || (!height) || (pitch < width)) {
CRNLIB_ASSERT(0);
return false;
}
if (pPixels == get_ptr())
{
if (pPixels == get_ptr()) {
CRNLIB_ASSERT(0);
return false;
}
@@ -145,8 +128,7 @@ namespace crnlib
return true;
}
void clear()
{
void clear() {
m_pPixels = NULL;
m_pixel_buf.clear();
m_width = 0;
@@ -162,8 +144,14 @@ namespace crnlib
inline void set_comp_flags(pixel_format_helpers::component_flags new_flags) { m_comp_flags = new_flags; }
inline void reset_comp_flags() { m_comp_flags = pixel_format_helpers::cDefaultCompFlags; }
inline bool is_component_valid(uint index) const { CRNLIB_ASSERT(index < 4U); return utils::is_flag_set(m_comp_flags, index); }
inline void set_component_valid(uint index, bool state) { CRNLIB_ASSERT(index < 4U); utils::set_flag(m_comp_flags, index, state); }
inline bool is_component_valid(uint index) const {
CRNLIB_ASSERT(index < 4U);
return utils::is_flag_set(m_comp_flags, index);
}
inline void set_component_valid(uint index, bool state) {
CRNLIB_ASSERT(index < 4U);
utils::set_flag(m_comp_flags, index, state);
}
inline bool has_rgb() const { return is_component_valid(0) || is_component_valid(1) || is_component_valid(2); }
inline bool has_alpha() const { return is_component_valid(3); }
@@ -171,19 +159,15 @@ namespace crnlib
inline bool is_grayscale() const { return utils::is_bit_set(m_comp_flags, pixel_format_helpers::cCompFlagGrayscale); }
inline void set_grayscale(bool state) { utils::set_bit(m_comp_flags, pixel_format_helpers::cCompFlagGrayscale, state); }
void set_all(const color_type& c)
{
void set_all(const color_type& c) {
for (uint i = 0; i < m_total; i++)
m_pPixels[i] = c;
}
void flip_x()
{
void flip_x() {
const uint half_width = m_width / 2;
for (uint y = 0; y < m_height; y++)
{
for (uint x = 0; x < half_width; x++)
{
for (uint y = 0; y < m_height; y++) {
for (uint x = 0; x < half_width; x++) {
color_type c((*this)(x, y));
(*this)(x, y) = (*this)(m_width - 1 - x, y);
(*this)(m_width - 1 - x, y) = c;
@@ -191,13 +175,10 @@ namespace crnlib
}
}
void flip_y()
{
void flip_y() {
const uint half_height = m_height / 2;
for (uint y = 0; y < half_height; y++)
{
for (uint x = 0; x < m_width; x++)
{
for (uint y = 0; y < half_height; y++) {
for (uint x = 0; x < m_width; x++) {
color_type c((*this)(x, y));
(*this)(x, y) = (*this)(x, m_height - 1 - y);
(*this)(x, m_height - 1 - y) = c;
@@ -205,11 +186,9 @@ namespace crnlib
}
}
void convert_to_grayscale()
{
void convert_to_grayscale() {
for (uint y = 0; y < m_height; y++)
for (uint x = 0; x < m_width; x++)
{
for (uint x = 0; x < m_width; x++) {
color_type c((*this)(x, y));
typename color_type::component_t l = static_cast<typename color_type::component_t>(c.get_luma());
c.r = l;
@@ -221,22 +200,18 @@ namespace crnlib
set_grayscale(true);
}
void swizzle(uint r, uint g, uint b, uint a)
{
void swizzle(uint r, uint g, uint b, uint a) {
for (uint y = 0; y < m_height; y++)
for (uint x = 0; x < m_width; x++)
{
for (uint x = 0; x < m_width; x++) {
const color_type& c = (*this)(x, y);
(*this)(x, y) = color_type(c[r], c[g], c[b], c[a]);
}
}
void set_alpha_to_luma()
{
void set_alpha_to_luma() {
for (uint y = 0; y < m_height; y++)
for (uint x = 0; x < m_width; x++)
{
for (uint x = 0; x < m_width; x++) {
color_type c((*this)(x, y));
typename color_type::component_t l = static_cast<typename color_type::component_t>(c.get_luma());
c.a = l;
@@ -246,32 +221,24 @@ namespace crnlib
set_component_valid(3, true);
}
bool extract_block(color_type* pDst, uint x, uint y, uint w, uint h, bool flip_xy = false) const
{
if ((x >= m_width) || (y >= m_height))
{
bool extract_block(color_type* pDst, uint x, uint y, uint w, uint h, bool flip_xy = false) const {
if ((x >= m_width) || (y >= m_height)) {
CRNLIB_ASSERT(0);
return false;
}
if (flip_xy)
{
if (flip_xy) {
for (uint y_ofs = 0; y_ofs < h; y_ofs++)
for (uint x_ofs = 0; x_ofs < w; x_ofs++)
pDst[x_ofs * h + y_ofs] = get_clamped(x_ofs + x, y_ofs + y); // 5/4/12 - this was incorrectly x_ofs * 4
}
else if (((x + w) > m_width) || ((y + h) > m_height))
{
} else if (((x + w) > m_width) || ((y + h) > m_height)) {
for (uint y_ofs = 0; y_ofs < h; y_ofs++)
for (uint x_ofs = 0; x_ofs < w; x_ofs++)
*pDst++ = get_clamped(x_ofs + x, y_ofs + y);
}
else
{
} else {
const color_type* pSrc = get_scanline(y) + x;
for (uint i = h; i; i--)
{
for (uint i = h; i; i--) {
memcpy(pDst, pSrc, w * sizeof(color_type));
pDst += w;
@@ -283,18 +250,15 @@ namespace crnlib
}
// No clipping!
void unclipped_fill_box(uint x, uint y, uint w, uint h, const color_type& c)
{
if (((x + w) > m_width) || ((y + h) > m_height))
{
void unclipped_fill_box(uint x, uint y, uint w, uint h, const color_type& c) {
if (((x + w) > m_width) || ((y + h) > m_height)) {
CRNLIB_ASSERT(0);
return;
}
color_type* p = get_scanline(y) + x;
for (uint i = h; i; i--)
{
for (uint i = h; i; i--) {
color_type* q = p;
for (uint j = w; j; j--)
*q++ = c;
@@ -302,8 +266,7 @@ namespace crnlib
}
}
void draw_rect(int x, int y, uint width, uint height, const color_type& c)
{
void draw_rect(int x, int y, uint width, uint height, const color_type& c) {
draw_line(x, y, x + width - 1, y, c);
draw_line(x, y, x, y + height - 1, c);
draw_line(x + width - 1, y, x + width - 1, y + height - 1, c);
@@ -311,22 +274,18 @@ namespace crnlib
}
// No clipping!
bool unclipped_blit(uint src_x, uint src_y, uint src_w, uint src_h, uint dst_x, uint dst_y, const image& src)
{
if ((!is_valid()) || (!src.is_valid()))
{
bool unclipped_blit(uint src_x, uint src_y, uint src_w, uint src_h, uint dst_x, uint dst_y, const image& src) {
if ((!is_valid()) || (!src.is_valid())) {
CRNLIB_ASSERT(0);
return false;
}
if ( ((src_x + src_w) > src.get_width()) || ((src_y + src_h) > src.get_height()) )
{
if (((src_x + src_w) > src.get_width()) || ((src_y + src_h) > src.get_height())) {
CRNLIB_ASSERT(0);
return false;
}
if ( ((dst_x + src_w) > get_width()) || ((dst_y + src_h) > get_height()) )
{
if (((dst_x + src_w) > get_width()) || ((dst_y + src_h) > get_height())) {
CRNLIB_ASSERT(0);
return false;
}
@@ -335,8 +294,7 @@ namespace crnlib
color_type* pD = &(*this)(dst_x, dst_y);
const uint bytes_to_copy = src_w * sizeof(color_type);
for (uint i = src_h; i; i--)
{
for (uint i = src_h; i; i--) {
memcpy(pD, pS, bytes_to_copy);
pS += src.get_pitch();
@@ -347,10 +305,8 @@ namespace crnlib
}
// With clipping.
bool blit(int dst_x, int dst_y, const image& src)
{
if ((!is_valid()) || (!src.is_valid()))
{
bool blit(int dst_x, int dst_y, const image& src) {
if ((!is_valid()) || (!src.is_valid())) {
CRNLIB_ASSERT(0);
return false;
}
@@ -358,16 +314,14 @@ namespace crnlib
int src_x = 0;
int src_y = 0;
if (dst_x < 0)
{
if (dst_x < 0) {
src_x = -dst_x;
if (src_x >= static_cast<int>(src.get_width()))
return false;
dst_x = 0;
}
if (dst_y < 0)
{
if (dst_y < 0) {
src_y = -dst_y;
if (src_y >= static_cast<int>(src.get_height()))
return false;
@@ -381,17 +335,15 @@ namespace crnlib
uint height = math::minimum(m_height - dst_y, src.get_height() - src_y);
bool success = unclipped_blit(src_x, src_y, width, height, dst_x, dst_y, src);
success;
(void)success;
CRNLIB_ASSERT(success);
return true;
}
// With clipping.
bool blit(int src_x, int src_y, int src_w, int src_h, int dst_x, int dst_y, const image& src)
{
if ((!is_valid()) || (!src.is_valid()))
{
bool blit(int src_x, int src_y, int src_w, int src_h, int dst_x, int dst_y, const image& src) {
if ((!is_valid()) || (!src.is_valid())) {
CRNLIB_ASSERT(0);
return false;
}
@@ -408,23 +360,21 @@ namespace crnlib
src_rect.get_left(), src_rect.get_top(),
math::minimum(src_rect.get_width(), dst_rect.get_width()), math::minimum(src_rect.get_height(), dst_rect.get_height()),
dst_rect.get_left(), dst_rect.get_top(), src);
success;
(void)success;
CRNLIB_ASSERT(success);
return true;
}
// In-place resize of image dimensions (cropping).
bool resize(uint new_width, uint new_height, uint new_pitch = UINT_MAX, const color_type background = color_type::make_black())
{
bool resize(uint new_width, uint new_height, uint new_pitch = UINT_MAX, const color_type background = color_type::make_black()) {
if (new_pitch == UINT_MAX)
new_pitch = new_width;
if ((new_width == m_width) && (new_height == m_height) && (new_pitch == m_pitch))
return true;
if ((!new_width) || (!new_height) || (!new_pitch))
{
if ((!new_width) || (!new_height) || (!new_pitch)) {
clear();
return false;
}
@@ -432,16 +382,13 @@ namespace crnlib
pixel_buf_t existing_pixels;
existing_pixels.swap(m_pixel_buf);
if (!m_pixel_buf.try_resize(new_height * new_pitch))
{
if (!m_pixel_buf.try_resize(new_height * new_pitch)) {
clear();
return false;
}
for (uint y = 0; y < new_height; y++)
{
for (uint x = 0; x < new_width; x++)
{
for (uint y = 0; y < new_height; y++) {
for (uint x = 0; x < new_width; x++) {
if ((x < m_width) && (y < m_height))
m_pixel_buf[x + y * new_pitch] = existing_pixels[x + y * m_pitch];
else
@@ -479,26 +426,22 @@ namespace crnlib
inline const color_type* get_pixels() const { return m_pPixels; }
inline color_type* get_pixels() { return m_pPixels; }
inline const color_type& operator() (uint x, uint y) const
{
inline const color_type& operator()(uint x, uint y) const {
CRNLIB_ASSERT((x < m_width) && (y < m_height));
return m_pPixels[x + y * m_pitch];
}
inline color_type& operator() (uint x, uint y)
{
inline color_type& operator()(uint x, uint y) {
CRNLIB_ASSERT((x < m_width) && (y < m_height));
return m_pPixels[x + y * m_pitch];
}
inline const color_type& get_unclamped(uint x, uint y) const
{
inline const color_type& get_unclamped(uint x, uint y) const {
CRNLIB_ASSERT((x < m_width) && (y < m_height));
return m_pPixels[x + y * m_pitch];
}
inline const color_type& get_clamped(int x, int y) const
{
inline const color_type& get_clamped(int x, int y) const {
x = math::clamp<int>(x, 0, m_width - 1);
y = math::clamp<int>(y, 0, m_height - 1);
return m_pPixels[x + y * m_pitch];
@@ -506,8 +449,7 @@ namespace crnlib
// Sample image with bilinear filtering.
// (x,y) - Continuous coordinates, where pixel centers are at (.5,.5), valid image coords are [0,width] and [0,height].
void get_filtered(float x, float y, color_type& result) const
{
void get_filtered(float x, float y, color_type& result) const {
x -= .5f;
y -= .5f;
@@ -521,8 +463,7 @@ namespace crnlib
color_type c(get_clamped(ix, iy + 1));
color_type d(get_clamped(ix + 1, iy + 1));
for (uint i = 0; i < 4; i++)
{
for (uint i = 0; i < 4; i++) {
double top = math::lerp<double>(a[i], b[i], wx);
double bot = math::lerp<double>(c[i], d[i], wx);
double m = math::lerp<double>(top, bot, wy);
@@ -534,8 +475,7 @@ namespace crnlib
}
}
void get_filtered(float x, float y, vec4F& result) const
{
void get_filtered(float x, float y, vec4F& result) const {
x -= .5f;
y -= .5f;
@@ -549,8 +489,7 @@ namespace crnlib
color_type c(get_clamped(ix, iy + 1));
color_type d(get_clamped(ix + 1, iy + 1));
for (uint i = 0; i < 4; i++)
{
for (uint i = 0; i < 4; i++) {
float top = math::lerp<float>(a[i], b[i], wx);
float bot = math::lerp<float>(c[i], d[i], wx);
float m = math::lerp<float>(top, bot, wy);
@@ -559,44 +498,37 @@ namespace crnlib
}
}
inline void set_pixel_unclipped(uint x, uint y, const color_type& c)
{
inline void set_pixel_unclipped(uint x, uint y, const color_type& c) {
CRNLIB_ASSERT((x < m_width) && (y < m_height));
m_pPixels[x + y * m_pitch] = c;
}
inline void set_pixel_clipped(int x, int y, const color_type& c)
{
inline void set_pixel_clipped(int x, int y, const color_type& c) {
if ((static_cast<uint>(x) >= m_width) || (static_cast<uint>(y) >= m_height))
return;
m_pPixels[x + y * m_pitch] = c;
}
inline const color_type* get_scanline(uint y) const
{
inline const color_type* get_scanline(uint y) const {
CRNLIB_ASSERT(y < m_height);
return &m_pPixels[y * m_pitch];
}
inline color_type* get_scanline(uint y)
{
inline color_type* get_scanline(uint y) {
CRNLIB_ASSERT(y < m_height);
return &m_pPixels[y * m_pitch];
}
inline const color_type* get_ptr() const
{
inline const color_type* get_ptr() const {
return m_pPixels;
}
inline color_type* get_ptr()
{
inline color_type* get_ptr() {
return m_pPixels;
}
inline void swap(image& other)
{
inline void swap(image& other) {
utils::swap(m_width, other.m_width);
utils::swap(m_height, other.m_height);
utils::swap(m_pitch, other.m_pitch);
@@ -606,50 +538,35 @@ namespace crnlib
m_pixel_buf.swap(other.m_pixel_buf);
}
void draw_line(int xs, int ys, int xe, int ye, const color_type& color)
{
if (xs > xe)
{
void draw_line(int xs, int ys, int xe, int ye, const color_type& color) {
if (xs > xe) {
utils::swap(xs, xe);
utils::swap(ys, ye);
}
int dx = xe - xs, dy = ye - ys;
if (!dx)
{
if (!dx) {
if (ys > ye)
utils::swap(ys, ye);
for (int i = ys; i <= ye; i++)
set_pixel_clipped(xs, i, color);
}
else if (!dy)
{
} else if (!dy) {
for (int i = xs; i < xe; i++)
set_pixel_clipped(i, ys, color);
}
else if (dy > 0)
{
if (dy <= dx)
{
} else if (dy > 0) {
if (dy <= dx) {
int e = 2 * dy - dx, e_no_inc = 2 * dy, e_inc = 2 * (dy - dx);
rasterize_line(xs, ys, xe, ye, 0, 1, e, e_inc, e_no_inc, color);
}
else
{
} else {
int e = 2 * dx - dy, e_no_inc = 2 * dx, e_inc = 2 * (dx - dy);
rasterize_line(xs, ys, xe, ye, 1, 1, e, e_inc, e_no_inc, color);
}
}
else
{
} else {
dy = -dy;
if (dy <= dx)
{
if (dy <= dx) {
int e = 2 * dy - dx, e_no_inc = 2 * dy, e_inc = 2 * (dy - dx);
rasterize_line(xs, ys, xe, ye, 0, -1, e, e_inc, e_no_inc, color);
}
else
{
} else {
int e = 2 * dx - dy, e_no_inc = (2 * dx), e_inc = 2 * (dx - dy);
rasterize_line(xe, ye, xs, ys, 1, -1, e, e_inc, e_no_inc, color);
}
@@ -670,35 +587,31 @@ namespace crnlib
pixel_buf_t m_pixel_buf;
void rasterize_line(int xs, int ys, int xe, int ye, int pred, int inc_dec, int e, int e_inc, int e_no_inc, const color_type& color)
{
void rasterize_line(int xs, int ys, int xe, int ye, int pred, int inc_dec, int e, int e_inc, int e_no_inc, const color_type& color) {
int start, end, var;
if (pred)
{
start = ys; end = ye; var = xs;
for (int i = start; i <= end; i++)
{
if (pred) {
start = ys;
end = ye;
var = xs;
for (int i = start; i <= end; i++) {
set_pixel_clipped(var, i, color);
if (e < 0)
e += e_no_inc;
else
{
else {
var += inc_dec;
e += e_inc;
}
}
}
else
{
start = xs; end = xe; var = ys;
for (int i = start; i <= end; i++)
{
} else {
start = xs;
end = xe;
var = ys;
for (int i = start; i <= end; i++) {
set_pixel_clipped(i, var, color);
if (e < 0)
e += e_no_inc;
else
{
else {
var += inc_dec;
e += e_inc;
}
@@ -715,8 +628,7 @@ namespace crnlib
typedef image<color_quad_f> image_f;
template <typename color_type>
inline void swap(image<color_type>& a, image<color_type>& b)
{
inline void swap(image<color_type>& a, image<color_type>& b) {
a.swap(b);
}
+206 -356
View File
File diff suppressed because it is too large Load Diff
+19 -29
View File
@@ -4,14 +4,11 @@
#include "crn_image.h"
#include "crn_data_stream_serializer.h"
namespace crnlib
{
namespace crnlib {
enum pixel_format;
namespace image_utils
{
enum read_flags_t
{
namespace image_utils {
enum read_flags_t {
cReadFlagForceSTB = 1,
cReadFlagsAllFlags = 1
@@ -26,8 +23,7 @@ namespace crnlib
// *pActual_comps is set to 1, 3, or 4. req_comps must range from 1-4.
uint8* read_from_memory(const uint8* pImage, int nSize, int* pWidth, int* pHeight, int* pActualComps, int req_comps, const char* pFilename);
enum
{
enum {
cWriteFlagIgnoreAlpha = 0x00000001,
cWriteFlagGrayscale = 0x00000002,
@@ -42,7 +38,10 @@ namespace crnlib
const int cLumaComponentIndex = -1;
inline uint create_jpeg_write_flags(uint base_flags, uint quality_level) { CRNLIB_ASSERT(quality_level <= 100); return base_flags | ((quality_level << cWriteFlagJPEGQualityLevelShift) & cWriteFlagJPEGQualityLevelMask); }
inline uint create_jpeg_write_flags(uint base_flags, uint quality_level) {
CRNLIB_ASSERT(quality_level <= 100);
return base_flags | ((quality_level << cWriteFlagJPEGQualityLevelShift) & cWriteFlagJPEGQualityLevelMask);
}
bool write_to_file(const char* pFilename, const image_u8& img, uint write_flags = 0, int grayscale_comp_index = cLumaComponentIndex);
@@ -50,10 +49,9 @@ namespace crnlib
bool is_normal_map(const image_u8& img, const char* pFilename = NULL);
void renorm_normal_map(image_u8& img);
struct resample_params
{
resample_params() :
m_dst_width(0),
struct resample_params {
resample_params()
: m_dst_width(0),
m_dst_height(0),
m_pFilter("lanczos4"),
m_filter_scale(1.0f),
@@ -62,8 +60,7 @@ namespace crnlib
m_first_comp(0),
m_num_comps(4),
m_source_gamma(2.2f), // 1.75f
m_multithreaded(true)
{
m_multithreaded(true) {
}
uint m_dst_width;
@@ -84,8 +81,7 @@ namespace crnlib
bool compute_delta(image_u8& dest, image_u8& a, image_u8& b, uint scale = 2);
class error_metrics
{
class error_metrics {
public:
error_metrics() { utils::zero_this(this); }
@@ -101,18 +97,15 @@ namespace crnlib
double mRootMeanSquared;
double mPeakSNR;
inline bool operator== (const error_metrics& other) const
{
inline bool operator==(const error_metrics& other) const {
return mPeakSNR == other.mPeakSNR;
}
inline bool operator< (const error_metrics& other) const
{
inline bool operator<(const error_metrics& other) const {
return mPeakSNR < other.mPeakSNR;
}
inline bool operator> (const error_metrics& other) const
{
inline bool operator>(const error_metrics& other) const {
return mPeakSNR > other.mPeakSNR;
}
};
@@ -123,8 +116,7 @@ namespace crnlib
double compute_ssim(const image_u8& a, const image_u8& b, int channel_index);
void print_ssim(const image_u8& src_img, const image_u8& dst_img);
enum conversion_type
{
enum conversion_type {
cConversion_Invalid = -1,
cConversion_To_CCxY,
@@ -154,8 +146,7 @@ namespace crnlib
void convert_image(image_u8& img, conversion_type conv_type);
template <typename image_type>
inline uint8* pack_image(const image_type& img, const pixel_packer& packer, uint& n)
{
inline uint8* pack_image(const image_type& img, const pixel_packer& packer, uint& n) {
n = 0;
if (!packer.is_valid())
@@ -170,8 +161,7 @@ namespace crnlib
uint8* pImage = static_cast<uint8*>(crnlib_malloc(n));
uint8* pDst = pImage;
for (uint y = 0; y < height; y++)
{
for (uint y = 0; y < height; y++) {
const typename image_type::color_t* pSrc = img.get_scanline(y);
for (uint x = 0; x < width; x++)
pDst = (uint8*)packer.pack(*pSrc++, pDst);
+16 -35
View File
@@ -3,12 +3,9 @@
#pragma once
#include "crn_ray.h"
namespace crnlib
{
namespace intersection
{
enum result
{
namespace crnlib {
namespace intersection {
enum result {
cBackfacing = -1,
cFailure = 0,
cSuccess,
@@ -19,10 +16,8 @@ namespace crnlib
// Returns cInside, cSuccess, or cFailure.
// Algorithm: Graphics Gems 1
template <typename vector_type, typename scalar_type, typename ray_type, typename aabb_type>
result ray_aabb(vector_type& coord, scalar_type& t, const ray_type& ray, const aabb_type& box)
{
enum
{
result ray_aabb(vector_type& coord, scalar_type& t, const ray_type& ray, const aabb_type& box) {
enum {
cNumDim = vector_type::num_elements,
cRight = 0,
cLeft = 1,
@@ -33,36 +28,28 @@ namespace crnlib
int quadrant[cNumDim];
scalar_type candidate_plane[cNumDim];
for (int i = 0; i < cNumDim; i++)
{
if (ray.get_origin()[i] < box[0][i])
{
for (int i = 0; i < cNumDim; i++) {
if (ray.get_origin()[i] < box[0][i]) {
quadrant[i] = cLeft;
candidate_plane[i] = box[0][i];
inside = false;
}
else if (ray.get_origin()[i] > box[1][i])
{
} else if (ray.get_origin()[i] > box[1][i]) {
quadrant[i] = cRight;
candidate_plane[i] = box[1][i];
inside = false;
}
else
{
} else {
quadrant[i] = cMiddle;
}
}
if (inside)
{
if (inside) {
coord = ray.get_origin();
t = 0.0f;
return cInside;
}
scalar_type max_t[cNumDim];
for (int i = 0; i < cNumDim; i++)
{
for (int i = 0; i < cNumDim; i++) {
if ((quadrant[i] != cMiddle) && (ray.get_direction()[i] != 0.0f))
max_t[i] = (candidate_plane[i] - ray.get_origin()[i]) / ray.get_direction()[i];
else
@@ -77,17 +64,13 @@ namespace crnlib
if (max_t[which_plane] < 0.0f)
return cFailure;
for (int i = 0; i < cNumDim; i++)
{
if (i != which_plane)
{
for (int i = 0; i < cNumDim; i++) {
if (i != which_plane) {
coord[i] = ray.get_origin()[i] + max_t[which_plane] * ray.get_direction()[i];
if ((coord[i] < box[0][i]) || (coord[i] > box[1][i]))
return cFailure;
}
else
{
} else {
coord[i] = candidate_plane[i];
}
@@ -99,10 +82,8 @@ namespace crnlib
}
template <typename vector_type, typename scalar_type, typename ray_type, typename aabb_type>
result ray_aabb(bool& started_within, vector_type& coord, scalar_type& t, const ray_type& ray, const aabb_type& box)
{
if (!box.contains(ray.get_origin()))
{
result ray_aabb(bool& started_within, vector_type& coord, scalar_type& t, const ray_type& ray, const aabb_type& box) {
if (!box.contains(ray.get_origin())) {
started_within = false;
return ray_aabb(coord, t, ray, box);
}
+427 -656
View File
File diff suppressed because it is too large Load Diff
+64 -34
View File
@@ -15,8 +15,7 @@
#define JPGD_NORETURN
#endif
namespace jpgd
{
namespace jpgd {
typedef unsigned char uint8;
typedef signed short int16;
typedef unsigned short uint16;
@@ -32,17 +31,43 @@ namespace jpgd
unsigned char* decompress_jpeg_image_from_file(const char* pSrc_filename, int* width, int* height, int* actual_comps, int req_comps);
// Success/failure error codes.
enum jpgd_status
{
JPGD_SUCCESS = 0, JPGD_FAILED = -1, JPGD_DONE = 1,
JPGD_BAD_DHT_COUNTS = -256, JPGD_BAD_DHT_INDEX, JPGD_BAD_DHT_MARKER, JPGD_BAD_DQT_MARKER, JPGD_BAD_DQT_TABLE,
JPGD_BAD_PRECISION, JPGD_BAD_HEIGHT, JPGD_BAD_WIDTH, JPGD_TOO_MANY_COMPONENTS,
JPGD_BAD_SOF_LENGTH, JPGD_BAD_VARIABLE_MARKER, JPGD_BAD_DRI_LENGTH, JPGD_BAD_SOS_LENGTH,
JPGD_BAD_SOS_COMP_ID, JPGD_W_EXTRA_BYTES_BEFORE_MARKER, JPGD_NO_ARITHMITIC_SUPPORT, JPGD_UNEXPECTED_MARKER,
JPGD_NOT_JPEG, JPGD_UNSUPPORTED_MARKER, JPGD_BAD_DQT_LENGTH, JPGD_TOO_MANY_BLOCKS,
JPGD_UNDEFINED_QUANT_TABLE, JPGD_UNDEFINED_HUFF_TABLE, JPGD_NOT_SINGLE_SCAN, JPGD_UNSUPPORTED_COLORSPACE,
JPGD_UNSUPPORTED_SAMP_FACTORS, JPGD_DECODE_ERROR, JPGD_BAD_RESTART_MARKER, JPGD_ASSERTION_ERROR,
JPGD_BAD_SOS_SPECTRAL, JPGD_BAD_SOS_SUCCESSIVE, JPGD_STREAM_READ, JPGD_NOTENOUGHMEM
enum jpgd_status {
JPGD_SUCCESS = 0,
JPGD_FAILED = -1,
JPGD_DONE = 1,
JPGD_BAD_DHT_COUNTS = -256,
JPGD_BAD_DHT_INDEX,
JPGD_BAD_DHT_MARKER,
JPGD_BAD_DQT_MARKER,
JPGD_BAD_DQT_TABLE,
JPGD_BAD_PRECISION,
JPGD_BAD_HEIGHT,
JPGD_BAD_WIDTH,
JPGD_TOO_MANY_COMPONENTS,
JPGD_BAD_SOF_LENGTH,
JPGD_BAD_VARIABLE_MARKER,
JPGD_BAD_DRI_LENGTH,
JPGD_BAD_SOS_LENGTH,
JPGD_BAD_SOS_COMP_ID,
JPGD_W_EXTRA_BYTES_BEFORE_MARKER,
JPGD_NO_ARITHMITIC_SUPPORT,
JPGD_UNEXPECTED_MARKER,
JPGD_NOT_JPEG,
JPGD_UNSUPPORTED_MARKER,
JPGD_BAD_DQT_LENGTH,
JPGD_TOO_MANY_BLOCKS,
JPGD_UNDEFINED_QUANT_TABLE,
JPGD_UNDEFINED_HUFF_TABLE,
JPGD_NOT_SINGLE_SCAN,
JPGD_UNSUPPORTED_COLORSPACE,
JPGD_UNSUPPORTED_SAMP_FACTORS,
JPGD_DECODE_ERROR,
JPGD_BAD_RESTART_MARKER,
JPGD_ASSERTION_ERROR,
JPGD_BAD_SOS_SPECTRAL,
JPGD_BAD_SOS_SUCCESSIVE,
JPGD_STREAM_READ,
JPGD_NOTENOUGHMEM
};
// Input stream interface.
@@ -50,8 +75,7 @@ namespace jpgd
// The decoder is rather greedy: it will keep on calling this method until its internal input buffer is full, or until the EOF flag is set.
// It the input stream contains data after the JPEG stream's EOI (end of image) marker it will probably be pulled into the internal buffer.
// Call the get_total_bytes_read() method to determine the actual size of the JPEG stream after successful decoding.
class jpeg_decoder_stream
{
class jpeg_decoder_stream {
public:
jpeg_decoder_stream() {}
virtual ~jpeg_decoder_stream() {}
@@ -67,8 +91,7 @@ namespace jpgd
};
// stdio FILE stream class.
class jpeg_decoder_file_stream : public jpeg_decoder_stream
{
class jpeg_decoder_file_stream : public jpeg_decoder_stream {
jpeg_decoder_file_stream(const jpeg_decoder_file_stream&);
jpeg_decoder_file_stream& operator=(const jpeg_decoder_file_stream&);
@@ -86,19 +109,24 @@ namespace jpgd
};
// Memory stream class.
class jpeg_decoder_mem_stream : public jpeg_decoder_stream
{
class jpeg_decoder_mem_stream : public jpeg_decoder_stream {
const uint8* m_pSrc_data;
uint m_ofs, m_size;
public:
jpeg_decoder_mem_stream() : m_pSrc_data(NULL), m_ofs(0), m_size(0) { }
jpeg_decoder_mem_stream(const uint8 *pSrc_data, uint size) : m_pSrc_data(pSrc_data), m_ofs(0), m_size(size) { }
jpeg_decoder_mem_stream()
: m_pSrc_data(NULL), m_ofs(0), m_size(0) {}
jpeg_decoder_mem_stream(const uint8* pSrc_data, uint size)
: m_pSrc_data(pSrc_data), m_ofs(0), m_size(size) {}
virtual ~jpeg_decoder_mem_stream() {}
bool open(const uint8* pSrc_data, uint size);
void close() { m_pSrc_data = NULL; m_ofs = 0; m_size = 0; }
void close() {
m_pSrc_data = NULL;
m_ofs = 0;
m_size = 0;
}
virtual int read(uint8* pBuf, int max_bytes_to_read, bool* pEOF_flag);
};
@@ -106,17 +134,22 @@ namespace jpgd
// Loads JPEG file from a jpeg_decoder_stream.
unsigned char* decompress_jpeg_image_from_stream(jpeg_decoder_stream* pStream, int* width, int* height, int* actual_comps, int req_comps);
enum
{
JPGD_IN_BUF_SIZE = 8192, JPGD_MAX_BLOCKS_PER_MCU = 10, JPGD_MAX_HUFF_TABLES = 8, JPGD_MAX_QUANT_TABLES = 4,
JPGD_MAX_COMPONENTS = 4, JPGD_MAX_COMPS_IN_SCAN = 4, JPGD_MAX_BLOCKS_PER_ROW = 8192, JPGD_MAX_HEIGHT = 16384, JPGD_MAX_WIDTH = 16384
enum {
JPGD_IN_BUF_SIZE = 8192,
JPGD_MAX_BLOCKS_PER_MCU = 10,
JPGD_MAX_HUFF_TABLES = 8,
JPGD_MAX_QUANT_TABLES = 4,
JPGD_MAX_COMPONENTS = 4,
JPGD_MAX_COMPS_IN_SCAN = 4,
JPGD_MAX_BLOCKS_PER_ROW = 8192,
JPGD_MAX_HEIGHT = 16384,
JPGD_MAX_WIDTH = 16384
};
typedef int16 jpgd_quant_t;
typedef int16 jpgd_block_t;
class jpeg_decoder
{
class jpeg_decoder {
public:
// Call get_error_code() after constructing to determine if the stream is valid or not. You may call the get_width(), get_height(), etc.
// methods after the constructor is called. You may then either destruct the object, or begin decoding the image by calling begin_decoding(), then decode() on each scanline.
@@ -155,8 +188,7 @@ namespace jpgd
typedef void (*pDecode_block_func)(jpeg_decoder*, int, int, int);
struct huff_tables
{
struct huff_tables {
bool ac_table;
uint look_up[256];
uint look_up2[256];
@@ -164,16 +196,14 @@ namespace jpgd
uint tree[512];
};
struct coeff_buf
{
struct coeff_buf {
uint8* pData;
int block_num_x, block_num_y;
int block_len_x, block_len_y;
int block_size;
};
struct mem_block
{
struct mem_block {
mem_block* m_pNext;
size_t m_used_count;
size_t m_size;
+451 -375
View File
File diff suppressed because it is too large Load Diff
+17 -15
View File
@@ -4,8 +4,7 @@
#ifndef JPEG_ENCODER_H
#define JPEG_ENCODER_H
namespace jpge
{
namespace jpge {
typedef unsigned char uint8;
typedef signed short int16;
typedef signed int int32;
@@ -14,17 +13,21 @@ namespace jpge
typedef unsigned int uint;
// JPEG chroma subsampling factors. Y_ONLY (grayscale images) and H2V2 (color images) are the most common.
enum subsampling_t { Y_ONLY = 0, H1V1 = 1, H2V1 = 2, H2V2 = 3 };
enum subsampling_t { Y_ONLY = 0,
H1V1 = 1,
H2V1 = 2,
H2V2 = 3 };
// JPEG compression parameters structure.
struct params
{
inline params() : m_quality(85), m_subsampling(H2V2), m_no_chroma_discrim_flag(false), m_two_pass_flag(false) { }
struct params {
inline params()
: m_quality(85), m_subsampling(H2V2), m_no_chroma_discrim_flag(false), m_two_pass_flag(false) {}
inline bool check() const
{
if ((m_quality < 1) || (m_quality > 100)) return false;
if ((uint)m_subsampling > (uint)H2V2) return false;
inline bool check() const {
if ((m_quality < 1) || (m_quality > 100))
return false;
if ((uint)m_subsampling > (uint)H2V2)
return false;
return true;
}
@@ -56,17 +59,16 @@ namespace jpge
// Output stream abstract class - used by the jpeg_encoder class to write to the output stream.
// put_buf() is generally called with len==JPGE_OUT_BUF_SIZE bytes, but for headers it'll be called with smaller amounts.
class output_stream
{
class output_stream {
public:
virtual ~output_stream(){};
virtual bool put_buf(const void* Pbuf, int len) = 0;
template<class T> inline bool put_obj(const T& obj) { return put_buf(&obj, sizeof(T)); }
template <class T>
inline bool put_obj(const T& obj) { return put_buf(&obj, sizeof(T)); }
};
// Lower level jpeg_encoder class - useful if more control is needed than the above helper functions.
class jpeg_encoder
{
class jpeg_encoder {
public:
jpeg_encoder();
~jpeg_encoder();
+86 -172
View File
@@ -6,14 +6,11 @@
// Set #if CRNLIB_KTX_PVRTEX_WORKAROUNDS to 1 to enable various workarounds for oddball KTX files written by PVRTexTool.
#define CRNLIB_KTX_PVRTEX_WORKAROUNDS 1
namespace crnlib
{
namespace crnlib {
const uint8 s_ktx_file_id[12] = {0xAB, 0x4B, 0x54, 0x58, 0x20, 0x31, 0x31, 0xBB, 0x0D, 0x0A, 0x1A, 0x0A};
bool is_packed_pixel_ogl_type(uint32 ogl_type)
{
switch (ogl_type)
{
bool is_packed_pixel_ogl_type(uint32 ogl_type) {
switch (ogl_type) {
case KTX_UNSIGNED_BYTE_3_3_2:
case KTX_UNSIGNED_BYTE_2_3_3_REV:
case KTX_UNSIGNED_SHORT_5_6_5:
@@ -34,10 +31,8 @@ namespace crnlib
return false;
}
uint get_ogl_type_size(uint32 ogl_type)
{
switch (ogl_type)
{
uint get_ogl_type_size(uint32 ogl_type) {
switch (ogl_type) {
case KTX_UNSIGNED_BYTE:
case KTX_BYTE:
return 1;
@@ -71,16 +66,16 @@ namespace crnlib
return 0;
}
uint32 get_ogl_base_internal_fmt(uint32 ogl_fmt)
{
switch (ogl_fmt)
{
uint32 get_ogl_base_internal_fmt(uint32 ogl_fmt) {
switch (ogl_fmt) {
case KTX_ETC1_RGB8_OES:
case KTX_COMPRESSED_RGB8_ETC2:
case KTX_RGB_S3TC:
case KTX_RGB4_S3TC:
case KTX_COMPRESSED_RGB_S3TC_DXT1_EXT:
case KTX_COMPRESSED_SRGB_S3TC_DXT1_EXT:
return KTX_RGB;
case KTX_COMPRESSED_RGBA8_ETC2_EAC:
case KTX_COMPRESSED_RGBA_S3TC_DXT1_EXT:
case KTX_COMPRESSED_SRGB_ALPHA_S3TC_DXT1_EXT:
case KTX_RGBA_S3TC:
@@ -141,31 +136,30 @@ namespace crnlib
return 0;
}
bool get_ogl_fmt_desc(uint32 ogl_fmt, uint32 ogl_type, uint& block_dim, uint& bytes_per_block)
{
bool get_ogl_fmt_desc(uint32 ogl_fmt, uint32 ogl_type, uint& block_dim, uint& bytes_per_block) {
uint ogl_type_size = get_ogl_type_size(ogl_type);
block_dim = 1;
bytes_per_block = 0;
switch (ogl_fmt)
{
switch (ogl_fmt) {
case KTX_COMPRESSED_RED_RGTC1_EXT:
case KTX_COMPRESSED_SIGNED_RED_RGTC1_EXT:
case KTX_COMPRESSED_LUMINANCE_LATC1_EXT:
case KTX_COMPRESSED_SIGNED_LUMINANCE_LATC1_EXT:
case KTX_ETC1_RGB8_OES:
case KTX_COMPRESSED_RGB8_ETC2:
case KTX_RGB_S3TC:
case KTX_RGB4_S3TC:
case KTX_COMPRESSED_RGB_S3TC_DXT1_EXT:
case KTX_COMPRESSED_RGBA_S3TC_DXT1_EXT:
case KTX_COMPRESSED_SRGB_S3TC_DXT1_EXT:
case KTX_COMPRESSED_SRGB_ALPHA_S3TC_DXT1_EXT:
{
case KTX_COMPRESSED_SRGB_ALPHA_S3TC_DXT1_EXT: {
block_dim = 4;
bytes_per_block = 8;
break;
}
case KTX_COMPRESSED_RGBA8_ETC2_EAC:
case KTX_COMPRESSED_LUMINANCE_ALPHA_LATC2_EXT:
case KTX_COMPRESSED_SIGNED_LUMINANCE_ALPHA_LATC2_EXT:
case KTX_COMPRESSED_RED_GREEN_RGTC2_EXT:
@@ -177,8 +171,7 @@ namespace crnlib
case KTX_COMPRESSED_RGBA_S3TC_DXT5_EXT:
case KTX_COMPRESSED_SRGB_ALPHA_S3TC_DXT5_EXT:
case KTX_RGBA_DXT5_S3TC:
case KTX_RGBA4_DXT5_S3TC:
{
case KTX_RGBA4_DXT5_S3TC: {
block_dim = 4;
bytes_per_block = 16;
break;
@@ -191,30 +184,26 @@ namespace crnlib
case KTX_RED_INTEGER:
case KTX_GREEN_INTEGER:
case KTX_BLUE_INTEGER:
case KTX_LUMINANCE:
{
case KTX_LUMINANCE: {
bytes_per_block = ogl_type_size;
break;
}
case KTX_R8:
case KTX_R8UI:
case KTX_ALPHA8:
case KTX_LUMINANCE8:
{
case KTX_LUMINANCE8: {
bytes_per_block = 1;
break;
}
case 2:
case KTX_RG:
case KTX_RG_INTEGER:
case KTX_LUMINANCE_ALPHA:
{
case KTX_LUMINANCE_ALPHA: {
bytes_per_block = 2 * ogl_type_size;
break;
}
case KTX_RG8:
case KTX_LUMINANCE8_ALPHA8:
{
case KTX_LUMINANCE8_ALPHA8: {
bytes_per_block = 2;
break;
}
@@ -223,14 +212,12 @@ namespace crnlib
case KTX_RGB:
case KTX_BGR:
case KTX_RGB_INTEGER:
case KTX_BGR_INTEGER:
{
case KTX_BGR_INTEGER: {
bytes_per_block = is_packed_pixel_ogl_type(ogl_type) ? ogl_type_size : (3 * ogl_type_size);
break;
}
case KTX_RGB8:
case KTX_SRGB8:
{
case KTX_SRGB8: {
bytes_per_block = 3;
break;
}
@@ -239,14 +226,12 @@ namespace crnlib
case KTX_BGRA:
case KTX_RGBA_INTEGER:
case KTX_BGRA_INTEGER:
case KTX_SRGB_ALPHA:
{
case KTX_SRGB_ALPHA: {
bytes_per_block = is_packed_pixel_ogl_type(ogl_type) ? ogl_type_size : (4 * ogl_type_size);
break;
}
case KTX_SRGB8_ALPHA8:
case KTX_RGBA8:
{
case KTX_RGBA8: {
bytes_per_block = 4;
break;
}
@@ -256,19 +241,15 @@ namespace crnlib
return true;
}
bool ktx_texture::compute_pixel_info()
{
if ((!m_header.m_glType) || (!m_header.m_glFormat))
{
bool ktx_texture::compute_pixel_info() {
if ((!m_header.m_glType) || (!m_header.m_glFormat)) {
if ((m_header.m_glType) || (m_header.m_glFormat))
return false;
// Must be a compressed format.
if (!get_ogl_fmt_desc(m_header.m_glInternalFormat, m_header.m_glType, m_block_dim, m_bytes_per_block))
{
if (!get_ogl_fmt_desc(m_header.m_glInternalFormat, m_header.m_glType, m_block_dim, m_bytes_per_block)) {
#if CRNLIB_KTX_PVRTEX_WORKAROUNDS
if ((!m_header.m_glInternalFormat) && (!m_header.m_glType) && (!m_header.m_glTypeSize) && (!m_header.m_glBaseInternalFormat))
{
if ((!m_header.m_glInternalFormat) && (!m_header.m_glType) && (!m_header.m_glTypeSize) && (!m_header.m_glBaseInternalFormat)) {
// PVRTexTool writes bogus headers when outputting ETC1.
console::warning("ktx_texture::compute_pixel_info: Header doesn't specify any format, assuming ETC1 and hoping for the best");
m_header.m_glBaseInternalFormat = KTX_RGB;
@@ -284,9 +265,7 @@ namespace crnlib
if (m_block_dim == 1)
return false;
}
else
{
} else {
// Must be an uncompressed format.
if (!get_ogl_fmt_desc(m_header.m_glFormat, m_header.m_glType, m_block_dim, m_bytes_per_block))
return false;
@@ -297,8 +276,7 @@ namespace crnlib
return true;
}
bool ktx_texture::read_from_stream(data_stream_serializer& serializer)
{
bool ktx_texture::read_from_stream(data_stream_serializer& serializer) {
clear();
// Read header
@@ -313,8 +291,7 @@ namespace crnlib
return false;
m_opposite_endianness = (m_header.m_endianness == KTX_OPPOSITE_ENDIAN);
if (m_opposite_endianness)
{
if (m_opposite_endianness) {
m_header.endian_swap();
if ((m_header.m_glTypeSize != sizeof(uint8)) && (m_header.m_glTypeSize != sizeof(uint16)) && (m_header.m_glTypeSize != sizeof(uint32)))
@@ -331,8 +308,7 @@ namespace crnlib
// Read the key value entries
uint num_key_value_bytes_remaining = m_header.m_bytesOfKeyValueData;
while (num_key_value_bytes_remaining)
{
while (num_key_value_bytes_remaining) {
if (num_key_value_bytes_remaining < sizeof(uint32))
return false;
@@ -349,8 +325,7 @@ namespace crnlib
return false;
uint8_vec key_value_data;
if (key_value_byte_size)
{
if (key_value_byte_size) {
key_value_data.resize(key_value_byte_size);
if (serializer.read(&key_value_data[0], 1, key_value_byte_size) != key_value_byte_size)
return false;
@@ -359,8 +334,7 @@ namespace crnlib
m_key_values.push_back(key_value_data);
uint padding = 3 - ((key_value_byte_size + 3) % 4);
if (padding)
{
if (padding) {
if (serializer.read(pad_bytes, 1, padding) != padding)
return false;
}
@@ -388,8 +362,6 @@ namespace crnlib
if ((!mip0_row_blocks) || (!mip0_col_blocks))
return false;
const uint mip0_depth = CRNLIB_MAX(1, m_header.m_pixelDepth); mip0_depth;
bool has_valid_image_size_fields = true;
bool disable_mip_and_cubemap_padding = false;
@@ -397,8 +369,7 @@ namespace crnlib
{
// PVRTexTool has a bogus KTX writer that doesn't write any imageSize fields. Nice.
size_t expected_bytes_remaining = 0;
for (uint mip_level = 0; mip_level < get_num_mips(); mip_level++)
{
for (uint mip_level = 0; mip_level < get_num_mips(); mip_level++) {
uint mip_width, mip_height, mip_depth;
get_mip_dim(mip_level, mip_width, mip_height, mip_depth);
@@ -409,26 +380,19 @@ namespace crnlib
expected_bytes_remaining += sizeof(uint32);
if ((!m_header.m_numberOfArrayElements) && (get_num_faces() == 6))
{
for (uint face = 0; face < get_num_faces(); face++)
{
if ((!m_header.m_numberOfArrayElements) && (get_num_faces() == 6)) {
for (uint face = 0; face < get_num_faces(); face++) {
uint slice_size = mip_row_blocks * mip_col_blocks * m_bytes_per_block;
expected_bytes_remaining += slice_size;
uint num_cube_pad_bytes = 3 - ((slice_size + 3) % 4);
expected_bytes_remaining += num_cube_pad_bytes;
}
}
else
{
} else {
uint total_mip_size = 0;
for (uint array_element = 0; array_element < get_array_size(); array_element++)
{
for (uint face = 0; face < get_num_faces(); face++)
{
for (uint zslice = 0; zslice < mip_depth; zslice++)
{
for (uint array_element = 0; array_element < get_array_size(); array_element++) {
for (uint face = 0; face < get_num_faces(); face++) {
for (uint zslice = 0; zslice < mip_depth; zslice++) {
uint slice_size = mip_row_blocks * mip_col_blocks * m_bytes_per_block;
total_mip_size += slice_size;
}
@@ -441,8 +405,7 @@ namespace crnlib
}
}
if (serializer.get_stream()->get_remaining() < expected_bytes_remaining)
{
if (serializer.get_stream()->get_remaining() < expected_bytes_remaining) {
has_valid_image_size_fields = false;
disable_mip_and_cubemap_padding = true;
console::warning("ktx_texture::read_from_stream: KTX file size is smaller than expected - trying to read anyway without imageSize fields");
@@ -450,8 +413,7 @@ namespace crnlib
}
#endif
for (uint mip_level = 0; mip_level < get_num_mips(); mip_level++)
{
for (uint mip_level = 0; mip_level < get_num_mips(); mip_level++) {
uint mip_width, mip_height, mip_depth;
get_mip_dim(mip_level, mip_width, mip_height, mip_depth);
@@ -463,8 +425,7 @@ namespace crnlib
uint32 image_size = 0;
if (!has_valid_image_size_fields)
image_size = mip_depth * mip_row_blocks * mip_col_blocks * m_bytes_per_block * get_array_size() * get_num_faces();
else
{
else {
if (serializer.read(&image_size, 1, sizeof(image_size)) != sizeof(image_size))
return false;
@@ -477,11 +438,9 @@ namespace crnlib
uint total_mip_size = 0;
if ((!m_header.m_numberOfArrayElements) && (get_num_faces() == 6))
{
if ((!m_header.m_numberOfArrayElements) && (get_num_faces() == 6)) {
// plain non-array cubemap
for (uint face = 0; face < get_num_faces(); face++)
{
for (uint face = 0; face < get_num_faces(); face++) {
CRNLIB_ASSERT(m_image_data.size() == get_image_index(mip_level, 0, face, 0));
m_image_data.push_back(uint8_vec());
@@ -500,18 +459,13 @@ namespace crnlib
total_mip_size += image_size + num_cube_pad_bytes;
}
}
else
{
} else {
// 1D, 2D, 3D (normal or array texture), or array cubemap
uint num_image_bytes_remaining = image_size;
for (uint array_element = 0; array_element < get_array_size(); array_element++)
{
for (uint face = 0; face < get_num_faces(); face++)
{
for (uint zslice = 0; zslice < mip_depth; zslice++)
{
for (uint array_element = 0; array_element < get_array_size(); array_element++) {
for (uint face = 0; face < get_num_faces(); face++) {
for (uint zslice = 0; zslice < mip_depth; zslice++) {
CRNLIB_ASSERT(m_image_data.size() == get_image_index(mip_level, array_element, face, zslice));
uint slice_size = mip_row_blocks * mip_col_blocks * m_bytes_per_block;
@@ -546,10 +500,8 @@ namespace crnlib
return true;
}
bool ktx_texture::write_to_stream(data_stream_serializer& serializer, bool no_keyvalue_data)
{
if (!consistency_check())
{
bool ktx_texture::write_to_stream(data_stream_serializer& serializer, bool no_keyvalue_data) {
if (!consistency_check()) {
CRNLIB_ASSERT(0);
return false;
}
@@ -557,19 +509,15 @@ namespace crnlib
memcpy(m_header.m_identifier, s_ktx_file_id, sizeof(m_header.m_identifier));
m_header.m_endianness = m_opposite_endianness ? KTX_OPPOSITE_ENDIAN : KTX_ENDIAN;
if (m_block_dim == 1)
{
if (m_block_dim == 1) {
m_header.m_glTypeSize = get_ogl_type_size(m_header.m_glType);
m_header.m_glBaseInternalFormat = m_header.m_glFormat;
}
else
{
} else {
m_header.m_glBaseInternalFormat = get_ogl_base_internal_fmt(m_header.m_glInternalFormat);
}
m_header.m_bytesOfKeyValueData = 0;
if (!no_keyvalue_data)
{
if (!no_keyvalue_data) {
for (uint i = 0; i < m_key_values.size(); i++)
m_header.m_bytesOfKeyValueData += sizeof(uint32) + ((m_key_values[i].size() + 3) & ~3);
}
@@ -588,10 +536,8 @@ namespace crnlib
uint total_key_value_bytes = 0;
const uint8 padding[3] = {0, 0, 0};
if (!no_keyvalue_data)
{
for (uint i = 0; i < m_key_values.size(); i++)
{
if (!no_keyvalue_data) {
for (uint i = 0; i < m_key_values.size(); i++) {
uint32 key_value_size = m_key_values[i].size();
if (m_opposite_endianness)
@@ -606,8 +552,7 @@ namespace crnlib
if (!success)
return false;
if (key_value_size)
{
if (key_value_size) {
if (serializer.write(&m_key_values[i][0], key_value_size, 1) != 1)
return false;
total_key_value_bytes += key_value_size;
@@ -623,8 +568,7 @@ namespace crnlib
CRNLIB_ASSERT(total_key_value_bytes == m_header.m_bytesOfKeyValueData);
for (uint mip_level = 0; mip_level < get_num_mips(); mip_level++)
{
for (uint mip_level = 0; mip_level < get_num_mips(); mip_level++) {
uint mip_width, mip_height, mip_depth;
get_mip_dim(mip_level, mip_width, mip_height, mip_depth);
@@ -653,23 +597,19 @@ namespace crnlib
uint total_mip_size = 0;
if ((!m_header.m_numberOfArrayElements) && (get_num_faces() == 6))
{
if ((!m_header.m_numberOfArrayElements) && (get_num_faces() == 6)) {
// plain non-array cubemap
for (uint face = 0; face < get_num_faces(); face++)
{
for (uint face = 0; face < get_num_faces(); face++) {
const uint8_vec& image_data = get_image_data(get_image_index(mip_level, 0, face, 0));
if ((!image_data.size()) || (image_data.size() != image_size))
return false;
if (m_opposite_endianness)
{
if (m_opposite_endianness) {
uint8_vec tmp_image_data(image_data);
utils::endian_swap_mem(&tmp_image_data[0], tmp_image_data.size(), m_header.m_glTypeSize);
if (serializer.write(&tmp_image_data[0], tmp_image_data.size(), 1) != 1)
return false;
}
else if (serializer.write(&image_data[0], image_data.size(), 1) != 1)
} else if (serializer.write(&image_data[0], image_data.size(), 1) != 1)
return false;
uint num_cube_pad_bytes = 3 - ((image_data.size() + 3) % 4);
@@ -678,28 +618,21 @@ namespace crnlib
total_mip_size += image_size + num_cube_pad_bytes;
}
}
else
{
} else {
// 1D, 2D, 3D (normal or array texture), or array cubemap
for (uint array_element = 0; array_element < get_array_size(); array_element++)
{
for (uint face = 0; face < get_num_faces(); face++)
{
for (uint zslice = 0; zslice < mip_depth; zslice++)
{
for (uint array_element = 0; array_element < get_array_size(); array_element++) {
for (uint face = 0; face < get_num_faces(); face++) {
for (uint zslice = 0; zslice < mip_depth; zslice++) {
const uint8_vec& image_data = get_image_data(get_image_index(mip_level, array_element, face, zslice));
if (!image_data.size())
return false;
if (m_opposite_endianness)
{
if (m_opposite_endianness) {
uint8_vec tmp_image_data(image_data);
utils::endian_swap_mem(&tmp_image_data[0], tmp_image_data.size(), m_header.m_glTypeSize);
if (serializer.write(&tmp_image_data[0], tmp_image_data.size(), 1) != 1)
return false;
}
else if (serializer.write(&image_data[0], image_data.size(), 1) != 1)
} else if (serializer.write(&image_data[0], image_data.size(), 1) != 1)
return false;
total_mip_size += image_data.size();
@@ -718,8 +651,7 @@ namespace crnlib
return true;
}
bool ktx_texture::init_2D(uint width, uint height, uint num_mips, uint32 ogl_internal_fmt, uint32 ogl_fmt, uint32 ogl_type)
{
bool ktx_texture::init_2D(uint width, uint height, uint num_mips, uint32 ogl_internal_fmt, uint32 ogl_fmt, uint32 ogl_type) {
clear();
m_header.m_pixelWidth = width;
@@ -736,8 +668,7 @@ namespace crnlib
return true;
}
bool ktx_texture::init_2D_array(uint width, uint height, uint num_mips, uint array_size, uint32 ogl_internal_fmt, uint32 ogl_fmt, uint32 ogl_type)
{
bool ktx_texture::init_2D_array(uint width, uint height, uint num_mips, uint array_size, uint32 ogl_internal_fmt, uint32 ogl_fmt, uint32 ogl_type) {
clear();
m_header.m_pixelWidth = width;
@@ -755,8 +686,7 @@ namespace crnlib
return true;
}
bool ktx_texture::init_3D(uint width, uint height, uint depth, uint num_mips, uint32 ogl_internal_fmt, uint32 ogl_fmt, uint32 ogl_type)
{
bool ktx_texture::init_3D(uint width, uint height, uint depth, uint num_mips, uint32 ogl_internal_fmt, uint32 ogl_fmt, uint32 ogl_type) {
clear();
m_header.m_pixelWidth = width;
@@ -774,8 +704,7 @@ namespace crnlib
return true;
}
bool ktx_texture::init_cubemap(uint dim, uint num_mips, uint32 ogl_internal_fmt, uint32 ogl_fmt, uint32 ogl_type)
{
bool ktx_texture::init_cubemap(uint dim, uint num_mips, uint32 ogl_internal_fmt, uint32 ogl_fmt, uint32 ogl_type) {
clear();
m_header.m_pixelWidth = dim;
@@ -792,8 +721,7 @@ namespace crnlib
return true;
}
bool ktx_texture::check_header() const
{
bool ktx_texture::check_header() const {
if (((get_num_faces() != 1) && (get_num_faces() != 6)) || (!m_header.m_pixelWidth))
return false;
@@ -803,8 +731,7 @@ namespace crnlib
if ((get_num_faces() == 6) && ((m_header.m_pixelDepth) || (!m_header.m_pixelHeight)))
return false;
if (m_header.m_numberOfMipmapLevels)
{
if (m_header.m_numberOfMipmapLevels) {
const uint max_mipmap_dimension = 1U << (m_header.m_numberOfMipmapLevels - 1U);
if (max_mipmap_dimension > (CRNLIB_MAX(CRNLIB_MAX(m_header.m_pixelWidth, m_header.m_pixelHeight), m_header.m_pixelDepth)))
return false;
@@ -813,14 +740,12 @@ namespace crnlib
return true;
}
bool ktx_texture::consistency_check() const
{
bool ktx_texture::consistency_check() const {
if (!check_header())
return false;
uint block_dim = 0, bytes_per_block = 0;
if ((!m_header.m_glType) || (!m_header.m_glFormat))
{
if ((!m_header.m_glType) || (!m_header.m_glFormat)) {
if ((m_header.m_glType) || (m_header.m_glFormat))
return false;
if (!get_ogl_fmt_desc(m_header.m_glInternalFormat, m_header.m_glType, block_dim, bytes_per_block))
@@ -829,9 +754,7 @@ namespace crnlib
return false;
//if ((get_width() % block_dim) || (get_height() % block_dim))
// return false;
}
else
{
} else {
if (!get_ogl_fmt_desc(m_header.m_glFormat, m_header.m_glType, block_dim, bytes_per_block))
return false;
if (block_dim > 1)
@@ -843,8 +766,7 @@ namespace crnlib
if (m_image_data.size() != get_total_images())
return false;
for (uint mip_level = 0; mip_level < get_num_mips(); mip_level++)
{
for (uint mip_level = 0; mip_level < get_num_mips(); mip_level++) {
uint mip_width, mip_height, mip_depth;
get_mip_dim(mip_level, mip_width, mip_height, mip_depth);
@@ -853,12 +775,9 @@ namespace crnlib
if ((!mip_row_blocks) || (!mip_col_blocks))
return false;
for (uint array_element = 0; array_element < get_array_size(); array_element++)
{
for (uint face = 0; face < get_num_faces(); face++)
{
for (uint zslice = 0; zslice < mip_depth; zslice++)
{
for (uint array_element = 0; array_element < get_array_size(); array_element++) {
for (uint face = 0; face < get_num_faces(); face++) {
for (uint zslice = 0; zslice < mip_depth; zslice++) {
const uint8_vec& image_data = get_image_data(get_image_index(mip_level, array_element, face, zslice));
uint expected_image_size = mip_row_blocks * mip_col_blocks * m_bytes_per_block;
@@ -872,11 +791,9 @@ namespace crnlib
return true;
}
const uint8_vec* ktx_texture::find_key(const char* pKey) const
{
const uint8_vec* ktx_texture::find_key(const char* pKey) const {
const size_t n = strlen(pKey) + 1;
for (uint i = 0; i < m_key_values.size(); i++)
{
for (uint i = 0; i < m_key_values.size(); i++) {
const uint8_vec& v = m_key_values[i];
if ((v.size() >= n) && (!memcmp(&v[0], pKey, n)))
return &v;
@@ -885,11 +802,9 @@ namespace crnlib
return NULL;
}
bool ktx_texture::get_key_value_as_string(const char* pKey, dynamic_string& str) const
{
bool ktx_texture::get_key_value_as_string(const char* pKey, dynamic_string& str) const {
const uint8_vec* p = find_key(pKey);
if (!p)
{
if (!p) {
str.clear();
return false;
}
@@ -907,8 +822,7 @@ namespace crnlib
return true;
}
uint ktx_texture::add_key_value(const char* pKey, const void* pVal, uint val_size)
{
uint ktx_texture::add_key_value(const char* pKey, const void* pVal, uint val_size) {
const uint idx = m_key_values.size();
m_key_values.resize(idx + 1);
uint8_vec& v = m_key_values.back();
+100 -53
View File
@@ -10,12 +10,10 @@
#define KTX_ENDIAN 0x04030201
#define KTX_OPPOSITE_ENDIAN 0x01020304
namespace crnlib
{
namespace crnlib {
extern const uint8 s_ktx_file_id[12];
struct ktx_header
{
struct ktx_header {
uint8 m_identifier[12];
uint32 m_endianness;
uint32 m_glType;
@@ -31,13 +29,11 @@ namespace crnlib
uint32 m_numberOfMipmapLevels;
uint32 m_bytesOfKeyValueData;
void clear()
{
void clear() {
memset(this, 0, sizeof(*this));
}
void endian_swap()
{
void endian_swap() {
utils::endian_swap_mem32(&m_endianness, (sizeof(*this) - sizeof(m_identifier)) / sizeof(uint32));
}
};
@@ -46,37 +42,98 @@ namespace crnlib
typedef crnlib::vector<uint8_vec> ktx_image_data_vec;
// Compressed pixel data formats: ETC1, DXT1, DXT3, DXT5
enum
{
KTX_ETC1_RGB8_OES = 0x8D64, KTX_RGB_S3TC = 0x83A0, KTX_RGB4_S3TC = 0x83A1, KTX_COMPRESSED_RGB_S3TC_DXT1_EXT = 0x83F0,
KTX_COMPRESSED_RGBA_S3TC_DXT1_EXT = 0x83F1, KTX_COMPRESSED_SRGB_S3TC_DXT1_EXT = 0x8C4C, KTX_COMPRESSED_SRGB_ALPHA_S3TC_DXT1_EXT = 0x8C4D,
KTX_RGBA_S3TC = 0x83A2, KTX_RGBA4_S3TC = 0x83A3, KTX_COMPRESSED_RGBA_S3TC_DXT3_EXT = 0x83F2, KTX_COMPRESSED_SRGB_ALPHA_S3TC_DXT3_EXT = 0x8C4E,
KTX_COMPRESSED_RGBA_S3TC_DXT5_EXT = 0x83F3, KTX_COMPRESSED_SRGB_ALPHA_S3TC_DXT5_EXT = 0x8C4F, KTX_RGBA_DXT5_S3TC = 0x83A4, KTX_RGBA4_DXT5_S3TC = 0x83A5,
KTX_COMPRESSED_RED_RGTC1_EXT = 0x8DBB, KTX_COMPRESSED_SIGNED_RED_RGTC1_EXT = 0x8DBC, KTX_COMPRESSED_RED_GREEN_RGTC2_EXT = 0x8DBD, KTX_COMPRESSED_SIGNED_RED_GREEN_RGTC2_EXT = 0x8DBE,
KTX_COMPRESSED_LUMINANCE_LATC1_EXT = 0x8C70, KTX_COMPRESSED_SIGNED_LUMINANCE_LATC1_EXT = 0x8C71, KTX_COMPRESSED_LUMINANCE_ALPHA_LATC2_EXT = 0x8C72, KTX_COMPRESSED_SIGNED_LUMINANCE_ALPHA_LATC2_EXT = 0x8C73
enum {
KTX_ETC1_RGB8_OES = 0x8D64,
KTX_COMPRESSED_RGB8_ETC2 = 0x9274,
KTX_COMPRESSED_RGBA8_ETC2_EAC = 0x9278,
KTX_RGB_S3TC = 0x83A0,
KTX_RGB4_S3TC = 0x83A1,
KTX_COMPRESSED_RGB_S3TC_DXT1_EXT = 0x83F0,
KTX_COMPRESSED_RGBA_S3TC_DXT1_EXT = 0x83F1,
KTX_COMPRESSED_SRGB_S3TC_DXT1_EXT = 0x8C4C,
KTX_COMPRESSED_SRGB_ALPHA_S3TC_DXT1_EXT = 0x8C4D,
KTX_RGBA_S3TC = 0x83A2,
KTX_RGBA4_S3TC = 0x83A3,
KTX_COMPRESSED_RGBA_S3TC_DXT3_EXT = 0x83F2,
KTX_COMPRESSED_SRGB_ALPHA_S3TC_DXT3_EXT = 0x8C4E,
KTX_COMPRESSED_RGBA_S3TC_DXT5_EXT = 0x83F3,
KTX_COMPRESSED_SRGB_ALPHA_S3TC_DXT5_EXT = 0x8C4F,
KTX_RGBA_DXT5_S3TC = 0x83A4,
KTX_RGBA4_DXT5_S3TC = 0x83A5,
KTX_COMPRESSED_RED_RGTC1_EXT = 0x8DBB,
KTX_COMPRESSED_SIGNED_RED_RGTC1_EXT = 0x8DBC,
KTX_COMPRESSED_RED_GREEN_RGTC2_EXT = 0x8DBD,
KTX_COMPRESSED_SIGNED_RED_GREEN_RGTC2_EXT = 0x8DBE,
KTX_COMPRESSED_LUMINANCE_LATC1_EXT = 0x8C70,
KTX_COMPRESSED_SIGNED_LUMINANCE_LATC1_EXT = 0x8C71,
KTX_COMPRESSED_LUMINANCE_ALPHA_LATC2_EXT = 0x8C72,
KTX_COMPRESSED_SIGNED_LUMINANCE_ALPHA_LATC2_EXT = 0x8C73
};
// Pixel formats (various internal, base, and base internal formats)
enum
{
KTX_R8 = 0x8229, KTX_R8UI = 0x8232, KTX_RGB8 = 0x8051, KTX_SRGB8 = 0x8C41, KTX_SRGB = 0x8C40, KTX_SRGB_ALPHA = 0x8C42,
KTX_SRGB8_ALPHA8 = 0x8C43, KTX_RGBA8 = 0x8058, KTX_STENCIL_INDEX = 0x1901, KTX_DEPTH_COMPONENT = 0x1902, KTX_DEPTH_STENCIL = 0x84F9, KTX_RED = 0x1903,
KTX_GREEN = 0x1904, KTX_BLUE = 0x1905, KTX_ALPHA = 0x1906, KTX_RG = 0x8227, KTX_RGB = 0x1907, KTX_RGBA = 0x1908, KTX_BGR = 0x80E0, KTX_BGRA = 0x80E1,
KTX_RED_INTEGER = 0x8D94, KTX_GREEN_INTEGER = 0x8D95, KTX_BLUE_INTEGER = 0x8D96, KTX_ALPHA_INTEGER = 0x8D97, KTX_RGB_INTEGER = 0x8D98, KTX_RGBA_INTEGER = 0x8D99,
KTX_BGR_INTEGER = 0x8D9A, KTX_BGRA_INTEGER = 0x8D9B, KTX_LUMINANCE = 0x1909, KTX_LUMINANCE_ALPHA = 0x190A, KTX_RG_INTEGER = 0x8228, KTX_RG8 = 0x822B,
KTX_ALPHA8 = 0x803C, KTX_LUMINANCE8 = 0x8040, KTX_LUMINANCE8_ALPHA8 = 0x8045
enum {
KTX_R8 = 0x8229,
KTX_R8UI = 0x8232,
KTX_RGB8 = 0x8051,
KTX_SRGB8 = 0x8C41,
KTX_SRGB = 0x8C40,
KTX_SRGB_ALPHA = 0x8C42,
KTX_SRGB8_ALPHA8 = 0x8C43,
KTX_RGBA8 = 0x8058,
KTX_STENCIL_INDEX = 0x1901,
KTX_DEPTH_COMPONENT = 0x1902,
KTX_DEPTH_STENCIL = 0x84F9,
KTX_RED = 0x1903,
KTX_GREEN = 0x1904,
KTX_BLUE = 0x1905,
KTX_ALPHA = 0x1906,
KTX_RG = 0x8227,
KTX_RGB = 0x1907,
KTX_RGBA = 0x1908,
KTX_BGR = 0x80E0,
KTX_BGRA = 0x80E1,
KTX_RED_INTEGER = 0x8D94,
KTX_GREEN_INTEGER = 0x8D95,
KTX_BLUE_INTEGER = 0x8D96,
KTX_ALPHA_INTEGER = 0x8D97,
KTX_RGB_INTEGER = 0x8D98,
KTX_RGBA_INTEGER = 0x8D99,
KTX_BGR_INTEGER = 0x8D9A,
KTX_BGRA_INTEGER = 0x8D9B,
KTX_LUMINANCE = 0x1909,
KTX_LUMINANCE_ALPHA = 0x190A,
KTX_RG_INTEGER = 0x8228,
KTX_RG8 = 0x822B,
KTX_ALPHA8 = 0x803C,
KTX_LUMINANCE8 = 0x8040,
KTX_LUMINANCE8_ALPHA8 = 0x8045
};
// Pixel data types
enum
{
KTX_UNSIGNED_BYTE = 0x1401, KTX_BYTE = 0x1400, KTX_UNSIGNED_SHORT = 0x1403, KTX_SHORT = 0x1402,
KTX_UNSIGNED_INT = 0x1405, KTX_INT = 0x1404, KTX_HALF_FLOAT = 0x140B, KTX_FLOAT = 0x1406,
KTX_UNSIGNED_BYTE_3_3_2 = 0x8032, KTX_UNSIGNED_BYTE_2_3_3_REV = 0x8362, KTX_UNSIGNED_SHORT_5_6_5 = 0x8363,
KTX_UNSIGNED_SHORT_5_6_5_REV = 0x8364, KTX_UNSIGNED_SHORT_4_4_4_4 = 0x8033, KTX_UNSIGNED_SHORT_4_4_4_4_REV = 0x8365,
KTX_UNSIGNED_SHORT_5_5_5_1 = 0x8034, KTX_UNSIGNED_SHORT_1_5_5_5_REV = 0x8366, KTX_UNSIGNED_INT_8_8_8_8 = 0x8035,
KTX_UNSIGNED_INT_8_8_8_8_REV = 0x8367, KTX_UNSIGNED_INT_10_10_10_2 = 0x8036, KTX_UNSIGNED_INT_2_10_10_10_REV = 0x8368,
KTX_UNSIGNED_INT_24_8 = 0x84FA, KTX_UNSIGNED_INT_10F_11F_11F_REV = 0x8C3B, KTX_UNSIGNED_INT_5_9_9_9_REV = 0x8C3E,
enum {
KTX_UNSIGNED_BYTE = 0x1401,
KTX_BYTE = 0x1400,
KTX_UNSIGNED_SHORT = 0x1403,
KTX_SHORT = 0x1402,
KTX_UNSIGNED_INT = 0x1405,
KTX_INT = 0x1404,
KTX_HALF_FLOAT = 0x140B,
KTX_FLOAT = 0x1406,
KTX_UNSIGNED_BYTE_3_3_2 = 0x8032,
KTX_UNSIGNED_BYTE_2_3_3_REV = 0x8362,
KTX_UNSIGNED_SHORT_5_6_5 = 0x8363,
KTX_UNSIGNED_SHORT_5_6_5_REV = 0x8364,
KTX_UNSIGNED_SHORT_4_4_4_4 = 0x8033,
KTX_UNSIGNED_SHORT_4_4_4_4_REV = 0x8365,
KTX_UNSIGNED_SHORT_5_5_5_1 = 0x8034,
KTX_UNSIGNED_SHORT_1_5_5_5_REV = 0x8366,
KTX_UNSIGNED_INT_8_8_8_8 = 0x8035,
KTX_UNSIGNED_INT_8_8_8_8_REV = 0x8367,
KTX_UNSIGNED_INT_10_10_10_2 = 0x8036,
KTX_UNSIGNED_INT_2_10_10_10_REV = 0x8368,
KTX_UNSIGNED_INT_24_8 = 0x84FA,
KTX_UNSIGNED_INT_10F_11F_11F_REV = 0x8C3B,
KTX_UNSIGNED_INT_5_9_9_9_REV = 0x8C3E,
KTX_FLOAT_32_UNSIGNED_INT_24_8_REV = 0x8DAD
};
@@ -86,21 +143,17 @@ namespace crnlib
uint get_ogl_type_size(uint32 ogl_type);
uint32 get_ogl_base_internal_fmt(uint32 ogl_fmt);
class ktx_texture
{
class ktx_texture {
public:
ktx_texture()
{
ktx_texture() {
clear();
}
ktx_texture(const ktx_texture& other)
{
ktx_texture(const ktx_texture& other) {
*this = other;
}
ktx_texture& operator= (const ktx_texture& rhs)
{
ktx_texture& operator=(const ktx_texture& rhs) {
if (this == &rhs)
return *this;
@@ -116,8 +169,7 @@ namespace crnlib
return *this;
}
void clear()
{
void clear() {
m_header.clear();
m_key_values.clear();
m_image_data.clear();
@@ -191,34 +243,29 @@ namespace crnlib
const ktx_image_data_vec& get_image_data_vec() const { return m_image_data; }
ktx_image_data_vec& get_image_data_vec() { return m_image_data; }
void add_image(uint face_index, uint mip_index, const void* pImage, uint image_size)
{
void add_image(uint face_index, uint mip_index, const void* pImage, uint image_size) {
const uint image_index = get_image_index(mip_index, 0, face_index, 0);
if (image_index >= m_image_data.size())
m_image_data.resize(image_index + 1);
if (image_size)
{
if (image_size) {
uint8_vec& v = m_image_data[image_index];
v.resize(image_size);
memcpy(&v[0], pImage, image_size);
}
}
uint get_image_index(uint mip_index, uint array_index, uint face_index, uint zslice_index) const
{
uint get_image_index(uint mip_index, uint array_index, uint face_index, uint zslice_index) const {
CRNLIB_ASSERT((mip_index < get_num_mips()) && (array_index < get_array_size()) && (face_index < get_num_faces()) && (zslice_index < get_depth()));
return zslice_index + (face_index * get_depth()) + (array_index * (get_depth() * get_num_faces())) + (mip_index * (get_depth() * get_num_faces() * get_array_size()));
}
void get_mip_dim(uint mip_index, uint& mip_width, uint& mip_height) const
{
void get_mip_dim(uint mip_index, uint& mip_width, uint& mip_height) const {
CRNLIB_ASSERT(mip_index < get_num_mips());
mip_width = CRNLIB_MAX(get_width() >> mip_index, 1);
mip_height = CRNLIB_MAX(get_height() >> mip_index, 1);
}
void get_mip_dim(uint mip_index, uint& mip_width, uint& mip_height, uint& mip_depth) const
{
void get_mip_dim(uint mip_index, uint& mip_width, uint& mip_height, uint& mip_depth) const {
CRNLIB_ASSERT(mip_index < get_num_mips());
mip_width = CRNLIB_MAX(get_width() >> mip_index, 1);
mip_height = CRNLIB_MAX(get_height() >> mip_index, 1);
+12 -22
View File
@@ -7,21 +7,17 @@
#include "lzma_LzmaLib.h"
#include "crn_threading.h"
namespace crnlib
{
lzma_codec::lzma_codec() :
m_pCompress(LzmaCompress),
m_pUncompress(LzmaUncompress)
{
namespace crnlib {
lzma_codec::lzma_codec()
: m_pCompress(LzmaCompress),
m_pUncompress(LzmaUncompress) {
CRNLIB_ASSUME(cLZMAPropsSize == LZMA_PROPS_SIZE);
}
lzma_codec::~lzma_codec()
{
lzma_codec::~lzma_codec() {
}
bool lzma_codec::pack(const void* p, uint n, crnlib::vector<uint8>& buf)
{
bool lzma_codec::pack(const void* p, uint n, crnlib::vector<uint8>& buf) {
if (n > 1024U * 1024U * 1024U)
return false;
@@ -36,14 +32,12 @@ namespace crnlib
pHDR->m_uncomp_size = n;
pHDR->m_adler32 = adler32(p, n);
if (n)
{
if (n) {
size_t destLen = 0;
size_t outPropsSize = 0;
int status = SZ_ERROR_INPUT_EOF;
for (uint trial = 0; trial < 3; trial++)
{
for (uint trial = 0; trial < 3; trial++) {
destLen = max_comp_size;
outPropsSize = cLZMAPropsSize;
@@ -71,8 +65,7 @@ namespace crnlib
pComp_data = &buf[sizeof(header)];
}
if (status != SZ_OK)
{
if (status != SZ_OK) {
buf.clear();
return false;
}
@@ -88,8 +81,7 @@ namespace crnlib
return true;
}
bool lzma_codec::unpack(const void* p, uint n, crnlib::vector<uint8>& buf)
{
bool lzma_codec::unpack(const void* p, uint n, crnlib::vector<uint8>& buf) {
buf.resize(0);
if (n < sizeof(header))
@@ -124,14 +116,12 @@ namespace crnlib
int status = (*m_pUncompress)(&buf[0], &destLen, pComp_data, &srcLen,
hdr.m_lzma_props, cLZMAPropsSize);
if ((status != SZ_OK) || (destLen != hdr.m_uncomp_size))
{
if ((status != SZ_OK) || (destLen != hdr.m_uncomp_size)) {
buf.clear();
return false;
}
if (adler32(&buf[0], buf.size()) != hdr.m_adler32)
{
if (adler32(&buf[0], buf.size()) != hdr.m_adler32) {
buf.clear();
return false;
}
+5 -8
View File
@@ -3,10 +3,8 @@
#pragma once
#include "crn_packed_uint.h"
namespace crnlib
{
class lzma_codec
{
namespace crnlib {
class lzma_codec {
public:
lzma_codec();
~lzma_codec();
@@ -40,9 +38,9 @@ namespace crnlib
#pragma pack(push)
#pragma pack(1)
struct header
{
enum { cSig = 'L' | ('0' << 8), cChecksumSkipBytes = 3 };
struct header {
enum { cSig = 'L' | ('0' << 8),
cChecksumSkipBytes = 3 };
packed_uint<2> m_sig;
uint8 m_checksum;
@@ -54,7 +52,6 @@ namespace crnlib
packed_uint<4> m_adler32;
};
#pragma pack(pop)
};
} // namespace crnlib
+9 -18
View File
@@ -2,10 +2,8 @@
// See Copyright Notice and license at the end of inc/crnlib.h
#include "crn_core.h"
namespace crnlib
{
namespace math
{
namespace crnlib {
namespace math {
uint g_bitmasks[32] =
{
1U << 0U, 1U << 1U, 1U << 2U, 1U << 3U,
@@ -15,11 +13,9 @@ namespace crnlib
1U << 16U, 1U << 17U, 1U << 18U, 1U << 19U,
1U << 20U, 1U << 21U, 1U << 22U, 1U << 23U,
1U << 24U, 1U << 25U, 1U << 26U, 1U << 27U,
1U << 28U, 1U << 29U, 1U << 30U, 1U << 31U
};
1U << 28U, 1U << 29U, 1U << 30U, 1U << 31U};
double compute_entropy(const uint8* p, uint n)
{
double compute_entropy(const uint8* p, uint n) {
uint hist[256];
utils::zero_object(hist);
@@ -29,8 +25,7 @@ namespace crnlib
double entropy = 0.0f;
const double invln2 = 1.0f / log(2.0f);
for (uint i = 0; i < 256; i++)
{
for (uint i = 0; i < 256; i++) {
if (!hist[i])
continue;
@@ -41,30 +36,26 @@ namespace crnlib
return entropy;
}
void compute_lower_pow2_dim(int& width, int& height)
{
void compute_lower_pow2_dim(int& width, int& height) {
const int tex_width = width;
const int tex_height = height;
width = 1;
for ( ; ; )
{
for (;;) {
if ((width * 2) > tex_width)
break;
width *= 2;
}
height = 1;
for ( ; ; )
{
for (;;) {
if ((height * 2) > tex_height)
break;
height *= 2;
}
}
void compute_upper_pow2_dim(int& width, int& height)
{
void compute_upper_pow2_dim(int& width, int& height) {
if (!math::is_power_of_2((uint32)width))
width = math::next_pow2((uint32)width);
+119 -76
View File
@@ -8,10 +8,8 @@
unsigned __int64 __emulu(unsigned int a, unsigned int b);
#endif
namespace crnlib
{
namespace math
{
namespace crnlib {
namespace math {
const float cNearlyInfinite = 1.0e+37f;
const float cDegToRad = 0.01745329252f;
@@ -19,70 +17,118 @@ namespace crnlib
extern uint g_bitmasks[32];
template<typename T> inline bool within_closed_range(T a, T b, T c) { return (a >= b) && (a <= c); }
template <typename T>
inline bool within_closed_range(T a, T b, T c) {
return (a >= b) && (a <= c);
}
template<typename T> inline bool within_open_range(T a, T b, T c) { return (a >= b) && (a < c); }
template <typename T>
inline bool within_open_range(T a, T b, T c) {
return (a >= b) && (a < c);
}
// Yes I know these should probably be pass by ref, not val:
// http://www.stepanovpapers.com/notes.pdf
// Just don't use them on non-simple (non built-in) types!
template<typename T> inline T minimum(T a, T b) { return (a < b) ? a : b; }
template <typename T>
inline T minimum(T a, T b) {
return (a < b) ? a : b;
}
template<typename T> inline T minimum(T a, T b, T c) { return minimum(minimum(a, b), c); }
template <typename T>
inline T minimum(T a, T b, T c) {
return minimum(minimum(a, b), c);
}
template<typename T> inline T maximum(T a, T b) { return (a > b) ? a : b; }
template <typename T>
inline T maximum(T a, T b) {
return (a > b) ? a : b;
}
template<typename T> inline T maximum(T a, T b, T c) { return maximum(maximum(a, b), c); }
template <typename T>
inline T maximum(T a, T b, T c) {
return maximum(maximum(a, b), c);
}
template<typename T, typename U> inline T lerp(T a, T b, U c) { return a + (b - a) * c; }
template <typename T, typename U>
inline T lerp(T a, T b, U c) {
return a + (b - a) * c;
}
template<typename T> inline T clamp(T value, T low, T high) { return (value < low) ? low : ((value > high) ? high : value); }
template <typename T>
inline T clamp(T value, T low, T high) {
return (value < low) ? low : ((value > high) ? high : value);
}
template<typename T> inline T saturate(T value) { return (value < 0.0f) ? 0.0f : ((value > 1.0f) ? 1.0f : value); }
template <typename T>
inline T saturate(T value) {
return (value < 0.0f) ? 0.0f : ((value > 1.0f) ? 1.0f : value);
}
inline int float_to_int(float f) { return static_cast<int>(f); }
inline int float_to_int(float f) {
return static_cast<int>(f);
}
inline uint float_to_uint(float f) { return static_cast<uint>(f); }
inline uint float_to_uint(float f) {
return static_cast<uint>(f);
}
inline int float_to_int(double f) { return static_cast<int>(f); }
inline int float_to_int(double f) {
return static_cast<int>(f);
}
inline uint float_to_uint(double f) { return static_cast<uint>(f); }
inline uint float_to_uint(double f) {
return static_cast<uint>(f);
}
inline int float_to_int_round(float f) { return static_cast<int>((f < 0.0f) ? -floor(-f + .5f) : floor(f + .5f)); }
inline int float_to_int_round(float f) {
return static_cast<int>((f < 0.0f) ? -floor(-f + .5f) : floor(f + .5f));
}
inline uint float_to_uint_round(float f) { return static_cast<uint>((f < 0.0f) ? 0.0f : floor(f + .5f)); }
inline uint float_to_uint_round(float f) {
return static_cast<uint>((f < 0.0f) ? 0.0f : floor(f + .5f));
}
template<typename T> inline int sign(T value) { return (value < 0) ? -1 : ((value > 0) ? 1 : 0); }
template <typename T>
inline int sign(T value) {
return (value < 0) ? -1 : ((value > 0) ? 1 : 0);
}
template<typename T> inline T square(T value) { return value * value; }
template <typename T>
inline T square(T value) {
return value * value;
}
inline bool is_power_of_2(uint32 x) { return x && ((x & (x - 1U)) == 0U); }
inline bool is_power_of_2(uint64 x) { return x && ((x & (x - 1U)) == 0U); }
inline bool is_power_of_2(uint32 x) {
return x && ((x & (x - 1U)) == 0U);
}
inline bool is_power_of_2(uint64 x) {
return x && ((x & (x - 1U)) == 0U);
}
template<typename T> inline T align_up_value(T x, uint alignment)
{
template <typename T>
inline T align_up_value(T x, uint alignment) {
CRNLIB_ASSERT(is_power_of_2(alignment));
uint q = static_cast<uint>(x);
q = (q + alignment - 1) & (~(alignment - 1));
return static_cast<T>(q);
}
template<typename T> inline T align_down_value(T x, uint alignment)
{
template <typename T>
inline T align_down_value(T x, uint alignment) {
CRNLIB_ASSERT(is_power_of_2(alignment));
uint q = static_cast<uint>(x);
q = q & (~(alignment - 1));
return static_cast<T>(q);
}
template<typename T> inline T get_align_up_value_delta(T x, uint alignment)
{
template <typename T>
inline T get_align_up_value_delta(T x, uint alignment) {
return align_up_value(x, alignment) - x;
}
// From "Hackers Delight"
inline uint32 next_pow2(uint32 val)
{
inline uint32 next_pow2(uint32 val) {
val--;
val |= val >> 16;
val |= val >> 8;
@@ -92,8 +138,7 @@ namespace crnlib
return val + 1;
}
inline uint64 next_pow2(uint64 val)
{
inline uint64 next_pow2(uint64 val) {
val--;
val |= val >> 32;
val |= val >> 16;
@@ -104,19 +149,16 @@ namespace crnlib
return val + 1;
}
inline uint floor_log2i(uint v)
{
inline uint floor_log2i(uint v) {
uint l = 0;
while (v > 1U)
{
while (v > 1U) {
v >>= 1;
l++;
}
return l;
}
inline uint ceil_log2i(uint v)
{
inline uint ceil_log2i(uint v) {
uint l = floor_log2i(v);
if ((l != cIntBits) && (v > (1U << l)))
l++;
@@ -124,11 +166,9 @@ namespace crnlib
}
// Returns the total number of bits needed to encode v.
inline uint total_bits(uint v)
{
inline uint total_bits(uint v) {
uint l = 0;
while (v > 0U)
{
while (v > 0U) {
v >>= 1;
l++;
}
@@ -136,24 +176,20 @@ namespace crnlib
}
// Actually counts the number of set bits, but hey
inline uint bitmask_size(uint mask)
{
inline uint bitmask_size(uint mask) {
uint size = 0;
while (mask)
{
while (mask) {
mask &= (mask - 1U);
size++;
}
return size;
}
inline uint bitmask_ofs(uint mask)
{
inline uint bitmask_ofs(uint mask) {
if (!mask)
return 0;
uint ofs = 0;
while ((mask & 1U) == 0)
{
while ((mask & 1U) == 0) {
mask >>= 1U;
ofs++;
}
@@ -162,8 +198,7 @@ namespace crnlib
// See Bit Twiddling Hacks (public domain)
// http://www-graphics.stanford.edu/~seander/bithacks.html
inline uint count_trailing_zero_bits(uint v)
{
inline uint count_trailing_zero_bits(uint v) {
uint c = 32; // c will be the number of zero bits on the right
static const unsigned int B[] = {0x55555555, 0x33333333, 0x0F0F0F0F, 0x00FF00FF, 0x0000FFFF};
@@ -171,31 +206,48 @@ namespace crnlib
for (int i = 4; i >= 0; --i) // unroll for more speed
{
if (v & B[i])
{
if (v & B[i]) {
v <<= S[i];
c -= S[i];
}
}
if (v)
{
if (v) {
c--;
}
return c;
}
inline uint count_leading_zero_bits(uint v)
{
inline uint count_leading_zero_bits(uint v) {
uint temp;
uint result = 32U;
temp = (v >> 16U); if (temp) { result -= 16U; v = temp; }
temp = (v >> 8U); if (temp) { result -= 8U; v = temp; }
temp = (v >> 4U); if (temp) { result -= 4U; v = temp; }
temp = (v >> 2U); if (temp) { result -= 2U; v = temp; }
temp = (v >> 1U); if (temp) { result -= 1U; v = temp; }
temp = (v >> 16U);
if (temp) {
result -= 16U;
v = temp;
}
temp = (v >> 8U);
if (temp) {
result -= 8U;
v = temp;
}
temp = (v >> 4U);
if (temp) {
result -= 4U;
v = temp;
}
temp = (v >> 2U);
if (temp) {
result -= 2U;
v = temp;
}
temp = (v >> 1U);
if (temp) {
result -= 1U;
v = temp;
}
if (v & 1U)
result--;
@@ -203,8 +255,7 @@ namespace crnlib
return result;
}
inline uint64 emulu(uint32 a, uint32 b)
{
inline uint64 emulu(uint32 a, uint32 b) {
#if defined(_M_IX86) && defined(_MSC_VER)
return __emulu(a, b);
#else
@@ -217,21 +268,13 @@ namespace crnlib
void compute_lower_pow2_dim(int& width, int& height);
void compute_upper_pow2_dim(int& width, int& height);
inline bool equal_tol(float a, float b, float t)
{
inline bool equal_tol(float a, float b, float t) {
return fabs(a - b) < ((maximum(fabs(a), fabs(b)) + 1.0f) * t);
}
inline bool equal_tol(double a, double b, double t)
{
inline bool equal_tol(double a, double b, double t) {
return fabs(a - b) < ((maximum(fabs(a), fabs(b)) + 1.0f) * t);
}
}
} // namespace crnlib
+80 -151
View File
@@ -4,17 +4,15 @@
#include "crn_vec.h"
namespace crnlib
{
template<class X, class Y, class Z> Z& matrix_mul_helper(Z& result, const X& lhs, const Y& rhs)
{
namespace crnlib {
template <class X, class Y, class Z>
Z& matrix_mul_helper(Z& result, const X& lhs, const Y& rhs) {
CRNLIB_ASSUME(Z::num_rows == X::num_rows);
CRNLIB_ASSUME(Z::num_cols == Y::num_cols);
CRNLIB_ASSUME(X::num_cols == Y::num_rows);
CRNLIB_ASSERT((&result != &lhs) && (&result != &rhs));
for (int r = 0; r < X::num_rows; r++)
for (int c = 0; c < Y::num_cols; c++)
{
for (int c = 0; c < Y::num_cols; c++) {
typename Z::scalar_type s = lhs(r, 0) * rhs(0, c);
for (uint i = 1; i < X::num_cols; i++)
s += lhs(r, i) * rhs(i, c);
@@ -23,14 +21,13 @@ namespace crnlib
return result;
}
template<class X, class Y, class Z> Z& matrix_mul_helper_transpose_lhs(Z& result, const X& lhs, const Y& rhs)
{
template <class X, class Y, class Z>
Z& matrix_mul_helper_transpose_lhs(Z& result, const X& lhs, const Y& rhs) {
CRNLIB_ASSUME(Z::num_rows == X::num_cols);
CRNLIB_ASSUME(Z::num_cols == Y::num_cols);
CRNLIB_ASSUME(X::num_rows == Y::num_rows);
for (int r = 0; r < X::num_cols; r++)
for (int c = 0; c < Y::num_cols; c++)
{
for (int c = 0; c < Y::num_cols; c++) {
typename Z::scalar_type s = lhs(0, r) * rhs(0, c);
for (uint i = 1; i < X::num_rows; i++)
s += lhs(i, r) * rhs(i, c);
@@ -39,14 +36,13 @@ namespace crnlib
return result;
}
template<class X, class Y, class Z> Z& matrix_mul_helper_transpose_rhs(Z& result, const X& lhs, const Y& rhs)
{
template <class X, class Y, class Z>
Z& matrix_mul_helper_transpose_rhs(Z& result, const X& lhs, const Y& rhs) {
CRNLIB_ASSUME(Z::num_rows == X::num_rows);
CRNLIB_ASSUME(Z::num_cols == Y::num_rows);
CRNLIB_ASSUME(X::num_cols == Y::num_cols);
for (int r = 0; r < X::num_rows; r++)
for (int c = 0; c < Y::num_rows; c++)
{
for (int c = 0; c < Y::num_rows; c++) {
typename Z::scalar_type s = lhs(r, 0) * rhs(c, 0);
for (uint i = 1; i < X::num_cols; i++)
s += lhs(r, i) * rhs(c, i);
@@ -56,11 +52,11 @@ namespace crnlib
}
template <uint R, uint C, typename T>
class matrix
{
class matrix {
public:
typedef T scalar_type;
enum { num_rows = R, num_cols = C };
enum { num_rows = R,
num_cols = C };
typedef vec<R, T> col_vec;
typedef vec<(R > 1) ? (R - 1) : 0, T> subcol_vec;
@@ -74,14 +70,12 @@ namespace crnlib
inline matrix(const T* p) { set(p); }
inline matrix(const matrix& other)
{
inline matrix(const matrix& other) {
for (uint i = 0; i < R; i++)
m_rows[i] = other.m_rows[i];
}
inline matrix& operator= (const matrix& rhs)
{
inline matrix& operator=(const matrix& rhs) {
if (this != &rhs)
for (uint i = 0; i < R; i++)
m_rows[i] = rhs.m_rows[i];
@@ -89,41 +83,34 @@ namespace crnlib
}
inline matrix(T val00, T val01,
T val10, T val11)
{
T val10, T val11) {
set(val00, val01, val10, val11);
}
inline matrix(T val00, T val01, T val02,
T val10, T val11, T val12,
T val20, T val21, T val22)
{
T val20, T val21, T val22) {
set(val00, val01, val02, val10, val11, val12, val20, val21, val22);
}
inline matrix(T val00, T val01, T val02, T val03,
T val10, T val11, T val12, T val13,
T val20, T val21, T val22, T val23,
T val30, T val31, T val32, T val33)
{
T val30, T val31, T val32, T val33) {
set(val00, val01, val02, val03, val10, val11, val12, val13, val20, val21, val22, val23, val30, val31, val32, val33);
}
inline void set(const float* p)
{
for (uint i = 0; i < R; i++)
{
inline void set(const float* p) {
for (uint i = 0; i < R; i++) {
m_rows[i].set(p);
p += C;
}
}
inline void set(T val00, T val01,
T val10, T val11)
{
T val10, T val11) {
m_rows[0].set(val00, val01);
if (R >= 2)
{
if (R >= 2) {
m_rows[1].set(val10, val11);
for (uint i = 2; i < R; i++)
@@ -133,14 +120,11 @@ namespace crnlib
inline void set(T val00, T val01, T val02,
T val10, T val11, T val12,
T val20, T val21, T val22)
{
T val20, T val21, T val22) {
m_rows[0].set(val00, val01, val02);
if (R >= 2)
{
if (R >= 2) {
m_rows[1].set(val10, val11, val12);
if (R >= 3)
{
if (R >= 3) {
m_rows[2].set(val20, val21, val22);
for (uint i = 3; i < R; i++)
@@ -152,18 +136,14 @@ namespace crnlib
inline void set(T val00, T val01, T val02, T val03,
T val10, T val11, T val12, T val13,
T val20, T val21, T val22, T val23,
T val30, T val31, T val32, T val33)
{
T val30, T val31, T val32, T val33) {
m_rows[0].set(val00, val01, val02, val03);
if (R >= 2)
{
if (R >= 2) {
m_rows[1].set(val10, val11, val12, val13);
if (R >= 3)
{
if (R >= 3) {
m_rows[2].set(val20, val21, val22, val23);
if (R >= 4)
{
if (R >= 4) {
m_rows[3].set(val30, val31, val32, val33);
for (uint i = 4; i < R; i++)
@@ -173,26 +153,22 @@ namespace crnlib
}
}
inline T operator() (uint r, uint c) const
{
inline T operator()(uint r, uint c) const {
CRNLIB_ASSERT((r < R) && (c < C));
return m_rows[r][c];
}
inline T& operator() (uint r, uint c)
{
inline T& operator()(uint r, uint c) {
CRNLIB_ASSERT((r < R) && (c < C));
return m_rows[r][c];
}
inline const row_vec& operator[] (uint r) const
{
inline const row_vec& operator[](uint r) const {
CRNLIB_ASSERT(r < R);
return m_rows[r];
}
inline row_vec& operator[] (uint r)
{
inline row_vec& operator[](uint r) {
CRNLIB_ASSERT(r < R);
return m_rows[r];
}
@@ -200,8 +176,7 @@ namespace crnlib
inline const row_vec& get_row(uint r) const { return (*this)[r]; }
inline row_vec& get_row(uint r) { return (*this)[r]; }
inline col_vec get_col(uint c) const
{
inline col_vec get_col(uint c) const {
CRNLIB_ASSERT(c < C);
col_vec result;
for (uint i = 0; i < R; i++)
@@ -209,15 +184,13 @@ namespace crnlib
return result;
}
inline void set_col(uint c, const col_vec& col)
{
inline void set_col(uint c, const col_vec& col) {
CRNLIB_ASSERT(c < C);
for (uint i = 0; i < R; i++)
m_rows[i][c] = col[i];
}
inline void set_col(uint c, const subcol_vec& col)
{
inline void set_col(uint c, const subcol_vec& col) {
CRNLIB_ASSERT(c < C);
for (uint i = 0; i < (R - 1); i++)
m_rows[i][c] = col[i];
@@ -225,19 +198,16 @@ namespace crnlib
m_rows[R - 1][c] = 0.0f;
}
inline const row_vec& get_translate() const
{
inline const row_vec& get_translate() const {
return m_rows[R - 1];
}
inline matrix& set_translate(const row_vec& r)
{
inline matrix& set_translate(const row_vec& r) {
m_rows[R - 1] = r;
return *this;
}
inline matrix& set_translate(const subrow_vec& r)
{
inline matrix& set_translate(const subrow_vec& r) {
m_rows[R - 1] = row_vec(r).as_point();
return *this;
}
@@ -245,128 +215,109 @@ namespace crnlib
inline const T* get_ptr() const { return reinterpret_cast<const T*>(&m_rows[0]); }
inline T* get_ptr() { return reinterpret_cast<T*>(&m_rows[0]); }
inline matrix& operator+= (const matrix& other)
{
inline matrix& operator+=(const matrix& other) {
for (uint i = 0; i < R; i++)
m_rows[i] += other.m_rows[i];
return *this;
}
inline matrix& operator-= (const matrix& other)
{
inline matrix& operator-=(const matrix& other) {
for (uint i = 0; i < R; i++)
m_rows[i] -= other.m_rows[i];
return *this;
}
inline matrix& operator*= (T val)
{
inline matrix& operator*=(T val) {
for (uint i = 0; i < R; i++)
m_rows[i] *= val;
return *this;
}
inline matrix& operator/= (T val)
{
inline matrix& operator/=(T val) {
for (uint i = 0; i < R; i++)
m_rows[i] /= val;
return *this;
}
inline matrix& operator*= (const matrix& other)
{
inline matrix& operator*=(const matrix& other) {
matrix result;
matrix_mul_helper(result, *this, other);
*this = result;
return *this;
}
friend inline matrix operator+ (const matrix& lhs, const matrix& rhs)
{
friend inline matrix operator+(const matrix& lhs, const matrix& rhs) {
matrix result;
for (uint i = 0; i < R; i++)
result[i] = lhs.m_rows[i] + rhs.m_rows[i];
return result;
}
friend inline matrix operator- (const matrix& lhs, const matrix& rhs)
{
friend inline matrix operator-(const matrix& lhs, const matrix& rhs) {
matrix result;
for (uint i = 0; i < R; i++)
result[i] = lhs.m_rows[i] - rhs.m_rows[i];
return result;
}
friend inline matrix operator* (const matrix& lhs, T val)
{
friend inline matrix operator*(const matrix& lhs, T val) {
matrix result;
for (uint i = 0; i < R; i++)
result[i] = lhs.m_rows[i] * val;
return result;
}
friend inline matrix operator/ (const matrix& lhs, T val)
{
friend inline matrix operator/(const matrix& lhs, T val) {
matrix result;
for (uint i = 0; i < R; i++)
result[i] = lhs.m_rows[i] / val;
return result;
}
friend inline matrix operator* (T val, const matrix& rhs)
{
friend inline matrix operator*(T val, const matrix& rhs) {
matrix result;
for (uint i = 0; i < R; i++)
result[i] = val * rhs.m_rows[i];
return result;
}
friend inline matrix operator* (const matrix& lhs, const matrix& rhs)
{
friend inline matrix operator*(const matrix& lhs, const matrix& rhs) {
matrix result;
return matrix_mul_helper(result, lhs, rhs);
}
friend inline row_vec operator* (const col_vec& a, const matrix& b)
{
friend inline row_vec operator*(const col_vec& a, const matrix& b) {
return transform(a, b);
}
inline matrix operator+ () const
{
inline matrix operator+() const {
return *this;
}
inline matrix operator- () const
{
inline matrix operator-() const {
matrix result;
for (uint i = 0; i < R; i++)
result[i] = -m_rows[i];
return result;
}
inline void clear(void)
{
inline void clear(void) {
for (uint i = 0; i < R; i++)
m_rows[i].clear();
}
inline void set_zero_matrix()
{
inline void set_zero_matrix() {
clear();
}
inline void set_identity_matrix()
{
for (uint i = 0; i < R; i++)
{
inline void set_identity_matrix() {
for (uint i = 0; i < R; i++) {
m_rows[i].clear();
m_rows[i][i] = 1.0f;
}
}
inline matrix& set_scale_matrix(float s)
{
inline matrix& set_scale_matrix(float s) {
clear();
for (int i = 0; i < (R - 1); i++)
m_rows[i][i] = s;
@@ -374,37 +325,32 @@ namespace crnlib
return *this;
}
inline matrix& set_scale_matrix(const row_vec& s)
{
inline matrix& set_scale_matrix(const row_vec& s) {
clear();
for (uint i = 0; i < R; i++)
m_rows[i][i] = s[i];
return *this;
}
inline matrix& set_translate_matrix(const row_vec& s)
{
inline matrix& set_translate_matrix(const row_vec& s) {
set_identity_matrix();
set_translate(s);
return *this;
}
inline matrix& set_translate_matrix(float x, float y)
{
inline matrix& set_translate_matrix(float x, float y) {
set_identity_matrix();
set_translate(row_vec(x, y).as_point());
return *this;
}
inline matrix& set_translate_matrix(float x, float y, float z)
{
inline matrix& set_translate_matrix(float x, float y, float z) {
set_identity_matrix();
set_translate(row_vec(x, y, z).as_point());
return *this;
}
inline matrix get_transposed(void) const
{
inline matrix get_transposed(void) const {
matrix result;
for (uint i = 0; i < R; i++)
for (uint j = 0; j < C; j++)
@@ -412,8 +358,7 @@ namespace crnlib
return result;
}
inline matrix& transpose_in_place(void)
{
inline matrix& transpose_in_place(void) {
matrix result;
for (uint i = 0; i < R; i++)
for (uint j = 0; j < C; j++)
@@ -423,8 +368,7 @@ namespace crnlib
}
// This method transforms a column vec by a matrix (D3D-style).
static inline row_vec transform(const col_vec& a, const matrix& b)
{
static inline row_vec transform(const col_vec& a, const matrix& b) {
row_vec result(b[0] * a[0]);
for (uint r = 1; r < R; r++)
result += b[r] * a[r];
@@ -432,8 +376,7 @@ namespace crnlib
}
// This method transforms a column vec by a matrix. Last component of vec is assumed to be 1.
static inline row_vec transform_point(const col_vec& a, const matrix& b)
{
static inline row_vec transform_point(const col_vec& a, const matrix& b) {
row_vec result(0);
for (int r = 0; r < (R - 1); r++)
result += b[r] * a[r];
@@ -442,19 +385,16 @@ namespace crnlib
}
// This method transforms a column vec by a matrix. Last component of vec is assumed to be 0.
static inline row_vec transform_vector(const col_vec& a, const matrix& b)
{
static inline row_vec transform_vector(const col_vec& a, const matrix& b) {
row_vec result(0);
for (int r = 0; r < (R - 1); r++)
result += b[r] * a[r];
return result;
}
static inline subcol_vec transform_point(const subcol_vec& a, const matrix& b)
{
static inline subcol_vec transform_point(const subcol_vec& a, const matrix& b) {
subcol_vec result(0);
for (int r = 0; r < R; r++)
{
for (int r = 0; r < R; r++) {
const T s = (r < subcol_vec::num_elements) ? a[r] : 1.0f;
for (int c = 0; c < (C - 1); c++)
result[c] += b[r][c] * s;
@@ -462,11 +402,9 @@ namespace crnlib
return result;
}
static inline subcol_vec transform_vector(const subcol_vec& a, const matrix& b)
{
static inline subcol_vec transform_vector(const subcol_vec& a, const matrix& b) {
subcol_vec result(0);
for (int r = 0; r < (R - 1); r++)
{
for (int r = 0; r < (R - 1); r++) {
const T s = a[r];
for (int c = 0; c < (C - 1); c++)
result[c] += b[r][c] * s;
@@ -475,8 +413,7 @@ namespace crnlib
}
// This method transforms a column vec by the transpose of a matrix.
static inline col_vec transform_transposed(const matrix& b, const col_vec& a)
{
static inline col_vec transform_transposed(const matrix& b, const col_vec& a) {
CRNLIB_ASSUME(R == C);
col_vec result;
for (uint r = 0; r < R; r++)
@@ -485,12 +422,10 @@ namespace crnlib
}
// This method transforms a column vec by the transpose of a matrix. Last component of vec is assumed to be 0.
static inline col_vec transform_vector_transposed(const matrix& b, const col_vec& a)
{
static inline col_vec transform_vector_transposed(const matrix& b, const col_vec& a) {
CRNLIB_ASSUME(R == C);
col_vec result;
for (uint r = 0; r < R; r++)
{
for (uint r = 0; r < R; r++) {
T s = 0;
for (uint c = 0; c < (C - 1); c++)
s += b[r][c] * a[c];
@@ -501,31 +436,26 @@ namespace crnlib
}
// This method transforms a matrix by a row vector (OGL style).
static inline col_vec transform(const matrix& b, const row_vec& a)
{
static inline col_vec transform(const matrix& b, const row_vec& a) {
col_vec result;
for (int r = 0; r < R; r++)
result[r] = b[r] * a;
return result;
}
static inline matrix& multiply(matrix& result, const matrix& lhs, const matrix& rhs)
{
static inline matrix& multiply(matrix& result, const matrix& lhs, const matrix& rhs) {
return matrix_mul_helper(result, lhs, rhs);
}
static inline matrix make_scale_matrix(float s)
{
static inline matrix make_scale_matrix(float s) {
return matrix().set_scale_matrix(s);
}
static inline matrix make_scale_matrix(const row_vec& s)
{
static inline matrix make_scale_matrix(const row_vec& s) {
return matrix().set_scale_matrix(s);
}
static inline matrix make_scale_matrix(float x, float y)
{
static inline matrix make_scale_matrix(float x, float y) {
CRNLIB_ASSUME(R >= 3 && C >= 3);
matrix result;
result.clear();
@@ -535,8 +465,7 @@ namespace crnlib
return result;
}
static inline matrix make_scale_matrix(float x, float y, float z)
{
static inline matrix make_scale_matrix(float x, float y, float z) {
CRNLIB_ASSUME(R >= 4 && C >= 4);
matrix result;
result.clear();
+38 -160
View File
@@ -14,8 +14,7 @@
#define _msize malloc_usable_size
#endif
namespace crnlib
{
namespace crnlib {
#if CRNLIB_MEM_STATS
#if CRNLIB_64BIT_POINTERS
typedef LONGLONG mem_stat_t;
@@ -29,11 +28,9 @@ namespace crnlib
static volatile mem_stat_t g_total_allocated;
static volatile mem_stat_t g_max_allocated;
static mem_stat_t update_total_allocated(int block_delta, mem_stat_t byte_delta)
{
static mem_stat_t update_total_allocated(int block_delta, mem_stat_t byte_delta) {
mem_stat_t cur_total_blocks;
for ( ; ; )
{
for (;;) {
cur_total_blocks = (mem_stat_t)g_total_blocks;
mem_stat_t new_total_blocks = static_cast<mem_stat_t>(cur_total_blocks + block_delta);
CRNLIB_ASSERT(new_total_blocks >= 0);
@@ -42,16 +39,14 @@ namespace crnlib
}
mem_stat_t cur_total_allocated, new_total_allocated;
for ( ; ; )
{
for (;;) {
cur_total_allocated = g_total_allocated;
new_total_allocated = static_cast<mem_stat_t>(cur_total_allocated + byte_delta);
CRNLIB_ASSERT(new_total_allocated >= 0);
if (CRNLIB_MEM_COMPARE_EXCHANGE(&g_total_allocated, new_total_allocated, cur_total_allocated) == cur_total_allocated)
break;
}
for ( ; ; )
{
for (;;) {
mem_stat_t cur_max_allocated = g_max_allocated;
mem_stat_t new_max_allocated = CRNLIB_MAX(new_total_allocated, cur_max_allocated);
if (CRNLIB_MEM_COMPARE_EXCHANGE(&g_max_allocated, new_max_allocated, cur_max_allocated) == cur_max_allocated)
@@ -61,35 +56,26 @@ namespace crnlib
}
#endif // CRNLIB_MEM_STATS
static void* crnlib_default_realloc(void* p, size_t size, size_t* pActual_size, bool movable, void* pUser_data)
{
pUser_data;
static void* crnlib_default_realloc(void* p, size_t size, size_t* pActual_size, bool movable, void*) {
void* p_new;
if (!p)
{
if (!p) {
p_new = ::malloc(size);
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(p_new) & (CRNLIB_MIN_ALLOC_ALIGNMENT - 1)) == 0);
if (!p_new)
{
if (!p_new) {
printf("WARNING: ::malloc() of size %u failed!\n", (uint)size);
}
if (pActual_size)
*pActual_size = p_new ? ::_msize(p_new) : 0;
}
else if (!size)
{
} else if (!size) {
::free(p);
p_new = NULL;
if (pActual_size)
*pActual_size = 0;
}
else
{
} else {
void* p_final_block = p;
#ifdef WIN32
p_new = ::_expand(p, size);
@@ -97,22 +83,16 @@ namespace crnlib
p_new = NULL;
#endif
if (p_new)
{
if (p_new) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(p_new) & (CRNLIB_MIN_ALLOC_ALIGNMENT - 1)) == 0);
p_final_block = p_new;
}
else if (movable)
{
} else if (movable) {
p_new = ::realloc(p, size);
if (p_new)
{
if (p_new) {
CRNLIB_ASSERT((reinterpret_cast<ptr_bits_t>(p_new) & (CRNLIB_MIN_ALLOC_ALIGNMENT - 1)) == 0);
p_final_block = p_new;
}
else
{
} else {
printf("WARNING: ::realloc() of size %u failed!\n", (uint)size);
}
}
@@ -124,110 +104,27 @@ namespace crnlib
return p_new;
}
static size_t crnlib_default_msize(void* p, void* pUser_data)
{
pUser_data;
static size_t crnlib_default_msize(void* p, void*) {
return p ? _msize(p) : 0;
}
#if 0
static __declspec(thread) void *g_pBuf;
static __declspec(thread) size_t g_buf_size;
static __declspec(thread) size_t g_buf_ofs;
static size_t crnlib_nofree_msize(void* p, void* pUser_data)
{
pUser_data;
return p ? ((const size_t*)p)[-1] : 0;
}
static void* crnlib_nofree_realloc(void* p, size_t size, size_t* pActual_size, bool movable, void* pUser_data)
{
pUser_data;
void* p_new;
if (!p)
{
size = math::align_up_value(size, CRNLIB_MIN_ALLOC_ALIGNMENT);
size_t actual_size = sizeof(size_t)*2 + size;
size_t num_remaining = g_buf_size - g_buf_ofs;
if (num_remaining < actual_size)
{
g_buf_size = CRNLIB_MAX(actual_size, 32*1024*1024);
g_buf_ofs = 0;
g_pBuf = malloc(g_buf_size);
if (!g_pBuf)
return NULL;
}
p_new = (uint8*)g_pBuf + g_buf_ofs;
((size_t*)p_new)[1] = size;
p_new = (size_t*)p_new + 2;
g_buf_ofs += actual_size;
if (pActual_size)
*pActual_size = size;
CRNLIB_ASSERT(crnlib_nofree_msize(p_new, NULL) == size);
}
else if (!size)
{
if (pActual_size)
*pActual_size = 0;
p_new = NULL;
}
else
{
size_t cur_size = crnlib_nofree_msize(p, NULL);
p_new = p;
if (!movable)
return NULL;
if (size > cur_size)
{
p_new = crnlib_nofree_realloc(NULL, size, NULL, true, NULL);
if (!p_new)
return NULL;
memcpy(p_new, p, cur_size);
cur_size = size;
}
if (pActual_size)
*pActual_size = cur_size;
}
return p_new;
}
static crn_realloc_func g_pRealloc = crnlib_nofree_realloc;
static crn_msize_func g_pMSize = crnlib_nofree_msize;
#else
static crn_realloc_func g_pRealloc = crnlib_default_realloc;
static crn_msize_func g_pMSize = crnlib_default_msize;
#endif
static void* g_pUser_data;
void crnlib_mem_error(const char* p_msg)
{
void crnlib_mem_error(const char* p_msg) {
crnlib_assert(p_msg, __FILE__, __LINE__);
}
void* crnlib_malloc(size_t size)
{
void* crnlib_malloc(size_t size) {
return crnlib_malloc(size, NULL);
}
void* crnlib_malloc(size_t size, size_t* pActual_size)
{
void* crnlib_malloc(size_t size, size_t* pActual_size) {
size = (size + sizeof(uint32) - 1U) & ~(sizeof(uint32) - 1U);
if (!size)
size = sizeof(uint32);
if (size > CRNLIB_MAX_POSSIBLE_BLOCK_SIZE)
{
if (size > CRNLIB_MAX_POSSIBLE_BLOCK_SIZE) {
crnlib_mem_error("crnlib_malloc: size too big");
return NULL;
}
@@ -238,8 +135,7 @@ namespace crnlib
if (pActual_size)
*pActual_size = actual_size;
if ((!p_new) || (actual_size < size))
{
if ((!p_new) || (actual_size < size)) {
crnlib_mem_error("crnlib_malloc: out of memory");
return NULL;
}
@@ -254,16 +150,13 @@ namespace crnlib
return p_new;
}
void* crnlib_realloc(void* p, size_t size, size_t* pActual_size, bool movable)
{
if ((ptr_bits_t)p & (CRNLIB_MIN_ALLOC_ALIGNMENT - 1))
{
void* crnlib_realloc(void* p, size_t size, size_t* pActual_size, bool movable) {
if ((ptr_bits_t)p & (CRNLIB_MIN_ALLOC_ALIGNMENT - 1)) {
crnlib_mem_error("crnlib_realloc: bad ptr");
return NULL;
}
if (size > CRNLIB_MAX_POSSIBLE_BLOCK_SIZE)
{
if (size > CRNLIB_MAX_POSSIBLE_BLOCK_SIZE) {
crnlib_mem_error("crnlib_malloc: size too big");
return NULL;
}
@@ -287,13 +180,10 @@ namespace crnlib
CRNLIB_ASSERT(!p_new || ((*g_pMSize)(p_new, g_pUser_data) == actual_size));
int num_new_blocks = 0;
if (p)
{
if (p) {
if (!p_new)
num_new_blocks = -1;
}
else if (p_new)
{
} else if (p_new) {
num_new_blocks = 1;
}
update_total_allocated(num_new_blocks, static_cast<mem_stat_t>(actual_size) - static_cast<mem_stat_t>(cur_size));
@@ -302,21 +192,19 @@ namespace crnlib
return p_new;
}
void* crnlib_calloc(size_t count, size_t size, size_t* pActual_size)
{
void* crnlib_calloc(size_t count, size_t size, size_t* pActual_size) {
size_t total = count * size;
void* p = crnlib_malloc(total, pActual_size);
if (p) memset(p, 0, total);
if (p)
memset(p, 0, total);
return p;
}
void crnlib_free(void* p)
{
void crnlib_free(void* p) {
if (!p)
return;
if (reinterpret_cast<ptr_bits_t>(p) & (CRNLIB_MIN_ALLOC_ALIGNMENT - 1))
{
if (reinterpret_cast<ptr_bits_t>(p) & (CRNLIB_MIN_ALLOC_ALIGNMENT - 1)) {
crnlib_mem_error("crnlib_free: bad ptr");
return;
}
@@ -330,13 +218,11 @@ namespace crnlib
(*g_pRealloc)(p, 0, NULL, true, g_pUser_data);
}
size_t crnlib_msize(void* p)
{
size_t crnlib_msize(void* p) {
if (!p)
return 0;
if (reinterpret_cast<ptr_bits_t>(p) & (CRNLIB_MIN_ALLOC_ALIGNMENT - 1))
{
if (reinterpret_cast<ptr_bits_t>(p) & (CRNLIB_MIN_ALLOC_ALIGNMENT - 1)) {
crnlib_mem_error("crnlib_msize: bad ptr");
return 0;
}
@@ -344,16 +230,12 @@ namespace crnlib
return (*g_pMSize)(p, g_pUser_data);
}
void crnlib_print_mem_stats()
{
void crnlib_print_mem_stats() {
#if CRNLIB_MEM_STATS
if (console::is_initialized())
{
if (console::is_initialized()) {
console::debug("crnlib_print_mem_stats:");
console::debug("Current blocks: %u, allocated: " CRNLIB_INT64_FORMAT_SPECIFIER ", max ever allocated: " CRNLIB_INT64_FORMAT_SPECIFIER, g_total_blocks, (int64)g_total_allocated, (int64)g_max_allocated);
}
else
{
} else {
printf("crnlib_print_mem_stats:\n");
printf("Current blocks: %u, allocated: " CRNLIB_INT64_FORMAT_SPECIFIER ", max ever allocated: " CRNLIB_INT64_FORMAT_SPECIFIER "\n", g_total_blocks, (int64)g_total_allocated, (int64)g_max_allocated);
}
@@ -362,16 +244,12 @@ namespace crnlib
} // namespace crnlib
void crn_set_memory_callbacks(crn_realloc_func pRealloc, crn_msize_func pMSize, void* pUser_data)
{
if ((!pRealloc) || (!pMSize))
{
void crn_set_memory_callbacks(crn_realloc_func pRealloc, crn_msize_func pMSize, void* pUser_data) {
if ((!pRealloc) || (!pMSize)) {
crnlib::g_pRealloc = crnlib::crnlib_default_realloc;
crnlib::g_pMSize = crnlib::crnlib_default_msize;
crnlib::g_pUser_data = NULL;
}
else
{
} else {
crnlib::g_pRealloc = pRealloc;
crnlib::g_pMSize = pMSize;
crnlib::g_pUser_data = pUser_data;
+31 -59
View File
@@ -6,8 +6,7 @@
#define CRNLIB_MIN_ALLOC_ALIGNMENT sizeof(size_t) * 2
#endif
namespace crnlib
{
namespace crnlib {
#if CRNLIB_64BIT_POINTERS
const uint64 CRNLIB_MAX_POSSIBLE_BLOCK_SIZE = 0x400000000ULL;
#else
@@ -26,8 +25,7 @@ namespace crnlib
// omfg - there must be a better way
template <typename T>
inline T* crnlib_new()
{
inline T* crnlib_new() {
T* p = static_cast<T*>(crnlib_malloc(sizeof(T)));
if (CRNLIB_IS_SCALAR_TYPE(T))
return p;
@@ -35,104 +33,90 @@ namespace crnlib
}
template <typename T, typename A>
inline T* crnlib_new(const A& init0)
{
inline T* crnlib_new(const A& init0) {
T* p = static_cast<T*>(crnlib_malloc(sizeof(T)));
return new (static_cast<void*>(p)) T(init0);
}
template <typename T, typename A>
inline T* crnlib_new(A& init0)
{
inline T* crnlib_new(A& init0) {
T* p = static_cast<T*>(crnlib_malloc(sizeof(T)));
return new (static_cast<void*>(p)) T(init0);
}
template <typename T, typename A, typename B>
inline T* crnlib_new(const A& init0, const B& init1)
{
inline T* crnlib_new(const A& init0, const B& init1) {
T* p = static_cast<T*>(crnlib_malloc(sizeof(T)));
return new (static_cast<void*>(p)) T(init0, init1);
}
template <typename T, typename A, typename B, typename C>
inline T* crnlib_new(const A& init0, const B& init1, const C& init2)
{
inline T* crnlib_new(const A& init0, const B& init1, const C& init2) {
T* p = static_cast<T*>(crnlib_malloc(sizeof(T)));
return new (static_cast<void*>(p)) T(init0, init1, init2);
}
template <typename T, typename A, typename B, typename C, typename D>
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3)
{
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3) {
T* p = static_cast<T*>(crnlib_malloc(sizeof(T)));
return new (static_cast<void*>(p)) T(init0, init1, init2, init3);
}
template <typename T, typename A, typename B, typename C, typename D, typename E>
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3, const E& init4)
{
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3, const E& init4) {
T* p = static_cast<T*>(crnlib_malloc(sizeof(T)));
return new (static_cast<void*>(p)) T(init0, init1, init2, init3, init4);
}
template <typename T, typename A, typename B, typename C, typename D, typename E, typename F>
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3, const E& init4, const F& init5)
{
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3, const E& init4, const F& init5) {
T* p = static_cast<T*>(crnlib_malloc(sizeof(T)));
return new (static_cast<void*>(p)) T(init0, init1, init2, init3, init4, init5);
}
template <typename T, typename A, typename B, typename C, typename D, typename E, typename F, typename G>
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3, const E& init4, const F& init5, const G& init6)
{
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3, const E& init4, const F& init5, const G& init6) {
T* p = static_cast<T*>(crnlib_malloc(sizeof(T)));
return new (static_cast<void*>(p)) T(init0, init1, init2, init3, init4, init5, init6);
}
template <typename T, typename A, typename B, typename C, typename D, typename E, typename F, typename G, typename H>
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3, const E& init4, const F& init5, const G& init6, const H& init7)
{
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3, const E& init4, const F& init5, const G& init6, const H& init7) {
T* p = static_cast<T*>(crnlib_malloc(sizeof(T)));
return new (static_cast<void*>(p)) T(init0, init1, init2, init3, init4, init5, init6, init7);
}
template <typename T, typename A, typename B, typename C, typename D, typename E, typename F, typename G, typename H, typename I>
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3, const E& init4, const F& init5, const G& init6, const H& init7, const I& init8)
{
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3, const E& init4, const F& init5, const G& init6, const H& init7, const I& init8) {
T* p = static_cast<T*>(crnlib_malloc(sizeof(T)));
return new (static_cast<void*>(p)) T(init0, init1, init2, init3, init4, init5, init6, init7, init8);
}
template <typename T, typename A, typename B, typename C, typename D, typename E, typename F, typename G, typename H, typename I, typename J>
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3, const E& init4, const F& init5, const G& init6, const H& init7, const I& init8, const J& init9)
{
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3, const E& init4, const F& init5, const G& init6, const H& init7, const I& init8, const J& init9) {
T* p = static_cast<T*>(crnlib_malloc(sizeof(T)));
return new (static_cast<void*>(p)) T(init0, init1, init2, init3, init4, init5, init6, init7, init8, init9);
}
template <typename T, typename A, typename B, typename C, typename D, typename E, typename F, typename G, typename H, typename I, typename J, typename K>
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3, const E& init4, const F& init5, const G& init6, const H& init7, const I& init8, const J& init9, const K& init10)
{
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3, const E& init4, const F& init5, const G& init6, const H& init7, const I& init8, const J& init9, const K& init10) {
T* p = static_cast<T*>(crnlib_malloc(sizeof(T)));
return new (static_cast<void*>(p)) T(init0, init1, init2, init3, init4, init5, init6, init7, init8, init9, init10);
}
template <typename T, typename A, typename B, typename C, typename D, typename E, typename F, typename G, typename H, typename I, typename J, typename K, typename L>
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3, const E& init4, const F& init5, const G& init6, const H& init7, const I& init8, const J& init9, const K& init10, const L& init11)
{
inline T* crnlib_new(const A& init0, const B& init1, const C& init2, const D& init3, const E& init4, const F& init5, const G& init6, const H& init7, const I& init8, const J& init9, const K& init10, const L& init11) {
T* p = static_cast<T*>(crnlib_malloc(sizeof(T)));
return new (static_cast<void*>(p)) T(init0, init1, init2, init3, init4, init5, init6, init7, init8, init9, init10, init11);
}
template <typename T>
inline T* crnlib_new_array(uint32 num)
{
if (!num) num = 1;
inline T* crnlib_new_array(uint32 num) {
if (!num)
num = 1;
uint64 total = CRNLIB_MIN_ALLOC_ALIGNMENT + sizeof(T) * num;
if (total > CRNLIB_MAX_POSSIBLE_BLOCK_SIZE)
{
if (total > CRNLIB_MAX_POSSIBLE_BLOCK_SIZE) {
crnlib_mem_error("crnlib_new_array: Array too large!");
return NULL;
}
@@ -143,20 +127,16 @@ namespace crnlib
reinterpret_cast<uint32*>(p)[-1] = num;
reinterpret_cast<uint32*>(p)[-2] = ~num;
if (!CRNLIB_IS_SCALAR_TYPE(T))
{
if (!CRNLIB_IS_SCALAR_TYPE(T)) {
helpers::construct_array(p, num);
}
return p;
}
template <typename T>
inline void crnlib_delete(T* p)
{
if (p)
{
if (!CRNLIB_IS_SCALAR_TYPE(T))
{
inline void crnlib_delete(T* p) {
if (p) {
if (!CRNLIB_IS_SCALAR_TYPE(T)) {
helpers::destruct(p);
}
crnlib_free(p);
@@ -164,17 +144,13 @@ namespace crnlib
}
template <typename T>
inline void crnlib_delete_array(T* p)
{
if (p)
{
inline void crnlib_delete_array(T* p) {
if (p) {
const uint32 num = reinterpret_cast<uint32*>(p)[-1];
const uint32 num_check = reinterpret_cast<uint32*>(p)[-2];
CRNLIB_ASSERT(num && (num == ~num_check));
if (num == ~num_check)
{
if (!CRNLIB_IS_SCALAR_TYPE(T))
{
if (num == ~num_check) {
if (!CRNLIB_IS_SCALAR_TYPE(T)) {
helpers::destruct_array(p, num);
}
@@ -185,25 +161,21 @@ namespace crnlib
} // namespace crnlib
#define CRNLIB_DEFINE_NEW_DELETE \
void* operator new (size_t size) \
{ \
void* operator new(size_t size) { \
void* p = crnlib::crnlib_malloc(size); \
if (!p) \
crnlib_fail("new: Out of memory!", __FILE__, __LINE__); \
return p; \
} \
void* operator new[] (size_t size) \
{ \
void* operator new[](size_t size) { \
void* p = crnlib::crnlib_malloc(size); \
if (!p) \
crnlib_fail("new[]: Out of memory!", __FILE__, __LINE__); \
return p; \
} \
void operator delete (void* p_block) \
{ \
void operator delete(void* p_block) { \
crnlib::crnlib_free(p_block); \
} \
void operator delete[] (void* p_block) \
{ \
void operator delete[](void* p_block) { \
crnlib::crnlib_free(p_block); \
}
+1521 -1147
View File
File diff suppressed because it is too large Load Diff
+77 -36
View File
@@ -210,7 +210,11 @@ mz_ulong mz_adler32(mz_ulong adler, const unsigned char *ptr, size_t buf_len);
mz_ulong mz_crc32(mz_ulong crc, const unsigned char* ptr, size_t buf_len);
// Compression strategies.
enum { MZ_DEFAULT_STRATEGY = 0, MZ_FILTERED = 1, MZ_HUFFMAN_ONLY = 2, MZ_RLE = 3, MZ_FIXED = 4 };
enum { MZ_DEFAULT_STRATEGY = 0,
MZ_FILTERED = 1,
MZ_HUFFMAN_ONLY = 2,
MZ_RLE = 3,
MZ_FIXED = 4 };
// Method
#define MZ_DEFLATED 8
@@ -231,13 +235,32 @@ typedef void *(*mz_realloc_func)(void *opaque, void *address, size_t items, size
#define MZ_VER_SUBREVISION 0
// Flush values. For typical usage you only need MZ_NO_FLUSH and MZ_FINISH. The other values are for advanced use (refer to the zlib docs).
enum { MZ_NO_FLUSH = 0, MZ_PARTIAL_FLUSH = 1, MZ_SYNC_FLUSH = 2, MZ_FULL_FLUSH = 3, MZ_FINISH = 4, MZ_BLOCK = 5 };
enum { MZ_NO_FLUSH = 0,
MZ_PARTIAL_FLUSH = 1,
MZ_SYNC_FLUSH = 2,
MZ_FULL_FLUSH = 3,
MZ_FINISH = 4,
MZ_BLOCK = 5 };
// Return status codes. MZ_PARAM_ERROR is non-standard.
enum { MZ_OK = 0, MZ_STREAM_END = 1, MZ_NEED_DICT = 2, MZ_ERRNO = -1, MZ_STREAM_ERROR = -2, MZ_DATA_ERROR = -3, MZ_MEM_ERROR = -4, MZ_BUF_ERROR = -5, MZ_VERSION_ERROR = -6, MZ_PARAM_ERROR = -10000 };
enum { MZ_OK = 0,
MZ_STREAM_END = 1,
MZ_NEED_DICT = 2,
MZ_ERRNO = -1,
MZ_STREAM_ERROR = -2,
MZ_DATA_ERROR = -3,
MZ_MEM_ERROR = -4,
MZ_BUF_ERROR = -5,
MZ_VERSION_ERROR = -6,
MZ_PARAM_ERROR = -10000 };
// Compression levels: 0-9 are the standard zlib-style levels, 10 is best possible compression (not zlib compatible, and may be very slow), MZ_DEFAULT_COMPRESSION=MZ_DEFAULT_LEVEL.
enum { MZ_NO_COMPRESSION = 0, MZ_BEST_SPEED = 1, MZ_BEST_COMPRESSION = 9, MZ_UBER_COMPRESSION = 10, MZ_DEFAULT_LEVEL = 6, MZ_DEFAULT_COMPRESSION = -1 };
enum { MZ_NO_COMPRESSION = 0,
MZ_BEST_SPEED = 1,
MZ_BEST_COMPRESSION = 9,
MZ_UBER_COMPRESSION = 10,
MZ_DEFAULT_LEVEL = 6,
MZ_DEFAULT_COMPRESSION = -1 };
// Window bits
#define MZ_DEFAULT_WINDOW_BITS 15
@@ -245,8 +268,7 @@ enum { MZ_NO_COMPRESSION = 0, MZ_BEST_SPEED = 1, MZ_BEST_COMPRESSION = 9, MZ_UBE
struct mz_internal_state;
// Compression/decompression stream struct.
typedef struct mz_stream_s
{
typedef struct mz_stream_s {
const unsigned char* next_in; // pointer to next byte to read
unsigned int avail_in; // number of bytes available at next_in
mz_ulong total_in; // total number of bytes consumed so far
@@ -459,8 +481,7 @@ typedef int mz_bool;
#ifndef MINIZ_NO_ARCHIVE_APIS
enum
{
enum {
MZ_ZIP_MAX_IO_BUF_SIZE = 64 * 1024,
MZ_ZIP_MAX_ARCHIVE_FILENAME_SIZE = 260,
MZ_ZIP_MAX_ARCHIVE_FILE_COMMENT_SIZE = 256
@@ -494,8 +515,7 @@ typedef size_t (*mz_file_write_func)(void *pOpaque, mz_uint64 file_ofs, const vo
struct mz_zip_internal_state_tag;
typedef struct mz_zip_internal_state_tag mz_zip_internal_state;
typedef enum
{
typedef enum {
MZ_ZIP_MODE_INVALID = 0,
MZ_ZIP_MODE_READING = 1,
MZ_ZIP_MODE_WRITING = 2,
@@ -524,8 +544,7 @@ typedef struct
} mz_zip_archive;
typedef enum
{
typedef enum {
MZ_ZIP_FLAG_CASE_SENSITIVE = 0x0100,
MZ_ZIP_FLAG_IGNORE_PATH = 0x0200,
MZ_ZIP_FLAG_COMPRESSED_DATA = 0x0400,
@@ -656,8 +675,7 @@ void *mz_zip_extract_archive_file_to_heap(const char *pZip_filename, const char
// TINFL_FLAG_HAS_MORE_INPUT: If set, there are more input bytes available beyond the end of the supplied input buffer. If clear, the input buffer contains all remaining input.
// TINFL_FLAG_USING_NON_WRAPPING_OUTPUT_BUF: If set, the output buffer is large enough to hold the entire decompressed stream. If clear, the output buffer is at least the size of the dictionary (typically 32KB).
// TINFL_FLAG_COMPUTE_ADLER32: Force adler-32 checksum computation of the decompressed bytes.
enum
{
enum {
TINFL_FLAG_PARSE_ZLIB_HEADER = 1,
TINFL_FLAG_HAS_MORE_INPUT = 2,
TINFL_FLAG_USING_NON_WRAPPING_OUTPUT_BUF = 4,
@@ -684,14 +702,14 @@ size_t tinfl_decompress_mem_to_mem(void *pOut_buf, size_t out_buf_len, const voi
typedef int (*tinfl_put_buf_func_ptr)(const void* pBuf, int len, void* pUser);
int tinfl_decompress_mem_to_callback(const void* pIn_buf, size_t* pIn_buf_size, tinfl_put_buf_func_ptr pPut_buf_func, void* pPut_buf_user, int flags);
struct tinfl_decompressor_tag; typedef struct tinfl_decompressor_tag tinfl_decompressor;
struct tinfl_decompressor_tag;
typedef struct tinfl_decompressor_tag tinfl_decompressor;
// Max size of LZ dictionary.
#define TINFL_LZ_DICT_SIZE 32768
// Return status.
typedef enum
{
typedef enum {
TINFL_STATUS_BAD_PARAM = -3,
TINFL_STATUS_ADLER32_MISMATCH = -2,
TINFL_STATUS_FAILED = -1,
@@ -701,7 +719,11 @@ typedef enum
} tinfl_status;
// Initializes the decompressor to its initial state.
#define tinfl_init(r) do { (r)->m_state = 0; } MZ_MACRO_END
#define tinfl_init(r) \
do { \
(r)->m_state = 0; \
} \
MZ_MACRO_END
#define tinfl_get_adler32(r) (r)->m_check_adler32
// Main low-level decompressor coroutine function. This is the only function actually needed for decompression. All the other functions are just high-level helpers for improved usability.
@@ -709,10 +731,13 @@ typedef enum
tinfl_status tinfl_decompress(tinfl_decompressor* r, const mz_uint8* pIn_buf_next, size_t* pIn_buf_size, mz_uint8* pOut_buf_start, mz_uint8* pOut_buf_next, size_t* pOut_buf_size, const mz_uint32 decomp_flags);
// Internal/private bits follow.
enum
{
TINFL_MAX_HUFF_TABLES = 3, TINFL_MAX_HUFF_SYMBOLS_0 = 288, TINFL_MAX_HUFF_SYMBOLS_1 = 32, TINFL_MAX_HUFF_SYMBOLS_2 = 19,
TINFL_FAST_LOOKUP_BITS = 10, TINFL_FAST_LOOKUP_SIZE = 1 << TINFL_FAST_LOOKUP_BITS
enum {
TINFL_MAX_HUFF_TABLES = 3,
TINFL_MAX_HUFF_SYMBOLS_0 = 288,
TINFL_MAX_HUFF_SYMBOLS_1 = 32,
TINFL_MAX_HUFF_SYMBOLS_2 = 19,
TINFL_FAST_LOOKUP_BITS = 10,
TINFL_FAST_LOOKUP_SIZE = 1 << TINFL_FAST_LOOKUP_BITS
};
typedef struct
@@ -733,8 +758,7 @@ typedef struct
#define TINFL_BITBUF_SIZE (32)
#endif
struct tinfl_decompressor_tag
{
struct tinfl_decompressor_tag {
mz_uint32 m_state, m_num_bits, m_zhdr0, m_zhdr1, m_z_adler32, m_final, m_type, m_check_adler32, m_dist, m_counter, m_num_extra, m_table_sizes[TINFL_MAX_HUFF_TABLES];
tinfl_bit_buf_t m_bit_buf;
size_t m_dist_from_out_buf_start;
@@ -749,9 +773,10 @@ struct tinfl_decompressor_tag
// tdefl_init() compression flags logically OR'd together (low 12 bits contain the max. number of probes per dictionary search):
// TDEFL_DEFAULT_MAX_PROBES: The compressor defaults to 128 dictionary probes per dictionary search. 0=Huffman only, 1=Huffman+LZ (fastest/crap compression), 4095=Huffman+LZ (slowest/best compression).
enum
{
TDEFL_HUFFMAN_ONLY = 0, TDEFL_DEFAULT_MAX_PROBES = 128, TDEFL_MAX_PROBES_MASK = 0xFFF
enum {
TDEFL_HUFFMAN_ONLY = 0,
TDEFL_DEFAULT_MAX_PROBES = 128,
TDEFL_MAX_PROBES_MASK = 0xFFF
};
// TDEFL_WRITE_ZLIB_HEADER: If set, the compressor outputs a zlib header before the deflate data, and the Adler-32 of the source data at the end. Otherwise, you'll get raw deflate data.
@@ -762,8 +787,7 @@ enum
// TDEFL_FILTER_MATCHES: Discards matches <= 5 chars if enabled.
// TDEFL_FORCE_ALL_STATIC_BLOCKS: Disable usage of optimized Huffman tables.
// TDEFL_FORCE_ALL_RAW_BLOCKS: Only use raw (uncompressed) deflate blocks.
enum
{
enum {
TDEFL_WRITE_ZLIB_HEADER = 0x01000,
TDEFL_COMPUTE_ADLER32 = 0x02000,
TDEFL_GREEDY_PARSING_FLAG = 0x04000,
@@ -805,18 +829,36 @@ typedef mz_bool (*tdefl_put_buf_func_ptr)(const void* pBuf, int len, void *pUser
// tdefl_compress_mem_to_output() compresses a block to an output stream. The above helpers use this function internally.
mz_bool tdefl_compress_mem_to_output(const void* pBuf, size_t buf_len, tdefl_put_buf_func_ptr pPut_buf_func, void* pPut_buf_user, int flags);
enum { TDEFL_MAX_HUFF_TABLES = 3, TDEFL_MAX_HUFF_SYMBOLS_0 = 288, TDEFL_MAX_HUFF_SYMBOLS_1 = 32, TDEFL_MAX_HUFF_SYMBOLS_2 = 19, TDEFL_LZ_DICT_SIZE = 32768, TDEFL_LZ_DICT_SIZE_MASK = TDEFL_LZ_DICT_SIZE - 1, TDEFL_MIN_MATCH_LEN = 3, TDEFL_MAX_MATCH_LEN = 258 };
enum { TDEFL_MAX_HUFF_TABLES = 3,
TDEFL_MAX_HUFF_SYMBOLS_0 = 288,
TDEFL_MAX_HUFF_SYMBOLS_1 = 32,
TDEFL_MAX_HUFF_SYMBOLS_2 = 19,
TDEFL_LZ_DICT_SIZE = 32768,
TDEFL_LZ_DICT_SIZE_MASK = TDEFL_LZ_DICT_SIZE - 1,
TDEFL_MIN_MATCH_LEN = 3,
TDEFL_MAX_MATCH_LEN = 258 };
// TDEFL_OUT_BUF_SIZE MUST be large enough to hold a single entire compressed output block (using static/fixed Huffman codes).
#if TDEFL_LESS_MEMORY
enum { TDEFL_LZ_CODE_BUF_SIZE = 24 * 1024, TDEFL_OUT_BUF_SIZE = (TDEFL_LZ_CODE_BUF_SIZE * 13 ) / 10, TDEFL_MAX_HUFF_SYMBOLS = 288, TDEFL_LZ_HASH_BITS = 12, TDEFL_LEVEL1_HASH_SIZE_MASK = 4095, TDEFL_LZ_HASH_SHIFT = (TDEFL_LZ_HASH_BITS + 2) / 3, TDEFL_LZ_HASH_SIZE = 1 << TDEFL_LZ_HASH_BITS };
enum { TDEFL_LZ_CODE_BUF_SIZE = 24 * 1024,
TDEFL_OUT_BUF_SIZE = (TDEFL_LZ_CODE_BUF_SIZE * 13) / 10,
TDEFL_MAX_HUFF_SYMBOLS = 288,
TDEFL_LZ_HASH_BITS = 12,
TDEFL_LEVEL1_HASH_SIZE_MASK = 4095,
TDEFL_LZ_HASH_SHIFT = (TDEFL_LZ_HASH_BITS + 2) / 3,
TDEFL_LZ_HASH_SIZE = 1 << TDEFL_LZ_HASH_BITS };
#else
enum { TDEFL_LZ_CODE_BUF_SIZE = 64 * 1024, TDEFL_OUT_BUF_SIZE = (TDEFL_LZ_CODE_BUF_SIZE * 13 ) / 10, TDEFL_MAX_HUFF_SYMBOLS = 288, TDEFL_LZ_HASH_BITS = 15, TDEFL_LEVEL1_HASH_SIZE_MASK = 4095, TDEFL_LZ_HASH_SHIFT = (TDEFL_LZ_HASH_BITS + 2) / 3, TDEFL_LZ_HASH_SIZE = 1 << TDEFL_LZ_HASH_BITS };
enum { TDEFL_LZ_CODE_BUF_SIZE = 64 * 1024,
TDEFL_OUT_BUF_SIZE = (TDEFL_LZ_CODE_BUF_SIZE * 13) / 10,
TDEFL_MAX_HUFF_SYMBOLS = 288,
TDEFL_LZ_HASH_BITS = 15,
TDEFL_LEVEL1_HASH_SIZE_MASK = 4095,
TDEFL_LZ_HASH_SHIFT = (TDEFL_LZ_HASH_BITS + 2) / 3,
TDEFL_LZ_HASH_SIZE = 1 << TDEFL_LZ_HASH_BITS };
#endif
// The low-level tdefl functions below may be used directly if the above helper functions aren't flexible enough. The low-level functions don't make any heap allocations, unlike the above helper functions.
typedef enum
{
typedef enum {
TDEFL_STATUS_BAD_PARAM = -2,
TDEFL_STATUS_PUT_BUF_FAILED = -1,
TDEFL_STATUS_OKAY = 0,
@@ -824,8 +866,7 @@ typedef enum
} tdefl_status;
// Must map to MZ_NO_FLUSH, MZ_SYNC_FLUSH, etc. enums
typedef enum
{
typedef enum {
TDEFL_NO_FLUSH = 0,
TDEFL_SYNC_FLUSH = 2,
TDEFL_FULL_FLUSH = 3,
File diff suppressed because it is too large Load Diff
+31 -33
View File
@@ -12,26 +12,22 @@
#include "crn_texture_file_types.h"
#include "crn_image_utils.h"
namespace crnlib
{
namespace crnlib {
extern const vec2I g_vertical_cross_image_offsets[6];
enum orientation_flags_t
{
enum orientation_flags_t {
cOrientationFlagXFlipped = 1,
cOrientationFlagYFlipped = 2,
cDefaultOrientationFlags = 0
};
enum unpack_flags_t
{
enum unpack_flags_t {
cUnpackFlagUncook = 1,
cUnpackFlagUnflip = 2
};
class mip_level
{
class mip_level {
friend class mipmapped_texture;
public:
@@ -116,8 +112,7 @@ namespace crnlib
// And an array of one, six, or N faces make up a texture.
typedef crnlib::vector<mip_ptr_vec> face_vec;
class mipmapped_texture
{
class mipmapped_texture {
public:
// Construction/destruction
mipmapped_texture();
@@ -155,12 +150,22 @@ namespace crnlib
uint get_total_pixels_in_all_faces_and_mips() const;
inline uint get_num_faces() const { return m_faces.size(); }
inline uint get_num_levels() const { if (m_faces.empty()) return 0; else return m_faces[0].size(); }
inline uint get_num_levels() const {
if (m_faces.empty())
return 0;
else
return m_faces[0].size();
}
inline pixel_format_helpers::component_flags get_comp_flags() const { return m_comp_flags; }
inline pixel_format get_format() const { return m_format; }
inline bool is_unpacked() const { if (get_num_faces()) { return get_level(0, 0)->get_image() != NULL; } return false; }
inline bool is_unpacked() const {
if (get_num_faces()) {
return get_level(0, 0)->get_image() != NULL;
}
return false;
}
inline const mip_ptr_vec& get_face(uint face) const { return m_faces[face]; }
inline mip_ptr_vec& get_face(uint face) { return m_faces[face]; }
@@ -212,17 +217,15 @@ namespace crnlib
void discard_mips();
struct resample_params
{
resample_params() :
m_pFilter("kaiser"),
struct resample_params {
resample_params()
: m_pFilter("kaiser"),
m_wrapping(false),
m_srgb(false),
m_renormalize(false),
m_filter_scale(.9f),
m_gamma(1.75f), // or 2.2f
m_multithreaded(true)
{
m_multithreaded(true) {
}
const char* m_pFilter;
@@ -236,13 +239,11 @@ namespace crnlib
bool resize(uint new_width, uint new_height, const resample_params& params);
struct generate_mipmap_params : public resample_params
{
generate_mipmap_params() :
resample_params(),
struct generate_mipmap_params : public resample_params {
generate_mipmap_params()
: resample_params(),
m_min_mip_size(1),
m_max_mips(0)
{
m_max_mips(0) {
}
uint m_min_mip_size;
@@ -256,10 +257,9 @@ namespace crnlib
bool vertical_cross_to_cubemap();
// Low-level clustered DXT (QDXT) compression
struct qdxt_state
{
qdxt_state(task_pool& tp) : m_fmt(PIXEL_FMT_INVALID), m_qdxt1(tp), m_qdxt5a(tp), m_qdxt5b(tp)
{
struct qdxt_state {
qdxt_state(task_pool& tp)
: m_fmt(PIXEL_FMT_INVALID), m_qdxt1(tp), m_qdxt5a(tp), m_qdxt5b(tp) {
}
pixel_format m_fmt;
@@ -272,8 +272,7 @@ namespace crnlib
qdxt5_params m_qdxt5_params[2];
bool m_has_blocks[3];
void clear()
{
void clear() {
m_fmt = PIXEL_FMT_INVALID;
m_qdxt1.clear();
m_qdxt5a.clear();
@@ -322,7 +321,7 @@ namespace crnlib
inline void set_last_error(const char* p) const { m_last_error = p; }
void free_all_mips();
bool read_regular_image(data_stream_serializer &serializer, texture_file_types::format file_format);
bool read_regular_image(data_stream_serializer& serializer);
bool write_regular_image(const char* pFilename, uint32 image_write_flags);
bool read_dds_internal(data_stream_serializer& serializer);
void print_crn_comp_params(const crn_comp_params& p);
@@ -331,8 +330,7 @@ namespace crnlib
bool flip_y_helper();
};
inline void swap(mipmapped_texture& a, mipmapped_texture& b)
{
inline void swap(mipmapped_texture& a, mipmapped_texture& b) {
a.swap(b);
}
+23 -35
View File
@@ -2,45 +2,35 @@
// See Copyright Notice and license at the end of inc/crnlib.h
#pragma once
namespace crnlib
{
namespace crnlib {
template <unsigned int N>
struct packed_uint
{
struct packed_uint {
inline packed_uint() {}
inline packed_uint(unsigned int val) { *this = val; }
inline packed_uint(const packed_uint& other) { *this = other; }
inline packed_uint& operator= (const packed_uint& rhs)
{
inline packed_uint& operator=(const packed_uint& rhs) {
if (this != &rhs)
memcpy(m_buf, rhs.m_buf, sizeof(m_buf));
return *this;
}
inline packed_uint& operator= (unsigned int val)
{
inline packed_uint& operator=(unsigned int val) {
#ifdef CRNLIB_BUILD_DEBUG
if (N == 1)
{
if (N == 1) {
CRNLIB_ASSERT(val <= 0xFFU);
}
else if (N == 2)
{
} else if (N == 2) {
CRNLIB_ASSERT(val <= 0xFFFFU);
}
else if (N == 3)
{
} else if (N == 3) {
CRNLIB_ASSERT(val <= 0xFFFFFFU);
}
#endif
val <<= (8U * (4U - N));
for (unsigned int i = 0; i < N; i++)
{
for (unsigned int i = 0; i < N; i++) {
m_buf[i] = static_cast<unsigned char>(val >> 24U);
val <<= 8U;
}
@@ -48,44 +38,42 @@ namespace crnlib
return *this;
}
inline operator unsigned int() const
{
switch (N)
{
case 1: return m_buf[0];
case 2: return (m_buf[0] << 8U) | m_buf[1];
case 3: return (m_buf[0] << 16U) | (m_buf[1] << 8U) | (m_buf[2]);
default: return (m_buf[0] << 24U) | (m_buf[1] << 16U) | (m_buf[2] << 8U) | (m_buf[3]);
inline operator unsigned int() const {
switch (N) {
case 1:
return m_buf[0];
case 2:
return (m_buf[0] << 8U) | m_buf[1];
case 3:
return (m_buf[0] << 16U) | (m_buf[1] << 8U) | (m_buf[2]);
default:
return (m_buf[0] << 24U) | (m_buf[1] << 16U) | (m_buf[2] << 8U) | (m_buf[3]);
}
}
unsigned char m_buf[N];
};
template <typename T>
class packed_value
{
class packed_value {
public:
packed_value() {}
packed_value(T val) { *this = val; }
inline operator T() const
{
inline operator T() const {
T result = 0;
for (int i = sizeof(T) - 1; i >= 0; i--)
result = static_cast<T>((result << 8) | m_bytes[i]);
return result;
}
packed_value& operator= (T val)
{
for (int i = 0; i < sizeof(T); i++)
{
packed_value& operator=(T val) {
for (int i = 0; i < sizeof(T); i++) {
m_bytes[i] = static_cast<uint8>(val);
val >>= 8;
}
return *this;
}
private:
uint8 m_bytes[sizeof(T)];
};
} // namespace crnlib
+164 -111
View File
@@ -4,10 +4,8 @@
#include "crn_pixel_format.h"
#include "crn_image.h"
namespace crnlib
{
namespace pixel_format_helpers
{
namespace crnlib {
namespace pixel_format_helpers {
const pixel_format g_all_pixel_formats[] =
{
PIXEL_FMT_DXT1,
@@ -24,167 +22,198 @@ namespace crnlib
PIXEL_FMT_DXT5_AGBR,
PIXEL_FMT_DXT1A,
PIXEL_FMT_ETC1,
PIXEL_FMT_ETC2,
PIXEL_FMT_ETC2A,
PIXEL_FMT_ETC1S,
PIXEL_FMT_ETC2AS,
PIXEL_FMT_R8G8B8,
PIXEL_FMT_L8,
PIXEL_FMT_A8,
PIXEL_FMT_A8L8,
PIXEL_FMT_A8R8G8B8
};
PIXEL_FMT_A8R8G8B8};
uint get_num_formats()
{
uint get_num_formats() {
return sizeof(g_all_pixel_formats) / sizeof(g_all_pixel_formats[0]);
}
pixel_format get_pixel_format_by_index(uint index)
{
pixel_format get_pixel_format_by_index(uint index) {
CRNLIB_ASSERT(index < get_num_formats());
return g_all_pixel_formats[index];
}
const char* get_pixel_format_string(pixel_format fmt)
{
switch (fmt)
{
case PIXEL_FMT_INVALID: return "INVALID";
case PIXEL_FMT_DXT1: return "DXT1";
case PIXEL_FMT_DXT1A: return "DXT1A";
case PIXEL_FMT_DXT2: return "DXT2";
case PIXEL_FMT_DXT3: return "DXT3";
case PIXEL_FMT_DXT4: return "DXT4";
case PIXEL_FMT_DXT5: return "DXT5";
case PIXEL_FMT_3DC: return "3DC";
case PIXEL_FMT_DXN: return "DXN";
case PIXEL_FMT_DXT5A: return "DXT5A";
case PIXEL_FMT_DXT5_CCxY: return "DXT5_CCxY";
case PIXEL_FMT_DXT5_xGxR: return "DXT5_xGxR";
case PIXEL_FMT_DXT5_xGBR: return "DXT5_xGBR";
case PIXEL_FMT_DXT5_AGBR: return "DXT5_AGBR";
case PIXEL_FMT_ETC1: return "ETC1";
case PIXEL_FMT_R8G8B8: return "R8G8B8";
case PIXEL_FMT_A8R8G8B8: return "A8R8G8B8";
case PIXEL_FMT_A8: return "A8";
case PIXEL_FMT_L8: return "L8";
case PIXEL_FMT_A8L8: return "A8L8";
default: break;
const char* get_pixel_format_string(pixel_format fmt) {
switch (fmt) {
case PIXEL_FMT_INVALID:
return "INVALID";
case PIXEL_FMT_DXT1:
return "DXT1";
case PIXEL_FMT_DXT1A:
return "DXT1A";
case PIXEL_FMT_DXT2:
return "DXT2";
case PIXEL_FMT_DXT3:
return "DXT3";
case PIXEL_FMT_DXT4:
return "DXT4";
case PIXEL_FMT_DXT5:
return "DXT5";
case PIXEL_FMT_3DC:
return "3DC";
case PIXEL_FMT_DXN:
return "DXN";
case PIXEL_FMT_DXT5A:
return "DXT5A";
case PIXEL_FMT_DXT5_CCxY:
return "DXT5_CCxY";
case PIXEL_FMT_DXT5_xGxR:
return "DXT5_xGxR";
case PIXEL_FMT_DXT5_xGBR:
return "DXT5_xGBR";
case PIXEL_FMT_DXT5_AGBR:
return "DXT5_AGBR";
case PIXEL_FMT_ETC1:
return "ETC1";
case PIXEL_FMT_ETC2:
return "ETC2";
case PIXEL_FMT_ETC2A:
return "ETC2A";
case PIXEL_FMT_ETC1S:
return "ETC1S";
case PIXEL_FMT_ETC2AS:
return "ETC2AS";
case PIXEL_FMT_R8G8B8:
return "R8G8B8";
case PIXEL_FMT_A8R8G8B8:
return "A8R8G8B8";
case PIXEL_FMT_A8:
return "A8";
case PIXEL_FMT_L8:
return "L8";
case PIXEL_FMT_A8L8:
return "A8L8";
default:
break;
}
CRNLIB_ASSERT(false);
return "?";
}
const char* get_crn_format_string(crn_format fmt)
{
switch (fmt)
{
case cCRNFmtDXT1: return "DXT1";
case cCRNFmtDXT3: return "DXT3";
case cCRNFmtDXT5: return "DXT5";
case cCRNFmtDXT5_CCxY: return "DXT5_CCxY";
case cCRNFmtDXT5_xGBR: return "DXT5_xGBR";
case cCRNFmtDXT5_AGBR: return "DXT5_AGBR";
case cCRNFmtDXT5_xGxR: return "DXT5_xGxR";
case cCRNFmtDXN_XY: return "DXN_XY";
case cCRNFmtDXN_YX: return "DXN_YX";
case cCRNFmtDXT5A: return "DXT5A";
case cCRNFmtETC1: return "ETC1";
default: break;
const char* get_crn_format_string(crn_format fmt) {
switch (fmt) {
case cCRNFmtDXT1:
return "DXT1";
case cCRNFmtDXT3:
return "DXT3";
case cCRNFmtDXT5:
return "DXT5";
case cCRNFmtDXT5_CCxY:
return "DXT5_CCxY";
case cCRNFmtDXT5_xGBR:
return "DXT5_xGBR";
case cCRNFmtDXT5_AGBR:
return "DXT5_AGBR";
case cCRNFmtDXT5_xGxR:
return "DXT5_xGxR";
case cCRNFmtDXN_XY:
return "DXN_XY";
case cCRNFmtDXN_YX:
return "DXN_YX";
case cCRNFmtDXT5A:
return "DXT5A";
case cCRNFmtETC1:
return "ETC1";
case cCRNFmtETC2:
return "ETC2";
case cCRNFmtETC2A:
return "ETC2A";
case cCRNFmtETC1S:
return "ETC1S";
case cCRNFmtETC2AS:
return "ETC2AS";
default:
break;
}
CRNLIB_ASSERT(false);
return "?";
}
component_flags get_component_flags(pixel_format fmt)
{
component_flags get_component_flags(pixel_format fmt) {
// These flags are for *uncooked* pixels, i.e. after after adding Z to DXN maps, or converting YCC maps to RGB, etc.
uint flags = cCompFlagRValid | cCompFlagGValid | cCompFlagBValid | cCompFlagAValid | cCompFlagGrayscale;
switch (fmt)
{
switch (fmt) {
case PIXEL_FMT_DXT1:
case PIXEL_FMT_ETC1:
{
case PIXEL_FMT_ETC2:
case PIXEL_FMT_ETC1S: {
flags = cCompFlagRValid | cCompFlagGValid | cCompFlagBValid;
break;
}
case PIXEL_FMT_DXT1A:
{
case PIXEL_FMT_DXT1A: {
flags = cCompFlagRValid | cCompFlagGValid | cCompFlagBValid | cCompFlagAValid;
break;
}
case PIXEL_FMT_DXT2:
case PIXEL_FMT_DXT3:
{
case PIXEL_FMT_DXT3: {
flags = cCompFlagRValid | cCompFlagGValid | cCompFlagBValid | cCompFlagAValid;
break;
}
case PIXEL_FMT_DXT4:
case PIXEL_FMT_DXT5:
{
case PIXEL_FMT_ETC2A:
case PIXEL_FMT_ETC2AS: {
flags = cCompFlagRValid | cCompFlagGValid | cCompFlagBValid | cCompFlagAValid;
break;
}
case PIXEL_FMT_DXT5A:
{
case PIXEL_FMT_DXT5A: {
flags = cCompFlagAValid;
break;
}
case PIXEL_FMT_DXT5_CCxY:
{
case PIXEL_FMT_DXT5_CCxY: {
flags = cCompFlagRValid | cCompFlagGValid | cCompFlagBValid | cCompFlagLumaChroma;
break;
}
case PIXEL_FMT_DXT5_xGBR:
{
case PIXEL_FMT_DXT5_xGBR: {
flags = cCompFlagRValid | cCompFlagGValid | cCompFlagBValid | cCompFlagNormalMap;
break;
}
case PIXEL_FMT_DXT5_AGBR:
{
case PIXEL_FMT_DXT5_AGBR: {
flags = cCompFlagRValid | cCompFlagGValid | cCompFlagBValid | cCompFlagAValid | cCompFlagNormalMap;
break;
}
case PIXEL_FMT_DXT5_xGxR:
{
case PIXEL_FMT_DXT5_xGxR: {
flags = cCompFlagRValid | cCompFlagGValid | cCompFlagBValid | cCompFlagNormalMap;
break;
}
case PIXEL_FMT_3DC:
{
case PIXEL_FMT_3DC: {
flags = cCompFlagRValid | cCompFlagGValid | cCompFlagBValid | cCompFlagNormalMap;
break;
}
case PIXEL_FMT_DXN:
{
case PIXEL_FMT_DXN: {
flags = cCompFlagRValid | cCompFlagGValid | cCompFlagBValid | cCompFlagNormalMap;
break;
}
case PIXEL_FMT_R8G8B8:
{
case PIXEL_FMT_R8G8B8: {
flags = cCompFlagRValid | cCompFlagGValid | cCompFlagBValid;
break;
}
case PIXEL_FMT_A8R8G8B8:
{
case PIXEL_FMT_A8R8G8B8: {
flags = cCompFlagRValid | cCompFlagGValid | cCompFlagBValid | cCompFlagAValid;
break;
}
case PIXEL_FMT_A8:
{
case PIXEL_FMT_A8: {
flags = cCompFlagAValid;
break;
}
case PIXEL_FMT_L8:
{
case PIXEL_FMT_L8: {
flags = cCompFlagRValid | cCompFlagGValid | cCompFlagBValid | cCompFlagGrayscale;
break;
}
case PIXEL_FMT_A8L8:
{
case PIXEL_FMT_A8L8: {
flags = cCompFlagRValid | cCompFlagGValid | cCompFlagBValid | cCompFlagAValid | cCompFlagGrayscale;
break;
}
default:
{
default: {
CRNLIB_ASSERT(0);
break;
}
@@ -192,11 +221,9 @@ namespace crnlib
return static_cast<component_flags>(flags);
}
crn_format convert_pixel_format_to_best_crn_format(pixel_format crn_fmt)
{
crn_format convert_pixel_format_to_best_crn_format(pixel_format crn_fmt) {
crn_format fmt = cCRNFmtDXT1;
switch (crn_fmt)
{
switch (crn_fmt) {
case PIXEL_FMT_DXT1:
case PIXEL_FMT_DXT1A:
fmt = cCRNFmtDXT1;
@@ -240,8 +267,19 @@ namespace crnlib
case PIXEL_FMT_ETC1:
fmt = cCRNFmtETC1;
break;
default:
{
case PIXEL_FMT_ETC2:
fmt = cCRNFmtETC2;
break;
case PIXEL_FMT_ETC2A:
fmt = cCRNFmtETC2A;
break;
case PIXEL_FMT_ETC1S:
fmt = cCRNFmtETC1S;
break;
case PIXEL_FMT_ETC2AS:
fmt = cCRNFmtETC2AS;
break;
default: {
CRNLIB_ASSERT(false);
break;
}
@@ -249,23 +287,39 @@ namespace crnlib
return fmt;
}
pixel_format convert_crn_format_to_pixel_format(crn_format fmt)
{
switch (fmt)
{
case cCRNFmtDXT1: return PIXEL_FMT_DXT1;
case cCRNFmtDXT3: return PIXEL_FMT_DXT3;
case cCRNFmtDXT5: return PIXEL_FMT_DXT5;
case cCRNFmtDXT5_CCxY: return PIXEL_FMT_DXT5_CCxY;
case cCRNFmtDXT5_xGxR: return PIXEL_FMT_DXT5_xGxR;
case cCRNFmtDXT5_xGBR: return PIXEL_FMT_DXT5_xGBR;
case cCRNFmtDXT5_AGBR: return PIXEL_FMT_DXT5_AGBR;
case cCRNFmtDXN_XY: return PIXEL_FMT_DXN;
case cCRNFmtDXN_YX: return PIXEL_FMT_3DC;
case cCRNFmtDXT5A: return PIXEL_FMT_DXT5A;
case cCRNFmtETC1: return PIXEL_FMT_ETC1;
default:
{
pixel_format convert_crn_format_to_pixel_format(crn_format fmt) {
switch (fmt) {
case cCRNFmtDXT1:
return PIXEL_FMT_DXT1;
case cCRNFmtDXT3:
return PIXEL_FMT_DXT3;
case cCRNFmtDXT5:
return PIXEL_FMT_DXT5;
case cCRNFmtDXT5_CCxY:
return PIXEL_FMT_DXT5_CCxY;
case cCRNFmtDXT5_xGxR:
return PIXEL_FMT_DXT5_xGxR;
case cCRNFmtDXT5_xGBR:
return PIXEL_FMT_DXT5_xGBR;
case cCRNFmtDXT5_AGBR:
return PIXEL_FMT_DXT5_AGBR;
case cCRNFmtDXN_XY:
return PIXEL_FMT_DXN;
case cCRNFmtDXN_YX:
return PIXEL_FMT_3DC;
case cCRNFmtDXT5A:
return PIXEL_FMT_DXT5A;
case cCRNFmtETC1:
return PIXEL_FMT_ETC1;
case cCRNFmtETC2:
return PIXEL_FMT_ETC2;
case cCRNFmtETC2A:
return PIXEL_FMT_ETC2A;
case cCRNFmtETC1S:
return PIXEL_FMT_ETC1S;
case cCRNFmtETC2AS:
return PIXEL_FMT_ETC2AS;
default: {
CRNLIB_ASSERT(false);
break;
}
@@ -277,4 +331,3 @@ namespace crnlib
} // namespace pixel_format
} // namespace crnlib
+184 -116
View File
@@ -5,10 +5,8 @@
#include "../inc/crnlib.h"
#include "../inc/dds_defs.h"
namespace crnlib
{
namespace pixel_format_helpers
{
namespace crnlib {
namespace pixel_format_helpers {
uint get_num_formats();
pixel_format get_pixel_format_by_index(uint index);
@@ -16,29 +14,25 @@ namespace crnlib
const char* get_crn_format_string(crn_format fmt);
inline bool is_grayscale(pixel_format fmt)
{
switch (fmt)
{
inline bool is_grayscale(pixel_format fmt) {
switch (fmt) {
case PIXEL_FMT_L8:
case PIXEL_FMT_A8L8:
return true;
default: break;
default:
break;
}
return false;
}
inline bool is_dxt1(pixel_format fmt)
{
inline bool is_dxt1(pixel_format fmt) {
return (fmt == PIXEL_FMT_DXT1) || (fmt == PIXEL_FMT_DXT1A);
}
// has_alpha() should probably be called "has_opacity()" - it indicates if the format encodes opacity
// because some swizzled DXT5 formats do not encode opacity.
inline bool has_alpha(pixel_format fmt)
{
switch (fmt)
{
inline bool has_alpha(pixel_format fmt) {
switch (fmt) {
case PIXEL_FMT_DXT1A:
case PIXEL_FMT_DXT2:
case PIXEL_FMT_DXT3:
@@ -49,43 +43,42 @@ namespace crnlib
case PIXEL_FMT_A8:
case PIXEL_FMT_A8L8:
case PIXEL_FMT_DXT5_AGBR:
case PIXEL_FMT_ETC2A:
case PIXEL_FMT_ETC2AS:
return true;
default: break;
default:
break;
}
return false;
}
inline bool is_alpha_only(pixel_format fmt)
{
switch (fmt)
{
inline bool is_alpha_only(pixel_format fmt) {
switch (fmt) {
case PIXEL_FMT_A8:
case PIXEL_FMT_DXT5A:
return true;
default: break;
default:
break;
}
return false;
}
inline bool is_normal_map(pixel_format fmt)
{
switch (fmt)
{
inline bool is_normal_map(pixel_format fmt) {
switch (fmt) {
case PIXEL_FMT_3DC:
case PIXEL_FMT_DXN:
case PIXEL_FMT_DXT5_xGBR:
case PIXEL_FMT_DXT5_xGxR:
case PIXEL_FMT_DXT5_AGBR:
return true;
default: break;
default:
break;
}
return false;
}
inline int is_dxt(pixel_format fmt)
{
switch (fmt)
{
inline int is_dxt(pixel_format fmt) {
switch (fmt) {
case PIXEL_FMT_DXT1:
case PIXEL_FMT_DXT1A:
case PIXEL_FMT_DXT2:
@@ -100,16 +93,19 @@ namespace crnlib
case PIXEL_FMT_DXT5_xGBR:
case PIXEL_FMT_DXT5_AGBR:
case PIXEL_FMT_ETC1:
case PIXEL_FMT_ETC2:
case PIXEL_FMT_ETC2A:
case PIXEL_FMT_ETC1S:
case PIXEL_FMT_ETC2AS:
return true;
default: break;
default:
break;
}
return false;
}
inline int get_fundamental_format(pixel_format fmt)
{
switch (fmt)
{
inline int get_fundamental_format(pixel_format fmt) {
switch (fmt) {
case PIXEL_FMT_DXT1A:
return PIXEL_FMT_DXT1;
case PIXEL_FMT_DXT5_CCxY:
@@ -117,38 +113,58 @@ namespace crnlib
case PIXEL_FMT_DXT5_xGBR:
case PIXEL_FMT_DXT5_AGBR:
return PIXEL_FMT_DXT5;
default: break;
default:
break;
}
return fmt;
}
inline dxt_format get_dxt_format(pixel_format fmt)
{
switch (fmt)
{
case PIXEL_FMT_DXT1: return cDXT1;
case PIXEL_FMT_DXT1A: return cDXT1A;
case PIXEL_FMT_DXT2: return cDXT3;
case PIXEL_FMT_DXT3: return cDXT3;
case PIXEL_FMT_DXT4: return cDXT5;
case PIXEL_FMT_DXT5: return cDXT5;
case PIXEL_FMT_3DC: return cDXN_YX;
case PIXEL_FMT_DXT5A: return cDXT5A;
case PIXEL_FMT_DXN: return cDXN_XY;
case PIXEL_FMT_DXT5_CCxY: return cDXT5;
case PIXEL_FMT_DXT5_xGxR: return cDXT5;
case PIXEL_FMT_DXT5_xGBR: return cDXT5;
case PIXEL_FMT_DXT5_AGBR: return cDXT5;
case PIXEL_FMT_ETC1: return cETC1;
default: break;
inline dxt_format get_dxt_format(pixel_format fmt) {
switch (fmt) {
case PIXEL_FMT_DXT1:
return cDXT1;
case PIXEL_FMT_DXT1A:
return cDXT1A;
case PIXEL_FMT_DXT2:
return cDXT3;
case PIXEL_FMT_DXT3:
return cDXT3;
case PIXEL_FMT_DXT4:
return cDXT5;
case PIXEL_FMT_DXT5:
return cDXT5;
case PIXEL_FMT_3DC:
return cDXN_YX;
case PIXEL_FMT_DXT5A:
return cDXT5A;
case PIXEL_FMT_DXN:
return cDXN_XY;
case PIXEL_FMT_DXT5_CCxY:
return cDXT5;
case PIXEL_FMT_DXT5_xGxR:
return cDXT5;
case PIXEL_FMT_DXT5_xGBR:
return cDXT5;
case PIXEL_FMT_DXT5_AGBR:
return cDXT5;
case PIXEL_FMT_ETC1:
return cETC1;
case PIXEL_FMT_ETC2:
return cETC2;
case PIXEL_FMT_ETC2A:
return cETC2A;
case PIXEL_FMT_ETC1S:
return cETC1S;
case PIXEL_FMT_ETC2AS:
return cETC2AS;
default:
break;
}
return cDXTInvalid;
}
inline pixel_format from_dxt_format(dxt_format dxt_fmt)
{
switch (dxt_fmt)
{
inline pixel_format from_dxt_format(dxt_format dxt_fmt) {
switch (dxt_fmt) {
case cDXT1:
return PIXEL_FMT_DXT1;
case cDXT1A:
@@ -165,16 +181,23 @@ namespace crnlib
return PIXEL_FMT_DXT5A;
case cETC1:
return PIXEL_FMT_ETC1;
default: break;
case cETC2:
return PIXEL_FMT_ETC2;
case cETC2A:
return PIXEL_FMT_ETC2A;
case cETC1S:
return PIXEL_FMT_ETC1S;
case cETC2AS:
return PIXEL_FMT_ETC2AS;
default:
break;
}
CRNLIB_ASSERT(false);
return PIXEL_FMT_INVALID;
}
inline bool is_pixel_format_non_srgb(pixel_format fmt)
{
switch (fmt)
{
inline bool is_pixel_format_non_srgb(pixel_format fmt) {
switch (fmt) {
case PIXEL_FMT_3DC:
case PIXEL_FMT_DXN:
case PIXEL_FMT_DXT5A:
@@ -183,15 +206,14 @@ namespace crnlib
case PIXEL_FMT_DXT5_xGBR:
case PIXEL_FMT_DXT5_AGBR:
return true;
default: break;
default:
break;
}
return false;
}
inline bool is_crn_format_non_srgb(crn_format fmt)
{
switch (fmt)
{
inline bool is_crn_format_non_srgb(crn_format fmt) {
switch (fmt) {
case cCRNFmtDXN_XY:
case cCRNFmtDXN_YX:
case cCRNFmtDXT5A:
@@ -200,66 +222,113 @@ namespace crnlib
case cCRNFmtDXT5_xGBR:
case cCRNFmtDXT5_AGBR:
return true;
default: break;
default:
break;
}
return false;
}
inline uint get_bpp(pixel_format fmt)
{
switch (fmt)
{
case PIXEL_FMT_DXT1: return 4;
case PIXEL_FMT_DXT1A: return 4;
case PIXEL_FMT_ETC1: return 4;
case PIXEL_FMT_DXT2: return 8;
case PIXEL_FMT_DXT3: return 8;
case PIXEL_FMT_DXT4: return 8;
case PIXEL_FMT_DXT5: return 8;
case PIXEL_FMT_3DC: return 8;
case PIXEL_FMT_DXT5A: return 4;
case PIXEL_FMT_R8G8B8: return 24;
case PIXEL_FMT_A8R8G8B8: return 32;
case PIXEL_FMT_A8: return 8;
case PIXEL_FMT_L8: return 8;
case PIXEL_FMT_A8L8: return 16;
case PIXEL_FMT_DXN: return 8;
case PIXEL_FMT_DXT5_CCxY: return 8;
case PIXEL_FMT_DXT5_xGxR: return 8;
case PIXEL_FMT_DXT5_xGBR: return 8;
case PIXEL_FMT_DXT5_AGBR: return 8;
default: break;
inline uint get_bpp(pixel_format fmt) {
switch (fmt) {
case PIXEL_FMT_DXT1:
return 4;
case PIXEL_FMT_DXT1A:
return 4;
case PIXEL_FMT_ETC1:
return 4;
case PIXEL_FMT_ETC2:
return 4;
case PIXEL_FMT_ETC2A:
return 8;
case PIXEL_FMT_ETC1S:
return 4;
case PIXEL_FMT_ETC2AS:
return 8;
case PIXEL_FMT_DXT2:
return 8;
case PIXEL_FMT_DXT3:
return 8;
case PIXEL_FMT_DXT4:
return 8;
case PIXEL_FMT_DXT5:
return 8;
case PIXEL_FMT_3DC:
return 8;
case PIXEL_FMT_DXT5A:
return 4;
case PIXEL_FMT_R8G8B8:
return 24;
case PIXEL_FMT_A8R8G8B8:
return 32;
case PIXEL_FMT_A8:
return 8;
case PIXEL_FMT_L8:
return 8;
case PIXEL_FMT_A8L8:
return 16;
case PIXEL_FMT_DXN:
return 8;
case PIXEL_FMT_DXT5_CCxY:
return 8;
case PIXEL_FMT_DXT5_xGxR:
return 8;
case PIXEL_FMT_DXT5_xGBR:
return 8;
case PIXEL_FMT_DXT5_AGBR:
return 8;
default:
break;
}
CRNLIB_ASSERT(false);
return 0;
};
inline uint get_dxt_bytes_per_block(pixel_format fmt)
{
switch (fmt)
{
case PIXEL_FMT_DXT1: return 8;
case PIXEL_FMT_DXT1A: return 8;
case PIXEL_FMT_DXT5A: return 8;
case PIXEL_FMT_ETC1: return 8;
case PIXEL_FMT_DXT2: return 16;
case PIXEL_FMT_DXT3: return 16;
case PIXEL_FMT_DXT4: return 16;
case PIXEL_FMT_DXT5: return 16;
case PIXEL_FMT_3DC: return 16;
case PIXEL_FMT_DXN: return 16;
case PIXEL_FMT_DXT5_CCxY: return 16;
case PIXEL_FMT_DXT5_xGxR: return 16;
case PIXEL_FMT_DXT5_xGBR: return 16;
case PIXEL_FMT_DXT5_AGBR: return 16;
default: break;
inline uint get_dxt_bytes_per_block(pixel_format fmt) {
switch (fmt) {
case PIXEL_FMT_DXT1:
return 8;
case PIXEL_FMT_DXT1A:
return 8;
case PIXEL_FMT_DXT5A:
return 8;
case PIXEL_FMT_ETC1:
return 8;
case PIXEL_FMT_ETC2:
return 8;
case PIXEL_FMT_ETC2A:
return 16;
case PIXEL_FMT_ETC1S:
return 8;
case PIXEL_FMT_ETC2AS:
return 16;
case PIXEL_FMT_DXT2:
return 16;
case PIXEL_FMT_DXT3:
return 16;
case PIXEL_FMT_DXT4:
return 16;
case PIXEL_FMT_DXT5:
return 16;
case PIXEL_FMT_3DC:
return 16;
case PIXEL_FMT_DXN:
return 16;
case PIXEL_FMT_DXT5_CCxY:
return 16;
case PIXEL_FMT_DXT5_xGxR:
return 16;
case PIXEL_FMT_DXT5_xGBR:
return 16;
case PIXEL_FMT_DXT5_AGBR:
return 16;
default:
break;
}
CRNLIB_ASSERT(false);
return 0;
}
enum component_flags
{
enum component_flags {
cCompFlagRValid = 1,
cCompFlagGValid = 2,
cCompFlagBValid = 4,
@@ -281,4 +350,3 @@ namespace crnlib
} // namespace pixel_format_helpers
} // namespace crnlib
+11 -22
View File
@@ -6,8 +6,7 @@
#include "crn_winhdr.h"
#endif
#ifndef _MSC_VER
int sprintf_s(char *buffer, size_t sizeOfBuffer, const char *format, ...)
{
int sprintf_s(char* buffer, size_t sizeOfBuffer, const char* format, ...) {
if (!sizeOfBuffer)
return 0;
@@ -24,8 +23,7 @@ int sprintf_s(char *buffer, size_t sizeOfBuffer, const char *format, ...)
return CRNLIB_MIN(c, (int)sizeOfBuffer - 1);
}
int vsprintf_s(char *buffer, size_t sizeOfBuffer, const char *format, va_list args)
{
int vsprintf_s(char* buffer, size_t sizeOfBuffer, const char* format, va_list args) {
if (!sizeOfBuffer)
return 0;
@@ -39,22 +37,18 @@ int vsprintf_s(char *buffer, size_t sizeOfBuffer, const char *format, va_list ar
return CRNLIB_MIN(c, (int)sizeOfBuffer - 1);
}
char* strlwr(char* p)
{
char* strlwr(char* p) {
char* q = p;
while (*q)
{
while (*q) {
char c = *q;
*q++ = tolower(c);
}
return p;
}
char* strupr(char *p)
{
char* strupr(char* p) {
char* q = p;
while (*q)
{
while (*q) {
char c = *q;
*q++ = toupper(c);
}
@@ -62,31 +56,26 @@ char* strupr(char *p)
}
#endif // __GNUC__
void crnlib_debug_break(void)
{
void crnlib_debug_break(void) {
CRNLIB_BREAKPOINT
}
#if CRNLIB_USE_WIN32_API
#include "crn_winhdr.h"
bool crnlib_is_debugger_present(void)
{
bool crnlib_is_debugger_present(void) {
return IsDebuggerPresent() != 0;
}
void crnlib_output_debug_string(const char* p)
{
void crnlib_output_debug_string(const char* p) {
OutputDebugStringA(p);
}
#else
bool crnlib_is_debugger_present(void)
{
bool crnlib_is_debugger_present(void) {
return false;
}
void crnlib_output_debug_string(const char* p)
{
void crnlib_output_debug_string(const char* p) {
puts(p);
}
#endif // CRNLIB_USE_WIN32_API
+9 -8
View File
@@ -65,11 +65,14 @@ const bool c_crnlib_big_endian_platform = !c_crnlib_little_endian_platform;
#define _strnicmp strncasecmp
#endif
inline bool crnlib_is_little_endian() { return c_crnlib_little_endian_platform; }
inline bool crnlib_is_big_endian() { return c_crnlib_big_endian_platform; }
inline bool crnlib_is_little_endian() {
return c_crnlib_little_endian_platform;
}
inline bool crnlib_is_big_endian() {
return c_crnlib_big_endian_platform;
}
inline bool crnlib_is_pc()
{
inline bool crnlib_is_pc() {
#ifdef CRNLIB_PLATFORM_PC
return true;
#else
@@ -77,8 +80,7 @@ inline bool crnlib_is_pc()
#endif
}
inline bool crnlib_is_x86()
{
inline bool crnlib_is_x86() {
#ifdef CRNLIB_PLATFORM_PC_X86
return true;
#else
@@ -86,8 +88,7 @@ inline bool crnlib_is_x86()
#endif
}
inline bool crnlib_is_x64()
{
inline bool crnlib_is_x64() {
#ifdef CRNLIB_PLATFORM_PC_X64
return true;
#else
+40 -83
View File
@@ -8,13 +8,10 @@
//#define TEST_DECODER_TABLES
#endif
namespace crnlib
{
namespace crnlib {
namespace prefix_coding
{
bool limit_max_code_size(uint num_syms, uint8* pCodesizes, uint max_code_size)
{
namespace prefix_coding {
bool limit_max_code_size(uint num_syms, uint8* pCodesizes, uint max_code_size) {
const uint cMaxEverCodeSize = 34;
if ((!num_syms) || (num_syms > cMaxSupportedSyms) || (max_code_size < 1) || (max_code_size > cMaxEverCodeSize))
@@ -25,11 +22,9 @@ namespace crnlib
bool should_limit = false;
for (uint i = 0; i < num_syms; i++)
{
for (uint i = 0; i < num_syms; i++) {
uint c = pCodesizes[i];
if (c)
{
if (c) {
CRNLIB_ASSERT(c <= cMaxEverCodeSize);
num_codes[c]++;
@@ -43,8 +38,7 @@ namespace crnlib
uint ofs = 0;
uint next_sorted_ofs[cMaxEverCodeSize + 1];
for (uint i = 1; i <= cMaxEverCodeSize; i++)
{
for (uint i = 1; i <= cMaxEverCodeSize; i++) {
next_sorted_ofs[i] = ofs;
ofs += num_codes[i];
}
@@ -67,13 +61,11 @@ namespace crnlib
if (total == (1U << max_code_size))
return true;
do
{
do {
num_codes[max_code_size]--;
uint i;
for (i = max_code_size - 1; i; --i)
{
for (i = max_code_size - 1; i; --i) {
if (!num_codes[i])
continue;
num_codes[i]--;
@@ -88,21 +80,17 @@ namespace crnlib
uint8 new_codesizes[cMaxSupportedSyms];
uint8* p = new_codesizes;
for (uint i = 1; i <= max_code_size; i++)
{
for (uint i = 1; i <= max_code_size; i++) {
uint n = num_codes[i];
if (n)
{
if (n) {
memset(p, i, n);
p += n;
}
}
for (uint i = 0; i < num_syms; i++)
{
for (uint i = 0; i < num_syms; i++) {
const uint c = pCodesizes[i];
if (c)
{
if (c) {
uint ofs = next_sorted_ofs[c];
next_sorted_ofs[c] = ofs + 1;
@@ -113,16 +101,13 @@ namespace crnlib
return true;
}
bool generate_codes(uint num_syms, const uint8* pCodesizes, uint16* pCodes)
{
bool generate_codes(uint num_syms, const uint8* pCodesizes, uint16* pCodes) {
uint num_codes[cMaxExpectedCodeSize + 1];
utils::zero_object(num_codes);
for (uint i = 0; i < num_syms; i++)
{
for (uint i = 0; i < num_syms; i++) {
uint c = pCodesizes[i];
if (c)
{
if (c) {
CRNLIB_ASSERT(c <= cMaxExpectedCodeSize);
num_codes[c]++;
}
@@ -133,29 +118,24 @@ namespace crnlib
uint next_code[cMaxExpectedCodeSize + 1];
next_code[0] = 0;
for (uint i = 1; i <= cMaxExpectedCodeSize; i++)
{
for (uint i = 1; i <= cMaxExpectedCodeSize; i++) {
next_code[i] = code;
code = (code + num_codes[i]) << 1;
}
if (code != (1 << (cMaxExpectedCodeSize + 1)))
{
if (code != (1 << (cMaxExpectedCodeSize + 1))) {
uint t = 0;
for (uint i = 1; i <= cMaxExpectedCodeSize; i++)
{
for (uint i = 1; i <= cMaxExpectedCodeSize; i++) {
t += num_codes[i];
if (t > 1)
return false;
}
}
for (uint i = 0; i < num_syms; i++)
{
for (uint i = 0; i < num_syms; i++) {
uint c = pCodesizes[i];
if (c)
{
if (c) {
CRNLIB_ASSERT(next_code[c] <= cUINT16_MAX);
pCodes[i] = static_cast<uint16>(next_code[c]++);
@@ -166,8 +146,7 @@ namespace crnlib
return true;
}
bool generate_decoder_tables(uint num_syms, const uint8* pCodesizes, decoder_tables* pTables, uint table_bits)
{
bool generate_decoder_tables(uint num_syms, const uint8* pCodesizes, decoder_tables* pTables, uint table_bits) {
uint min_codes[cMaxExpectedCodeSize];
if ((!num_syms) || (table_bits > cMaxTableBits))
@@ -178,8 +157,7 @@ namespace crnlib
uint num_codes[cMaxExpectedCodeSize + 1];
utils::zero_object(num_codes);
for (uint i = 0; i < num_syms; i++)
{
for (uint i = 0; i < num_syms; i++) {
uint c = pCodesizes[i];
if (c)
num_codes[c]++;
@@ -192,14 +170,12 @@ namespace crnlib
uint total_used_syms = 0;
uint max_code_size = 0;
uint min_code_size = UINT_MAX;
for (uint i = 1; i <= cMaxExpectedCodeSize; i++)
{
for (uint i = 1; i <= cMaxExpectedCodeSize; i++) {
const uint n = num_codes[i];
if (!n)
pTables->m_max_codes[i - 1] = 0; //UINT_MAX;
else
{
else {
min_code_size = math::minimum(min_code_size, i);
max_code_size = math::maximum(max_code_size, i);
@@ -221,15 +197,13 @@ namespace crnlib
pTables->m_total_used_syms = total_used_syms;
if (total_used_syms > pTables->m_cur_sorted_symbol_order_size)
{
if (total_used_syms > pTables->m_cur_sorted_symbol_order_size) {
pTables->m_cur_sorted_symbol_order_size = total_used_syms;
if (!math::is_power_of_2(total_used_syms))
pTables->m_cur_sorted_symbol_order_size = math::minimum<uint>(num_syms, math::next_pow2(total_used_syms));
if (pTables->m_sorted_symbol_order)
{
if (pTables->m_sorted_symbol_order) {
crnlib_delete_array(pTables->m_sorted_symbol_order);
pTables->m_sorted_symbol_order = NULL;
}
@@ -240,11 +214,9 @@ namespace crnlib
pTables->m_min_code_size = static_cast<uint8>(min_code_size);
pTables->m_max_code_size = static_cast<uint8>(max_code_size);
for (uint i = 0; i < num_syms; i++)
{
for (uint i = 0; i < num_syms; i++) {
uint c = pCodesizes[i];
if (c)
{
if (c) {
CRNLIB_ASSERT(num_codes[c]);
uint sorted_pos = sorted_positions[c]++;
@@ -259,15 +231,12 @@ namespace crnlib
table_bits = 0;
pTables->m_table_bits = table_bits;
if (table_bits)
{
if (table_bits) {
uint table_size = 1 << table_bits;
if (table_size > pTables->m_cur_lookup_size)
{
if (table_size > pTables->m_cur_lookup_size) {
pTables->m_cur_lookup_size = table_size;
if (pTables->m_lookup)
{
if (pTables->m_lookup) {
crnlib_delete_array(pTables->m_lookup);
pTables->m_lookup = NULL;
}
@@ -277,8 +246,7 @@ namespace crnlib
memset(pTables->m_lookup, 0xFF, static_cast<uint>(sizeof(pTables->m_lookup[0])) * (1UL << table_bits));
for (uint codesize = 1; codesize <= table_bits; codesize++)
{
for (uint codesize = 1; codesize <= table_bits; codesize++) {
if (!num_codes[codesize])
continue;
@@ -289,13 +257,11 @@ namespace crnlib
const uint max_code = pTables->get_unshifted_max_code(codesize);
const uint val_ptr = pTables->m_val_ptrs[codesize - 1];
for (uint code = min_code; code <= max_code; code++)
{
for (uint code = min_code; code <= max_code; code++) {
const uint sym_index = pTables->m_sorted_symbol_order[val_ptr + code - min_code];
CRNLIB_ASSERT(pCodesizes[sym_index] == codesize);
for (uint j = 0; j < fillnum; j++)
{
for (uint j = 0; j < fillnum; j++) {
const uint t = j + (code << fillsize);
CRNLIB_ASSERT(t < (1U << table_bits));
@@ -314,24 +280,18 @@ namespace crnlib
pTables->m_table_max_code = 0;
pTables->m_decode_start_code_size = pTables->m_min_code_size;
if (table_bits)
{
if (table_bits) {
uint i;
for (i = table_bits; i >= 1; i--)
{
if (num_codes[i])
{
for (i = table_bits; i >= 1; i--) {
if (num_codes[i]) {
pTables->m_table_max_code = pTables->m_max_codes[i - 1];
break;
}
}
if (i >= 1)
{
if (i >= 1) {
pTables->m_decode_start_code_size = table_bits + 1;
for (uint i = table_bits + 1; i <= max_code_size; i++)
{
if (num_codes[i])
{
for (uint i = table_bits + 1; i <= max_code_size; i++) {
if (num_codes[i]) {
pTables->m_decode_start_code_size = i;
break;
}
@@ -350,7 +310,4 @@ namespace crnlib
} // namespace prefix_codig
} // namespace crnlib
+15 -28
View File
@@ -2,10 +2,8 @@
// See Copyright Notice and license at the end of inc/crnlib.h
#pragma once
namespace crnlib
{
namespace prefix_coding
{
namespace crnlib {
namespace prefix_coding {
const uint cMaxExpectedCodeSize = 16;
const uint cMaxSupportedSyms = 8192;
const uint cMaxTableBits = 11;
@@ -14,22 +12,18 @@ namespace crnlib
bool generate_codes(uint num_syms, const uint8* pCodesizes, uint16* pCodes);
class decoder_tables
{
class decoder_tables {
public:
inline decoder_tables() :
m_table_shift(0), m_table_max_code(0), m_decode_start_code_size(0), m_cur_lookup_size(0), m_lookup(NULL), m_cur_sorted_symbol_order_size(0), m_sorted_symbol_order(NULL)
{
inline decoder_tables()
: m_table_shift(0), m_table_max_code(0), m_decode_start_code_size(0), m_cur_lookup_size(0), m_lookup(NULL), m_cur_sorted_symbol_order_size(0), m_sorted_symbol_order(NULL) {
}
inline decoder_tables(const decoder_tables& other) :
m_table_shift(0), m_table_max_code(0), m_decode_start_code_size(0), m_cur_lookup_size(0), m_lookup(NULL), m_cur_sorted_symbol_order_size(0), m_sorted_symbol_order(NULL)
{
inline decoder_tables(const decoder_tables& other)
: m_table_shift(0), m_table_max_code(0), m_decode_start_code_size(0), m_cur_lookup_size(0), m_lookup(NULL), m_cur_sorted_symbol_order_size(0), m_sorted_symbol_order(NULL) {
*this = other;
}
decoder_tables& operator= (const decoder_tables& other)
{
decoder_tables& operator=(const decoder_tables& other) {
if (this == &other)
return *this;
@@ -37,14 +31,12 @@ namespace crnlib
memcpy(this, &other, sizeof(*this));
if (other.m_lookup)
{
if (other.m_lookup) {
m_lookup = crnlib_new_array<uint32>(m_cur_lookup_size);
memcpy(m_lookup, other.m_lookup, sizeof(m_lookup[0]) * m_cur_lookup_size);
}
if (other.m_sorted_symbol_order)
{
if (other.m_sorted_symbol_order) {
m_sorted_symbol_order = crnlib_new_array<uint16>(m_cur_sorted_symbol_order_size);
memcpy(m_sorted_symbol_order, other.m_sorted_symbol_order, sizeof(m_sorted_symbol_order[0]) * m_cur_sorted_symbol_order_size);
}
@@ -52,25 +44,21 @@ namespace crnlib
return *this;
}
inline void clear()
{
if (m_lookup)
{
inline void clear() {
if (m_lookup) {
crnlib_delete_array(m_lookup);
m_lookup = 0;
m_cur_lookup_size = 0;
}
if (m_sorted_symbol_order)
{
if (m_sorted_symbol_order) {
crnlib_delete_array(m_sorted_symbol_order);
m_sorted_symbol_order = NULL;
m_cur_sorted_symbol_order_size = 0;
}
}
inline ~decoder_tables()
{
inline ~decoder_tables() {
if (m_lookup)
crnlib_delete_array(m_lookup);
@@ -99,8 +87,7 @@ namespace crnlib
uint m_cur_sorted_symbol_order_size;
uint16* m_sorted_symbol_order;
inline uint get_unshifted_max_code(uint len) const
{
inline uint get_unshifted_max_code(uint len) const {
CRNLIB_ASSERT((len >= 1) && (len <= cMaxExpectedCodeSize));
uint k = m_max_codes[len - 1];
if (!k)
+104 -174
View File
@@ -9,10 +9,9 @@
#define GENERATE_DEBUG_IMAGES 0
namespace crnlib
{
qdxt1::qdxt1(task_pool& task_pool) :
m_pTask_pool(&task_pool),
namespace crnlib {
qdxt1::qdxt1(task_pool& task_pool)
: m_pTask_pool(&task_pool),
m_main_thread_id(0),
m_canceled(false),
m_progress_start(0),
@@ -23,16 +22,13 @@ namespace crnlib
m_elements_per_block(0),
m_max_selector_clusters(0),
m_prev_percentage_complete(-1),
m_selector_clusterizer(task_pool)
{
m_selector_clusterizer(task_pool) {
}
qdxt1::~qdxt1()
{
qdxt1::~qdxt1() {
}
void qdxt1::clear()
{
void qdxt1::clear() {
m_main_thread_id = 0;
m_num_blocks = 0;
m_pBlocks = 0;
@@ -55,8 +51,7 @@ namespace crnlib
m_prev_percentage_complete = -1;
}
bool qdxt1::init(uint n, const dxt_pixel_block* pBlocks, const qdxt1_params& params)
{
bool qdxt1::init(uint n, const dxt_pixel_block* pBlocks, const qdxt1_params& params) {
clear();
CRNLIB_ASSERT(n && pBlocks);
@@ -75,8 +70,7 @@ namespace crnlib
const bool debugging = false;
image_u8 debug_img;
if ((m_params.m_hierarchical) && (m_params.m_num_mips))
{
if ((m_params.m_hierarchical) && (m_params.m_num_mips)) {
vec6F_clusterizer::training_vec_array& training_vecs = m_endpoint_clusterizer.get_training_vecs();
training_vecs.resize(m_num_blocks);
@@ -86,8 +80,7 @@ namespace crnlib
uint total_processed_blocks = 0;
uint next_progress_threshold = 512;
for (uint level = 0; level < m_params.m_num_mips; level++)
{
for (uint level = 0; level < m_params.m_num_mips; level++) {
const qdxt1_params::mip_desc& level_desc = m_params.m_mip_desc[level];
const uint num_chunks_x = (level_desc.m_block_width + cChunkBlockWidth - 1) / cChunkBlockWidth;
@@ -100,24 +93,19 @@ namespace crnlib
debug_img.resize(num_chunks_x * cChunkPixelWidth, num_chunks_y * cChunkPixelHeight);
float adaptive_tile_color_psnr_derating = 1.5f; // was 2.4f
if ((level) && (adaptive_tile_color_psnr_derating > .25f))
{
if ((level) && (adaptive_tile_color_psnr_derating > .25f)) {
adaptive_tile_color_psnr_derating = math::maximum(.25f, adaptive_tile_color_psnr_derating / powf(3.1f, static_cast<float>(level))); // was 3.0f
}
for (uint chunk_y = 0; chunk_y < num_chunks_y; chunk_y++)
{
for (uint chunk_x = 0; chunk_x < num_chunks_x; chunk_x++)
{
for (uint chunk_y = 0; chunk_y < num_chunks_y; chunk_y++) {
for (uint chunk_x = 0; chunk_x < num_chunks_x; chunk_x++) {
color_quad_u8 chunk_pixels[cChunkPixelWidth * cChunkPixelHeight];
for (uint y = 0; y < cChunkPixelHeight; y++)
{
for (uint y = 0; y < cChunkPixelHeight; y++) {
const uint pix_y = math::minimum<uint>(chunk_y * cChunkPixelHeight + y, level_height - 1);
const uint outer_block_index = level_desc.m_first_block + ((pix_y >> 2) * level_desc.m_block_width);
for (uint x = 0; x < cChunkPixelWidth; x++)
{
for (uint x = 0; x < cChunkPixelWidth; x++) {
const uint pix_x = math::minimum<uint>(chunk_x * cChunkPixelWidth + x, level_width - 1);
const uint block_index = outer_block_index + (pix_x >> 2);
@@ -130,8 +118,7 @@ namespace crnlib
}
}
struct layout_results
{
struct layout_results {
uint m_low_color;
uint m_high_color;
uint8 m_selectors[cChunkPixelWidth * cChunkPixelHeight];
@@ -140,8 +127,7 @@ namespace crnlib
};
layout_results layouts[cNumChunkTileLayouts];
for (uint l = 0; l < cNumChunkTileLayouts; l++)
{
for (uint l = 0; l < cNumChunkTileLayouts; l++) {
const uint width = g_chunk_tile_layouts[l].m_width;
const uint height = g_chunk_tile_layouts[l].m_height;
const uint x_ofs = g_chunk_tile_layouts[l].m_x_ofs;
@@ -185,8 +171,7 @@ namespace crnlib
double best_peak_snr = -1.0f;
uint best_encoding = 0;
for (uint e = 0; e < cNumChunkEncodings; e++)
{
for (uint e = 0; e < cNumChunkEncodings; e++) {
const chunk_encoding_desc& encoding_desc = g_chunk_encodings[e];
double total_error = 0;
@@ -211,8 +196,7 @@ namespace crnlib
//for (uint t = 0; t < encoding_desc.m_num_tiles; t++)
// peak_snr -= (double)layouts[encoding_desc.m_tiles[t].m_layout_index].m_penalty;
if (peak_snr > best_peak_snr)
{
if (peak_snr > best_peak_snr) {
best_peak_snr = peak_snr;
best_encoding = e;
}
@@ -222,8 +206,7 @@ namespace crnlib
const chunk_encoding_desc& encoding_desc = g_chunk_encodings[best_encoding];
for (uint t = 0; t < encoding_desc.m_num_tiles; t++)
{
for (uint t = 0; t < encoding_desc.m_num_tiles; t++) {
const chunk_tile_desc& tile_desc = encoding_desc.m_tiles[t];
uint layout_index = tile_desc.m_layout_index;
@@ -234,12 +217,10 @@ namespace crnlib
color_quad_u8 tile_pixels[cChunkPixelWidth * cChunkPixelHeight];
for (uint y = 0; y < tile_desc.m_height; y++)
{
for (uint y = 0; y < tile_desc.m_height; y++) {
const uint pix_y = y + tile_desc.m_y_ofs;
for (uint x = 0; x < tile_desc.m_width; x++)
{
for (uint x = 0; x < tile_desc.m_width; x++) {
const uint pix_x = x + tile_desc.m_x_ofs;
tile_pixels[x + y * tile_desc.m_width] = chunk_pixels[pix_x + pix_y * cChunkPixelWidth];
@@ -261,17 +242,19 @@ namespace crnlib
vec6F ev;
ev[0] = l[0]; ev[1] = l[1]; ev[2] = l[2];
ev[3] = h[0]; ev[4] = h[1]; ev[5] = h[2];
ev[0] = l[0];
ev[1] = l[1];
ev[2] = l[2];
ev[3] = h[0];
ev[4] = h[1];
ev[5] = h[2];
for (uint y = 0; y < (tile_desc.m_height >> 2); y++)
{
for (uint y = 0; y < (tile_desc.m_height >> 2); y++) {
uint block_y = chunk_y * cChunkBlockHeight + y + (tile_desc.m_y_ofs >> 2);
if (block_y >= level_desc.m_block_height)
continue;
for (uint x = 0; x < (tile_desc.m_width >> 2); x++)
{
for (uint x = 0; x < (tile_desc.m_width >> 2); x++) {
uint block_x = chunk_x * cChunkBlockWidth + x + (tile_desc.m_x_ofs >> 2);
if (block_x >= level_desc.m_block_width)
break;
@@ -293,8 +276,7 @@ namespace crnlib
} // y
} //t
if (total_processed_blocks >= next_progress_threshold)
{
if (total_processed_blocks >= next_progress_threshold) {
next_progress_threshold += 512;
if (!update_progress(total_processed_blocks, m_num_blocks - 1))
@@ -317,13 +299,9 @@ namespace crnlib
trace("%u ", encoding_hist[i]);
trace("\n");
#endif
}
else
{
for (uint block_index = 0; block_index < m_num_blocks; block_index++)
{
if ((block_index & 511) == 0)
{
} else {
for (uint block_index = 0; block_index < m_num_blocks; block_index++) {
if ((block_index & 511) == 0) {
if (!update_progress(block_index, m_num_blocks - 1))
return false;
}
@@ -340,8 +318,12 @@ namespace crnlib
vec6F ev;
ev[0] = l[0]; ev[1] = l[1]; ev[2] = l[2];
ev[3] = h[0]; ev[4] = h[1]; ev[5] = h[2];
ev[0] = l[0];
ev[1] = l[1];
ev[2] = l[2];
ev[3] = h[0];
ev[4] = h[1];
ev[5] = h[2];
m_endpoint_clusterizer.add_training_vec(ev, weight);
}
@@ -360,10 +342,8 @@ namespace crnlib
m_progress_start = 95;
m_progress_range = 5;
for (uint block_index = 0; block_index < m_num_blocks; block_index++)
{
if ((block_index & 511) == 0)
{
for (uint block_index = 0; block_index < m_num_blocks; block_index++) {
if ((block_index & 511) == 0) {
if (!update_progress(block_index, m_num_blocks - 1))
return false;
}
@@ -386,8 +366,7 @@ namespace crnlib
return true;
}
bool qdxt1::update_progress(uint value, uint max_value)
{
bool qdxt1::update_progress(uint value, uint max_value) {
if (!m_params.m_pProgress_func)
return true;
@@ -396,8 +375,7 @@ namespace crnlib
return true;
m_prev_percentage_complete = percentage;
if (!m_params.m_pProgress_func(m_params.m_progress_start + (percentage * m_params.m_progress_range) / 100U, m_params.m_pProgress_data))
{
if (!m_params.m_pProgress_func(m_params.m_progress_start + (percentage * m_params.m_progress_range) / 100U, m_params.m_pProgress_data)) {
m_canceled = true;
return false;
}
@@ -405,9 +383,7 @@ namespace crnlib
return true;
}
void qdxt1::pack_endpoints_task(uint64 data, void* pData_ptr)
{
pData_ptr;
void qdxt1::pack_endpoints_task(uint64 data, void*) {
const uint thread_index = static_cast<uint>(data);
crnlib::vector<color_quad_u8> cluster_pixels;
@@ -433,22 +409,18 @@ namespace crnlib
cluster_id cid;
const crnlib::vector<uint32>& indices = cid.m_cells;
for (uint cluster_index = 0; cluster_index < m_endpoint_cluster_indices.size(); cluster_index++)
{
for (uint cluster_index = 0; cluster_index < m_endpoint_cluster_indices.size(); cluster_index++) {
if (m_canceled)
return;
if ((cluster_index & cluster_index_progress_mask) == 0)
{
if (crn_get_current_thread_id() == m_main_thread_id)
{
if ((cluster_index & cluster_index_progress_mask) == 0) {
if (crn_get_current_thread_id() == m_main_thread_id) {
if (!update_progress(cluster_index, m_endpoint_cluster_indices.size() - 1))
return;
}
}
if (m_pTask_pool->get_num_threads())
{
if (m_pTask_pool->get_num_threads()) {
if ((cluster_index % (m_pTask_pool->get_num_threads() + 1)) != thread_index)
continue;
}
@@ -466,8 +438,7 @@ namespace crnlib
scoped_spinlock lock(m_cluster_hash_lock);
cluster_hash::const_iterator it(m_cluster_hash.find(cid));
if (it != m_cluster_hash.end())
{
if (it != m_cluster_hash.end()) {
CRNLIB_ASSERT(cid == it->first);
found = true;
@@ -475,8 +446,7 @@ namespace crnlib
}
}
if (found)
{
if (found) {
const uint16 low_color = static_cast<uint16>(found_endpoints);
const uint16 high_color = static_cast<uint16>((found_endpoints >> 16U));
@@ -485,22 +455,19 @@ namespace crnlib
const bool is_alpha_block = (low_color <= high_color);
for (uint block_iter = 0; block_iter < indices.size(); block_iter++)
{
for (uint block_iter = 0; block_iter < indices.size(); block_iter++) {
const uint block_index = indices[block_iter];
const color_quad_u8* pSrc_pixels = &m_pBlocks[block_index].m_pixels[0][0];
for (uint i = 0; i < cDXTBlockSize * cDXTBlockSize; i++)
{
for (uint i = 0; i < cDXTBlockSize * cDXTBlockSize; i++) {
dxt1_block& dxt_block = get_block(block_index);
dxt_block.set_low_color(static_cast<uint16>(low_color));
dxt_block.set_high_color(static_cast<uint16>(high_color));
uint mask = 0;
for (int i = 15; i >= 0; i--)
{
for (int i = 15; i >= 0; i--) {
mask <<= 2;
const color_quad_u8& c = pSrc_pixels[i];
@@ -511,16 +478,21 @@ namespace crnlib
uint selector = 0, best_dist = dist0;
if (dist1 < best_dist) { selector = 1; best_dist = dist1; }
if (dist2 < best_dist) { selector = 2; best_dist = dist2; }
if (!is_alpha_block)
{
uint dist3 = color::color_distance(m_params.m_perceptual, c, block_colors[3], false);
if (dist3 < best_dist) { selector = 3; }
if (dist1 < best_dist) {
selector = 1;
best_dist = dist1;
}
else
{
if (dist2 < best_dist) {
selector = 2;
best_dist = dist2;
}
if (!is_alpha_block) {
uint dist3 = color::color_distance(m_params.m_perceptual, c, block_colors[3], false);
if (dist3 < best_dist) {
selector = 3;
}
} else {
if (c.a < m_params.m_dxt1a_alpha_threshold)
selector = 3;
}
@@ -534,24 +506,20 @@ namespace crnlib
dxt_block.m_selectors[3] = static_cast<uint8>((mask >> 24) & 0xFF);
}
}
}
else
{
} else {
cluster_pixels.resize(indices.size() * cDXTBlockSize * cDXTBlockSize);
color_quad_u8* pDst = &cluster_pixels[0];
bool has_alpha_pixels = false;
for (uint block_iter = 0; block_iter < indices.size(); block_iter++)
{
for (uint block_iter = 0; block_iter < indices.size(); block_iter++) {
const uint block_index = indices[block_iter];
//const color_quad_u8* pSrc_pixels = &m_pBlocks[block_index].m_pixels[0][0];
const color_quad_u8* pSrc_pixels = (const color_quad_u8*)m_pBlocks[block_index].m_pixels;
for (uint i = 0; i < cDXTBlockSize * cDXTBlockSize; i++)
{
for (uint i = 0; i < cDXTBlockSize * cDXTBlockSize; i++) {
const color_quad_u8& src = pSrc_pixels[i];
if (src.a < m_params.m_dxt1a_alpha_threshold)
@@ -568,23 +536,19 @@ namespace crnlib
r.m_pSelectors = selectors.begin();
uint low_color, high_color;
if ((m_params.m_dxt_quality != cCRNDXTQualitySuperFast) || (has_alpha_pixels))
{
if ((m_params.m_dxt_quality != cCRNDXTQualitySuperFast) || (has_alpha_pixels)) {
p.m_pixels_have_alpha = has_alpha_pixels;
optimizer.compute(p, r);
low_color = r.m_low_color;
high_color = r.m_high_color;
}
else
{
} else {
dxt_fast::compress_color_block(cluster_pixels.size(), cluster_pixels.begin(), low_color, high_color, selectors.begin(), true);
}
const uint8* pSrc_selectors = selectors.begin();
for (uint block_iter = 0; block_iter < indices.size(); block_iter++)
{
for (uint block_iter = 0; block_iter < indices.size(); block_iter++) {
const uint block_index = indices[block_iter];
dxt1_block& dxt_block = get_block(block_index);
@@ -593,8 +557,7 @@ namespace crnlib
dxt_block.set_high_color(static_cast<uint16>(high_color));
uint mask = 0;
for (int i = 15; i >= 0; i--)
{
for (int i = 15; i >= 0; i--) {
mask <<= 2;
mask |= pSrc_selectors[i];
}
@@ -604,7 +567,6 @@ namespace crnlib
dxt_block.m_selectors[1] = static_cast<uint8>((mask >> 8) & 0xFF);
dxt_block.m_selectors[2] = static_cast<uint8>((mask >> 16) & 0xFF);
dxt_block.m_selectors[3] = static_cast<uint8>((mask >> 24) & 0xFF);
}
{
@@ -613,25 +575,21 @@ namespace crnlib
m_cluster_hash.insert(cid, low_color | (high_color << 16));
}
}
}
}
struct optimize_selectors_params
{
struct optimize_selectors_params {
CRNLIB_NO_COPY_OR_ASSIGNMENT_OP(optimize_selectors_params);
optimize_selectors_params(
crnlib::vector< crnlib::vector<uint> >& selector_cluster_indices) :
m_selector_cluster_indices(selector_cluster_indices)
{
crnlib::vector<crnlib::vector<uint> >& selector_cluster_indices)
: m_selector_cluster_indices(selector_cluster_indices) {
}
crnlib::vector<crnlib::vector<uint> >& m_selector_cluster_indices;
};
void qdxt1::optimize_selectors_task(uint64 data, void* pData_ptr)
{
void qdxt1::optimize_selectors_task(uint64 data, void* pData_ptr) {
const uint thread_index = static_cast<uint>(data);
optimize_selectors_params& task_params = *static_cast<optimize_selectors_params*>(pData_ptr);
@@ -640,22 +598,18 @@ namespace crnlib
block_categories[0].reserve(2048);
block_categories[1].reserve(2048);
for (uint cluster_index = 0; cluster_index < task_params.m_selector_cluster_indices.size(); cluster_index++)
{
for (uint cluster_index = 0; cluster_index < task_params.m_selector_cluster_indices.size(); cluster_index++) {
if (m_canceled)
return;
if ((cluster_index & 255) == 0)
{
if (crn_get_current_thread_id() == m_main_thread_id)
{
if ((cluster_index & 255) == 0) {
if (crn_get_current_thread_id() == m_main_thread_id) {
if (!update_progress(cluster_index, task_params.m_selector_cluster_indices.size() - 1))
return;
}
}
if (m_pTask_pool->get_num_threads())
{
if (m_pTask_pool->get_num_threads()) {
if ((cluster_index % (m_pTask_pool->get_num_threads() + 1)) != thread_index)
continue;
}
@@ -668,27 +622,22 @@ namespace crnlib
block_categories[0].resize(0);
block_categories[1].resize(0);
for (uint block_iter = 0; block_iter < selector_indices.size(); block_iter++)
{
for (uint block_iter = 0; block_iter < selector_indices.size(); block_iter++) {
const uint block_index = selector_indices[block_iter];
const dxt1_block& src_block = get_block(block_index);
if (!src_block.is_alpha_block())
block_categories[0].push_back(block_index);
else
{
else {
bool has_alpha_pixels = false;
if (m_params.m_dxt1a_alpha_threshold > 0)
{
if (m_params.m_dxt1a_alpha_threshold > 0) {
const color_quad_u8* pSrc_pixels = (const color_quad_u8*)m_pBlocks[block_index].m_pixels;
for (uint i = 0; i < cDXTBlockSize * cDXTBlockSize; i++)
{
for (uint i = 0; i < cDXTBlockSize * cDXTBlockSize; i++) {
const color_quad_u8& src = pSrc_pixels[i];
if (src.a < m_params.m_dxt1a_alpha_threshold)
{
if (src.a < m_params.m_dxt1a_alpha_threshold) {
has_alpha_pixels = true;
break;
}
@@ -705,16 +654,13 @@ namespace crnlib
dxt1_block blk;
utils::zero_object(blk);
for (uint block_type = 0; block_type <= 1; block_type++)
{
for (uint block_type = 0; block_type <= 1; block_type++) {
const crnlib::vector<uint>& block_indices = block_categories[block_type];
if (block_indices.size() <= 1)
continue;
for (uint y = 0; y < 4; y++)
{
for (uint x = 0; x < 4; x++)
{
for (uint y = 0; y < 4; y++) {
for (uint x = 0; x < 4; x++) {
uint best_s = 0;
uint64 best_error = 0xFFFFFFFFFFULL;
@@ -722,12 +668,10 @@ namespace crnlib
if (block_type == 1)
max_s = 3;
for (uint s = 0; s < max_s; s++)
{
for (uint s = 0; s < max_s; s++) {
uint64 total_error = 0;
for (uint block_iter = 0; block_iter < block_indices.size(); block_iter++)
{
for (uint block_iter = 0; block_iter < block_indices.size(); block_iter++) {
const uint block_index = block_indices[block_iter];
const color_quad_u8& orig_color = m_pBlocks[block_index].m_pixels[y][x];
@@ -742,8 +686,7 @@ namespace crnlib
total_error += error;
}
if (total_error < best_error)
{
if (total_error < best_error) {
best_error = total_error;
best_s = s;
}
@@ -754,8 +697,7 @@ namespace crnlib
} // x
} // y
for (uint block_iter = 0; block_iter < block_indices.size(); block_iter++)
{
for (uint block_iter = 0; block_iter < block_indices.size(); block_iter++) {
const uint block_index = block_indices[block_iter];
dxt1_block& dst_block = get_block(block_index);
@@ -767,20 +709,17 @@ namespace crnlib
} // cluster_index
}
bool qdxt1::generate_codebook_progress_callback(uint percentage_completed, void* pData)
{
bool qdxt1::generate_codebook_progress_callback(uint percentage_completed, void* pData) {
return static_cast<qdxt1*>(pData)->update_progress(percentage_completed, 100U);
}
bool qdxt1::create_selector_clusters(uint max_selector_clusters, crnlib::vector< crnlib::vector<uint> >& selector_cluster_indices)
{
bool qdxt1::create_selector_clusters(uint max_selector_clusters, crnlib::vector<crnlib::vector<uint> >& selector_cluster_indices) {
m_progress_start = m_progress_range;
m_progress_range = 33;
weighted_selector_vec_array selector_vecs(m_num_blocks);
for (uint block_iter = 0; block_iter < m_num_blocks; block_iter++)
{
for (uint block_iter = 0; block_iter < m_num_blocks; block_iter++) {
dxt1_block& dxt1_block = get_block(block_iter);
vec16F sv;
@@ -806,8 +745,7 @@ namespace crnlib
selector_vecs, max_selector_clusters, selector_cluster_indices, generate_codebook_progress_callback, this);
}
bool qdxt1::pack(dxt1_block* pDst_elements, uint elements_per_block, const qdxt1_params& params, float quality_power_mul)
{
bool qdxt1::pack(dxt1_block* pDst_elements, uint elements_per_block, const qdxt1_params& params, float quality_power_mul) {
CRNLIB_ASSERT(m_num_blocks);
m_main_thread_id = crn_get_current_thread_id();
@@ -831,24 +769,20 @@ namespace crnlib
const uint max_endpoint_clusters = math::clamp<uint>(static_cast<uint>(m_endpoint_clusterizer.get_codebook_size() * endpoint_quality), 96U, m_endpoint_clusterizer.get_codebook_size());
const uint max_selector_clusters = math::clamp<uint>(static_cast<uint>(m_max_selector_clusters * selector_quality), 128U, m_max_selector_clusters);
if (quality >= 1.0f)
{
if (quality >= 1.0f) {
m_endpoint_cluster_indices.resize(m_num_blocks);
for (uint i = 0; i < m_num_blocks; i++)
{
for (uint i = 0; i < m_num_blocks; i++) {
m_endpoint_cluster_indices[i].resize(1);
m_endpoint_cluster_indices[i][0] = i;
}
}
else
} else
m_endpoint_clusterizer.retrieve_clusters(max_endpoint_clusters, m_endpoint_cluster_indices);
// trace("endpoint clusters: %u\n", m_endpoint_cluster_indices.size());
uint total_blocks = 0;
uint max_blocks = 0;
for (uint i = 0; i < m_endpoint_cluster_indices.size(); i++)
{
for (uint i = 0; i < m_endpoint_cluster_indices.size(); i++) {
uint num = m_endpoint_cluster_indices[i].size();
total_blocks += num;
max_blocks = math::maximum(max_blocks, num);
@@ -880,12 +814,10 @@ namespace crnlib
if (quality >= 1.0f)
return true;
if (selector_cluster_indices.empty())
{
if (selector_cluster_indices.empty()) {
create_selector_clusters(max_selector_clusters, selector_cluster_indices);
if (m_canceled)
{
if (m_canceled) {
selector_cluster_indices.clear();
return false;
@@ -906,5 +838,3 @@ namespace crnlib
}
} // namespace crnlib
+14 -27
View File
@@ -8,17 +8,13 @@
#include "crn_threaded_clusterizer.h"
#include "crn_dxt_image.h"
namespace crnlib
{
struct qdxt1_params
{
qdxt1_params()
{
namespace crnlib {
struct qdxt1_params {
qdxt1_params() {
clear();
}
void clear()
{
void clear() {
m_quality_level = cMaxQuality;
m_dxt_quality = cCRNDXTQualityUber;
m_perceptual = true;
@@ -33,8 +29,7 @@ namespace crnlib
m_progress_range = 100;
}
void init(const dxt_image::pack_params &pp, int quality_level, bool hierarchical)
{
void init(const dxt_image::pack_params& pp, int quality_level, bool hierarchical) {
m_dxt_quality = pp.m_quality;
m_hierarchical = hierarchical;
m_perceptual = pp.m_perceptual;
@@ -52,8 +47,7 @@ namespace crnlib
bool m_use_alpha_blocks;
bool m_hierarchical;
struct mip_desc
{
struct mip_desc {
uint m_first_block;
uint m_block_width;
uint m_block_height;
@@ -70,8 +64,7 @@ namespace crnlib
uint m_progress_range;
};
class qdxt1
{
class qdxt1 {
CRNLIB_NO_COPY_OR_ASSIGNMENT_OP(qdxt1);
public:
@@ -122,20 +115,16 @@ namespace crnlib
crnlib::vector<crnlib::vector<uint> > m_cached_selector_cluster_indices[qdxt1_params::cMaxQuality + 1];
struct cluster_id
{
cluster_id() : m_hash(0)
{
struct cluster_id {
cluster_id()
: m_hash(0) {
}
cluster_id(const crnlib::vector<uint>& indices)
{
cluster_id(const crnlib::vector<uint>& indices) {
set(indices);
}
void set(const crnlib::vector<uint>& indices)
{
void set(const crnlib::vector<uint>& indices) {
m_cells.resize(indices.size());
for (uint i = 0; i < indices.size(); i++)
@@ -146,13 +135,11 @@ namespace crnlib
m_hash = fast_hash(&m_cells[0], sizeof(m_cells[0]) * m_cells.size());
}
bool operator< (const cluster_id& rhs) const
{
bool operator<(const cluster_id& rhs) const {
return m_cells < rhs.m_cells;
}
bool operator== (const cluster_id& rhs) const
{
bool operator==(const cluster_id& rhs) const {
if (m_hash != rhs.m_hash)
return false;
+79 -159
View File
@@ -10,10 +10,9 @@
#define QDXT5_DEBUGGING 0
namespace crnlib
{
qdxt5::qdxt5(task_pool& task_pool) :
m_pTask_pool(&task_pool),
namespace crnlib {
qdxt5::qdxt5(task_pool& task_pool)
: m_pTask_pool(&task_pool),
m_main_thread_id(0),
m_canceled(false),
m_progress_start(0),
@@ -24,16 +23,13 @@ namespace crnlib
m_elements_per_block(0),
m_max_selector_clusters(0),
m_prev_percentage_complete(-1),
m_selector_clusterizer(task_pool)
{
m_selector_clusterizer(task_pool) {
}
qdxt5::~qdxt5()
{
qdxt5::~qdxt5() {
}
void qdxt5::clear()
{
void qdxt5::clear() {
m_main_thread_id = 0;
m_num_blocks = 0;
m_pBlocks = 0;
@@ -56,8 +52,7 @@ namespace crnlib
m_prev_percentage_complete = -1;
}
bool qdxt5::init(uint n, const dxt_pixel_block* pBlocks, const qdxt5_params& params)
{
bool qdxt5::init(uint n, const dxt_pixel_block* pBlocks, const qdxt5_params& params) {
clear();
CRNLIB_ASSERT(n && pBlocks);
@@ -77,8 +72,7 @@ namespace crnlib
const bool debugging = true;
if ((m_params.m_hierarchical) && (m_params.m_num_mips))
{
if ((m_params.m_hierarchical) && (m_params.m_num_mips)) {
vec2F_clusterizer::training_vec_array& training_vecs = m_endpoint_clusterizer.get_training_vecs();
training_vecs.resize(m_num_blocks);
@@ -88,8 +82,7 @@ namespace crnlib
uint total_processed_blocks = 0;
uint next_progress_threshold = 512;
for (uint level = 0; level < m_params.m_num_mips; level++)
{
for (uint level = 0; level < m_params.m_num_mips; level++) {
const qdxt5_params::mip_desc& level_desc = m_params.m_mip_desc[level];
const uint num_chunks_x = (level_desc.m_block_width + cChunkBlockWidth - 1) / cChunkBlockWidth;
@@ -101,20 +94,16 @@ namespace crnlib
if (debugging)
debug_img.resize(num_chunks_x * cChunkPixelWidth, num_chunks_y * cChunkPixelHeight);
for (uint chunk_y = 0; chunk_y < num_chunks_y; chunk_y++)
{
for (uint chunk_x = 0; chunk_x < num_chunks_x; chunk_x++)
{
for (uint chunk_y = 0; chunk_y < num_chunks_y; chunk_y++) {
for (uint chunk_x = 0; chunk_x < num_chunks_x; chunk_x++) {
color_quad_u8 chunk_pixels[cChunkPixelWidth * cChunkPixelHeight];
for (uint y = 0; y < cChunkPixelHeight; y++)
{
for (uint y = 0; y < cChunkPixelHeight; y++) {
const uint pix_y = math::minimum<uint>(chunk_y * cChunkPixelHeight + y, level_height - 1);
const uint outer_block_index = level_desc.m_first_block + ((pix_y >> 2) * level_desc.m_block_width);
for (uint x = 0; x < cChunkPixelWidth; x++)
{
for (uint x = 0; x < cChunkPixelWidth; x++) {
const uint pix_x = math::minimum<uint>(chunk_x * cChunkPixelWidth + x, level_width - 1);
const uint block_index = outer_block_index + (pix_x >> 2);
@@ -127,8 +116,7 @@ namespace crnlib
}
}
struct layout_results
{
struct layout_results {
uint m_low_color;
uint m_high_color;
uint8 m_selectors[cChunkPixelWidth * cChunkPixelHeight];
@@ -137,8 +125,7 @@ namespace crnlib
};
layout_results layouts[cNumChunkTileLayouts];
for (uint l = 0; l < cNumChunkTileLayouts; l++)
{
for (uint l = 0; l < cNumChunkTileLayouts; l++) {
const uint width = g_chunk_tile_layouts[l].m_width;
const uint height = g_chunk_tile_layouts[l].m_height;
const uint x_ofs = g_chunk_tile_layouts[l].m_x_ofs;
@@ -165,8 +152,7 @@ namespace crnlib
double best_peak_snr = -1.0f;
uint best_encoding = 0;
for (uint e = 0; e < cNumChunkEncodings; e++)
{
for (uint e = 0; e < cNumChunkEncodings; e++) {
const chunk_encoding_desc& encoding_desc = g_chunk_encodings[e];
double total_error = 0;
@@ -184,8 +170,7 @@ namespace crnlib
float adaptive_tile_alpha_psnr_derating = 2.4f;
//if (level)
// adaptive_tile_alpha_psnr_derating = math::lerp(adaptive_tile_alpha_psnr_derating * .5f, .3f, math::maximum((level - 1) / float(m_params.m_num_mips - 2), 1.0f));
if ((level) && (adaptive_tile_alpha_psnr_derating > .25f))
{
if ((level) && (adaptive_tile_alpha_psnr_derating > .25f)) {
adaptive_tile_alpha_psnr_derating = math::maximum(.25f, adaptive_tile_alpha_psnr_derating / powf(3.0f, static_cast<float>(level)));
}
@@ -195,8 +180,7 @@ namespace crnlib
//for (uint t = 0; t < encoding_desc.m_num_tiles; t++)
// peak_snr -= (double)layouts[encoding_desc.m_tiles[t].m_layout_index].m_penalty;
if (peak_snr > best_peak_snr)
{
if (peak_snr > best_peak_snr) {
best_peak_snr = peak_snr;
best_encoding = e;
}
@@ -206,8 +190,7 @@ namespace crnlib
const chunk_encoding_desc& encoding_desc = g_chunk_encodings[best_encoding];
for (uint t = 0; t < encoding_desc.m_num_tiles; t++)
{
for (uint t = 0; t < encoding_desc.m_num_tiles; t++) {
const chunk_tile_desc& tile_desc = encoding_desc.m_tiles[t];
uint layout_index = tile_desc.m_layout_index;
@@ -219,12 +202,10 @@ namespace crnlib
color_quad_u8 tile_pixels[cChunkPixelWidth * cChunkPixelHeight];
for (uint y = 0; y < tile_desc.m_height; y++)
{
for (uint y = 0; y < tile_desc.m_height; y++) {
const uint pix_y = y + tile_desc.m_y_ofs;
for (uint x = 0; x < tile_desc.m_width; x++)
{
for (uint x = 0; x < tile_desc.m_width; x++) {
const uint pix_x = x + tile_desc.m_x_ofs;
uint a = chunk_pixels[pix_x + pix_y * cChunkPixelWidth][m_params.m_comp_index];
@@ -250,14 +231,12 @@ namespace crnlib
ev[0] = l[0];
ev[1] = h[0];
for (uint y = 0; y < (tile_desc.m_height >> 2); y++)
{
for (uint y = 0; y < (tile_desc.m_height >> 2); y++) {
uint block_y = chunk_y * cChunkBlockHeight + y + (tile_desc.m_y_ofs >> 2);
if (block_y >= level_desc.m_block_height)
continue;
for (uint x = 0; x < (tile_desc.m_width >> 2); x++)
{
for (uint x = 0; x < (tile_desc.m_width >> 2); x++) {
uint block_x = chunk_x * cChunkBlockWidth + x + (tile_desc.m_x_ofs >> 2);
if (block_x >= level_desc.m_block_width)
break;
@@ -273,8 +252,7 @@ namespace crnlib
} // y
} //t
if (total_processed_blocks >= next_progress_threshold)
{
if (total_processed_blocks >= next_progress_threshold) {
next_progress_threshold += 512;
if (!update_progress(total_processed_blocks, m_num_blocks - 1))
@@ -297,13 +275,9 @@ namespace crnlib
trace("%u ", encoding_hist[i]);
trace("\n");
#endif
}
else
{
for (uint block_index = 0; block_index < m_num_blocks; block_index++)
{
if ((block_index & 511) == 0)
{
} else {
for (uint block_index = 0; block_index < m_num_blocks; block_index++) {
if ((block_index & 511) == 0) {
if (!update_progress(block_index, m_num_blocks - 1))
return false;
}
@@ -344,10 +318,8 @@ namespace crnlib
m_progress_start = 95;
m_progress_range = 5;
for (uint block_index = 0; block_index < m_num_blocks; block_index++)
{
if ((block_index & 511) == 0)
{
for (uint block_index = 0; block_index < m_num_blocks; block_index++) {
if ((block_index & 511) == 0) {
if (!update_progress(block_index, m_num_blocks - 1))
return false;
}
@@ -369,8 +341,7 @@ namespace crnlib
return true;
}
bool qdxt5::update_progress(uint value, uint max_value)
{
bool qdxt5::update_progress(uint value, uint max_value) {
if (!m_params.m_pProgress_func)
return true;
@@ -379,8 +350,7 @@ namespace crnlib
return true;
m_prev_percentage_complete = percentage;
if (!m_params.m_pProgress_func(m_params.m_progress_start + (percentage * m_params.m_progress_range) / 100U, m_params.m_pProgress_data))
{
if (!m_params.m_pProgress_func(m_params.m_progress_start + (percentage * m_params.m_progress_range) / 100U, m_params.m_pProgress_data)) {
m_canceled = true;
return false;
}
@@ -388,9 +358,7 @@ namespace crnlib
return true;
}
void qdxt5::pack_endpoints_task(uint64 data, void* pData_ptr)
{
pData_ptr;
void qdxt5::pack_endpoints_task(uint64 data, void*) {
const uint thread_index = static_cast<uint>(data);
crnlib::vector<color_quad_u8> cluster_pixels;
@@ -412,22 +380,18 @@ namespace crnlib
cluster_index_progress_mask = math::maximum<uint>(cluster_index_progress_mask, 8);
cluster_index_progress_mask -= 1;
for (uint cluster_index = 0; cluster_index < m_endpoint_cluster_indices.size(); cluster_index++)
{
for (uint cluster_index = 0; cluster_index < m_endpoint_cluster_indices.size(); cluster_index++) {
if (m_canceled)
return;
if ((cluster_index & cluster_index_progress_mask) == 0)
{
if (crn_get_current_thread_id() == m_main_thread_id)
{
if ((cluster_index & cluster_index_progress_mask) == 0) {
if (crn_get_current_thread_id() == m_main_thread_id) {
if (!update_progress(cluster_index, m_endpoint_cluster_indices.size() - 1))
return;
}
}
if (m_pTask_pool->get_num_threads())
{
if (m_pTask_pool->get_num_threads()) {
if ((cluster_index % (m_pTask_pool->get_num_threads() + 1)) != thread_index)
continue;
}
@@ -440,15 +404,13 @@ namespace crnlib
color_quad_u8* pDst = &cluster_pixels[0];
for (uint block_iter = 0; block_iter < cluster_indices.size(); block_iter++)
{
for (uint block_iter = 0; block_iter < cluster_indices.size(); block_iter++) {
const uint block_index = cluster_indices[block_iter];
//const color_quad_u8* pSrc_pixels = &m_pBlocks[block_index].m_pixels[0][0];
const color_quad_u8* pSrc_pixels = (const color_quad_u8*)m_pBlocks[block_index].m_pixels;
for (uint i = 0; i < cDXTBlockSize * cDXTBlockSize; i++)
{
for (uint i = 0; i < cDXTBlockSize * cDXTBlockSize; i++) {
const color_quad_u8& src = pSrc_pixels[i];
*pDst++ = src;
@@ -463,21 +425,17 @@ namespace crnlib
uint low_color;
uint high_color;
if (m_params.m_dxt_quality != cCRNDXTQualitySuperFast)
{
if (m_params.m_dxt_quality != cCRNDXTQualitySuperFast) {
optimizer.compute(p, r);
low_color = r.m_first_endpoint;
high_color = r.m_second_endpoint;
}
else
{
} else {
dxt_fast::compress_alpha_block(cluster_pixels.size(), cluster_pixels.begin(), low_color, high_color, selectors.begin(), m_params.m_comp_index);
}
const uint8* pSrc_selectors = selectors.begin();
for (uint block_iter = 0; block_iter < cluster_indices.size(); block_iter++)
{
for (uint block_iter = 0; block_iter < cluster_indices.size(); block_iter++) {
const uint block_index = cluster_indices[block_iter];
dxt5_block& dxt_block = get_block(block_index);
@@ -492,21 +450,18 @@ namespace crnlib
}
}
struct optimize_selectors_params
{
struct optimize_selectors_params {
CRNLIB_NO_COPY_OR_ASSIGNMENT_OP(optimize_selectors_params);
optimize_selectors_params(
crnlib::vector< crnlib::vector<uint> >& selector_cluster_indices) :
m_selector_cluster_indices(selector_cluster_indices)
{
crnlib::vector<crnlib::vector<uint> >& selector_cluster_indices)
: m_selector_cluster_indices(selector_cluster_indices) {
}
crnlib::vector<crnlib::vector<uint> >& m_selector_cluster_indices;
};
void qdxt5::optimize_selectors_task(uint64 data, void* pData_ptr)
{
void qdxt5::optimize_selectors_task(uint64 data, void* pData_ptr) {
const uint thread_index = static_cast<uint>(data);
optimize_selectors_params& task_params = *static_cast<optimize_selectors_params*>(pData_ptr);
@@ -515,22 +470,18 @@ namespace crnlib
block_categories[0].reserve(2048);
block_categories[1].reserve(2048);
for (uint cluster_index = 0; cluster_index < task_params.m_selector_cluster_indices.size(); cluster_index++)
{
for (uint cluster_index = 0; cluster_index < task_params.m_selector_cluster_indices.size(); cluster_index++) {
if (m_canceled)
return;
if ((cluster_index & 255) == 0)
{
if (crn_get_current_thread_id() == m_main_thread_id)
{
if ((cluster_index & 255) == 0) {
if (crn_get_current_thread_id() == m_main_thread_id) {
if (!update_progress(cluster_index, task_params.m_selector_cluster_indices.size() - 1))
return;
}
}
if (m_pTask_pool->get_num_threads())
{
if (m_pTask_pool->get_num_threads()) {
if ((cluster_index % (m_pTask_pool->get_num_threads() + 1)) != thread_index)
continue;
}
@@ -543,8 +494,7 @@ namespace crnlib
block_categories[0].resize(0);
block_categories[1].resize(0);
for (uint block_iter = 0; block_iter < selector_indices.size(); block_iter++)
{
for (uint block_iter = 0; block_iter < selector_indices.size(); block_iter++) {
const uint block_index = selector_indices[block_iter];
const dxt5_block& src_block = get_block(block_index);
@@ -555,25 +505,20 @@ namespace crnlib
dxt5_block blk;
utils::zero_object(blk);
for (uint block_type = 0; block_type <= 1; block_type++)
{
for (uint block_type = 0; block_type <= 1; block_type++) {
const crnlib::vector<uint>& block_indices = block_categories[block_type];
if (block_indices.size() <= 1)
continue;
for (uint y = 0; y < cDXTBlockSize; y++)
{
for (uint x = 0; x < cDXTBlockSize; x++)
{
for (uint y = 0; y < cDXTBlockSize; y++) {
for (uint x = 0; x < cDXTBlockSize; x++) {
uint best_s = 0;
uint64 best_error = 0xFFFFFFFFFFULL;
for (uint s = 0; s < dxt5_block::cMaxSelectorValues; s++)
{
for (uint s = 0; s < dxt5_block::cMaxSelectorValues; s++) {
uint64 total_error = 0;
for (uint block_iter = 0; block_iter < block_indices.size(); block_iter++)
{
for (uint block_iter = 0; block_iter < block_indices.size(); block_iter++) {
const uint block_index = block_indices[block_iter];
const color_quad_u8& orig_color = m_pBlocks[block_index].m_pixels[y][x];
@@ -588,8 +533,7 @@ namespace crnlib
total_error += error;
}
if (total_error < best_error)
{
if (total_error < best_error) {
best_error = total_error;
best_s = s;
}
@@ -600,8 +544,7 @@ namespace crnlib
} // x
} // y
for (uint block_iter = 0; block_iter < block_indices.size(); block_iter++)
{
for (uint block_iter = 0; block_iter < block_indices.size(); block_iter++) {
const uint block_index = block_indices[block_iter];
dxt5_block& dst_block = get_block(block_index);
@@ -613,20 +556,16 @@ namespace crnlib
} // cluster_index
}
bool qdxt5::generate_codebook_progress_callback(uint percentage_completed, void* pData)
{
bool qdxt5::generate_codebook_progress_callback(uint percentage_completed, void* pData) {
return static_cast<qdxt5*>(pData)->update_progress(percentage_completed, 100U);
}
bool qdxt5::create_selector_clusters(uint max_selector_clusters, crnlib::vector< crnlib::vector<uint> >& selector_cluster_indices)
{
bool qdxt5::create_selector_clusters(uint max_selector_clusters, crnlib::vector<crnlib::vector<uint> >& selector_cluster_indices) {
weighted_selector_vec_array selector_vecs[2];
crnlib::vector<uint> selector_vec_remap[2];
for (uint block_type = 0; block_type < 2; block_type++)
{
for (uint block_iter = 0; block_iter < m_num_blocks; block_iter++)
{
for (uint block_type = 0; block_type < 2; block_type++) {
for (uint block_iter = 0; block_iter < m_num_blocks; block_iter++) {
dxt5_block& dxt5_block = get_block(block_iter);
if ((uint)dxt5_block.is_alpha6_block() != block_type)
continue;
@@ -636,24 +575,18 @@ namespace crnlib
bool uses_absolute_values = false;
for (uint y = 0; y < 4; y++)
{
for (uint x = 0; x < 4; x++)
{
for (uint y = 0; y < 4; y++) {
for (uint x = 0; x < 4; x++) {
const uint s = dxt5_block.get_selector(x, y);
float f;
if (dxt5_block.is_alpha6_block())
{
if (s >= 6)
{
if (dxt5_block.is_alpha6_block()) {
if (s >= 6) {
uses_absolute_values = true;
f = 0.0f;
}
else
} else
f = g_dxt5_alpha6_to_linear[s];
}
else
} else
f = g_dxt5_to_linear[s];
*pDst++ = f;
@@ -681,8 +614,7 @@ namespace crnlib
selector_cluster_indices.clear();
for (uint block_type = 0; block_type < 2; block_type++)
{
for (uint block_type = 0; block_type < 2; block_type++) {
if (selector_vecs[block_type].empty())
continue;
@@ -699,28 +631,23 @@ namespace crnlib
crnlib::vector<crnlib::vector<uint> > block_type_selector_cluster_indices;
if (!block_type)
{
if (!block_type) {
m_progress_start = m_progress_range;
m_progress_range = 16;
}
else
{
} else {
m_progress_start = m_progress_range + 16;
m_progress_range = 17;
}
if (!m_selector_clusterizer.create_clusters(
selector_vecs[block_type], max_clusters, block_type_selector_cluster_indices, generate_codebook_progress_callback, this))
{
selector_vecs[block_type], max_clusters, block_type_selector_cluster_indices, generate_codebook_progress_callback, this)) {
return false;
}
const uint first_cluster = selector_cluster_indices.size();
selector_cluster_indices.enlarge(block_type_selector_cluster_indices.size());
for (uint i = 0; i < block_type_selector_cluster_indices.size(); i++)
{
for (uint i = 0; i < block_type_selector_cluster_indices.size(); i++) {
crnlib::vector<uint>& indices = selector_cluster_indices[first_cluster + i];
indices.swap(block_type_selector_cluster_indices[i]);
@@ -732,8 +659,7 @@ namespace crnlib
return true;
}
bool qdxt5::pack(dxt5_block* pDst_elements, uint elements_per_block, const qdxt5_params& params)
{
bool qdxt5::pack(dxt5_block* pDst_elements, uint elements_per_block, const qdxt5_params& params) {
CRNLIB_ASSERT(m_num_blocks);
m_main_thread_id = crn_get_current_thread_id();
@@ -758,22 +684,18 @@ namespace crnlib
trace("max selector clusters: %u\n", max_selector_clusters);
#endif
if (quality >= 1.0f)
{
if (quality >= 1.0f) {
m_endpoint_cluster_indices.resize(m_num_blocks);
for (uint i = 0; i < m_num_blocks; i++)
{
for (uint i = 0; i < m_num_blocks; i++) {
m_endpoint_cluster_indices[i].resize(1);
m_endpoint_cluster_indices[i][0] = i;
}
}
else
} else
m_endpoint_clusterizer.retrieve_clusters(max_endpoint_clusters, m_endpoint_cluster_indices);
uint total_blocks = 0;
uint max_blocks = 0;
for (uint i = 0; i < m_endpoint_cluster_indices.size(); i++)
{
for (uint i = 0; i < m_endpoint_cluster_indices.size(); i++) {
uint num = m_endpoint_cluster_indices[i].size();
total_blocks += num;
max_blocks = math::maximum(max_blocks, num);
@@ -799,12 +721,10 @@ namespace crnlib
if (quality >= 1.0f)
return true;
if (selector_cluster_indices.empty())
{
if (selector_cluster_indices.empty()) {
create_selector_clusters(max_selector_clusters, selector_cluster_indices);
if (m_canceled)
{
if (m_canceled) {
selector_cluster_indices.clear();
return false;
+14 -37
View File
@@ -8,17 +8,13 @@
#include "crn_dxt.h"
#include "crn_dxt_image.h"
namespace crnlib
{
struct qdxt5_params
{
qdxt5_params()
{
namespace crnlib {
struct qdxt5_params {
qdxt5_params() {
clear();
}
void clear()
{
void clear() {
m_quality_level = cMaxQuality;
m_dxt_quality = cCRNDXTQualityUber;
@@ -35,8 +31,7 @@ namespace crnlib
m_use_both_block_types = true;
}
void init(const dxt_image::pack_params &pp, int quality_level, bool hierarchical, int comp_index = 3)
{
void init(const dxt_image::pack_params& pp, int quality_level, bool hierarchical, int comp_index = 3) {
m_dxt_quality = pp.m_quality;
m_hierarchical = hierarchical;
m_comp_index = comp_index;
@@ -49,8 +44,7 @@ namespace crnlib
crn_dxt_quality m_dxt_quality;
bool m_hierarchical;
struct mip_desc
{
struct mip_desc {
uint m_first_block;
uint m_block_width;
uint m_block_height;
@@ -71,8 +65,7 @@ namespace crnlib
bool m_use_both_block_types;
};
class qdxt5
{
class qdxt5 {
CRNLIB_NO_COPY_OR_ASSIGNMENT_OP(qdxt5);
public:
@@ -123,20 +116,16 @@ namespace crnlib
crnlib::vector<crnlib::vector<uint> > m_cached_selector_cluster_indices[qdxt5_params::cMaxQuality + 1];
struct cluster_id
{
cluster_id() : m_hash(0)
{
struct cluster_id {
cluster_id()
: m_hash(0) {
}
cluster_id(const crnlib::vector<uint>& indices)
{
cluster_id(const crnlib::vector<uint>& indices) {
set(indices);
}
void set(const crnlib::vector<uint>& indices)
{
void set(const crnlib::vector<uint>& indices) {
m_cells.resize(indices.size());
for (uint i = 0; i < indices.size(); i++)
@@ -147,13 +136,11 @@ namespace crnlib
m_hash = fast_hash(&m_cells[0], sizeof(m_cells[0]) * m_cells.size());
}
bool operator< (const cluster_id& rhs) const
{
bool operator<(const cluster_id& rhs) const {
return m_cells < rhs.m_cells;
}
bool operator== (const cluster_id& rhs) const
{
bool operator==(const cluster_id& rhs) const {
if (m_hash != rhs.m_hash)
return false;
@@ -182,13 +169,3 @@ namespace crnlib
};
} // namespace crnlib
+37 -82
View File
@@ -2,12 +2,10 @@
// See Copyright Notice and license at the end of inc/crnlib.h
#pragma once
namespace crnlib
{
namespace crnlib {
// Returns pointer to sorted array.
template <typename T>
T* radix_sort(uint num_vals, T* pBuf0, T* pBuf1, uint key_ofs, uint key_size)
{
T* radix_sort(uint num_vals, T* pBuf0, T* pBuf1, uint key_ofs, uint key_size) {
CRNLIB_ASSERT_OPEN_RANGE(key_ofs, 0, sizeof(T));
CRNLIB_ASSERT_CLOSED_RANGE(key_size, 1, 4);
@@ -17,12 +15,10 @@ namespace crnlib
#define CRNLIB_GET_KEY(p) (*(uint*)((uint8*)(p) + key_ofs))
if (key_size == 4)
{
if (key_size == 4) {
T* p = pBuf0;
T* q = pBuf0 + num_vals;
for ( ; p != q; p++)
{
for (; p != q; p++) {
const uint key = CRNLIB_GET_KEY(p);
hist[key & 0xFF]++;
@@ -30,27 +26,21 @@ namespace crnlib
hist[512 + ((key >> 16) & 0xFF)]++;
hist[768 + ((key >> 24) & 0xFF)]++;
}
}
else if (key_size == 3)
{
} else if (key_size == 3) {
T* p = pBuf0;
T* q = pBuf0 + num_vals;
for ( ; p != q; p++)
{
for (; p != q; p++) {
const uint key = CRNLIB_GET_KEY(p);
hist[key & 0xFF]++;
hist[256 + ((key >> 8) & 0xFF)]++;
hist[512 + ((key >> 16) & 0xFF)]++;
}
}
else if (key_size == 2)
{
} else if (key_size == 2) {
T* p = pBuf0;
T* q = pBuf0 + (num_vals >> 1) * 2;
for ( ; p != q; p += 2)
{
for (; p != q; p += 2) {
const uint key0 = CRNLIB_GET_KEY(p);
const uint key1 = CRNLIB_GET_KEY(p + 1);
@@ -61,16 +51,13 @@ namespace crnlib
hist[256 + ((key1 >> 8) & 0xFF)]++;
}
if (num_vals & 1)
{
if (num_vals & 1) {
const uint key = CRNLIB_GET_KEY(p);
hist[key & 0xFF]++;
hist[256 + ((key >> 8) & 0xFF)]++;
}
}
else
{
} else {
CRNLIB_ASSERT(key_size == 1);
if (key_size != 1)
return NULL;
@@ -78,8 +65,7 @@ namespace crnlib
T* p = pBuf0;
T* q = pBuf0 + (num_vals >> 1) * 2;
for ( ; p != q; p += 2)
{
for (; p != q; p += 2) {
const uint key0 = CRNLIB_GET_KEY(p);
const uint key1 = CRNLIB_GET_KEY(p + 1);
@@ -87,8 +73,7 @@ namespace crnlib
hist[key1 & 0xFF]++;
}
if (num_vals & 1)
{
if (num_vals & 1) {
const uint key = CRNLIB_GET_KEY(p);
hist[key & 0xFF]++;
}
@@ -97,15 +82,13 @@ namespace crnlib
T* pCur = pBuf0;
T* pNew = pBuf1;
for (uint pass = 0; pass < key_size; pass++)
{
for (uint pass = 0; pass < key_size; pass++) {
const uint* pHist = &hist[pass << 8];
uint offsets[256];
uint cur_ofs = 0;
for (uint i = 0; i < 256; i += 2)
{
for (uint i = 0; i < 256; i += 2) {
offsets[i] = cur_ofs;
cur_ofs += pHist[i];
@@ -118,22 +101,18 @@ namespace crnlib
T* p = pCur;
T* q = pCur + (num_vals >> 1) * 2;
for ( ; p != q; p += 2)
{
for (; p != q; p += 2) {
uint c0 = (CRNLIB_GET_KEY(p) >> pass_shift) & 0xFF;
uint c1 = (CRNLIB_GET_KEY(p + 1) >> pass_shift) & 0xFF;
if (c0 == c1)
{
if (c0 == c1) {
uint dst_offset0 = offsets[c0];
offsets[c0] = dst_offset0 + 2;
pNew[dst_offset0] = p[0];
pNew[dst_offset0 + 1] = p[1];
}
else
{
} else {
uint dst_offset0 = offsets[c0]++;
uint dst_offset1 = offsets[c1]++;
@@ -142,8 +121,7 @@ namespace crnlib
}
}
if (num_vals & 1)
{
if (num_vals & 1) {
uint c = (CRNLIB_GET_KEY(p) >> pass_shift) & 0xFF;
uint dst_offset = offsets[c];
@@ -164,18 +142,15 @@ namespace crnlib
// Returns pointer to sorted array.
template <typename T, typename Q>
T* indirect_radix_sort(uint num_indices, T* pIndices0, T* pIndices1, const Q* pKeys, uint key_ofs, uint key_size, bool init_indices)
{
T* indirect_radix_sort(uint num_indices, T* pIndices0, T* pIndices1, const Q* pKeys, uint key_ofs, uint key_size, bool init_indices) {
CRNLIB_ASSERT_OPEN_RANGE(key_ofs, 0, sizeof(T));
CRNLIB_ASSERT_CLOSED_RANGE(key_size, 1, 4);
if (init_indices)
{
if (init_indices) {
T* p = pIndices0;
T* q = pIndices0 + (num_indices >> 1) * 2;
uint i;
for (i = 0; p != q; p += 2, i += 2)
{
for (i = 0; p != q; p += 2, i += 2) {
p[0] = static_cast<T>(i);
p[1] = static_cast<T>(i + 1);
}
@@ -191,12 +166,10 @@ namespace crnlib
#define CRNLIB_GET_KEY(p) (*(const uint*)((const uint8*)(pKeys + *(p)) + key_ofs))
#define CRNLIB_GET_KEY_FROM_INDEX(i) (*(const uint*)((const uint8*)(pKeys + (i)) + key_ofs))
if (key_size == 4)
{
if (key_size == 4) {
T* p = pIndices0;
T* q = pIndices0 + num_indices;
for ( ; p != q; p++)
{
for (; p != q; p++) {
const uint key = CRNLIB_GET_KEY(p);
hist[key & 0xFF]++;
@@ -204,27 +177,21 @@ namespace crnlib
hist[512 + ((key >> 16) & 0xFF)]++;
hist[768 + ((key >> 24) & 0xFF)]++;
}
}
else if (key_size == 3)
{
} else if (key_size == 3) {
T* p = pIndices0;
T* q = pIndices0 + num_indices;
for ( ; p != q; p++)
{
for (; p != q; p++) {
const uint key = CRNLIB_GET_KEY(p);
hist[key & 0xFF]++;
hist[256 + ((key >> 8) & 0xFF)]++;
hist[512 + ((key >> 16) & 0xFF)]++;
}
}
else if (key_size == 2)
{
} else if (key_size == 2) {
T* p = pIndices0;
T* q = pIndices0 + (num_indices >> 1) * 2;
for ( ; p != q; p += 2)
{
for (; p != q; p += 2) {
const uint key0 = CRNLIB_GET_KEY(p);
const uint key1 = CRNLIB_GET_KEY(p + 1);
@@ -235,16 +202,13 @@ namespace crnlib
hist[256 + ((key1 >> 8) & 0xFF)]++;
}
if (num_indices & 1)
{
if (num_indices & 1) {
const uint key = CRNLIB_GET_KEY(p);
hist[key & 0xFF]++;
hist[256 + ((key >> 8) & 0xFF)]++;
}
}
else
{
} else {
CRNLIB_ASSERT(key_size == 1);
if (key_size != 1)
return NULL;
@@ -252,8 +216,7 @@ namespace crnlib
T* p = pIndices0;
T* q = pIndices0 + (num_indices >> 1) * 2;
for ( ; p != q; p += 2)
{
for (; p != q; p += 2) {
const uint key0 = CRNLIB_GET_KEY(p);
const uint key1 = CRNLIB_GET_KEY(p + 1);
@@ -261,8 +224,7 @@ namespace crnlib
hist[key1 & 0xFF]++;
}
if (num_indices & 1)
{
if (num_indices & 1) {
const uint key = CRNLIB_GET_KEY(p);
hist[key & 0xFF]++;
@@ -272,15 +234,13 @@ namespace crnlib
T* pCur = pIndices0;
T* pNew = pIndices1;
for (uint pass = 0; pass < key_size; pass++)
{
for (uint pass = 0; pass < key_size; pass++) {
const uint* pHist = &hist[pass << 8];
uint offsets[256];
uint cur_ofs = 0;
for (uint i = 0; i < 256; i += 2)
{
for (uint i = 0; i < 256; i += 2) {
offsets[i] = cur_ofs;
cur_ofs += pHist[i];
@@ -293,25 +253,21 @@ namespace crnlib
T* p = pCur;
T* q = pCur + (num_indices >> 1) * 2;
for ( ; p != q; p += 2)
{
for (; p != q; p += 2) {
uint index0 = p[0];
uint index1 = p[1];
uint c0 = (CRNLIB_GET_KEY_FROM_INDEX(index0) >> pass_shift) & 0xFF;
uint c1 = (CRNLIB_GET_KEY_FROM_INDEX(index1) >> pass_shift) & 0xFF;
if (c0 == c1)
{
if (c0 == c1) {
uint dst_offset0 = offsets[c0];
offsets[c0] = dst_offset0 + 2;
pNew[dst_offset0] = static_cast<T>(index0);
pNew[dst_offset0 + 1] = static_cast<T>(index1);
}
else
{
} else {
uint dst_offset0 = offsets[c0]++;
uint dst_offset1 = offsets[c1]++;
@@ -320,8 +276,7 @@ namespace crnlib
}
}
if (num_indices & 1)
{
if (num_indices & 1) {
uint index = *p;
uint c = (CRNLIB_GET_KEY_FROM_INDEX(index) >> pass_shift) & 0xFF;
+44 -80
View File
@@ -25,28 +25,24 @@
//#define rot(x,k) (((x)<<(k))|((x)>>(32-(k))))
#define rot(x, k) CRNLIB_ROTATE_LEFT(x, k)
namespace crnlib
{
namespace crnlib {
static const double cNorm = 1.0 / (double)0x100000000ULL;
kiss99::kiss99()
{
kiss99::kiss99() {
x = 123456789;
y = 362436000;
z = 521288629;
c = 7654321;
}
void kiss99::seed(uint32 i, uint32 j, uint32 k)
{
void kiss99::seed(uint32 i, uint32 j, uint32 k) {
x = i;
y = j;
z = k;
c = 7654321;
}
inline uint32 kiss99::next()
{
inline uint32 kiss99::next() {
x = 69069 * x + 12345;
y ^= (y << 13);
@@ -61,8 +57,7 @@ namespace crnlib
return (x + y + z);
}
inline uint32 ranctx::next()
{
inline uint32 ranctx::next() {
uint32 e = a - rot(b, 27);
a = b ^ rot(c, 17);
b = c + d;
@@ -71,30 +66,25 @@ namespace crnlib
return d;
}
void ranctx::seed(uint32 seed)
{
void ranctx::seed(uint32 seed) {
a = 0xf1ea5eed, b = c = d = seed;
for (uint32 i = 0; i < 20; ++i)
next();
}
well512::well512()
{
well512::well512() {
seed(0xDEADBE3F);
}
void well512::seed(uint32 seed[well512::cStateSize])
{
void well512::seed(uint32 seed[well512::cStateSize]) {
memcpy(m_state, seed, sizeof(m_state));
m_index = 0;
}
void well512::seed(uint32 seed)
{
void well512::seed(uint32 seed) {
uint32 jsr = utils::swap32(seed) ^ 0xAAC29377;
for (uint i = 0; i < cStateSize; i++)
{
for (uint i = 0; i < cStateSize; i++) {
SHR3;
seed = bitmix32c(seed);
@@ -103,13 +93,11 @@ namespace crnlib
m_index = 0;
}
void well512::seed(uint32 seed1, uint32 seed2, uint32 seed3)
{
void well512::seed(uint32 seed1, uint32 seed2, uint32 seed3) {
uint32 jsr = seed2;
uint32 jcong = seed3;
for (uint i = 0; i < cStateSize; i++)
{
for (uint i = 0; i < cStateSize; i++) {
SHR3;
seed1 = bitmix32c(seed1);
CONG;
@@ -119,8 +107,7 @@ namespace crnlib
m_index = 0;
}
inline uint32 well512::next()
{
inline uint32 well512::next() {
uint32 a, b, c, d;
a = m_state[m_index];
c = m_state[(m_index + 13) & 15];
@@ -135,18 +122,15 @@ namespace crnlib
return m_state[m_index];
}
random::random()
{
random::random() {
seed(12345, 65435, 34221);
}
random::random(uint32 i)
{
random::random(uint32 i) {
seed(i);
}
void random::seed(uint32 i1, uint32 i2, uint32 i3)
{
void random::seed(uint32 i1, uint32 i2, uint32 i3) {
m_ranctx.seed(i1 ^ i2 ^ i3);
m_kiss99.seed(i1, i2, i3);
@@ -157,43 +141,39 @@ namespace crnlib
urand32();
}
void random::seed(uint32 i)
{
void random::seed(uint32 i) {
uint32 jsr = i;
SHR3; SHR3;
SHR3;
SHR3;
uint32 jcong = utils::swap32(~jsr);
CONG; CONG;
CONG;
CONG;
uint32 i1 = SHR3 ^ CONG;
uint32 i2 = SHR3 ^ CONG;
uint32 i3 = SHR3 + CONG;
seed(i1, i2, i3);
}
uint32 random::urand32()
{
uint32 random::urand32() {
return m_kiss99.next() ^ (m_ranctx.next() + m_well512.next());
}
uint64 random::urand64()
{
uint64 random::urand64() {
uint64 result = urand32();
result <<= 32ULL;
result |= urand32();
return result;
}
uint32 random::fast_urand32()
{
uint32 random::fast_urand32() {
return m_well512.next();
}
uint32 random::bit()
{
uint32 random::bit() {
uint32 k = urand32();
return (k ^ (k >> 6) ^ (k >> 10) ^ (k >> 30)) & 1;
}
double random::drand(double l, double h)
{
double random::drand(double l, double h) {
CRNLIB_ASSERT(l <= h);
if (l >= h)
return l;
@@ -201,8 +181,7 @@ namespace crnlib
return math::clamp(l + (h - l) * (urand32() * cNorm), l, h);
}
float random::frand(float l, float h)
{
float random::frand(float l, float h) {
CRNLIB_ASSERT(l <= h);
if (l >= h)
return l;
@@ -212,8 +191,7 @@ namespace crnlib
return math::clamp<float>(r, l, h);
}
int random::irand(int l, int h)
{
int random::irand(int l, int h) {
CRNLIB_ASSERT(l < h);
if (l >= h)
return l;
@@ -236,8 +214,7 @@ namespace crnlib
return result;
}
int random::irand_inclusive(int l, int h)
{
int random::irand_inclusive(int l, int h) {
CRNLIB_ASSERT(h < cINT32_MAX);
return irand(l, h + 1);
}
@@ -253,8 +230,7 @@ namespace crnlib
The algorithm uses the ratio of uniforms method of A.J. Kinderman
and J.F. Monahan augmented with quadratic bounding curves.
*/
double random::gaussian(double mean, double stddev)
{
double random::gaussian(double mean, double stddev) {
double q, u, v, x, y;
/*
@@ -287,35 +263,29 @@ namespace crnlib
return (mean + stddev * v / u);
}
void random::test()
{
void random::test() {
}
fast_random::fast_random() :
jsr(0xABCD917A),
jcong(0x17F3DEAD)
{
fast_random::fast_random()
: jsr(0xABCD917A),
jcong(0x17F3DEAD) {
}
fast_random::fast_random(const fast_random& other) :
jsr(other.jsr), jcong(other.jcong)
{
fast_random::fast_random(const fast_random& other)
: jsr(other.jsr), jcong(other.jcong) {
}
fast_random::fast_random(uint32 i)
{
fast_random::fast_random(uint32 i) {
seed(i);
}
fast_random& fast_random::operator=(const fast_random& other)
{
fast_random& fast_random::operator=(const fast_random& other) {
jsr = other.jsr;
jcong = other.jcong;
return *this;
}
void fast_random::seed(uint32 i)
{
void fast_random::seed(uint32 i) {
jsr = i;
SHR3;
SHR3;
@@ -325,20 +295,17 @@ namespace crnlib
CONG;
}
uint32 fast_random::urand32()
{
uint32 fast_random::urand32() {
return SHR3 ^ CONG;
}
uint64 fast_random::urand64()
{
uint64 fast_random::urand64() {
uint64 result = urand32();
result <<= 32ULL;
result |= urand32();
return result;
}
int fast_random::irand(int l, int h)
{
int fast_random::irand(int l, int h) {
CRNLIB_ASSERT(l < h);
if (l >= h)
return l;
@@ -361,8 +328,7 @@ namespace crnlib
return result;
}
double fast_random::drand(double l, double h)
{
double fast_random::drand(double l, double h) {
CRNLIB_ASSERT(l <= h);
if (l >= h)
return l;
@@ -370,8 +336,7 @@ namespace crnlib
return math::clamp(l + (h - l) * (urand32() * cNorm), l, h);
}
float fast_random::frand(float l, float h)
{
float fast_random::frand(float l, float h) {
CRNLIB_ASSERT(l <= h);
if (l >= h)
return l;
@@ -382,4 +347,3 @@ namespace crnlib
}
} // namespace crnlib

Some files were not shown because too many files have changed in this diff Show More