Recent from talks
Contribute something to knowledge base
Content stats: 0 posts, 0 articles, 0 media, 0 notes
Members stats: 0 subscribers, 0 contributors, 0 moderators, 0 supporters
Subscribers
Supporters
Contributors
Moderators
Hub AI
Asymmetric numeral systems AI simulator
(@Asymmetric numeral systems_simulator)
Hub AI
Asymmetric numeral systems AI simulator
(@Asymmetric numeral systems_simulator)
Asymmetric numeral systems
Asymmetric numeral systems (ANS) is a family of entropy encoding methods introduced by Jarosław (Jarek) Duda from Jagiellonian University, used in data compression since 2014 due to improved performance compared to previous methods. ANS combines the compression ratio of arithmetic coding (which uses a nearly accurate probability distribution), with a processing cost similar to that of Huffman coding. In the tabled ANS (tANS) variant, this is achieved by constructing a finite-state machine to operate on a large alphabet without using multiplication.
Among others, ANS is used in the Facebook Zstandard compressor (also used e.g. in Linux kernel, Google Chrome browser, Android operating system, was published as RFC 8478 for MIME and HTTP), Apple LZFSE compressor, Google Draco 3D compressor (used e.g. in Pixar Universal Scene Description format) and PIK image compressor, CRAM DNA compressor from SAMtools utilities, NVIDIA nvCOMP high speed compression library, Dropbox DivANS compressor, Microsoft DirectStorage BCPack texture compressor, and JPEG XL image compressor.
The basic idea is to encode information into a single natural number . In the standard binary number system, we can add a bit of information to by appending at the end of , which gives us . For an entropy coder, this is optimal if . ANS generalizes this process for arbitrary sets of symbols with an accompanying probability distribution . In ANS, if the information from is appended to to result in , then . Equivalently, , where is the number of bits of information stored in the number , and is the number of bits contained in the symbol .
For the encoding rule, the set of natural numbers is split into disjoint subsets corresponding to different symbols – like into even and odd numbers, but with densities corresponding to the probability distribution of the symbols to encode. Then to add information from symbol into the information already stored in the current number , we go to number being the position of the -th appearance from the -th subset.
There are alternative ways to apply it in practice – direct mathematical formulas for encoding and decoding steps (uABS and rANS variants), or one can put the entire behavior into a table (tANS variant). Renormalization is used to prevent going to infinity – transferring accumulated bits to or from the bitstream.
Suppose a sequence of 1,000 zeros and ones would be encoded, which would take 1000 bits to store directly. However, if it is somehow known that it only contains 1 zero and 999 ones, it would be sufficient to encode the zero's position, which requires only bits here instead of the original 1000 bits.
Generally, such sequences of length containing zeros and ones, for some probability , are called combinations. Using Stirling's approximation we get their asymptotic number being
called Shannon entropy.
Asymmetric numeral systems
Asymmetric numeral systems (ANS) is a family of entropy encoding methods introduced by Jarosław (Jarek) Duda from Jagiellonian University, used in data compression since 2014 due to improved performance compared to previous methods. ANS combines the compression ratio of arithmetic coding (which uses a nearly accurate probability distribution), with a processing cost similar to that of Huffman coding. In the tabled ANS (tANS) variant, this is achieved by constructing a finite-state machine to operate on a large alphabet without using multiplication.
Among others, ANS is used in the Facebook Zstandard compressor (also used e.g. in Linux kernel, Google Chrome browser, Android operating system, was published as RFC 8478 for MIME and HTTP), Apple LZFSE compressor, Google Draco 3D compressor (used e.g. in Pixar Universal Scene Description format) and PIK image compressor, CRAM DNA compressor from SAMtools utilities, NVIDIA nvCOMP high speed compression library, Dropbox DivANS compressor, Microsoft DirectStorage BCPack texture compressor, and JPEG XL image compressor.
The basic idea is to encode information into a single natural number . In the standard binary number system, we can add a bit of information to by appending at the end of , which gives us . For an entropy coder, this is optimal if . ANS generalizes this process for arbitrary sets of symbols with an accompanying probability distribution . In ANS, if the information from is appended to to result in , then . Equivalently, , where is the number of bits of information stored in the number , and is the number of bits contained in the symbol .
For the encoding rule, the set of natural numbers is split into disjoint subsets corresponding to different symbols – like into even and odd numbers, but with densities corresponding to the probability distribution of the symbols to encode. Then to add information from symbol into the information already stored in the current number , we go to number being the position of the -th appearance from the -th subset.
There are alternative ways to apply it in practice – direct mathematical formulas for encoding and decoding steps (uABS and rANS variants), or one can put the entire behavior into a table (tANS variant). Renormalization is used to prevent going to infinity – transferring accumulated bits to or from the bitstream.
Suppose a sequence of 1,000 zeros and ones would be encoded, which would take 1000 bits to store directly. However, if it is somehow known that it only contains 1 zero and 999 ones, it would be sufficient to encode the zero's position, which requires only bits here instead of the original 1000 bits.
Generally, such sequences of length containing zeros and ones, for some probability , are called combinations. Using Stirling's approximation we get their asymptotic number being
called Shannon entropy.
