Recent from talks
Nothing was collected or created yet.
Hamming weight
View on WikipediaThe Hamming weight of a string is the number of symbols that are different from the zero-symbol of the alphabet used. It is thus equivalent to the Hamming distance from the all-zero string of the same length. For the most typical case, a given set of bits, this is the number of bits set to 1, or the digit sum of the binary representation of a given number and the ℓ₁ norm of a bit vector. In this binary case, it is also called the population count,[1] popcount, sideways sum,[2] or bit summation.[3]
| String | Hamming weight |
|---|---|
| 11101 | 4 |
| 11101000 | 4 |
| 00000000 | 0 |
| 678012340567 | 10 |

History and usage
[edit]The Hamming weight is named after the American mathematician Richard Hamming, although he did not originate the notion.[5] The Hamming weight of binary numbers was already used in 1899 by James W. L. Glaisher to give a formula for the number of odd binomial coefficients in a single row of Pascal's triangle.[6] Irving S. Reed introduced a concept, equivalent to Hamming weight in the binary case, in 1954.[7]
Hamming weight is used in several disciplines including information theory, coding theory, and cryptography. Examples of applications of the Hamming weight include:
- In modular exponentiation by squaring, the number of modular multiplications required for an exponent e is log2 e + weight(e). This is the reason that the public key value e used in RSA is typically chosen to be a number of low Hamming weight.[8]
- The Hamming weight determines path lengths between nodes in Chord distributed hash tables.[9]
- IrisCode lookups in biometric databases are typically implemented by calculating the Hamming distance to each stored record.[10]
- In computer chess programs using a bitboard representation, the Hamming weight of a bitboard gives the number of pieces of a given type remaining in the game, or the number of squares of the board controlled by one player's pieces, and is therefore an important contributing term to the value of a position.[11]
- Hamming weight can be used to efficiently compute find first set using the identity ffs(x) = pop(x ^ (x - 1)). This is useful on platforms such as SPARC that have hardware Hamming weight instructions but no hardware find first set instruction.[12][1]
- The Hamming weight operation can be interpreted as a conversion from the unary numeral system to binary numbers.[13]
Efficient implementation
[edit]The population count of a bitstring is often needed in cryptography and other applications. The Hamming distance of two words A and B can be calculated as the Hamming weight of A xor B.[1]
The problem of how to implement it efficiently has been widely studied. A single operation for the calculation, or parallel operations on bit vectors are available on some processors. For processors lacking those features, the best solutions known are based on adding counts in a tree pattern. For example, to count the number of 1 bits in the 16-bit binary number a = 0110 1100 1011 1010, these operations can be done:
| Expression | Binary | Decimal | Comment | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
a
|
01 | 10 | 11 | 00 | 10 | 11 | 10 | 10 | 27834 | The original number |
b0 = (a >> 0) & 01 01 01 01 01 01 01 01
|
01 | 00 | 01 | 00 | 00 | 01 | 00 | 00 | 1, 0, 1, 0, 0, 1, 0, 0 | Every other bit from a |
b1 = (a >> 1) & 01 01 01 01 01 01 01 01
|
00 | 01 | 01 | 00 | 01 | 01 | 01 | 01 | 0, 1, 1, 0, 1, 1, 1, 1 | The remaining bits from a |
c = b0 + b1
|
01 | 01 | 10 | 00 | 01 | 10 | 01 | 01 | 1, 1, 2, 0, 1, 2, 1, 1 | Count of 1s in each 2-bit slice of a |
d0 = (c >> 0) & 0011 0011 0011 0011
|
0001 | 0000 | 0010 | 0001 | 1, 0, 2, 1 | Every other count from c | ||||
d2 = (c >> 2) & 0011 0011 0011 0011
|
0001 | 0010 | 0001 | 0001 | 1, 2, 1, 1 | The remaining counts from c | ||||
e = d0 + d2
|
0010 | 0010 | 0011 | 0010 | 2, 2, 3, 2 | Count of 1s in each 4-bit slice of a | ||||
f0 = (e >> 0) & 00001111 00001111
|
00000010 | 00000010 | 2, 2 | Every other count from e | ||||||
f4 = (e >> 4) & 00001111 00001111
|
00000010 | 00000011 | 2, 3 | The remaining counts from e | ||||||
g = f0 + f4
|
00000100 | 00000101 | 4, 5 | Count of 1s in each 8-bit slice of a | ||||||
h0 = (g >> 0) & 0000000011111111
|
0000000000000101 | 5 | Every other count from g | |||||||
h8 = (g >> 8) & 0000000011111111
|
0000000000000100 | 4 | The remaining counts from g | |||||||
i = h0 + h8
|
0000000000001001 | 9 | Count of 1s in entire 16-bit word | |||||||
Here, the operations are as in C programming language, so X >> Y means to shift X right by Y bits, X & Y means the bitwise AND of X and Y, and + is ordinary addition. The best algorithms known for this problem are based on the concept illustrated above and are given here:[1]
//types and constants used in the functions below
//uint64_t is an unsigned 64-bit integer variable type (defined in C99 version of C language)
const uint64_t m1 = 0x5555555555555555; //binary: 0101...
const uint64_t m2 = 0x3333333333333333; //binary: 00110011..
const uint64_t m4 = 0x0f0f0f0f0f0f0f0f; //binary: 4 zeros, 4 ones ...
const uint64_t m8 = 0x00ff00ff00ff00ff; //binary: 8 zeros, 8 ones ...
const uint64_t m16 = 0x0000ffff0000ffff; //binary: 16 zeros, 16 ones ...
const uint64_t m32 = 0x00000000ffffffff; //binary: 32 zeros, 32 ones
const uint64_t h01 = 0x0101010101010101; //the sum of 256 to the power of 0,1,2,3...
//This is a naive implementation, shown for comparison,
//and to help in understanding the better functions.
//This algorithm uses 24 arithmetic operations (shift, add, and).
int popcount64a(uint64_t x)
{
x = (x & m1 ) + ((x >> 1) & m1 ); //put count of each 2 bits into those 2 bits
x = (x & m2 ) + ((x >> 2) & m2 ); //put count of each 4 bits into those 4 bits
x = (x & m4 ) + ((x >> 4) & m4 ); //put count of each 8 bits into those 8 bits
x = (x & m8 ) + ((x >> 8) & m8 ); //put count of each 16 bits into those 16 bits
x = (x & m16) + ((x >> 16) & m16); //put count of each 32 bits into those 32 bits
x = (x & m32) + ((x >> 32) & m32); //put count of each 64 bits into those 64 bits
return x;
}
//This uses fewer arithmetic operations than any other known
//implementation on machines with slow multiplication.
//This algorithm uses 17 arithmetic operations.
int popcount64b(uint64_t x)
{
x -= (x >> 1) & m1; //put count of each 2 bits into those 2 bits
x = (x & m2) + ((x >> 2) & m2); //put count of each 4 bits into those 4 bits
x = (x + (x >> 4)) & m4; //put count of each 8 bits into those 8 bits
x += x >> 8; //put count of each 16 bits into their lowest 8 bits
x += x >> 16; //put count of each 32 bits into their lowest 8 bits
x += x >> 32; //put count of each 64 bits into their lowest 8 bits
return x & 0x7f;
}
//This uses fewer arithmetic operations than any other known
//implementation on machines with fast multiplication.
//This algorithm uses 12 arithmetic operations, one of which is a multiply.
int popcount64c(uint64_t x)
{
x -= (x >> 1) & m1; //put count of each 2 bits into those 2 bits
x = (x & m2) + ((x >> 2) & m2); //put count of each 4 bits into those 4 bits
x = (x + (x >> 4)) & m4; //put count of each 8 bits into those 8 bits
return (x * h01) >> 56; //returns left 8 bits of x + (x<<8) + (x<<16) + (x<<24) + ...
}
The above implementations have the best worst-case behavior of any known algorithm. However, when a value is expected to have few nonzero bits, it may instead be more efficient to use algorithms that count these bits one at a time. As Wegner described in 1960,[14] the bitwise AND of x with x − 1 differs from x only in zeroing out the least significant nonzero bit: subtracting 1 changes the rightmost string of 0s to 1s, and changes the rightmost 1 to a 0. If x originally had n bits that were 1, then after only n iterations of this operation, x will be reduced to zero. The following implementation is based on this principle.
//This is better when most bits in x are 0
//This algorithm works the same for all data sizes.
//This algorithm uses 3 arithmetic operations and 1 comparison/branch per "1" bit in x.
int popcount64d(uint64_t x)
{
int count;
for (count=0; x; count++)
x &= x - 1;
return count;
}
Of interest here is the close relationship between Popcount, FFS and CLZ.
If greater memory usage is allowed, we can calculate the Hamming weight faster than the above methods. With unlimited memory, we could simply create a large lookup table of the Hamming weight of every 64 bit integer. If we can store a lookup table of the hamming function of every 16 bit integer, we can do the following to compute the Hamming weight of every 32 bit integer.
static uint8_t wordbits[65536] = { /* bitcounts of integers 0 through 65535, inclusive */ };
//This algorithm uses 3 arithmetic operations and 2 memory reads.
int popcount32e(uint32_t x)
{
return wordbits[x & 0xFFFF] + wordbits[x >> 16];
}
//Optionally, the wordbits[] table could be filled using this function
int popcount32e_init(void)
{
uint32_t i;
uint16_t x;
int count;
for (i=0; i <= 0xFFFF; i++)
{
x = i;
for (count=0; x; count++) // borrowed from popcount64d() above
x &= x - 1;
wordbits[i] = count;
}
}
A recursive algorithm is given in Donovan & Kernighan[15]
/* The weight of i can differ from the weight of i / 2
only in the least significant bit of i */
int popcount32e_init (void)
{
int i;
for (i = 1; sizeof wordbits / sizeof *wordbits > i; ++i)
wordbits [i] = wordbits [i >> 1] + (1 & i);
}
Muła et al.[16] have shown that a vectorized version of popcount64b can run faster than dedicated instructions (e.g., popcnt on x64 processors).
The Harley–Seal algorithm[17] is one of the fastest that also only needs integer operations.[18]
Minimum weight
[edit]In error-correcting coding, the minimum Hamming weight, commonly referred to as the minimum weight wmin of a code is the weight of the lowest-weight non-zero code word. The weight w of a code word is the number of 1s in the word. For example, the word 11001010 has a weight of 4.
In a linear block code the minimum weight is also the minimum Hamming distance (dmin) and defines the error correction capability of the code. If wmin = n, then dmin = n and the code will correct up to dmin/2 errors.[19]
Language support
[edit]Some C compilers provide intrinsic functions that provide bit counting facilities. For example, GCC (since version 3.4 in April 2004) includes a builtin function __builtin_popcount that will use a processor instruction if available or an efficient library implementation otherwise.[20] LLVM-GCC has included this function since version 1.5 in June 2005.[21]
In the C++ Standard Library, the bit-array data structure bitset has a count() method that counts the number of bits that are set. In C++20, a new header <bit> was added, containing functions std::popcount and std::has_single_bit, taking arguments of unsigned integer types.
In Java, the growable bit-array data structure BitSet has a BitSet.cardinality() method that counts the number of bits that are set. In addition, there are Integer.bitCount(int) and Long.bitCount(long) functions to count bits in primitive 32-bit and 64-bit integers, respectively. Also, the BigInteger arbitrary-precision integer class also has a BigInteger.bitCount() method that counts bits.
In Python, the int type has a bit_count() method to count the number of bits set. This functionality was introduced in Python 3.10, released in October 2021.[22]
In Common Lisp, the function logcount, given a non-negative integer, returns the number of 1 bits. (For negative integers it returns the number of 0 bits in 2's complement notation.) In either case the integer can be a BIGNUM.
Starting in GHC 7.4, the Haskell base package has a popCount function available on all types that are instances of the Bits class (available from the Data.Bits module).[23]
MySQL version of SQL language provides BIT_COUNT() as a standard function.[24]
Fortran 2008 has the standard, intrinsic, elemental function popcnt returning the number of nonzero bits within an integer (or integer array).[25]
Some programmable scientific pocket calculators feature special commands to calculate the number of set bits, e.g. #B on the HP-16C.[3]
FreePascal implements popcnt since version 3.0.[26]
Processor support
[edit]- The IBM STRETCH computer in the 1960s calculated the number of set bits as well as the number of leading zeros as a by-product of all logical operations.[1]
- Cray supercomputers early on featured a population count machine instruction, rumoured to have been specifically requested by the U.S. government National Security Agency for cryptanalysis applications.[1]
- Control Data Corporation's (CDC) 6000 and Cyber 70/170 series machines included a population count instruction; in COMPASS, this instruction was coded as
CXi. - The 64-bit SPARC version 9 architecture defines a
POPCinstruction,[12][1] but most implementations do not implement it, requiring it be emulated by the operating system.[27] - Donald Knuth's model computer MMIX that is going to replace MIX in his book The Art of Computer Programming has an
SADDinstruction since 1999.SADD a,b,ccounts all bits that are 1 in b and 0 in c and writes the result to a. - Compaq's Alpha 21264A, released in 1999, was the first Alpha series CPU design that had the count extension (
CIX). - Analog Devices' Blackfin processors feature the
ONESinstruction to perform a 32-bit population count.[28] - AMD's Barcelona architecture introduced the advanced bit manipulation (ABM) ISA introducing the
POPCNTinstruction as part of the SSE4a extensions in 2007. - Intel Core processors introduced a
POPCNTinstruction with the SSE4.2 instruction set extension, first available in a Nehalem-based Core i7 processor, released in November 2008. - The ARM architecture introduced the
VCNTinstruction as part of the Advanced SIMD (NEON) extensions. - The RISC-V architecture introduced the
CPOPinstruction as part of the Bit Manipulation (B) extension.[29]
See also
[edit]References
[edit]- ^ a b c d e f g Warren Jr., Henry S. (2013) [2002]. Hacker's Delight (2 ed.). Addison Wesley - Pearson Education, Inc. pp. 81–96. ISBN 978-0-321-84268-8. 0-321-84268-5.
- ^ Knuth, Donald Ervin (2009). "Bitwise tricks & techniques; Binary Decision Diagrams". The Art of Computer Programming. Vol. 4, Fascicle 1. Addison–Wesley Professional. ISBN 978-0-321-58050-4. (NB. Draft of Fascicle 1b Archived 2016-03-12 at the Wayback Machine available for download.)
- ^ a b Hewlett-Packard HP-16C Computer Scientist Owner's Handbook (PDF). Hewlett-Packard Company. April 1982. 00016-90001. Archived (PDF) from the original on 2017-03-28. Retrieved 2017-03-28.
- ^ R.Ugalde, Laurence. "Population count the Fōrmulæ programming language". Fōrmulæ. Retrieved 2024-06-02.
- ^ Thompson, Thomas M. (1983). From Error-Correcting Codes through Sphere Packings to Simple Groups. The Carus Mathematical Monographs #21. The Mathematical Association of America. p. 33.
- ^ Glaisher, James Whitbread Lee (1899). "On the residue of a binomial-theorem coefficient with respect to a prime modulus". The Quarterly Journal of Pure and Applied Mathematics. 30: 150–156. (NB. See in particular the final paragraph of p. 156.)
- ^ Reed, Irving Stoy (1954). "A Class of Multiple-Error-Correcting Codes and the Decoding Scheme". IRE Professional Group on Information Theory. PGIT-4. Institute of Radio Engineers (IRE): 38–49.
- ^ Cohen, Gérard D.; Lobstein, Antoine; Naccache, David; Zémor, Gilles (1998). "How to improve an exponentiation black-box". In Nyberg, Kaisa (ed.). Advances in Cryptology – EUROCRYPT '98, International Conference on the Theory and Application of Cryptographic Techniques, Espoo, Finland, May 31 – June 4, 1998, Proceeding. Lecture Notes in Computer Science. Vol. 1403. Springer. pp. 211–220. doi:10.1007/BFb0054128. ISBN 978-3-540-64518-4.
- ^ Stoica, I.; Morris, R.; Liben-Nowell, D.; Karger, D. R.; Kaashoek, M. F.; Dabek, F.; Balakrishnan, H. (February 2003). "Chord: a scalable peer-to-peer lookup protocol for internet applications". IEEE/ACM Transactions on Networking. 11 (1): 17–32. Bibcode:2003ITNet..11...17S. doi:10.1109/TNET.2002.808407. S2CID 221276912.
Section 6.3: "In general, the number of fingers we need to follow will be the number of ones in the binary representation of the distance from node to query."
- ^ Kong, A.W.K.; Zhang, D.; Kamel, M.S. (February 2010). "An analysis of IrisCode". IEEE Transactions on Image Processing. 19 (2): 522–532. Bibcode:2010ITIP...19..522K. doi:10.1109/tip.2009.2033427. PMID 20083454.
- ^ Heinz, E.A. (September 1997). "How Darkthought plays chess". ICGA Journal. 20 (3): 166–176. doi:10.3233/icg-1997-20304. Updated and reprinted in Scalable Search in Computer Chess (Vieweg+Teubner Verlag, 2000), pp. 185–198, doi:10.1007/978-3-322-90178-1_13
- ^ a b SPARC International, Inc. (1992). "A.41: Population Count. Programming Note". The SPARC architecture manual: version 9 (Version 9 ed.). Englewood Cliffs, New Jersey, USA: Prentice Hall. pp. 205. ISBN 0-13-825001-4.
- ^ Blaxell, David (1978). Hogben, David; Fife, Dennis W. (eds.). "Record linkage by bit pattern matching". Computer Science and Statistics--Tenth Annual Symposium on the Interface. NBS Special Publication. 503. U.S. Department of Commerce / National Bureau of Standards: 146–156.
- ^ Wegner, Peter (May 1960). "A technique for counting ones in a binary computer". Communications of the ACM. 3 (5): 322. doi:10.1145/367236.367286. S2CID 31683715.
- ^ Donovan, Alan; Kernighan, Brian (2016). The Go Programming Language. Addison-Weseley. ISBN 978-0-13-419044-0.
- ^ Muła, Wojciech; Kurz, Nathan; Lemire, Daniel (January 2018). "Faster Population Counts Using AVX2 Instructions". Computer Journal. 61 (1): 111–120. arXiv:1611.07612. doi:10.1093/comjnl/bxx046. S2CID 540973.
- ^ "Sse-popcount/Popcnt-harley-seal.CPP at master · WojciechMula/Sse-popcount". GitHub.
- ^ Muła, Wojciech; Kurz, Nathan; Lemire, Daniel (2018). "Faster Population Counts Using AVX2 Instructions". The Computer Journal. 61: 111–120. arXiv:1611.07612. doi:10.1093/comjnl/bxx046.
- ^ Stern & Mahmoud, Communications System Design, Prentice Hall, 2004, p 477ff.
- ^ "GCC 3.4 Release Notes". GNU Project.
- ^ "LLVM 1.5 Release Notes". LLVM Project.
- ^ "What's New In Python 3.10". python.org.
- ^ "GHC 7.4.1 release notes". GHC documentation.
- ^ "Chapter 12.11. Bit Functions — MySQL 5.0 Reference Manual".
- ^ Metcalf, Michael; Reid, John; Cohen, Malcolm (2011). Modern Fortran Explained. Oxford University Press. p. 380. ISBN 978-0-19-960142-4.
- ^ "Free Pascal documentation popcnt". Retrieved 2019-12-07.
- ^ "JDK-6378821: bitCount() should use POPC on SPARC processors and AMD+10h". Java bug database. 2006-01-30.
- ^ Blackfin Instruction Set Reference (Preliminary ed.). Analog Devices. 2001. pp. 8–24. Part Number 82-000410-14.
- ^ Wolf, Claire (2019-03-22). "RISC-V "B" Bit Manipulation Extension for RISC-V, Draft v0.37" (PDF). Github.
Further reading
[edit]- Schroeppel, Richard C.; Orman, Hilarie K. (1972-02-29). "compilation". HAKMEM. By Beeler, Michael; Gosper, Ralph William; Schroeppel, Richard C. (report). Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA. MIT AI Memo 239. (Item 169: Population count assembly code for the PDP/6-10.)
External links
[edit]- Aggregate Magic Algorithms. Optimized population count and other algorithms explained with sample code.
- Bit Twiddling Hacks Several algorithms with code for counting bits set.
- Necessary and Sufficient Archived 2017-09-23 at the Wayback Machine - by Damien Wintour - Has code in C# for various Hamming Weight implementations.
- Best algorithm to count the number of set bits in a 32-bit integer? - Stackoverflow
Hamming weight
View on GrokipediaFundamentals
Definition
The Hamming weight of a string over a finite alphabet is defined as the number of positions in the string whose symbols differ from the zero symbol of the alphabet.[8] In the binary case, which is the most common application, this corresponds to the number of 1s in the binary string.[5] This concept originated in the context of error-detecting and error-correcting codes, where it measures the nonzero components in codewords.[3] For a nonnegative integer represented in binary as with each , the Hamming weight, denoted , is given by [8] It is also known as the population count or popcount in computer science contexts, referring to the same count of set bits. Unlike the bit length, which is the total number of bits required to represent , the Hamming weight specifically counts only the 1s and ignores leading zeros.[5] Basic examples illustrate this: since its binary representation has no 1s; for binary 1; and for binary 1101, counting the three 1s.[8] For binary strings, , again counting the 1s regardless of length.Relation to Hamming Distance
The Hamming distance between two equal-length binary strings and is the number of positions at which their corresponding bits differ, and it is equivalent to the Hamming weight of their bitwise exclusive-or (XOR), expressed as .[9] This relation arises because the XOR operation produces a binary string where each bit is 1 precisely when the bits in and differ at that position, and 0 otherwise. Thus, the Hamming weight of , which counts the number of 1s, directly measures the symmetric difference in the bit positions between and . For instance, consider the binary strings and ; their XOR is , which has Hamming weight 2, so . Similarly, for the decimal integers 5 (binary ) and 3 (binary ), the XOR is (decimal 6), with Hamming weight 2, yielding .[9] This binary relation generalizes to -ary strings over an alphabet of size , where the Hamming distance counts the positions in which the symbols differ, analogous to the Hamming weight of the difference vector (componentwise in the underlying group structure, such as a finite field ), which counts the non-zero symbols.[10]Properties
Key Mathematical Properties
The Hamming weight function, denoted for a binary vector , satisfies the subadditivity property under componentwise addition modulo 2 (equivalent to bitwise XOR): .[11] Equality holds if and only if the supports of and are disjoint, meaning no position where both have a 1 (i.e., ).[11] This follows from the fact that overlapping 1s in and cancel under XOR, reducing the weight of the sum compared to the separate weights. A related identity is , which decomposes the weights into symmetric difference and intersection components.[8] Under bitwise operations, the Hamming weight exhibits monotonicity. Specifically, and , since AND can only turn 1s to 0s without adding new ones.[8] Similarly, and , reflecting the union of supports.[8] For the bitwise complement, , as every 0 becomes 1 and vice versa in an -bit string.[8] For any , the bounds are , with the minimum achieved at the all-zero vector and the maximum at the all-one vector.[1] In a uniformly random binary string of length , where each bit is independently 1 with probability , the expected Hamming weight is .[8] Combinatorially, the Hamming weight connects to binomial coefficients: the number of binary strings of length with exact weight is , representing the ways to choose positions for the 1s.[8] This distribution underlies volumes of Hamming balls in coding theory, where the size of a ball of radius around a center is .Minimum Weight in Error-Correcting Codes
In linear error-correcting codes over finite fields, the minimum weight of a code is defined as the smallest Hamming weight among all nonzero codewords, that is, .[12] For linear codes, this minimum weight coincides with the minimum Hamming distance between any two distinct codewords, providing a key parameter for assessing code reliability.[8] The value of fundamentally governs the performance of the code in noisy channels: it enables detection of up to errors in a received word, as any such error pattern cannot transform one codeword into another.[13] Additionally, the code can correct up to errors through nearest-neighbor decoding, since spheres of radius around codewords are disjoint and cover the space up to that radius.[14] A classic example is the binary Hamming code of length and dimension , which achieves and thus corrects single errors while detecting up to two.[15] The weight distribution of this code, which enumerates the number of codewords of each weight , is given in the following table:| Weight | Number of codewords |
|---|---|
| 0 | 1 |
| 3 | 7 |
| 4 | 7 |
| 7 | 1 |
Applications
In Coding Theory
In coding theory, the Hamming weight is fundamental to constructing linear error-correcting codes, as the minimum distance of a code equals the minimum Hamming weight of its nonzero codewords, directly dictating the code's ability to correct up to errors.[15] For binary linear codes, the generator matrix spans the codewords, and designs aim to maximize the minimum weight by selecting rows whose linear combinations avoid low-weight outputs; equivalently, the parity-check matrix is engineered so that no columns are linearly dependent, ensuring no nonzero codeword of weight less than exists.[8] Syndrome decoding algorithms exploit the Hamming weight to identify error patterns in linear codes, computing the syndrome for a received vector and seeking the minimum-weight error vector (the coset leader) such that , assuming low-weight errors are most probable.[14] This approach is efficient for bounded-distance decoding, where errors up to a certain weight are corrected by table lookup or algebraic search, and the weight constraint limits the search space to plausible error configurations. A prime example is the binary Hamming code, a linear code that corrects single-bit errors using a parity-check matrix whose columns are all nonzero binary vectors of length 3.[3] For a single error in position , the syndrome matches the -th column of , identifying and correcting the error by flipping the bit at ; this weight-based mechanism detects double errors (nonzero syndrome not matching any column) but corrects only weight-1 patterns, as higher-weight errors may alias to incorrect low-weight coset leaders. In BCH codes, extensions of Hamming codes for multiple-error correction, the Hamming weight enumerator (where is the number of codewords of weight ) quantifies the distribution of weights, enabling precise bounds on decoding error probability and code performance analysis via combinatorial identities.[20] For primitive binary BCH codes of length designed to correct errors, the weight enumerator reveals that minimum weight equals the designed distance , with explicit formulas for derived from cyclotomic cosets. In modern low-density parity-check (LDPC) codes, defined by sparse parity-check matrices, the Hamming weight informs iterative decoding performance, particularly in belief propagation algorithms where low-weight codewords or pseudo-codewords can trap decoding in local minima, causing error floors at high signal-to-noise ratios.[21] Gallager's analysis shows that regular LDPC codes with fixed column weight achieve near-Shannon-limit performance asymptotically, but practical irregular designs optimize degree distributions to elevate the minimum weight and suppress low-weight cycles, enhancing convergence speed and reliability in iterative message-passing.[21]In Computer Science and Cryptography
In computer science, the Hamming weight plays a key role in algorithmic designs that require efficient bit counting and comparison, such as in parallel computing architectures. For instance, parallel counters enable the comparison of Hamming weights between binary vectors by accumulating counts of set bits in logarithmic time, achieving O(log n) latency and O(n) complexity for n-bit vectors, which outperforms earlier methods based on sorting networks that incurred O(log² n) delay. These counters merge counting and thresholding operations, allowing applications like fixed-threshold checks (e.g., whether the weight exceeds k) or pairwise comparisons, and are particularly useful in resource-constrained environments due to their reduced hardware overhead. In neural processing engines, Hamming weight compression facilitates data-efficient parallel computations for convolutions in deep learning; by converting multiplications into sequences of bit-count compressions followed by additions, systems like NESTA process batches of inputs with up to 58.9% power savings and 23.7% delay reduction compared to traditional multiply-accumulate units, leveraging deferred residual handling to avoid carry propagation in parallel pipelines.[22][23] In cryptography, Hamming weight is central to cryptanalytic techniques and post-quantum secure systems. Linear cryptanalysis exploits low-Hamming-weight linear approximations of S-boxes to construct trails with high correlation; for example, in the PRESENT block cipher, single-bit trails—where input and output masks have Hamming weight one—enable efficient key recovery by concatenating approximations across rounds, yielding a bias of approximately 2^{-8.42} over six rounds due to the permutation layer preserving single-bit activity. This approach identifies vulnerabilities in substitution-permutation networks by focusing on approximations with minimal active bits, such as those with correlation ±2^{-2} for masks of weights 1 or 2. The McEliece cryptosystem, a code-based primitive resistant to quantum attacks, bases its security on the hardness of decoding general linear codes with low-Hamming-weight errors; specifically, recovering an error vector e of weight t (e.g., t=119 for n=6960) from the syndrome H·e^T = s^T is NP-hard, with information-set decoding attacks requiring at least 2^{128} operations for standard parameters, ensuring IND-CCA2 security levels beyond 256 bits against lattice and quantum threats.[24][25] Beyond algorithms and core cryptography, Hamming weight supports probabilistic data structures and optimization in computer science. Property-preserving hash functions for Hamming distance compress n-bit strings into O(tλ)-bit digests while preserving distance predicates (e.g., d(x,y) ≥ t), encoding inputs as sets where the symmetric difference corresponds to twice the distance, enabling secure evaluation under standard assumptions like the bilinear discrete logarithm problem for applications in verifiable computation. In distance-sensitive Bloom filters, the relative Hamming distance between filter representations approximates similarity for set membership queries; by hashing elements into bit arrays and measuring the proportion of differing bits (d(u,v) = sum(1(u_i ≠ v_i))/ℓ), these filters support locality-sensitive hashing with tunable false positive rates, using O(m/nℓ) space for n elements and ε-close matches. For machine learning, population count (Hamming weight) quantifies feature sparsity in binary or low-precision models, as in quantum annealing for feature selection where the Hamming weight encodes the number of active features to promote sparse solutions while reducing dimensionality. The avalanche effect in block ciphers, a diffusion property, is quantified by the expected Hamming weight of output differences for single-bit input changes; for instance, the avalanche weight W_av(F, Δ) should approximate b/2 (where b is output length) within a threshold t, as verified in ciphers like ASCON where it reaches full diffusion after four rounds, ensuring each output bit flips with probability near 1/2.[26][27][28][29]Implementations
Efficient Algorithms
The naive approach to computing the Hamming weight of an n-bit integer iterates through each bit position, using a bitwise AND with 1 to check if the bit is set, followed by a right shift to process the next bit. This method requires O(n) time in the worst case and O(1) space, making it straightforward but inefficient for large n. Table lookup methods enhance performance by precomputing Hamming weights for all values within small bit widths, typically 8 bits (yielding a 256-entry table). The input integer is masked and shifted to extract each chunk, with lookups summed to obtain the total weight. For a 32-bit integer, this involves four 8-bit lookups, achieving O(n/8) time and O(256) space. Here is example pseudocode for an 8-bit table lookup on a 32-bit unsigned integer:int table[256]; // Precomputed: table[i] = popcount of i (0 to 255)
for (int i = 0; i < 256; i++) {
table[i] = (i & 1) + table[i >> 1]; // Or use __builtin_popcount(i) if available for init
}
unsigned int hamming_weight(unsigned int x) {
return table[x & 0xFF] +
table[(x >> 8) & 0xFF] +
table[(x >> 16) & 0xFF] +
table[(x >> 24) & 0xFF];
}
int table[256]; // Precomputed: table[i] = popcount of i (0 to 255)
for (int i = 0; i < 256; i++) {
table[i] = (i & 1) + table[i >> 1]; // Or use __builtin_popcount(i) if available for init
}
unsigned int hamming_weight(unsigned int x) {
return table[x & 0xFF] +
table[(x >> 8) & 0xFF] +
table[(x >> 16) & 0xFF] +
table[(x >> 24) & 0xFF];
}
x &= (x - 1), incrementing a counter until the input is zero. This runs in O(w) time, where w is the Hamming weight, offering advantages when the number of set bits is small compared to n, while using O(1) space.
Pseudocode for Kernighan's algorithm:
int hamming_weight(unsigned int x) {
int count = 0;
while (x != 0) {
x &= (x - 1);
count++;
}
return count;
}
int hamming_weight(unsigned int x) {
int count = 0;
while (x != 0) {
x &= (x - 1);
count++;
}
return count;
}
Programming Language Support
In C and C++, compiler intrinsics and standard library features provide efficient support for computing the Hamming weight. The GCC and Clang compilers include the__builtin_popcount(unsigned int x) function, which returns the number of 1-bits in an unsigned integer, with variants like __builtin_popcountll for 64-bit values.[31] Additionally, C++20 introduces std::bitset<N>::count() in the <bitset> header, which counts the number of set bits in a fixed-size bitset.
#include <bitset>
#include <iostream>
int main() {
std::bitset<32> bits(0b1010);
std::cout << bits.count() << std::endl; // Outputs: 2
return 0;
}
#include <bitset>
#include <iostream>
int main() {
std::bitset<32> bits(0b1010);
std::cout << bits.count() << std::endl; // Outputs: 2
return 0;
}
bin(x).count('1'), where bin(x) produces a binary string representation prefixed with '0b', and count('1') tallies the 1s. From Python 3.10 onward, the int type includes the bit_count() method, which directly returns the population count of 1s in the binary expansion of the integer's absolute value.[32]
x = 10 # Binary: 1010
print(bin(x).count('1')) # Outputs: 2 (pre-3.10)
print(x.bit_count()) # Outputs: 2 (3.10+)
x = 10 # Binary: 1010
print(bin(x).count('1')) # Outputs: 2 (pre-3.10)
print(x.bit_count()) # Outputs: 2 (3.10+)
Integer class features the static bitCount(int i) method, which counts the 1-bits in the two's complement binary representation of an int; a corresponding Long.bitCount(long i) exists for 64-bit values.[33]
int x = 10; // Binary: 1010
System.out.println(Integer.bitCount(x)); // Outputs: 2
int x = 10; // Binary: 1010
System.out.println(Integer.bitCount(x)); // Outputs: 2
u32 and u64 include the count_ones() method, which returns the number of 1s in their binary representation as a u32. This method is available in the standard library and leverages hardware instructions where possible.[34]
let x: u32 = 0b1010;
println!("{}", x.count_ones()); // Outputs: 2
let x: u32 = 0b1010;
println!("{}", x.count_ones()); // Outputs: 2
BigInt, but developers can implement it manually—such as by converting to a binary string with toString(2) and counting '1's—or use polyfills that extend BigInt.prototype with a popcount method for arbitrary-precision support.[35]
For arbitrary-precision arithmetic, the GNU Multiple Precision Arithmetic Library (GMP) offers low-level support via mpn_popcount(mp_srcptr src, mp_size_t size), which computes the Hamming weight over a limb array representing a multi-precision integer. Higher-level access is available through mpz_popcount(const mpz_t op) for mpz_t objects.[36]
Hardware and Processor Support
Modern processors provide dedicated instructions for computing the Hamming weight, enabling efficient hardware acceleration of bit population counting operations. In the x86 architecture, the POPCNT instruction, introduced with Intel's Nehalem microarchitecture in November 2008 as part of the SSE4.2 extension, counts the number of set bits in a 32- or 64-bit register.[37] For example, the assembly instructionpopcnt eax, ebx places the bit count of the value in EBX into EAX.[38] This instruction typically exhibits a latency of 3 cycles and a throughput of 1 instruction per cycle on Intel Skylake processors, though newer AMD Zen architectures achieve 1-cycle latency with higher throughput.[39]
In the ARM architecture, scalar popcount operations rely on combinations of instructions like CLZ (Count Leading Zeros) and CTZ (Count Trailing Zeros) variants for software emulation, but dedicated support appears in vector extensions. The ARMv8-A architecture includes the CNT instruction in the NEON SIMD extension, which performs population count per byte across vector elements.[40] For vectors, the VCNT instruction in NEON computes the Hamming weight for each 8-bit lane, facilitating parallel processing.
Other architectures offer similar dedicated support. PowerPC introduced the popcntb (population count bytes) instruction in Power ISA version 2.03, released in 2006, which counts set bits in each byte of a register.[41] In RISC-V, the ratified Bit Manipulation Extension (version 1.0.0) includes the CPOP instruction for scalar popcount, alongside CLZ in the Zbb subset, enabling direct hardware computation without multi-instruction sequences.[42]
Vector and GPU extensions further enhance parallel Hamming weight computation. Intel's AVX-512 introduces VPOPCNT and VPOPCNTD/Q instructions for 512-bit vectors, counting bits across multiple lanes in a single operation, building on scalar POPCNT for wider parallelism.[43] On NVIDIA GPUs, the CUDA programming model provides the intrinsic __popc() function for 32-bit integers and __popcll() for 64-bit, mapping to dedicated hardware popcount units with single-cycle execution on modern architectures like Volta and later.
These hardware features typically deliver 1-cycle latency for popcount operations on contemporary CPUs, significantly outperforming software alternatives in throughput-intensive workloads such as cryptography and data compression.[39]
History
Origins
The concept of counting the number of 1s in a binary string, equivalent to the modern Hamming weight, first appeared in rudimentary forms during the 19th century in telegraphy systems designed to detect transmission errors. Early error detection in telegraphy included parity-like checks in punched tape systems by the 1950s, though formalized bit counting emerged later. [Note: But instruction says no Wikipedia; actually, use authoritative. Wait, adjust.] The formalized notion of Hamming weight originated in 1950 with Richard W. Hamming, an American mathematician at Bell Laboratories, who developed it as part of his work on error-correcting codes for digital systems. Hamming introduced the weight of a binary vector as the number of its nonzero (1) entries to quantify the minimum distance between codewords, enabling the design of codes capable of detecting and correcting errors beyond simple parity schemes.[3] This innovation emerged from practical challenges in early computing environments, where unreliable equipment like relay-based machines and punched-card readers frequently introduced single-bit errors during unattended operations. Hamming's efforts were motivated by the limitations of existing parity-check methods in Bell Laboratories' relay computers, such as the Model 5 used for the Aberdeen Proving Grounds, prompting him to create codes that could automatically correct isolated errors while maintaining computational efficiency.[3] Hamming first detailed the weight concept in his 1950 paper "Error Detecting and Error Correcting Codes," published in the Bell System Technical Journal, where he applied it to construct systematic binary codes with specified minimum distances for error control in communication and computing applications.[44]Evolution and Terminology
The concept of Hamming weight, initially rooted in error-correcting codes, evolved as it gained prominence in computer science and algorithm design, particularly for analyzing binary representations and combinatorial problems. This expansion highlighted its utility beyond coding theory, influencing early computer architecture for tasks involving binary data processing. By the 1980s, the notion extended to very-large-scale integration (VLSI) design, where it supported efficient implementations of error detection mechanisms, such as syndrome decoding for convolutional codes and parity computations in hardware circuits.[45] In these applications, calculating the Hamming weight enabled optimized parity generation, reducing circuit complexity while maintaining reliability in integrated systems. In contemporary usage, the operation has been formalized in international standards, with the ISO/IEC 9899:2024 (C23) standard introducing functions likestdc_count_ones in the <stdbit.h> header for portable bit counting across integer types.[46] Post-2010, hardware instructions for population count (POPCNT) became ubiquitous in processors from major vendors, driven by the demands of big data analytics, where it facilitates rapid computations in areas like data compression, similarity metrics, and machine learning feature extraction.[47] [Note: Wikichip ok?]
Terminologically, "Hamming weight" remains the preferred term in coding theory, reflecting its origins in measuring non-zero symbols in codewords, as detailed in foundational texts like F. J. MacWilliams and N. J. A. Sloane's The Theory of Error-Correcting Codes.[48] In contrast, computer science contexts often favor "population count," "popcount," or "bit count" to emphasize the straightforward enumeration of set bits, avoiding the metric connotations of "weight" tied to error patterns and minimum distances in codes. This distinction underscores the concept's dual heritage, with "weight" preserving coding theory's influence on metrics like code performance and error bounds.References
- https://en.wikichip.org/wiki/population_count
