Hubbry Logo
Binary prefixBinary prefixMain
Open search
Binary prefix
Community hub
Binary prefix
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Binary prefix
Binary prefix
from Wikipedia

Prefixes for decimal and binary multiples
Decimal Binary
Value SI Value IEC JEDEC
1000 k kilo 1024 Ki kibi K kilo
10002 M mega 10242 Mi mebi M mega
10003 G giga 10243 Gi gibi G giga
10004 T tera 10244 Ti tebi T tera
10005 P peta 10245 Pi pebi
10006 E exa 10246 Ei exbi
10007 Z zetta 10247 Zi zebi
10008 Y yotta 10248 Yi yobi
10009 R ronna 10249 Ri robi
100010 Q quetta 102410 Qi quebi

A binary prefix is a unit prefix that indicates a multiple of a unit of measurement by an integer power of two. The most commonly used binary prefixes are kibi (symbol Ki, meaning 210 = 1024), mebi (Mi, 220 = 1048576), and gibi (Gi, 230 = 1073741824). They are most often used in information technology as multipliers of bit and byte, when expressing the capacity of storage devices or the size of computer files.

The binary prefixes "kibi", "mebi", etc. were defined in 1999 by the International Electrotechnical Commission (IEC), in the IEC 60027-2 standard (Amendment 2). They were meant to replace the metric (SI) decimal power prefixes, such as "kilo" (k, 103 = 1000), "mega" (M, 106 = 1000000) and "giga" (G, 109 = 1000000000),[1] that were commonly used in the computer industry to indicate the nearest powers of two. For example, a memory module whose capacity was specified by the manufacturer as "2 megabytes" or "2 MB" would hold 2 × 220 = 2097152 bytes, instead of 2 × 106 = 2000000.

On the other hand, a hard disk whose capacity is specified by the manufacturer as "10 gigabytes" or "10 GB", holds 10 × 109 = 10000000000 bytes, or a little more than that, but less than 10 × 230 = 10737418240 and a file whose size is listed as "2.3 GB" may have a size closer to 2.3 × 2302470000000 or to 2.3 × 109 = 2300000000, depending on the program or operating system providing that measurement. This kind of ambiguity is often confusing to computer system users and has resulted in lawsuits.[2][3] The IEC 60027-2 binary prefixes have been incorporated in the ISO/IEC 80000 standard and are supported by other standards bodies, including the BIPM, which defines the SI system,[1]: p.121  the US NIST,[4][5] and the European Union.

Prior to the 1999 IEC standard, some industry organizations, such as the Joint Electron Device Engineering Council (JEDEC), noted the common use of the terms kilobyte, megabyte, and gigabyte, and the corresponding symbols KB, MB, and GB in the binary sense, for use in storage capacity measurements. However, other computer industry sectors (such as magnetic storage) continued using those same terms and symbols with the decimal meaning. Since then, the major standards organizations have expressly disapproved the use of SI prefixes to denote binary multiples, and recommended or mandated the use of the IEC prefixes for that purpose, but the use of SI prefixes in this sense has persisted in some fields.

Definitions

[edit]

Specific units of IEC 60027-2 A.2 and ISO/IEC 80000:13-2025
IEC prefix Representations
Name Symbol Base 2 Base 1024 Value Base 10
kibi Ki 210 10241 1024 = 1.024×103
mebi Mi 220 10242 1048576 1.049×106
gibi Gi 230 10243 1073741824 1.074×109
tebi Ti 240 10244 1099511627776 1.100×1012
pebi Pi 250 10245 1125899906842624 1.126×1015
exbi Ei 260 10246 1152921504606846976 1.153×1018
zebi Zi 270 10247 1180591620717411303424 1.181×1021
yobi Yi 280 10248 1208925819614629174706176 1.209×1024
robi Ri 290 10249 1237940039285380274899124224 1.238×1027
quebi Qi 2100 102410 1267650600228229401496703205376 1.268×1030

In 2022, the International Bureau of Weights and Measures (BIPM) adopted the decimal prefixes ronna for 10009 and quetta for 100010.[6][7] In 2025, the prefixes robi (Ri, 10249) and quebi (Qi, 102410) were adopted by the IEC.[8]

Comparison of binary and decimal prefixes

[edit]

The relative difference between the values in the binary and decimal interpretations increases, when using the SI prefixes as the base, from 2.4% for kibi vs. kilo to nearly 27% for the quebi vs. quetta.

Prefix Binary ÷ Decimal Decimal ÷ Binary
kilo kibi 1.024 (+2.4%)
 
0.9766 (−2.3%)
 
mega mebi 1.049 (+4.9%)
 
0.9537 (−4.6%)
 
giga gibi 1.074 (+7.4%)
 
0.9313 (−6.9%)
 
tera tebi 1.100 (+10.0%)
 
0.9095 (−9.1%)
 
peta pebi 1.126 (+12.6%)
 
0.8882 (−11.2%)
 
exa exbi 1.153 (+15.3%)
 
0.8674 (−13.3%)
 
zetta zebi 1.181 (+18.1%)
 
0.8470 (−15.3%)
 
yotta yobi 1.209 (+20.9%)
 
0.8272 (−17.3%)
 
ronna robi 1.238 (+23.8%)
 
0.8078 (−19.2%)
 
quetta quebi 1.268 (+26.8%)
 
0.7889 (−21.1%)
 

History

[edit]

Early prefixes

[edit]

There are several numeral prefixes in the English language that are binary prefixes, such as semi-, hemi-, di-, tetra- and octo-.

The original metric system adopted by France in 1795 included two binary prefixes named double- (2×) and demi- (1/2×).[9] However, these were not retained when the SI prefixes were internationally adopted by the 11th CGPM conference in 1960.

Storage capacity

[edit]

Main memory

[edit]

Early computers used one of two addressing methods to access the system memory; binary (base 2) or decimal (base 10).[10] For example, the IBM 701 (1952) used a binary methods and could address 2048 words of 36 bits each, while the IBM 702 (1953) used a decimal system, and could address ten thousand 7-bit words.

By the mid-1960s, binary addressing had become the standard architecture in most computer designs, and main memory sizes were most commonly powers of two. This is the most natural configuration for memory, as all combinations of states of their address lines map to a valid address, allowing easy aggregation into a larger block of memory with contiguous addresses.

While early documentation specified those memory sizes as exact numbers such as 4096, 8192, or 16384 units (usually words, bytes, or bits), computer professionals also started using the long-established metric system prefixes "kilo", "mega", "giga", etc., defined to be powers of 10,[1] to mean instead the nearest powers of two; namely, 210 = 1024, 220 = 10242, 230 = 10243, etc.[11][12] The corresponding metric prefix symbols ("k", "M", "G", etc.) were used with the same binary meanings.[13][14] The symbol for 210 = 1024 could be written either in lower case ("k")[15][16][17] or in uppercase ("K"). The latter was often used intentionally to indicate the binary rather than decimal meaning.[18] This convention, which could not be extended to higher powers, was widely used in the documentation of the IBM 360 (1964)[18] and of the IBM System/370 (1972),[19] of the CDC 7600,[20] of the DEC PDP-11/70 (1975)[21] and of the DEC VAX-11/780 (1977).[citation needed]

In other documents, however, the metric prefixes and their symbols were used to denote powers of 10, but usually with the understanding that the values given were approximate, often truncated down. Thus, for example, a 1967 document by Control Data Corporation (CDC) abbreviated "216 = 64 × 1024 = 65536 words" as "65K words" (rather than "64K" or "66K"),[22] while the documentation of the HP 21MX real-time computer (1974) denoted 3 × 216 = 192 × 1024 = 196608 as "196K" and 220 = 1048576 as "1M".[23]

These three possible meanings of "k" and "K" ("1024", "1000", or "approximately 1000") were used loosely around the same time, sometimes by the same company. The HP 3000 business computer (1973) could have "64K", "96K", or "128K" bytes of memory.[24] The use of SI prefixes, and the use of "K" instead of "k" remained popular in computer-related publications well into the 21st century, although the ambiguity persisted. The correct meaning was often clear from the context; for instance, in a binary-addressed computer, the true memory size had to be either a power of 2, or a small integer multiple thereof. Thus a "512 megabyte" RAM module was generally understood to have 512 × 10242 = 536870912 bytes, rather than 512000000.

Hard disks

[edit]

In specifying disk drive capacities, manufacturers have always used conventional decimal SI prefixes representing powers of 10. Storage in a rotating disk drive is organized in platters and tracks whose sizes and counts are determined by mechanical engineering constraints so that the capacity of a disk drive has hardly ever been a simple multiple of a power of 2. For example, the first commercially sold disk drive, the IBM 350 (1956), had 50 physical disk platters containing a total of 50000 sectors of 100 characters each, for a total quoted capacity of 5 million characters.[25]

Moreover, since the 1960s, many disk drives used IBM's disk format, where each track was divided into blocks of user-specified size; and the block sizes were recorded on the disk, subtracting from the usable capacity. For example, the IBM 3336 disk pack was quoted to have a 200-megabyte capacity, achieved only with a single 13030-byte block in each of its 808 × 19 tracks.

Decimal megabytes were used for disk capacity by the CDC in 1974.[26] The Seagate ST-412,[27] one of several types installed in the IBM PC/XT,[28] had a capacity of 10027008 bytes when formatted as 306 × 4 tracks and 32 256-byte sectors per track, which was quoted as "10 MB".[29] Similarly, a "300 GB" hard drive can be expected to offer only slightly more than 300×109 = 300000000000, bytes, not 300 × 230 (which would be about 322×109 bytes or "322 GB"). The first terabyte (SI prefix, 1000000000000 bytes) hard disk drive was introduced in 2007.[30] Decimal prefixes were generally used by information processing publications when comparing hard disk capacities.[31]

Some programs and operating systems, such as Microsoft Windows, still use "MB" and "GB" to denote binary prefixes even when displaying disk drive capacities and file sizes, as did Classic Mac OS. Thus, for example, the capacity of a "10 MB" (decimal "M") disk drive could be reported as "9.56 MB", and that of a "300 GB" drive as "279.4 GB". Some operating systems, such as Mac OS X,[32] Ubuntu,[33] and Debian,[34] have been updated to use "MB" and "GB" to denote decimal prefixes when displaying disk drive capacities and file sizes. Some manufacturers, such as Seagate Technology, have released recommendations stating that properly-written software and documentation should specify clearly whether prefixes such as "K", "M", or "G" mean binary or decimal multipliers.[35][36]

Floppy disks

[edit]

Floppy disks used a variety of formats, and their capacities was usually specified with SI-like prefixes "K" and "M" with either decimal or binary meaning. The capacity of the disks was often specified without accounting for the internal formatting overhead, leading to more irregularities.

The early 8-inch diskette formats could contain less than a megabyte with the capacities of those devices specified in kilobytes, kilobits or megabits.[37][38]

The 5.25-inch diskette sold with the IBM PC AT could hold 1200 × 1024 = 1228800 bytes, and thus was marketed as "1200 KB" with the binary sense of "KB".[39] However, the capacity was also quoted "1.2 MB",[40] which was a hybrid decimal and binary notation, since the "M" meant 1000 × 1024. The precise value was 1.2288 MB (decimal) or 1.171875 MiB (binary).

The 5.25-inch Apple Disk II had 256 bytes per sector, 13 sectors per track, 35 tracks per side, or a total capacity of 116480 bytes. It was later upgraded to 16 sectors per track, giving a total of 140 × 210 = 143360 bytes, which was described as "140KB" using the binary sense of "K".

The most recent version of the physical hardware, the "3.5-inch diskette" cartridge, had 720 512-byte blocks (single-sided). Since two blocks comprised 1024 bytes, the capacity was quoted "360 KB", with the binary sense of "K". On the other hand, the quoted capacity of "1.44 MB" of the High Density ("HD") version was again a hybrid decimal and binary notation, since it meant 1440 pairs of 512-byte sectors, or 1440 × 210 = 1474560 bytes. Some operating systems displayed the capacity of those disks using the binary sense of "MB", as "1.4 MB" (which would be 1.4 × 2201468000 bytes). User complaints forced both Apple[citation needed] and Microsoft[41] to issue support bulletins explaining the discrepancy.

Optical disks

[edit]

When specifying the capacities of optical compact discs, "megabyte" and "MB" usually meant 10242 bytes. Thus a "700-MB" (or "80-minute") CD has a nominal capacity of about 700 MiB, which is approximately 730 MB (decimal).[42]

On the other hand, capacities of other optical disc storage media like DVD, Blu-ray Disc, HD DVD and magneto-optical (MO) have been generally specified in decimal gigabytes ("GB"), that is, 10003 bytes. In particular, a typical "4.7 GB" DVD has a nominal capacity of about 4.7×109 bytes, which is about 4.38 GiB.[43]

Tape drives and media

[edit]

Tape drive and media manufacturers have generally used SI decimal prefixes to specify the maximum capacity,[44][45] although the actual capacity would depend on the block size used when recording.

Data and clock rates

[edit]

Computer clock frequencies are always quoted using SI prefixes in their decimal sense. For example, the internal clock frequency of the original IBM PC was 4.77 MHz, that is 4770000 Hz.

Similarly, digital information transfer rates are quoted using decimal prefixe. The Parallel ATA "100 MB/s" disk interface can transfer 100000000 bytes per second, and a "56 Kb/s" modem transmits 56000 bits per second. Seagate specified the sustained transfer rate of some hard disk drive models with both decimal and IEC binary prefixes.[35] The standard sampling rate of music compact disks, quoted as 44.1 kHz, is indeed 44100 samples per second.[citation needed] A "1 Gb/s" Ethernet interface can receive or transmit up to 109 bits per second, or 125000000 bytes per second within each packet. A "56k" modem can encode or decode up to 56000 bits per second.

Decimal SI prefixes are also generally used for processor-memory data transfer speeds. A PCI-X bus with 66 MHz clock and 64 bits wide can transfer 66000000 64-bit words per second, or 4224000000 bit/s = 528000000 B/s, which is usually quoted as 528 MB/s. A PC3200 memory on a double data rate bus, transferring 8 bytes per cycle with a clock speed of 200 MHz has a bandwidth of 200000000 × 8 × 2 = 3200000000 B/s, which would be quoted as 3.2 GB/s.

Ambiguous standards

[edit]

The ambiguous usage of the prefixes "kilo ("K" or "k"), "mega" ("M"), and "giga" ("G"), as meaning both powers of 1000 or (in computer contexts) of 1024, has been recorded in popular dictionaries,[46][47][48] and even in some obsolete standards, such as ANSI/IEEE 1084-1986[49] and ANSI/IEEE 1212-1991,[50] IEEE 610.10-1994,[51] and IEEE 100-2000.[52] Some of these standards specifically limited the binary meaning to multiples of "byte" ("B") or "bit" ("b").

Early binary prefix proposals

[edit]

Before the IEC standard, several alternative proposals existed for unique binary prefixes, starting in the late 1960s. In 1996, Markus Kuhn proposed the extra prefix "di" and the symbol suffix or subscript "2" to mean "binary"; so that, for example, "one dikilobyte" would mean "1024 bytes", denoted "K2B" or "K2B".[53]

In 1968, Donald Morrison proposed to use the Greek letter kappa (κ) to denote 1024, κ2 to denote 10242, and so on.[54] (At the time, memory size was small, and only K was in widespread use.) In the same year, Wallace Givens responded with a suggestion to use bK as an abbreviation for 1024 and bK2 or bK2 for 10242, though he noted that neither the Greek letter nor lowercase letter b would be easy to reproduce on computer printers of the day.[55] Bruce Alan Martin of Brookhaven National Laboratory proposed that, instead of prefixes, binary powers of two were indicated by the letter B followed by the exponent, similar to E in decimal scientific notation. Thus one would write 3B20 for 3 × 220.[56] This convention is still used on some calculators to present binary floating point-numbers today.[57]

In 1969, Donald Knuth, who uses decimal notation like 1 MB = 1000 kB,[58] proposed that the powers of 1024 be designated as "large kilobytes" and "large megabytes", with abbreviations KKB and MMB.[59]

Consumer confusion

[edit]

The ambiguous meanings of "kilo", "mega", "giga", etc., has caused significant consumer confusion, especially in the personal computer era. A common source of confusion was the discrepancy between the capacities of hard drives specified by manufacturers, using those prefixes in the decimal sense, and the numbers reported by operating systems and other software, that used them in the binary sense, such as the Apple Macintosh in 1984. For example, a hard drive marketed as "1 TB" could be reported as having only "931 GB". The confusion was compounded by fact that RAM manufacturers used the binary sense too.

[edit]

The different interpretations of disk size prefixes led to class action lawsuits against digital storage manufacturers. These cases involved both flash memory and hard disk drives.

Early cases

[edit]

Early cases (2004–2007) were settled prior to any court ruling with the manufacturers admitting no wrongdoing but agreeing to clarify the storage capacity of their products on the consumer packaging. Accordingly, many flash memory and hard disk manufacturers have disclosures on their packaging and web sites clarifying the formatted capacity of the devices or defining MB as 1 million bytes and 1 GB as 1 billion bytes.[60][61][62][63]

Willem Vroegh v. Eastman Kodak Company

[edit]

On 20 February 2004, Willem Vroegh filed a lawsuit against Lexar Media, Dane–Elec Memory, Fuji Photo Film USA, Eastman Kodak Company, Kingston Technology Company, Inc., Memorex Products, Inc.; PNY Technologies Inc., SanDisk Corporation, Verbatim Corporation, and Viking Interworks alleging that their descriptions of the capacity of their flash memory cards were false and misleading.

Vroegh claimed that a 256 MB Flash Memory Device had only 244 MB of accessible memory. "Plaintiffs allege that Defendants marketed the memory capacity of their products by assuming that one megabyte equals one million bytes and one gigabyte equals one billion bytes." The plaintiffs wanted the defendants to use the customary values of 10242 for megabyte and 10243 for gigabyte. The plaintiffs acknowledged that the IEC and IEEE standards define a MB as one million bytes but stated that the industry has largely ignored the IEC standards.[64]

The parties agreed that manufacturers could continue to use the decimal definition so long as the definition was added to the packaging and web sites.[65] The consumers could apply for "a discount of ten percent off a future online purchase from Defendants' Online Stores Flash Memory Device".[66]

Orin Safier v. Western Digital Corporation

[edit]

On 7 July 2005, an action entitled Orin Safier v. Western Digital Corporation, et al. was filed in the Superior Court for the City and County of San Francisco, Case No. CGC-05-442812. The case was subsequently moved to the Northern District of California, Case No. 05-03353 BZ.[67]

Although Western Digital maintained that their usage of units is consistent with "the indisputably correct industry standard for measuring and describing storage capacity", and that they "cannot be expected to reform the software industry", they agreed to settle in March 2006 with 14 June 2006 as the Final Approval hearing date.[68]

Western Digital offered to compensate customers with a gratis download of backup and recovery software that they valued at US$30. They also paid $500000 in fees and expenses to San Francisco lawyers Adam Gutride and Seth Safier, who filed the suit. The settlement called for Western Digital to add a disclaimer to their later packaging and advertising.[69][70][71] Western Digital had this footnote in their settlement. "Apparently, Plaintiff believes that he could sue an egg company for fraud for labeling a carton of 12 eggs a 'dozen', because some bakers would view a 'dozen' as including 13 items."[72]

Cho v. Seagate Technology (US) Holdings, Inc.

[edit]

A lawsuit (Cho v. Seagate Technology (US) Holdings, Inc., San Francisco Superior Court, Case No. CGC-06-453195) was filed against Seagate Technology, alleging that Seagate overrepresented the amount of usable storage by 7% on hard drives sold between 22 March 2001 and 26 September 2007. The case was settled without Seagate admitting wrongdoing, but agreeing to supply those purchasers with gratis backup software or a 5% refund on the cost of the drives.[73]

Dinan et al. v. SanDisk LLC

[edit]

On 22 January 2020, the district court of the Northern District of California ruled in favor of the defendant, SanDisk, upholding its use of "GB" to mean 1000000000 bytes.[2] The Ninth Circuit affirmed in February 2021.[3]

IEC 1999 Standard

[edit]

In 1995, the International Union of Pure and Applied Chemistry's (IUPAC) Interdivisional Committee on Nomenclature and Symbols (IDCNS) proposed the prefixes "kibi" (short for "kilobinary"), "mebi" ("megabinary"), "gibi" ("gigabinary") and "tebi" ("terabinary"), with respective symbols "kb", "Mb", "Gb" and "Tb",[74] for binary multipliers. The proposal suggested that the SI prefixes should be used only for powers of 10; so that a disk drive capacity of "500 gigabytes", "0.5 terabytes", "500 GB", or "0.5 TB" should all mean 500×109 bytes, exactly or approximately, rather than 500 × 230 (= 536870912000) or 0.5 × 240 (= 549755813888).

The proposal was not accepted by IUPAC at the time, but was taken up in 1996 by the Institute of Electrical and Electronics Engineers (IEEE) in collaboration with the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC). The prefixes "kibi", "mebi", "gibi" and "tebi" were retained, but with the symbols "Ki" (with capital "K"), "Mi", "Gi" and "Ti" respectively.[75]

In January 1999, the IEC published this proposal, with additional prefixes "pebi" ("Pi") and "exbi" ("Ei"), as an international standard (IEC 60027-2 Amendment 2)[76][77][78] The standard reaffirmed the BIPM's position that the SI prefixes should always denote powers of 10. The third edition of the standard, published in 2005, added prefixes "zebi" and "yobi", thus matching all then-defined SI prefixes with binary counterparts.[79]

The harmonized ISO/IEC IEC 80000-13:2025 standard cancels and replaces subclauses 3.8 and 3.9 of IEC 60027-2:2005 (those defining prefixes for binary multiples). The only significant change is the addition of explicit definitions for some quantities.[80] In 2009, the prefixes kibi-, mebi-, etc. were defined by ISO 80000-1 in their own right, independently of the kibibyte, mebibyte, and so on.

The BIPM standard JCGM 200:2012 "International vocabulary of metrology – Basic and general concepts and associated terms (VIM), 3rd edition" lists the IEC binary prefixes and states "SI prefixes refer strictly to powers of 10, and should not be used for powers of 2. For example, 1 kilobit should not be used to represent 1024 bits (210 bits), which is 1 kibibit."[81]

The IEC 60027-2 standard recommended operating systems and other software were updated to use binary or decimal prefixes consistently, but incorrect usage of SI prefixes for binary multiples is still common. At the time, the IEEE decided that their standards would use the prefixes "kilo", etc. with their metric definitions, but allowed the binary definitions to be used in an interim period as long as such usage was explicitly pointed out on a case-by-case basis.[82]

Other standards bodies and organizations

[edit]

The IEC standard binary prefixes are supported by other standardization bodies and technical organizations.

The United States National Institute of Standards and Technology (NIST) supports the ISO/IEC standards for "Prefixes for binary multiples" and has a web page[83] documenting them, describing and justifying their use. NIST suggests that in English, the first syllable of the name of the binary-multiple prefix should be pronounced in the same way as the first syllable of the name of the corresponding SI prefix, and that the second syllable should be pronounced as bee.[5] NIST has stated the SI prefixes "refer strictly to powers of 10" and that the binary definitions "should not be used" for them.[84]

As of 2014, the microelectronics industry standards body JEDEC describes the IEC prefixes in its online dictionary, but acknowledges that the SI prefixes and the symbols "K", "M" and "G" are still commonly used with the binary sense for memory sizes.[85][86]

On 19 March 2005, the IEEE standard IEEE 1541-2002 ("Prefixes for Binary Multiples") was elevated to a full-use standard by the IEEE Standards Association after a two-year trial period.[87][88] as of April 2008, the IEEE Publications division does not require the use of IEC prefixes in its major magazines such as Spectrum[89] or Computer.[90]

The International Bureau of Weights and Measures (BIPM), which maintains the International System of Units (SI), expressly prohibits the use of SI prefixes to denote binary multiples, and recommends the use of the IEC prefixes as an alternative since units of information are not included in the SI.[91][1]

The Society of Automotive Engineers (SAE) prohibits the use of SI prefixes with anything but a power-of-1000 meaning, but does not cite the IEC binary prefixes.[92]

The European Committee for Electrotechnical Standardization (CENELEC) adopted the IEC-recommended binary prefixes via the harmonization document HD 60027-2:2003-03.[93] The European Union (EU) has required the use of the IEC binary prefixes since 2007.[94]

Current practice

[edit]
The 536870912-byte capacity of these RAM modules is stated as "512 MB" on the label.
GNOME's partition editor uses IEC prefixes to display partition sizes. The total capacity of the 120 × 109-byte disk is displayed as "111.79 GiB".
GNOME's system monitor uses IEC prefixes to show memory size and networking data rate.

Some computer industry participants, such as Hewlett-Packard (HP),[95] and IBM[96][97] have adopted or recommended IEC binary prefixes as part of their general documentation policies.

As of 2023, the use of SI prefixes with the binary meanings is still prevalent for specifying the capacity of the main memory of computers, of RAM, ROM, EPROM, and EEPROM chips and memory modules, and of the cache of computer processors. For example, a "512-megabyte" or "512 MB" memory module holds 512 MiB; that is, 512 × 220 bytes, not 512 × 106 bytes.[98][99][100][101]

JEDEC continues to include the customary binary definitions of "kilo", "mega", and "giga" in the document Terms, Definitions, and Letter Symbols,[102] and, as of 2010, still used those definitions in their memory standards.[103][104][105][106][107]

On the other hand, the SI prefixes with powers of ten meanings are generally used for the capacity of external storage units, such as disk drives,[108][109][110][111][112] solid state drives, and USB flash drives,[63] except for some flash memory chips intended to be used as EEPROMs. However, some disk manufacturers have used the IEC prefixes to avoid confusion.[113] The decimal meaning of SI prefixes is usually also intended in measurements of data transfer rates, and clock speeds.[citation needed]

Some operating systems and other software use either the IEC binary multiplier symbols ("Ki", "Mi", etc.)[114][115][116][117][118][119] or the SI multiplier symbols ("k", "M", "G", etc.) with decimal meaning. Some programs, such as the GNU ls command, let the user choose between binary or decimal multipliers. However, some continue to use the SI symbols with the binary meanings, even when reporting disk or file sizes. Some programs may also use "K" instead of "k", with either meaning.[120]

Other uses

[edit]

While the binary prefixes are predominantly used with units of data, bits and bytes, they may be used with other unit of measure. For example, in signal processing it may be convenient to use a binary prefix with the unit of frequency, hertz (Hz), to produce a unit such as the kibihertz (KiHz), which is equal to 1024 Hz.[121][122]

See also

[edit]

References

[edit]

Further reading

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Binary prefixes are unit prefixes that denote multiples of a base unit by an integer power of two (2n), primarily applied in and to specify quantities of digital information such as bytes of memory or storage capacity. They provide a standardized alternative to the decimal-based SI prefixes (powers of 10), addressing the longstanding convention in where terms like "" ambiguously referred to bytes (210) rather than 1000 bytes, due to the close approximation of 210 to 103. The International Electrotechnical Commission (IEC) formalized binary prefixes in Amendment 2 to IEC International Standard 60027-2 (published 1998 and revised 2005), defining symbols and names such as kibi (Ki) for 210 = 1024, mebi (Mi) for 220 = 1 048 576, gibi (Gi) for 230 = 1 073 741 824, and extending to yobi (Yi) for 280. This system enables precise distinctions, for instance, between a kilobyte (kB = 1000 bytes) and a kibibyte (KiB = 1024 bytes), or a gigabyte (GB = 109 bytes, as used by storage manufacturers for marketing) versus a gibibyte (GiB = 230 bytes, common in RAM addressing). The introduction of binary prefixes resolved a core ambiguity rooted in computing's binary architecture, where memory and file systems operate on powers of two, yet decimal prefixes led to discrepancies—such as consumer hard drives labeled as 1 TB holding only about 931 GiB when formatted—prompting calls for clarity from standards bodies like the IEC and endorsements from the National Institute of Standards and Technology (NIST). Despite this, adoption remains uneven: while some software (e.g., certain distributions and macOS) displays binary prefixes for , most hardware vendors, operating systems like Windows, and file systems persist with overloaded decimal terms for binary quantities, sustaining confusion and resistance attributed to entrenched habits and commercial incentives favoring larger apparent capacities.

Technical Foundations

Definition and Notation

Binary prefixes are unit prefixes denoting multiples of base units by integer powers of two (2^n), distinct from the decimal prefixes of the (SI), which use powers of ten (10^n). They address the need in and to express quantities like and storage capacities that align with binary addressing schemes, where 1 traditionally equals bytes (2^10) rather than 1000 bytes (10^3). The (IEC) defined binary prefixes in Amendment 2 to IEC 60027-2, published in January 1999, to resolve ambiguities arising from the overloaded use of decimal prefixes in binary contexts. These prefixes combine the initial letters of corresponding SI prefixes with "bi" (from "binary"), yielding names like "kibi" for the 2^10 factor; symbols use two uppercase letters, such as "Ki" for kibi, followed by the unit symbol (e.g., KiB for kibibyte). The full name incorporates the unit, as in "kibibyte" (KiB = 2^10 bytes = bytes).
FactorPrefix NameSymbolDerivation
2¹⁰kibiKikilobinary (2¹⁰)¹
2²⁰mebiMimegabinary (2¹⁰)²
2³⁰gibiGigigabinary (2¹⁰)³
2⁴⁰tebiTiterabinary (2¹⁰)⁴
2⁵⁰pebiPipetabinary (2¹⁰)⁵
2⁶⁰exbiEiexabinary (2¹⁰)⁶
2⁷⁰zebiZizetabinary (2¹⁰)⁷
2⁸⁰yobiYiyottabinary (2¹⁰)⁸
This notation ensures precision: for instance, 1 MiB = 1,048,576 bytes (2^20), while 1 GB (decimal) = 1,000,000,000 bytes, highlighting the ~7.37% difference at larger scales that binary prefixes clarify. Usage typically pairs the prefix with bits (b) or bytes (B), as in Kibit or MiB, with bytes being eight bits in standard computing contexts.

Mathematical and Computational Basis

Binary prefixes quantify data in digital systems using multiples of powers of 2, specifically 210n2^{10n} for integer n1n \geq 1, to align with the binary architecture of computers. This choice arises because digital memory and storage operate on binary data, where addresses and capacities are expressed as 2k2^k units for kk addressing bits, enabling direct hardware mapping without conversion overhead. For instance, a system with 10 address bits can access 210=[1024](/page/1024)2^{10} = [1024](/page/1024) locations, naturally scaling in binary increments. The factor of 210=10242^{10} = 1024 approximates the decimal 103=100010^3 = 1000 by about 2.4%, facilitating human-readable notation while preserving exact binary alignment for operations like bit shifting and masking, which are fundamental to processors. In practice, the kibibyte (KiB) equals 1024=2101024 = 2^{10} bytes, the mebibyte (MiB) equals 2202^{20} bytes, and higher prefixes follow as 2302^{30} (GiB), 2402^{40} (TiB), up to 2802^{80} (YiB), as standardized for unambiguous measurement in . Computationally, this basis optimizes allocation: page sizes (e.g., 4 KiB = 2122^{12} bytes) and cache lines are powers of 2 to minimize alignment issues and enable fast modulo-2 arithmetic via bitwise AND operations. This structure contrasts with by prioritizing causal efficiency in binary hardware over convenience, as non-power-of-2 sizes would require additional circuitry for addressing, increasing latency and power consumption in real-world implementations. Empirical evidence from confirms that binary scaling reduces in indexing and bounds checking, underpinning reliable performance in operating systems and .

Comparison with Decimal Prefixes

Binary prefixes, such as kibi (Ki) denoting 210=10242^{10} = 1024, differ fundamentally from in the (SI), where denotes 103=100010^3 = 1000. This distinction arises because binary prefixes align with the base-2 architecture of digital computing, where data addressing and memory allocation operate in powers of 2 for efficiency in binary operations, whereas decimal prefixes reflect the base-10 human counting system used in general . The numerical divergence between corresponding prefixes grows logarithmically with scale. For instance, one mebibyte (MiB) equals 220=1,048,5762^{20} = 1,048,576 bytes, while one (MB) equals 106=1,000,00010^6 = 1,000,000 bytes, yielding a relative difference of approximately 4.86%. At larger magnitudes, such as terabyte (TB = 101210^{12}) versus tebibyte (TiB = 2401.0995×10122^{40} \approx 1.0995 \times 10^{12}), the discrepancy exceeds 9.09%. These values are summarized below:
Prefix LevelDecimal Value (SI, powers of 10)Binary Value (IEC, powers of 2102^{10})Relative Difference (%)
/Kibi103=1,00010^3 = 1,000210=1,0242^{10} = 1,0242.4
Mega/Mebi106=1,000,00010^6 = 1,000,000220=1,048,5762^{20} = 1,048,5764.86
/109=1,000,000,00010^9 = 1,000,000,0002301.0737×1092^{30} \approx 1.0737 \times 10^97.37
Tera/Tebi1012=1,000,000,000,00010^{12} = 1,000,000,000,0002401.0995×10122^{40} \approx 1.0995 \times 10^{12}9.95
In applications, binary prefixes better represent actual hardware capacities, such as RAM modules sized in powers of 2 (e.g., 1 GiB = 2302^{30} bytes), due to binary addressing schemes that minimize computational overhead in . , conversely, predominate in contexts like specifications, where manufacturers report capacities in base-10 multiples to align with SI conventions, often leading to user-perceived shortfalls when binary-based operating systems display volumes. The IEC formalized binary prefixes in 1998 precisely to resolve this mismatch, emphasizing that SI decimal prefixes are not intended for binary multiples.

Historical Development

Early Computing and Prefix Usage

In the era of early electronic digital computers, from the late 1940s through the , binary representation dominated internal operations and addressing due to the simplicity of electronic switches operating in two states—on or off—aligning with powers of 2 for efficient hardware implementation. and storage capacities were thus expressed as multiples of 2^n, with 2^10 = emerging as a fundamental unit because it closely approximated the 1000, enabling the reuse of the SI prefix "kilo" (k) as a convenient without introducing awkward decimals in binary contexts. This adaptation was not a formal redefinition but a pragmatic convention driven by the causal necessity of binary alignment in hardware, where multiples would complicate addressing and allocation; for instance, early magnetic and core memories were engineered in -unit blocks to match word sizes and bus widths. Documented applications of this usage appeared in mainframe specifications by the mid-1960s, such as the series (announced 1964), which denoted RAM capacities like 8K or 16K bytes, explicitly equating 1K to bytes to reflect binary page and block sizing in its architecture. Likewise, minicomputers like the DEC PDP-8 () employed "K" for 1024-word modules in core memory, standardizing the term across programming manuals and datasheets for both and ferrite-core technologies prevalent at the time. These practices ensured seamless integration with binary instructions and avoided the inefficiencies of non-power-of-2 granularities, establishing "kilo" as synonymous with 1024 in computational metrics without initial contention, as data volumes remained small and predominantly RAM-focused. This binary prefix convention extended to peripherals and software addressing, where quantities like 1K bits or words facilitated direct mapping to and avoided overflow in early limited-address-space systems; for example, the (operational 1951) and subsequent machines used binary scaling that implicitly favored 1024-unit increments in engineering reports. Empirical hardware constraints, such as the 1024-bit planes in core memory arrays, reinforced this as the default, predating any decimal alternatives in storage media where binary fidelity was paramount for error-free data handling.

Onset of Ambiguity in Data Storage

The divergence between binary and interpretations of prefixes in emerged prominently in the late and early , as (HDD) capacities scaled to the level. Prior to this, smaller capacities in kilobytes and megabytes exhibited minimal practical discrepancy (e.g., 2^10 = vs. 10^3 = 1000 bytes for , a ~2.4% difference), which was often overlooked in early contexts where exact byte counts were specified without . HDD manufacturers, however, consistently applied (SI) definitions from the introduction of prefixed capacities, defining 1 GB as 10^9 bytes to align with metric conventions and facilitate marketing of higher nominal figures. A key early instance was IBM's 3380 drive, released in 1980, advertised with 2.52 GB capacity—equivalent to 2.52 × 10^9 bytes—marking the first commercial HDD exceeding 1 GB under decimal reckoning. This approach contrasted sharply with RAM and software conventions, where 1 GB denoted 2^30 = 1,073,741,824 bytes, rooted in binary addressing for allocation and file systems. Operating systems like Windows and Unix-derived systems displayed available storage using binary units, leading consumers to perceive a shortfall; a drive labeled 10 GB yielded roughly 9.31 GiB (binary gigabytes) in the OS. By the early 1990s, this practice proliferated among manufacturers, including IBM's series, exacerbating confusion as consumer PCs adopted GB-scale HDDs while RAM remained firmly binary-aligned. The ~7.37% underreporting gap for GB (and larger for TB at ~10%) fueled early complaints, as users compared advertised specs against OS-reported , highlighting the causal role of marketing incentives in prioritizing decimal for storage over computational binary norms. Empirical evidence of user frustration appears in technical forums from the mid-1990s onward, predating formal attempts.

Pre-IEC Proposals for Resolution

In 1968, as ambiguities in prefix usage became evident in contexts—where and storage were typically measured in powers of rather than 1000—proposals for distinct binary notations appeared in the Communications of the ACM. Wallace Givens recommended "bK" as a specific for to differentiate it from decimal equivalents, arguing that this would reduce confusion in technical documentation without altering established SI conventions. The same publication featured additional suggestions, including Donald Morrison's advocacy for the Greek letter κ to denote , with κ² representing 1,048,576 (²), extending to higher powers as needed; this approach leveraged symbolic notation to explicitly signal binary scaling while accommodating the era's smaller memory sizes, where ambiguities were already problematic in scaling from, say, 32K to 64K systems. These ideas stemmed from practical concerns in programming and hardware documentation but lacked formal endorsement and saw limited implementation. By the mid-1990s, renewed interest prompted further refinements. In 1996, Markus Kuhn outlined a systematic framework using a "di" prefix (from "dyadic," referencing base-2 structure) combined with SI terms, such as "dikilobyte" for 1024 bytes or "dimegabyte" for 1,048,576 bytes, alongside symbolic variants like k₂B ( with binary subscript). Kuhn emphasized backward compatibility, recommending unabbreviated units like "dikilobyte" for readability and allowing subscripts for compact notation in or displays; he critiqued ad-hoc solutions like uppercase K for binary as insufficiently precise, drawing on historical precedents including the 1968 ACM letters to argue for prefixes that preserved decimal SI integrity while enabling unambiguous binary expression. This proposal influenced later standardization discussions but remained voluntary prior to IEC adoption.

Standardization Efforts

IEC Standards of 1998-1999

In December 1998, the (IEC) approved Amendment 2 to International Standard IEC 60027-2, titled Letter symbols to be used in electrical technology – Part 2: , and related fields. This amendment, developed by IEC Technical Committee 25 (Quantities and units) with encouragement from the International Committee for Weights and Measures (CIPM) and the International Bureau of Weights and Measures (BIPM), introduced formal names and symbols for prefixes denoting binary multiples—powers of 2—to address longstanding in contexts where "" and similar terms had been overloaded to mean both 10^3 (SI decimal) and 2^10 (binary). The amendment specified these prefixes for use in , data transmission, and digital quantities, recommending their application to units like the byte (e.g., 1 KiB = bytes). Amendment 2 was published on January 29, 1999, marking the first international standardization of binary prefixes and extending coverage up to multiples of 2^60. The prefixes follow a consistent : names formed by adding "-bi" to the first two letters of the corresponding SI prefix (e.g., "ki" from "kilo"), with symbols consisting of the SI prefix symbol followed by "i" (e.g., "Ki" from "k"). This design preserved SI prefixes strictly for decimal powers of 10 while providing unambiguous alternatives for binary scales prevalent in addressing and storage. The defined prefixes are as follows:
Binary FactorPrefix NameSymbol
2¹⁰kibiKi
2²⁰mebiMi
2³⁰gibiGi
2⁴⁰tebiTi
2⁵⁰pebiPi
2⁶⁰exbiEi
These prefixes were positioned as a precise tool for technical documentation and specifications, with the IEC emphasizing their role in eliminating errors from prefix misuse, such as discrepancies in reported hard drive capacities where manufacturers used decimal interpretations while operating systems applied binary. The standard's content was later integrated into the second edition of IEC 60027-2 (published November 2000) before being reaffirmed and expanded in subsequent documents like IEC 80000-13 (2008).

Endorsements and Divergences by Other Organizations

The (ISO) endorses binary prefixes through its joint standard with the (IEC), ISO/IEC 80000-13:2008 (updated in subsequent editions), which defines names and symbols such as "kibi" (Ki) for 2^10, "mebi" (Mi) for 2^20, and "gibi" (Gi) for 2^30, specifically for use in contexts involving powers of two. This standard explicitly distinguishes binary prefixes from SI decimal prefixes to resolve ambiguities in data quantities. The Institute of Electrical and Electronics Engineers (IEEE) aligns with binary prefixes via IEEE Std 1541-2021, which standardizes symbols like Ki, Mi, and for binary multiples (2^10n), emphasizing their application to units such as the byte to ensure precise communication in electrical and electronics engineering. This standard preserves SI prefixes exclusively for decimal powers of ten while recommending binary prefixes for computational and contexts, reflecting a goal of unambiguous notation without conflicting with metric conventions. In contrast, the National Institute of Standards and Technology (NIST) acknowledges the IEC-defined binary prefixes but maintains that they fall outside the (SI), where prefixes like kilo- and mega- strictly denote decimal multiples of 1000 and 1,000,000, respectively. NIST advises against using SI prefixes for binary quantities, citing potential confusion, and lists binary prefixes separately as a non-SI convention developed by IEC for , without formal integration into broader practices. The Bureau International des Poids et Mesures (BIPM), custodian of the SI, implicitly diverges by confining SI prefixes to decimal bases, as affirmed in interpretations aligning with IEEE 1541-2021, which reject binary interpretations to uphold the metric system's foundational powers-of-ten structure. National standards bodies, such as the British Standards Institution (BSI) through BS EN 80000-13:2008, have adopted the ISO/IEC framework, implementing identical provisions for binary prefixes in European contexts, though practical divergences persist in industry applications favoring decimal notation for storage capacities.

Barriers to Widespread Standardization

The adoption of IEC binary prefixes, formalized in , has faced significant resistance due to deeply entrenched historical conventions in , where terms like "" have denoted 1024 bytes since the mid-20th century, predating formal efforts. This legacy usage permeates software codebases, documentation, and user expectations, making retroactive changes costly and disruptive; for instance, revising millions of lines of existing programs and to incorporate "kibibyte" (KiB) would require extensive testing and could introduce compatibility issues across ecosystems. Binary multiples aligned with powers of 2 emerged naturally from early computer architectures in the and , as processors and addressed data in binary increments, fostering a that resisted later decimal-binary distinctions. Commercial incentives, particularly in data storage hardware, have perpetuated decimal prefix usage, allowing manufacturers to advertise capacities in larger nominal figures—such as labeling a 1 terabyte drive as 10^12 bytes rather than approximately 931 gibibytes (GiB)—to enhance market appeal without altering physical specifications. Hard drive producers, including major firms like Seagate and , adopted this practice by the early 2000s, aligning with (SI) decimal definitions to simplify global marketing and avoid the smaller numbers implied by binary equivalents, despite operating systems often interpreting them as binary for file systems. This divergence creates persistent consumer discrepancies, as evidenced by OS-reported capacities falling short of advertised values by about 7-10% for terabyte-scale drives, yet regulatory bodies have not imposed unified enforcement, allowing industry self-regulation to favor decimal reporting. Practical challenges include the perceived clumsiness of binary prefix nomenclature, such as "mebibyte" (MiB), which lacks the intuitive familiarity of traditional terms and has led to developer and user pushback in software interfaces. The absence of mandatory compliance across standards organizations—while the IEC and IEEE endorse binary prefixes, bodies like NIST clarify they fall outside core SI units—exacerbates fragmentation, with voluntary adoption limited to niches like certain distributions and scientific computing. Without coordinated international mandates or economic penalties, inertia from mixed decimal-binary applications in networking, RAM, and storage sustains ambiguity, hindering widespread standardization over two decades post-IEC introduction.

Controversies and Disputes

Origins of Prefix Misuse in Marketing

The application of to capacities, diverging from the binary conventions prevalent in , emerged in the late as manufacturers scaled production to gigabyte-level storage. This shift enabled advertising capacities using powers of ten—such as defining 1 GB as exactly bytes—to produce larger, round numerical figures aligned with SI standards, contrasting the approximate binary usage (1 GB ≈ 2^{30} = 1,073,741,824 bytes) inherited from early mainframe and RAM addressing. pioneered this approach with its multi-gigabyte drives, advertising them under metrics by the early to reflect physical byte counts in decimal multiples, which facilitated straightforward marketing of products like the 0664, a 1 GB model from 1990. The incentive stemmed from storage engineering practices, where sector sizes (typically 512 or 1024 bytes) and platter densities lent themselves to decimal totalizations for simplicity in specification sheets and sales literature, allowing claims of "1 GB" for drives with precisely 10^9 bytes rather than the larger 2^{30} expected by software users. This decimal labeling effectively inflated advertised capacities by about 7.37% relative to binary expectations, as a 10^9-byte drive appeared as only 953.67 MiB (mebibytes) in operating systems employing binary division. By 1995, as average drive sizes exceeded 1 GB, the discrepancy fueled initial consumer awareness, though manufacturers defended it as adherence to SI purity over computing's historical approximation. Subsequent firms like Seagate and Western Digital adopted the convention industry-wide, embedding it in product datasheets by the mid-1990s to compete on headline numbers amid rapid areal density advances (doubling roughly every 18 months per the then-emerging Kryder's law).

Consumer Confusion and Empirical Evidence

The discrepancy between manufacturer-advertised storage capacities, calculated using (e.g., 1 GB = 10^9 bytes), and the binary-based reporting in operating systems (e.g., 1 GiB ≈ 1.0737 GB) routinely results in users observing about 7-10% less capacity than expected, fostering perceptions of inadequate product performance. This arises because hardware vendors align with SI decimal standards for marketing, while software defaults to powers of for memory addressing, a convention rooted in early computing architecture. Consumers, often lacking awareness of these dual systems, interpret the shortfall as a defect or , prompting support inquiries and returns. Evidence of such confusion manifests in legal actions, including a 2003 class-action filed by U.S. against major hard drive makers like and Seagate, alleging deceptive overstatement of capacities by approximately 7% due to decimal usage. Similar complaints surged in consumer forums and help resources around that period, with users reporting "missing" space on newly purchased drives. Although no large-scale peer-reviewed surveys quantify rates, the persistence of explanatory articles from manufacturers—such as Seagate's entry addressing the issue since at least 2003—indicates recurrent user reports. Common misconceptions attribute the observed shortfall to operating system overhead or manufacturer false claims, rather than the underlying prefix ambiguity between decimal (TB = 10^{12} bytes) and binary (TiB = 2^{40} = 1,099,511,627,776 bytes) definitions. For instance, a drive advertised as 1 TB (1,000,000,000,000 bytes) by the manufacturer displays as approximately 0.909 TiB or 931 GiB in Windows and Linux operating systems using binary scaling, leading users to incorrectly blame software inefficiencies or deceptive advertising. These misunderstandings are evidenced in online forums, such as Reddit and Quora discussions, where consumers frequently express frustration over the discrepancy without recognizing the role of binary prefix conventions in computation. Cognitive analyses highlight how parallel numeration systems ( for sales, binary for computation) impose additional mental processing demands, amplifying errors in capacity estimation and contributing to frustration in IT purchasing. Courts have increasingly rejected deception claims, as in a 2019 federal dismissal of a flash drive suit where the plaintiff alleged a 6.7% shortfall; the ruling held that packaging disclosures and norms suffice for reasonable understanding. This suggests that while initial ambiguity generated verifiable backlash, heightened awareness via online resources has mitigated widespread , though isolated misunderstandings endure among non-technical buyers. One prominent legal challenge arose in the United States through lawsuits against hard drive manufacturers, alleging due to discrepancies between advertised decimal-based capacities (using powers of 1000) and the binary-based measurements (powers of ) displayed by operating systems, resulting in perceived shortfalls of approximately 7% for gigabyte-scale drives. In a 2006 settlement, agreed to pay up to $300,000 in cash and provide product coupons totaling $2.5 million to affected consumers without admitting liability, following claims that an 80 GB drive yielded only 74.4 GB in Windows due to the prefix interpretation difference. Similar litigation targeted flash memory producers, with a 2007 class action suit claiming companies like SanDisk and overstated capacities by applying decimal definitions, inflating sizes by about 4-7% compared to binary expectations in consumer devices. These cases underscored consumer reliance on historical binary conventions in computing, despite manufacturers' adherence to (SI) decimal standards formalized in the 1990s, but courts focused on marketing practices rather than mandating binary prefix adoption like "GiB". Outcomes generally involved settlements offering refunds or discounts rather than injunctions against decimal labeling, reflecting judicial recognition of the ambiguity's roots in industry evolution but reluctance to override established metrological norms. No major U.S. federal rulings definitively resolved the binary-decimal debate, leaving persistent disputes without enforceable standardization on prefixes.

Current Adoption and Practices

Usage in Operating Systems and Software

In Windows, file sizes in are computed using binary scaling, where 1 KB denotes 1024 bytes, reflecting historical conventions aligned with powers of two for and file allocation. Reported drive capacities also employ binary scaling, with 1 GB representing 1,073,741,824 bytes (2^{30} bytes), leading to discrepancies with manufacturer labels that use decimal scaling (1 TB = 10^{12} bytes). For instance, a drive advertised as 1 TB by the manufacturer displays as approximately 931 GB or 0.909 TB in Windows, as the operating system interprets the capacity using binary prefixes; this is not unique to Windows, as Linux distributions also use binary scaling for drive capacity reporting in tools like df. This dual approach, where hardware uses decimal but software uses binary, persists as of in 2025, contributing to user confusion and misconceptions, such as attributing the shortfall to operating system overhead or manufacturer false advertising rather than prefix differences. For clarity in such contexts, the tebibyte (TiB) is defined by the IEC as exactly 2^{40} bytes (1,099,511,627,776 bytes), distinguishing it from the terabyte (TB = 10^{12} bytes). macOS, from version 10.6 onward, uniformly applies decimal prefixes for both file sizes and disk capacities in Finder and , defining 1 kB as 1000 bytes and 1 GB as 1,000,000,000 bytes to conform with SI decimal conventions and storage industry practices. This standardization, implemented by in 2009, avoids binary multipliers for reported volumes, though underlying operations like HFS+ or APFS may internally leverage binary blocks. Linux distributions predominantly utilize binary prefixes in command-line tools and graphical interfaces. The df utility, for instance, defaults to binary units with the -h flag, scaling in KiB (1024 bytes), MiB (1,048,576 bytes), and equivalents since coreutils version 8.0 in 2010, though the --si option enables decimal output for compatibility with decimal-labeled hardware. File managers in environments like () and () display sizes using binary scaling, frequently adopting IEC prefixes such as KiB and MiB as of in 2014 and in 2011, promoting precision in contexts like RAM and partition sizing. ![GNOME System Monitor memory size and network rate][float-right] Adoption of formal IEC binary prefixes (KiB, MiB) in broader software remains selective, with open-source projects like the GNU coreutils and desktop environments embracing them for clarity, while proprietary applications and legacy code often retain ambiguous KB=1024 notation without the 'i' suffix. As of 2025, empirical surveys of distributions like and confirm binary dominance for in-memory and file operations, but decimal prevails in network throughput displays to align with SI-based bandwidth standards. This variance underscores ongoing tensions between computational efficiency (powers of two) and interoperability with decimal hardware metrics.

Application in Hardware and Data Rates

Binary prefixes are applied in hardware specifications for components structured around binary addressing, including (RAM) and CPU caches. RAM capacities align with powers of two, where 1 GiB equals 2^{30} bytes (1,073,741,824 bytes), as defined for by standards bodies. For example, a 32-bit processor's addressable RAM is 4 GiB (2^{32} bytes). CPU caches, such as L1 instruction and caches, are commonly sized at 32 KiB or 64 KiB per core, reflecting binary multiples for efficient addressing. In storage hardware like hard disk drives (HDDs) and solid-state drives (SSDs), manufacturers label capacities with (e.g., 1 TB = 10^{12} bytes) to comply with SI conventions, but the actual addressable space and operating system reporting often use binary interpretations, resulting in displayed capacities like 931 GiB for a 1 TB drive or approximately 0.909 TiB, where 1 TiB = 2^{40} bytes (1,099,511,627,776 bytes). Binary prefixes offer precision here, as storage allocation in file systems operates on binary blocks, with differences growing at larger scales (e.g., 1 TB decimal = 0.909 TiB binary, a 9.09% discrepancy). Data rates in hardware, including network bandwidth and storage transfer speeds, predominantly employ decimal prefixes per telecommunications standards; for instance, specifies 1 Gbps as 10^9 bits per second. Binary prefixes, while standardized for data transmission to denote powers of two (e.g., 1 Kibit = 2^{10} bits), see limited adoption in hardware specs, though software tools may report throughput in MiB/s to distinguish from decimal MB/s and align with binary data units. This selective use underscores binary prefixes' role in clarifying binary-aligned measurements amid decimal dominance in rate specifications.

Persistent Industry Resistance and Rationales

Despite the IEC's introduction of binary prefixes in its 60027-2 standard Amendment 2 in 1999, adoption remains limited, with most software, operating systems, and documentation continuing to use traditional SI-derived prefixes like KB for binary multiples of bytes. This persistence stems from historical precedents dating to the mid-1960s, when binary logic in computer architectures prompted practical approximations using powers of 2 under SI-like labels (e.g., for 2¹⁰) due to the absence of dedicated binary units at the time. Industry rationales emphasize entrenched practices and inertia, as retrofitting millions of lines of code, user interfaces, and legacy systems to distinguish "kibibyte" (KiB = 2¹⁰ bytes) from "kilobyte" (kB = 10³ bytes) would impose significant engineering and training costs without immediate benefits. Operating systems like Windows and many distributions retain binary interpretations for RAM and file allocations to align with hardware addressing (powers of 2), prioritizing compatibility over terminological precision. Storage manufacturers, conversely, adhere to for device capacities per SI conventions, as a 1 TB drive equates to exactly 10¹² bytes, avoiding binary labels that would understate marketed volumes by approximately 7-10% relative to operating system displays. Further resistance arises from the perceived unfamiliarity and lack of enforcement of IEC terms, with no regulatory mandates compelling change and minimal institutional dissemination efforts to promote them. Proponents of usage contend that contextual clarity suffices in most scenarios, as the 2.34% divergence between 1000 and 1024 rarely impacts practical computations, and forced adoption could exacerbate short-term user errors during any rollout. This combination of legacy alignment, economic disincentives for storage marketing, and voluntary standards has sustained ambiguous prefix application over strict differentiation.

Extended Applications

Non-Storage Contexts in Computing

In random-access memory (RAM), capacities adhere to binary multiples as defined by JEDEC standards, with the prefix "mega" (M) specifying 2^{20} (1,048,576) bytes for semiconductor memory modules. This convention ensures alignment with binary addressing, where modules are manufactured in powers of two, such as 8 GB equating to 8 × 2^{30} bytes (approximately 8.59 × 10^9 bytes). Operating systems reflect this in reporting; Linux kernels, for instance, interpret "kB" in /proc/meminfo as 1,024 bytes, while Windows displays RAM usage with MB denoting 1,048,576 bytes. CPU caches similarly employ binary units due to their hierarchical, power-of-two structure optimized for fetches. L1 caches commonly range from 16 KiB to 64 KiB per core (2^{14} to 2^{16} bytes), L2 from 256 KiB to 1 MiB (2^{18} to 2^{20} bytes), and shared L3 up to tens of MiB, as verifiable through system queries like Linux's lscpu command outputting sizes in KiB. This usage stems from cache line transfers, typically 64 bytes (2^6), which minimize overhead in binary-aligned architectures. ![GNOME System Monitor displaying memory size in binary units and network rate in decimal units][center] In contrast, non-memory data rates like network bandwidth universally adopt prefixes per industry standards, where 1 Mbps equals exactly 1,000,000 bits per second to facilitate in protocols. The IEC endorses decimal notation here to distinguish from binary contexts, avoiding in transmission metrics. Software contexts, such as memory allocation in programming (e.g., resource limits using MiB for 2^{20} bytes), also favor binary for precision in heap and buffer sizing, though explicit IEC symbols (KiB, MiB) see limited uptake outside technical documentation, with conventional MB/GB often implying binary values. This persistence reflects hardware realities over standardized nomenclature, prioritizing functional accuracy in binary-dominant operations.

Implications for Data Measurement Accuracy

The use of (e.g., as 1000 bytes) for quantities that are inherently binary multiples (e.g., bytes in RAM addressing or blocks) introduces systematic discrepancies in data measurement, with relative errors escalating from approximately 2.4% at the kilobyte level (/1000) to 9.1% at the terabyte level (2^40 / 10^12 ≈ 1.0995). This mismatch arises because and software operate on base-2 addressing and allocation, rendering approximations imprecise for exact capacity reporting and leading to underestimations when binary systems interpret advertised values. In storage devices, a drive marketed as 1 TB (10^12 bytes) yields only about 931 GiB (2^30 bytes per GiB) when measured by operating systems using binary conventions, potentially causing errors in space allocation algorithms or user expectations of available storage, as the actual usable bytes deviate from the labeled figure by over 68 billion bytes. Binary prefixes like tebibyte (TiB = 2^40 bytes) eliminate this offset by aligning nomenclature directly with hardware realities, enabling more accurate forensic analysis, backup verification, and in environments handling petabyte-scale data. For dynamic measurements such as usage or , ambiguous prefixes exacerbate rounding errors in and monitoring tools; for instance, assuming 1 MB = 10^6 bytes in a binary context can inflate reported utilization by up to 4.9% (2^20 / 10^6 ≈ 1.0486), risking suboptimal resource provisioning or undetected overflows in embedded systems. Standards bodies emphasize that reserving SI decimal prefixes strictly for powers of 10 preserves measurement , while binary prefixes ensure causal fidelity to computational bases, reducing propagation of inaccuracies in chained calculations like cumulative data transfers or archival checks.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.