Recent from talks
Nothing was collected or created yet.
Windows-1250
View on Wikipedia| MIME / IANA | windows-1250 |
|---|---|
| Alias(es) | cp1250 (Code page 1250) |
| Languages | Czech, Polish, Slovak, Hungarian, Slovene, Serbo-Croatian (Latin script), Montenegrin, Romanian (before 1993 spelling reform), Turkmen, Rotokas, Albanian, English, German, Irish, Luxembourgish, Dutch |
| Created by | Microsoft |
| Standard | WHATWG Encoding Standard |
| Classification | extended ASCII, Windows-125x |
| Other related encoding | ISO-8859-2 |
Windows-1250 is a code page used under Microsoft Windows to represent texts in Central European and Eastern European languages that use the Latin script. It is primarily used by Czech.[1] It is also used for Polish (as can Windows-1257), Slovak, Hungarian, Slovene (as can Windows-1257), Serbo-Croatian (Latin script), Romanian (before a 1993 spelling reform) and Albanian (as can Windows-1252). It may also be used with the German language, though it is missing uppercase ẞ.[a] German-language texts encoded with Windows-1250 and Windows-1252 are identical.
This has been replaced by UTF-8 far more than Windows-1252 has. As of March 2025[update], less than 0.05% of all web pages use Windows-1250.[2][3][4]
Windows-1250 is similar to ISO-8859-2 and has all the printable characters it has and more. However, a few of them are rearranged (unlike Windows-1252, which keeps all printable characters from ISO-8859-1 in the same place). Most of the rearrangements seem to have been done to keep characters shared with Windows-1252 in the same place but three of the characters moved (Ą, Ľ, ź) cannot be explained this way, since those do not occur in Windows-1252 and could have been put in the same positions as in ISO-8859-2 if ˇ had been put e.g. at 9F.
IBM uses code page 1250 (CCSID 1250 and euro sign extended CCSID 5346) for Windows-1250.[5][6][7][8][9][10][11]
Character set
[edit]The following table shows Windows-1250. Each character is shown with its Unicode equivalent.
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 0x | NUL | SOH | STX | ETX | EOT | ENQ | ACK | BEL | BS | HT | LF | VT | FF | CR | SO | SI |
| 1x | DLE | DC1 | DC2 | DC3 | DC4 | NAK | SYN | ETB | CAN | EM | SUB | ESC | FS | GS | RS | US |
| 2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
| 3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
| 4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
| 5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
| 6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
| 7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | DEL |
| 8x | € | ‚ | „ | … | † | ‡ | ‰ | Š | ‹ | Ś | Ť | Ž | Ź | |||
| 9x | ‘ | ’ | “ | ” | • | – | — | ™ | š | › | ś | ť | ž | ź | ||
| Ax | NBSP | ˇ | ˘ | Ł | ¤ | Ą | ¦ | § | ¨ | © | Ş | « | ¬ | SHY | ® | Ż |
| Bx | ° | ± | ˛ | ł | ´ | µ | ¶ | · | ¸ | ą | ş | » | Ľ | ˝ | ľ | ż |
| Cx | Ŕ | Á | Â | Ă | Ä | Ĺ | Ć | Ç | Č | É | Ę | Ë | Ě | Í | Î | Ď |
| Dx | Đ | Ń | Ň | Ó | Ô | Ő | Ö | × | Ř | Ů | Ú | Ű | Ü | Ý | Ţ | ß |
| Ex | ŕ | á | â | ă | ä | ĺ | ć | ç | č | é | ę | ë | ě | í | î | ď |
| Fx | đ | ń | ň | ó | ô | ő | ö | ÷ | ř | ů | ú | ű | ü | ý | ţ | ˙ |
See also
[edit]Notes
[edit]- ^ In 2017, the Council for German Orthography officially adopted a capital, ⟨ẞ⟩, before support for German was complete. Fully compatible with ISO/IEC 8859-1 for German texts.
References
[edit]- ^ "Distribution of Content Languages among websites that use Windows-1250". w3techs.com. Retrieved 2022-10-23.
- ^ "Historical trends in the usage of character encodings for websites, October 2022". w3techs.com.
- ^ "Frequently Asked Questions". w3techs.com.
- ^ "Distribution of Character Encodings among websites that use Czech". w3techs.com. Retrieved 2022-10-23.
- ^ "Code page 1250 information document". Archived from the original on 2016-03-03.
- ^ "CCSID 1250 information document". Archived from the original on 2016-03-27.
- ^ "CCSID 5346 information document". Archived from the original on 2014-11-29.
- ^ Code Page CPGID 01250 (pdf) (PDF), IBM
- ^ Code Page CPGID 01250 (txt), IBM
- ^ International Components for Unicode (ICU), ibm-1250_P100-1995.ucm, 2002-12-03
- ^ International Components for Unicode (ICU), ibm-5346_P100-1998.ucm, 2002-12-03
- ^ Steele, Shawn (1998), CP1250 to Unicode table, Unicode Consortium, CP1250.TXT
External links
[edit]Windows-1250
View on GrokipediaIntroduction
Overview
Windows-1250 is an 8-bit, single-byte character encoding standard, designated as code page 1250 by Microsoft, that maps 256 distinct code points to characters, with the initial 128 code points identical to those in the ASCII standard for basic Latin letters, digits, and symbols.[3] This design ensures backward compatibility with ASCII while extending support to additional glyphs required for regional scripts.[4] Developed as part of Microsoft's efforts to internationalize the Windows operating system, Windows-1250 was introduced in the 1990s to accommodate Central and Eastern European languages that employ the Latin alphabet augmented with diacritical marks and special characters.[2] It primarily serves languages such as Polish, Czech, Slovak, Hungarian, Slovene, Croatian, Serbian (in Latin script), and Romanian, enabling proper representation of their native texts in software and documents.[2] As a member of the Windows-12xx family of code pages—intended for non-East Asian scripts—Windows-1250 parallels encodings like Windows-1252, which targets Western European languages, but focuses on the unique orthographic needs of its designated region.[3]Purpose and Scope
Windows-1250 was designed to provide an efficient single-byte encoding scheme for Central European languages within Windows applications, addressing the limitations of ASCII and early ISO standards that lacked sufficient support for accented characters in these languages.[5][1] The scope of Windows-1250 is limited to a total of 256 characters, prioritizing Latin-based scripts with common diacritics required for languages such as Polish, Czech, Slovak, and Hungarian; it does not support non-Latin scripts like Cyrillic or Greek, which are instead covered by other code pages such as Windows-1251.[3][1] This encoding is optimized for text processing in Windows environments, with a strong emphasis on compatibility for file input/output operations and display in regional settings to ensure seamless handling of legacy applications.[5] Compared to ASCII, Windows-1250 maintains full compatibility in the 0x00–0x7F range while extending the 0x80–0xFF byte values to include accented letters and symbols essential for the target Central European languages.[5]History and Development
Origins in Microsoft Code Pages
Windows-1250 was developed by Microsoft in the early 1990s as part of the internationalization efforts for Windows 3.1, which was released in April 1992, to support Central and Eastern European languages using the Latin script, such as Czech, Hungarian, Polish, and Slovak.[2] This code page built upon earlier OEM code pages like CP852, which had been used in DOS and OS/2 environments for similar linguistic regions, by extending the character repertoire to better accommodate Windows-specific needs while maintaining compatibility with legacy systems.[6] Introduced in non-English versions of Windows 3.1 and Windows for Workgroups 3.11, it represented a shift toward region-specific encodings in Microsoft's ecosystem, diverging from the single ANSI code page used in the English edition.[7] The code page drew influences from extensions to the ISO 646 standard and early drafts of ISO/IEC 8859-2 (Latin-2), but Microsoft customized it for optimal integration with Windows font rendering technologies, including the newly introduced TrueType fonts in Windows 3.1, which enabled scalable typography for accented characters common in Central European languages.[7] Unlike strict adherence to international standards, these adaptations prioritized practical usability in Windows applications, such as improved support for diacritics and punctuation in graphical user interfaces. First prominently documented in 1993 alongside Windows NT 3.1, it was named "Windows Central European" to highlight its regional focus and distinguish it from ANSI-compliant alternatives.[2] As part of the broader Windows-125x series of single-byte code pages—where the "1250" designation specifically denotes Central European coverage—Windows-1250 contrasted with Windows-1252, which targeted Western European languages.[7] This series emerged from Microsoft's need to provide localized text handling without relying solely on emerging Unicode standards, ensuring backward compatibility while expanding global reach in the pre-Windows 95 era.[2]Standardization and Evolution
Windows-1250 was formally documented and registered with the Internet Assigned Numbers Authority (IANA) in 1996 as the MIME charset name "windows-1250," enabling its use in internet protocols and email standards.[2] Following its initial development in the 1990s, Windows-1250 underwent minor updates, particularly in 1998, when Microsoft released a euro symbol (€) addition for code pages including 1250 to support the upcoming introduction of the euro currency; this update was integrated into Windows NT 4.0 service packs and later extended to Windows 95 and 98 for improved compatibility with Central European fonts and symbols.[7] The encoding has seen no major revisions since its inception, remaining largely stable under Microsoft's proprietary control, which allowed for tweaks diverging from international standards like ISO/IEC 8859-2, such as the addition of characters for better Windows font rendering. With the release of Windows XP in 2001, Microsoft began emphasizing Unicode (UTF-16 internally) for new applications, marking the start of deprecation trends for legacy code pages like Windows-1250, though full support persisted for backward compatibility in non-Unicode programs.[5][3] By the 2010s, Microsoft explicitly recommended UTF-8 over code pages for consistent internationalization in Windows applications, as outlined in developer guidelines promoting Unicode to avoid locale-specific inconsistencies.[3] Despite this shift, Windows-1250 continues to serve as the default ANSI code page for Central European locales (e.g., Polish, Czech) in Windows as of 2025, ensuring legacy software compatibility while encouraging migration to UTF-8 via system settings.[8][3]Technical Specifications
Encoding Structure
Windows-1250 is an 8-bit single-byte character encoding scheme that defines a total of 256 code points, corresponding to byte values from 0x00 to 0xFF in hexadecimal notation.[5][2] This fixed structure allows for straightforward mapping of characters to bytes, making it suitable for legacy systems where processing efficiency was paramount. The encoding is designed as a superset of the 7-bit US-ASCII standard, ensuring compatibility with basic Latin text and control characters.[5] The code point allocation divides the 256 slots into two primary ranges: bytes 0x00 through 0x7F, which align directly with ASCII for control characters (such as 0x00 for null and 0x1F for unit separator) and printable basic Latin characters (from 0x20 space to 0x7E tilde), and bytes 0x80 through 0xFF, reserved for language-specific extensions primarily targeting Central and Eastern European scripts using the Latin alphabet.[5][9] This bifurcation maintains interoperability with ASCII-based systems while providing room for additional glyphs, such as accented letters and diacritics, without altering the foundational 128-character set.[5] As a single-byte encoding, Windows-1250 represents each character with exactly one 8-bit byte, eschewing multi-byte sequences found in more complex schemes like UTF-8; this fixed-width design simplifies parsing and rendering in early computing environments, where variable-length encodings could introduce overhead in string manipulation and display routines.[5][2] A notable feature is its deviation from ISO standards, particularly in the 0x80–0x9F range, where control codes in ISO/IEC 8859-2 are reassigned to printable characters—for instance, 0x8A maps to the uppercase Š (S with caron)—to better accommodate practical needs in Windows applications.[9][2] Unlike Unicode-based encodings, Windows-1250 requires no byte-order mark (BOM) for interpretation, as its single-byte format renders byte order and endianness concerns irrelevant; text streams in this encoding can thus be processed directly without preamble indicators.[5][2]Byte Values and Assignment
Windows-1250 employs an 8-bit structure where the byte values 0x00 through 0x7F directly correspond to the ASCII standard, ensuring compatibility with basic Latin text. Bytes 0x00 to 0x1F are assigned to control characters identical to those in ASCII, such as 0x00 for NULL and 0x0A for line feed. Bytes 0x20 to 0x7E map to the printable ASCII characters, including 0x20 for space and 0x7A for lowercase 'z'. The byte 0x7F is designated for the delete control character.[10] In the extended range of 0x80 to 0xFF, Windows-1250 assigns 123 characters (all printable), primarily supporting Central European scripts through Latin letters with modifications and typographic symbols. Notably, 27 positions within 0x80 to 0x9F are redefined from the C1 control codes used in ISO/IEC 8859-1 or ISO/IEC 8859-2 to printable characters, such as 0x82 for the single low-9 quotation mark (U+201A) and 0x8D for the Latin capital letter T with caron (U+0164). Five bytes in this subrange—0x81, 0x83, 0x88, 0x90, and 0x98—remain undefined.[10][11] The assignments in 0xA0 to 0xBF focus on spacing characters, diacritical marks, and uppercase letters with accents, exemplified by 0xA0 for the non-breaking space (U+00A0) and 0xA1 for the caron (U+02C7). The range 0xC0 to 0xFF predominantly covers uppercase and lowercase letters with various accents along with additional symbols, such as 0xC0 for the Latin capital letter R with acute (U+0154) and 0xFF for the dot above (U+02D9). These mappings enable representation of characters unique to languages like Polish, Czech, and Hungarian.[10]| Byte Range | Description of Assignments | Examples |
|---|---|---|
| 0x00–0x1F | ASCII control characters | 0x00: NULL (U+0000), 0x09: Horizontal tab (U+0009) |
| 0x20–0x7E | Printable ASCII characters | 0x21: Exclamation mark (!, U+0021), 0x41: Latin capital letter A (A, U+0041) |
| 0x7F | Delete control | 0x7F: Delete (U+007F) |
| 0x80–0x9F | Typographic symbols, letters with carons/acutes, and undefined positions (27 redefined from controls) | 0x82: Single low-9 quotation mark (‚, U+201A), 0x91: Left single quotation mark (‘, U+2018), 0x9A: Latin small letter s with caron (š, U+0161) |
| 0xA0–0xBF | Non-breaking space, diacritics, and accented uppercase | 0xA0: Non-breaking space ( , U+00A0), 0xA2: Breve (˘, U+02D8), 0xA3: Latin capital letter L with stroke (Ł, U+0141), 0xAF: Latin capital letter Z with dot above (Ż, U+017B) |
| 0xC0–0xFF | Accented uppercase and lowercase letters, additional symbols | 0xC0: Latin capital letter R with acute (Ŕ, U+0154), 0xE0: Latin small letter r with acute (ŕ, U+0155), 0xFF: Dot above (˙, U+02D9) |
Character Coverage
Basic Latin and Punctuation
The Basic Latin and Punctuation range in Windows-1250 corresponds to byte values 0x00 through 0x7F, encompassing 128 characters that form the core of the encoding.[4] This range is defined identically to the US-ASCII standard, ensuring backward compatibility with systems and software designed for 7-bit ASCII text processing.[4] As a single-byte encoding, Windows-1250 assigns these bytes directly to characters without alteration, facilitating the representation of basic English-language content in international contexts.[5] The printable characters within this range include the 26 uppercase letters (A–Z, bytes 0x41–0x5A), 26 lowercase letters (a–z, bytes 0x61–0x7A), and the 10 digits (0–9, bytes 0x30–0x39).[9] Common punctuation and symbols, such as the exclamation mark (!) at 0x21, quotation marks (" and ') at 0x22 and 0x27, number sign (#) at 0x23, dollar sign ($) at 0x24, percent sign (%) at 0x25, ampersand (&) at 0x26, apostrophe ('), parentheses ( and ) at 0x28–0x29, asterisk (*) at 0x2A, plus (+) at 0x2B, comma (,) at 0x2C, hyphen-minus (-) at 0x2D, period (.) at 0x2E, slash (/) at 0x2F, colon (:) at 0x3A, semicolon (;) at 0x3B, less-than (<) and greater-than (>) at 0x3C–0x3E, question mark (?) at 0x3F, at sign (@) at 0x40, square brackets [ and ] at 0x5B–0x5D, backslash () at 0x5C, caret (^) at 0x5E, underscore (_) at 0x5F, grave accent (`) at 0x60, curly braces { and } at 0x7B–0x7D, vertical bar (|) at 0x7C, and tilde (~) at 0x7E, provide essential tools for sentence structure, mathematical notation, and formatting in text.[9] Additionally, the space character at 0x20 serves as a fundamental delimiter.[9] Control characters occupy bytes 0x00–0x1F and 0x7F, supporting non-printable functions critical for data interchange and display.[4] Notable examples include the null character (NUL) at 0x00 for string termination, line feed (LF) at 0x0A for vertical spacing in Unix-style line endings, carriage return (CR) at 0x0D for horizontal positioning in DOS-style endings, horizontal tab (HT) at 0x09 for columnar alignment, and delete (DEL) at 0x7F for erasing positions in teletype operations.[9] These controls enable consistent text formatting across files, terminals, and protocols, underpinning the reliability of Windows-1250 in mixed-language environments.[5] By mirroring US-ASCII in this foundational block, Windows-1250 supports the seamless integration of English terminology and symbols into documents primarily using Central European scripts, promoting interoperability without requiring translation for basic elements.[3]Central European Extensions
Windows-1250 extends the Basic Latin and punctuation characters of the ASCII standard by incorporating approximately 128 additional code points in the range 0x80–0xFF to support the diverse Latin-based scripts of Central and Eastern Europe. Of these, several code points (0x81, 0x83, 0x88, 0x8D, 0x8F, 0x90, 0x98) are undefined. These extensions focus on accented letters and diacritics essential for accurate representation of regional orthographies, enabling proper rendering of text in multiple languages without resorting to Unicode or other multi-byte encodings in legacy Windows environments.[10][2] The character set includes key uppercase accented letters such as Á, Č, Ď, É, Ě, Í, Ň, Ó, Ř, Š, Ť, Ú, Ů, Ý, and Ž, along with their lowercase counterparts á, č, ď, é, ě, í, ň, ó, ř, š, ť, ú, ů, ý, and ž. These characters are grouped primarily by case, with uppercase forms concentrated in the 0xC0–0xDF range and lowercase in 0xE0–0xFF, while symbols and diacritic variants are interspersed in the 0x80–0xBF blocks for compatibility with existing software layouts.[10] Coverage is tailored to several languages, including Polish with characters like Ł and ł for the unique "w" sound, Czech and Slovak featuring ř and ů for specific phonetic distinctions, and Hungarian incorporating ő and ű for rounded vowels. Other supported scripts include Croatian, Romanian, Serbian (Latin), and Slovenian, encompassing a total of about 100 diacritic extensions that prioritize Latin alphabet variations over non-Latin scripts. This selection ensures comprehensive support for everyday text in these regions, such as literature, signage, and official documents.[10][2] Unique to Windows-1250 among early Microsoft code pages are typographic elements like single low-9 quotation mark ‚, double low-9 quotation mark „, en dash –, and em dash —, which facilitate professional typesetting in European publications. Unlike Western European code pages such as Windows-1252, these extensions initially omitted the euro symbol € (added later at 0x80 via Windows updates in the late 1990s), emphasizing instead regional linguistic needs over pan-European currency adoption at the time of original design.[10][12]Compatibility and Mappings
Relation to ISO/IEC 8859-2
Windows-1250 and ISO/IEC 8859-2 are both single-byte character encodings primarily designed to support Central and Eastern European languages that use Latin-based scripts with diacritical marks, such as Polish, Czech, Slovak, Hungarian, and Croatian.[7] Windows-1250 includes all the printable characters of ISO/IEC 8859-2 plus additional ones, but a few are rearranged, including key diacritics like acute accents, carons, and ogoneks.[13][14] For instance, the character Š (Latin capital letter S with caron, U+0160) appears in both encodings but is mapped to byte 0xA9 in ISO/IEC 8859-2 and 0x8A in Windows-1250. Specific rearrangements include Ą, Ľ, and ź.[13][14] A significant structural difference arises in the byte range 0x80–0x9F, which ISO/IEC 8859-2 reserves for non-printable C1 control codes, such as line tabulation set (0x8A) and single shift two (0x8E), to maintain compatibility with international standards for data interchange.[14] In contrast, Windows-1250 repurposes 27 of these 32 slots—leaving five undefined, similar to other Windows code pages—for printable characters tailored to graphical user interfaces, including typographic symbols like the en dash (– at 0x96), em dash (— at 0x97), and additional diacritics such as Ž (Latin capital letter Z with caron, U+017D at 0x8E).[13][7] This redefinition enhances support for common punctuation and quotes in Windows applications but introduces incompatibilities with pure ISO-compliant systems. Beyond the C1 range, Windows-1250 adds unique glyphs not present in the standard, such as the euro sign (€ at 0x80) and single angle quotation marks (‹ at 0x8B and › at 0x9B, U+2039 and U+203A).[13] While Windows-1250 includes all core printable characters from ISO/IEC 8859-2, the positional shifts for a few characters mean it is not a strict superset; direct byte-to-byte conversion can result in mojibake for mismatched characters.[14] A few characters, such as Ľ (Latin capital letter L with caron, U+013D at 0xA5 in ISO/IEC 8859-2 versus 0xBC in Windows-1250), require explicit remapping during conversion to preserve fidelity.[13][14] The Internet Assigned Numbers Authority (IANA) maintains them as distinct registered character sets, underscoring their non-interchangeability without transformation.[15] Historically, the divergence traces to ISO/IEC 8859-2's finalization as an international standard in 1987, aimed at uniform encoding across diverse systems.[16] Windows-1250 emerged in 1992 as part of Microsoft Windows 3.1, positioned as a "Windows Latin-2" variant with optimizations for the operating system's text rendering and localization needs, including the printable C1 extensions to better accommodate UI elements like dialog boxes and menus.[7][2] This evolution reflects Microsoft's proprietary adaptations for enhanced usability in regional Windows deployments, finalized around 1993.[7]Unicode Conversion Guidelines
Windows-1250 provides a direct one-to-one mapping for its defined characters to Unicode code points, primarily utilizing the Latin-1 Supplement (U+0080 to U+00FF) for accented Latin letters and the Latin Extended-A block (U+0100 to U+017F) for additional Central European extensions such as carons and rings.[10] For instance, the byte 0xC1 maps to U+00C1 (LATIN CAPITAL LETTER A WITH ACUTE), while 0x8A maps to U+0160 (LATIN CAPITAL LETTER S WITH CARON).[17] This structure ensures that the 251 printable characters in Windows-1250, including ASCII-compatible bytes from 0x00 to 0x7F, correspond uniquely to Unicode scalars without overlap or ambiguity.[10] Converting from Windows-1250 to Unicode involves addressing the code page's Windows-specific additions, such as mappings for characters like 0xA5 (Š, U+0160) that extend beyond basic Latin sets, but these pose no significant challenges since all 251 defined characters (with 5 undefined bytes like 0x81) map losslessly to Unicode.[10] There are no lossy conversions required, as Windows-1250 is a strict subset of Unicode's Basic Multilingual Plane, allowing full preservation of the original data during transformation.[11] Round-trip compatibility is complete: encoding a Unicode string back to Windows-1250 and then to Unicode recovers the exact original, provided only supported characters are used.[10] Practical guidelines for conversion emphasize using established APIs and detection methods. In Windows environments, the MultiByteToWideChar function with the code page identifier 1250 converts Windows-1250 bytes to UTF-16 Unicode wide characters, handling the mapping automatically for strings or buffers.[18] To detect Windows-1250 encoding in files or streams lacking a byte-order mark (BOM), rely on heuristics such as the presence of bytes in the 0x80–0xFF range typical of Central European text, combined with validation against known invalid sequences.[19] For web applications or modern interoperability, convert to UTF-8 as the target encoding, which supports seamless display and transmission of these characters across platforms. These extensions build on the Central European character coverage to ensure accurate representation of diacritics in languages like Polish and Czech.[10]Usage and Applications
In Windows Operating Systems
Windows-1250 serves as the default ANSI code page for Central European locales, such as Czech, Hungarian, Polish, Slovak, and Slovenian, in Microsoft Windows operating systems from Windows 95 through Windows 11.[3][20] In these locales, it is used by applications like Notepad for saving and opening text files in ANSI format and by Windows Explorer for handling non-Unicode file names, ensuring compatibility with legacy text-based operations.[5][21] The encoding is integrated into Windows through the Code Page APIs, identifiable by the constant CP1250 (value 1250), which allows developers to convert between Windows-1250 and Unicode strings using functions like MultiByteToWideChar and WideCharToMultiByte.[3] Font support is provided by system fonts such as Arial CE (Central European variant) and Tahoma, which include glyphs for the extended Latin characters defined in Windows-1250.[22] Users can switch code pages via regional settings in the Control Panel, under the non-Unicode language options, to accommodate different locales without reinstalling the OS.[5] In early Windows versions, such as Windows 95, Windows-1250 was essential for maintaining compatibility with MS-DOS applications using OEM code page 852, bridging the gap between console and graphical interfaces.[7] However, starting with Windows NT, Microsoft shifted internal string storage to UTF-16 (initially UCS-2), deprecating code pages like Windows-1250 for core system operations while retaining them for legacy support.[23] As of 2025, Windows-1250 remains available in the Win32 API for backward-compatible applications, though Microsoft documentation strongly recommends migrating to Unicode to avoid encoding limitations and ensure global compatibility.[5][1]In Web and Legacy Software
Windows-1250 is declared as the character encoding in HTML documents using the<meta charset="windows-1250"> tag, allowing browsers to correctly interpret text content for Central and Eastern European languages.[24] This declaration ensures proper rendering of characters such as accented letters in Polish, Czech, and Hungarian. Modern web browsers, including Google Chrome and Microsoft Edge, provide support for Windows-1250 primarily to handle legacy websites, decoding pages that specify this encoding via HTTP headers or meta tags.[25] The official IANA MIME name for this encoding is "windows-1250", which is used in HTTP Content-Type headers like text/html; charset=windows-1250 to signal the encoding to clients.[2]
In legacy software from the 1990s and 2000s, Windows-1250 was commonly employed in applications targeting Central and Eastern European users, including older email clients that processed messages with regional diacritics and databases such as pre-Unicode versions of Oracle, which supported it as the WE8MSWIN1250 character set for storing and retrieving text data. Mis-detection of this encoding in such systems often results in mojibake, where bytes are misinterpreted as another charset like UTF-8, leading to garbled displays of characters such as "ł" appearing as "£".[26]
As of 2025, Windows-1250 is rarely used in new web development due to the dominance of UTF-8, which accounts for over 97% of websites, while Windows-1250 appears on less than 0.1% of sites with known encodings.[27] However, it persists in some Eastern European government and enterprise systems for backward compatibility with legacy data and applications. The W3C recommends against specifying legacy encodings like Windows-1250 in HTTP headers for new content, favoring UTF-8 to ensure universal interoperability.[28] In Unix-like environments, tools such as GNU iconv facilitate conversions involving Windows-1250, for example, using the command iconv -f WINDOWS-1250 -t UTF-8 input.txt > output.txt to migrate legacy files to modern standards.