Latin script
View on WikipediaThis article includes a list of general references, but it lacks sufficient corresponding inline citations. (October 2017) |
| Latin Roman | |
|---|---|
| Script type | |
Period | c. 700 BC – present |
| Direction | Left-to-right |
| Languages | See List of Latin-script alphabets |
| Related scripts | |
Parent systems | |
Child systems |
|
Sister systems | |
| ISO 15924 | |
| ISO 15924 | Latn (215), Latin |
| Unicode | |
Unicode alias | Latin |
| See Latin characters in Unicode | |
The Latin script, also known as the Roman script, is a writing system based on the letters of the classical Latin alphabet, derived from a form of the Greek alphabet which was in use in the ancient Greek city of Cumae in Magna Graecia. The Greek alphabet was altered by the Etruscans, and subsequently their alphabet was altered by the Ancient Romans. Several Latin-script alphabets exist, which differ in graphemes, collation and phonetic values from the classical Latin alphabet.
The Latin script is the basis of the International Phonetic Alphabet (IPA), and the 26 most widespread letters are the letters contained in the ISO basic Latin alphabet, which are the same letters as the English alphabet.
Latin script is the basis for the largest number of alphabets of any writing system[1] and is the most widely adopted writing system in the world. Latin script is used as the standard method of writing the languages of Western and Central Europe, most of sub-Saharan Africa, the Americas, and Oceania, as well as many languages in other parts of the world.
Name
[edit]The script is either called Latin script or Roman script, in reference to its origin in ancient Rome (though some of the capital letters are Greek in origin). In the context of transliteration, the term "romanization" (British English: "romanisation") is often found.[2][3] Unicode uses the term "Latin"[4] as does the International Organization for Standardization (ISO).[5]
The numeral system is called the Roman numeral system, and the collection of the elements is known as the Roman numerals. The numbers 1, 2, 3 ... are Latin/Roman script numbers for the Hindu–Arabic numeral system.
ISO basic Latin alphabet
[edit]| Uppercase Latin alphabet | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Lowercase Latin alphabet | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z |
The use of the letters I and V for both consonants and vowels proved inconvenient as the Latin alphabet was adapted to Germanic and Romance languages. W originated as a doubled V (VV) used to represent the Voiced labial–velar approximant /w/ found in Old English as early as the 7th century. It came into common use in the later 11th century, replacing the letter wynn ⟨Ƿ ƿ⟩, which had been used for the same sound. In the Romance languages, the minuscule form of V was a rounded u; from this was derived a rounded capital U for the vowel in the 16th century, while a new, pointed minuscule v was derived from V for the consonant. In the case of I, a word-final swash form, j, came to be used for the consonant, with the unswashed form restricted to vowel use. Such conventions were erratic for centuries. J was introduced into English for the consonant in the 17th century (it had been rare as a vowel), but it was not universally considered a distinct letter in the alphabetic order until the 19th century.
By the 1960s, it became apparent to the computer and telecommunications industries in the First World that a non-proprietary method of encoding characters was needed. The International Organization for Standardization (ISO) encapsulated the Latin alphabet in their (ISO/IEC 646) standard. To achieve widespread acceptance, this encapsulation was based on popular usage. As the United States held a preeminent position in both industries during the 1960s, the standard was based on the already published American Standard Code for Information Interchange, better known as ASCII, which included in the character set the 26 × 2 (uppercase and lowercase) letters of the English alphabet. Later standards issued by the ISO, for example ISO/IEC 10646 (Unicode Latin), have continued to define the 26 × 2 letters of the English alphabet as the basic Latin alphabet with extensions to handle other letters in other languages.
Spread
[edit]
Latin-script alphabets are sometimes extensively used in areas coloured grey due to the use of unofficial second languages, such as French in Morocco and English in Egypt, and to Latin transliteration of the official script, such as pinyin in China.
The Latin alphabet spread, along with Latin, from the Italian Peninsula to the lands surrounding the Mediterranean Sea with the expansion of the Roman Empire. The eastern half of the Empire, including Greece, Turkey, the Levant, and Egypt, continued to use Greek as a lingua franca, but Latin was widely spoken in the western half, and as the western Romance languages evolved out of Latin, they continued to use and adapt the Latin alphabet.
Middle Ages
[edit]With the spread of Western Christianity during the Middle Ages, the Latin alphabet was gradually adopted by the peoples of Northern Europe who spoke Celtic languages (displacing the Ogham alphabet) or Germanic languages (displacing earlier Runic alphabets) or Baltic languages, as well as by the speakers of several Uralic languages, most notably Hungarian, Finnish and Estonian.
The Latin script also came into use for writing the West Slavic languages and several South Slavic languages, as the people who spoke them adopted Roman Catholicism. The speakers of East Slavic languages generally adopted Cyrillic along with Orthodox Christianity. The Serbian language uses both scripts, with Cyrillic predominating in official communication and Latin elsewhere, as determined by the Law on Official Use of the Language and Alphabet.[6]
Since the 16th century
[edit]As late as 1500, the Latin script was limited primarily to the languages spoken in Western, Northern, and Central Europe. The Orthodox Christian Slavs of Eastern and Southeastern Europe mostly used Cyrillic, and the Greek alphabet was in use by Greek speakers around the eastern Mediterranean. The Arabic script was widespread within Islam, both among Arabs and non-Arab nations like the Iranians, Indonesians, Malays, and Turkic peoples. Most of the rest of Asia used a variety of Brahmic scripts or the Chinese script.
Through European colonization the Latin script has spread to the Americas, Oceania, parts of Asia, Africa, and the Pacific, in forms based on the Spanish, Portuguese, English, French, German and Dutch alphabets.
It is used for many Austronesian languages, including the languages of the Philippines and the Malaysian and Indonesian languages, replacing earlier Arabic and indigenous Brahmic alphabets. Latin letters served as the basis for the forms of the Cherokee syllabary developed by Sequoyah; however, the sound values are completely different.[citation needed]
Under Portuguese missionary influence, a Latin alphabet was devised for the Vietnamese language, which had previously used Chinese characters. Portuguese and other European missionaries, who arrived in Goa on west coast of India in sixteenth and seventeenth centuries, introduced Roman script for the Konkani language—an Indo-Aryan language.[7] The Latin-based alphabet replaced the Chinese characters in administration in the 19th century with French rule.
Since the 19th century
[edit]In the late 19th century, the Romanians returned to the Latin alphabet, dropping the Romanian Cyrillic alphabet. Romanian is one of the Romance languages.
Since 20th century
[edit]In 1928, as part of Mustafa Kemal Atatürk's reforms, the new Republic of Turkey adopted a Latin alphabet for the Turkish language, replacing a modified Arabic alphabet. Most of the Turkic-speaking peoples of the former USSR, including Tatars, Bashkirs, Azeri, Kazakh, Kyrgyz and others, had their writing systems replaced by the Latin-based Uniform Turkic alphabet in the 1930s; but, in the 1940s, all were replaced by Cyrillic.
After the collapse of the Soviet Union in 1991, three of the newly independent Turkic-speaking republics, Azerbaijan, Uzbekistan, Turkmenistan, as well as Romanian-speaking Moldova, officially adopted Latin alphabets for their languages. Kyrgyzstan, Iranian-speaking Tajikistan, and the breakaway region of Transnistria kept the Cyrillic alphabet, chiefly due to their close ties with Russia.
In the 1930s and 1940s, the majority of Kurds replaced the Arabic script with two Latin alphabets. Although only the official Kurdish government uses an Arabic alphabet for public documents, the Latin Kurdish alphabet remains widely used throughout the region by the majority of Kurdish-speakers.
In 1957, the People's Republic of China introduced a script reform to the Zhuang language, changing its orthography from Sawndip, a writing system based on Chinese, to a Latin script alphabet that used a mixture of Latin, Cyrillic, and IPA letters to represent both the phonemes and tones of the Zhuang language, without the use of diacritics. In 1982 this was further standardised to use only Latin script letters.
With the collapse of the Derg and subsequent end of decades of Amharic assimilation in 1991, various ethnic groups in Ethiopia dropped the Geʽez script, which was deemed unsuitable for languages outside of the Semitic branch.[8] In the following years the Kafa,[9] Oromo,[10] Sidama,[11] Somali,[11] and Wolaitta[11] languages switched to Latin while there is continued debate on whether to follow suit for the Hadiyya and Kambaata languages.[12]
21st century
[edit]On 15 September 1999 the authorities of Tatarstan, Russia, passed a law to make the Latin script a co-official writing system alongside Cyrillic for the Tatar language by 2011.[13] A year later, however, the Russian government overruled the law and banned Latinization on its territory.[14]
In 2015, the government of Kazakhstan announced that a Kazakh Latin alphabet would replace the Kazakh Cyrillic alphabet as the official writing system for the Kazakh language by 2025.[15] There are also talks about switching from the Cyrillic script to Latin in Ukraine,[16] Kyrgyzstan,[17][18] and Mongolia.[19] Mongolia, however, has since opted to revive the Mongolian script instead of switching to Latin.[20]
In October 2019, Inuit Tapiriit Kanatami (ITK), the national organization for Inuit in Canada announced that they will introduce a unified writing system for the Inuit languages in the country. The writing system is based on the Latin alphabet and is modeled after the one used in the Greenlandic language.[21]
On 12 February 2021 the government of Uzbekistan announced it will finalize the transition from Cyrillic to Latin for the Uzbek language by 2023. Plans to switch to Latin originally began in 1993 but subsequently stalled and Cyrillic remained in widespread use.[22][23]
At present the Crimean Tatar language uses both Cyrillic and Latin. The use of Latin was originally approved by Crimean Tatar representatives after the Soviet Union's collapse[24] but was never implemented by the regional government. After Russia's annexation of Crimea in 2014 the Latin script was dropped entirely. Nevertheless, Crimean Tatars outside of Crimea continue to use Latin and on 22 October 2021 the government of Ukraine approved a proposal endorsed by the Mejlis of the Crimean Tatar People to switch the Crimean Tatar language to Latin by 2025.[25]
In July 2020, 2.6 billion people (36% of the world population) use the Latin alphabet.[26]
International standards
[edit]By the 1960s, it became apparent to the computer and telecommunications industries in the First World that a non-proprietary method of encoding characters was needed. The International Organization for Standardization (ISO) encapsulated the Latin alphabet in their (ISO/IEC 646) standard. To achieve widespread acceptance, this encapsulation was based on popular usage.
As the United States held a preeminent position in both industries during the 1960s, the standard was based on the already published American Standard Code for Information Interchange, better known as ASCII, which included in the character set the 26 × 2 (uppercase and lowercase) letters of the English alphabet. Later standards issued by the ISO, for example ISO/IEC 10646 (Unicode Latin), have continued to define the 26 × 2 letters of the English alphabet as the basic Latin alphabet with extensions to handle other letters in other languages.
National standards
[edit]The DIN standard DIN 91379 specifies a subset of Unicode letters, special characters, and sequences of letters and diacritic signs to allow the correct representation of names and to simplify data exchange in Europe. This specification supports all official languages of European Union and European Free Trade Association countries (thus also the Greek and Cyrillic scripts), plus the German minority languages.[clarification needed] To allow the transliteration of names in other writing systems to the Latin script according to the relevant ISO standards, all necessary combinations of base letters and diacritic signs are provided.[27] Efforts are being made to further develop it into a European CEN standard.[28]
As used by various languages
[edit]In the course of its use, the Latin alphabet was adapted for use in new languages, sometimes representing phonemes not found in languages that were already written with the Roman characters. To represent these new sounds, extensions were therefore created, be it by adding diacritics to existing letters, by joining multiple letters together to make ligatures, by creating completely new forms, or by assigning a special function to pairs or triplets of letters. These new forms are given a place in the alphabet by defining an alphabetical order or collation sequence, which can vary with the particular language.
Letters
[edit]Some examples of new letters to the standard Latin alphabet are the Runic letters wynn ⟨Ƿ ƿ⟩ and thorn ⟨Þ þ⟩, and the letter eth ⟨Ð/ð⟩, which were added to the alphabet of Old English. Another Irish letter, the insular g, developed into yogh ⟨Ȝ ȝ⟩, used in Middle English. Wynn was later replaced with the new letter ⟨w⟩, eth and thorn with ⟨th⟩, and yogh with ⟨gh⟩. Although the four are no longer part of the English or Irish alphabets, eth and thorn are still used in the modern Icelandic alphabet, while eth is also used by the Faroese alphabet.
Some West, Central and Southern African languages use a few additional letters that have sound values similar to those of their equivalents in the IPA. For example, Adangme uses the letters ⟨Ɛ ɛ⟩ and ⟨Ɔ ɔ⟩, and Ga uses ⟨Ɛ ɛ⟩, ⟨Ŋ ŋ⟩ and ⟨Ɔ ɔ⟩. Hausa uses ⟨Ɓ ɓ⟩ and ⟨Ɗ ɗ⟩ for implosives, and ⟨Ƙ ƙ⟩ for an ejective. Africanists have standardized these into the African reference alphabet.
Dotted and dotless I — ⟨İ i⟩ and ⟨I ı⟩ — are two forms of the letter I used by the Turkish, Azerbaijani, and Kazakh alphabets.[29] The Azerbaijani language also has ⟨Ə ə⟩, which represents the near-open front unrounded vowel.
Multigraphs
[edit]A digraph is a pair of letters used to write one sound or a combination of sounds that does not correspond to the written letters in sequence. Examples are ⟨ch⟩, ⟨ng⟩, ⟨rh⟩, ⟨sh⟩, ⟨ph⟩, ⟨th⟩ in English, and ⟨ij⟩, ⟨ee⟩, ⟨ch⟩ and ⟨ei⟩ in Dutch. In Dutch the ⟨ij⟩ is capitalized as ⟨IJ⟩ or the ligature ⟨IJ⟩, but never as ⟨Ij⟩, and it often takes the appearance of a ligature ⟨ij⟩ very similar to the letter ⟨ÿ⟩ in handwriting.
A trigraph is made up of three letters, like the German ⟨sch⟩, the Breton ⟨c'h⟩ or the Milanese ⟨oeu⟩. In the orthographies of some languages, digraphs and trigraphs are regarded as independent letters of the alphabet in their own right. The capitalization of digraphs and trigraphs is language-dependent, as only the first letter may be capitalized, or all component letters simultaneously (even for words written in title case, where letters after the digraph or trigraph are left in lowercase).
Ligatures
[edit]A ligature is a fusion of two or more ordinary letters into a new glyph or character. Examples are ⟨Æ æ⟩ (from ⟨AE⟩, called ash), ⟨Œ œ⟩ (from ⟨OE⟩, sometimes called oethel or eðel), the abbreviation ⟨&⟩ (from Latin: et, lit. 'and', called ampersand), and ⟨ẞ ß⟩ (from ⟨ſʒ⟩ or ⟨ſs⟩, the archaic medial form of ⟨s⟩, followed by an ⟨ʒ⟩ or ⟨s⟩, called sharp S or eszett).
Diacritics
[edit]
A diacritic, in some cases also called an accent, is a small symbol that can appear above or below a letter, or in some other position, such as the umlaut sign used in the German characters ⟨ä⟩, ⟨ö⟩, ⟨ü⟩ or the Romanian characters ă, â, î, ș, ț. Its main function is to change the phonetic value of the letter to which it is added, but it may also modify the pronunciation of a whole syllable or word, indicate the start of a new syllable, or distinguish between homographs such as the Dutch words een (pronounced [ən]) meaning "a" or "an", and één, (pronounced [e:n]) meaning "one". As with the pronunciation of letters, the effect of diacritics is language-dependent.
English is the only major modern European language that requires no diacritics for its native vocabulary[note 1]. Historically, in formal writing, a diaeresis was sometimes used to indicate the start of a new syllable within a sequence of letters that could otherwise be misinterpreted as being a single vowel (e.g., "coöperative", "reëlect"), but modern writing styles either omit such marks or use a hyphen to indicate a syllable break (e.g. "co-operative", "re-elect"). [note 2][30]
Collation
[edit]Some modified letters, such as the symbols ⟨å⟩, ⟨ä⟩, and ⟨ö⟩, may be regarded as new individual letters in themselves, and assigned a specific place in the alphabet for collation purposes, separate from that of the letter on which they are based, as is done in Swedish. In other cases, such as with ⟨ä⟩, ⟨ö⟩, ⟨ü⟩ in German, this is not done; letter-diacritic combinations being identified with their base letter. The same applies to digraphs and trigraphs. Different diacritics may be treated differently in collation within a single language. For example, in Spanish, the character ⟨ñ⟩ is considered a letter, and sorted between ⟨n⟩ and ⟨o⟩ in dictionaries, but the accented vowels ⟨á⟩, ⟨é⟩, ⟨í⟩, ⟨ó⟩, ⟨ú⟩, ⟨ü⟩ are not separated from the unaccented vowels ⟨a⟩, ⟨e⟩, ⟨i⟩, ⟨o⟩, ⟨u⟩.
Capitalization
[edit]The languages that use the Latin script today generally use capital letters to begin paragraphs and sentences and proper nouns. The rules for capitalization have changed over time, and different languages have varied in their rules for capitalization. Old English, for example, was rarely written with even proper nouns capitalized; whereas Modern English of the 18th century had frequently all nouns capitalized, in the same way that Modern German is written today, e.g. German: Alle Schwestern der alten Stadt hatten die Vögel gesehen, lit. 'All of the Sisters of the old City had seen the Birds'.
Romanization
[edit]Words from languages natively written with other scripts, such as Arabic or Chinese, are usually transliterated or transcribed when embedded in Latin-script text or in multilingual international communication, a process termed romanization.
Whilst the romanization of such languages is used mostly at unofficial levels, it has been especially prominent in computer messaging where only the limited seven-bit ASCII code is available on older systems. However, with the introduction of Unicode, romanization is now becoming less necessary. Keyboards used to enter such text may still restrict users to romanized text, as only ASCII or Latin-alphabet characters may be available.
See also
[edit]Notes
[edit]- ^ In formal English writing, however, diacritics are often preserved on many loanwords, such as "café", "naïve", "façade", "jalapeño" or the German prefix "über-".
- ^ As an example, an article containing a diaeresis in "coöperate" and a cedilla in "façade" as well as a circumflex in the word "crêpe": Grafton, Anthony (23 October 2006). "Books: The Nutty Professors, The history of academic charisma". The New Yorker.
References
[edit]Citations
[edit]- ^ Haarmann 2004, p. 96.
- ^ "Search results | BSI Group". Bsigroup.com. Retrieved 12 May 2014.[permanent dead link]
- ^ "Romanisation_systems". Pcgn.org.uk. Archived from the original on 27 June 2014. Retrieved 12 May 2014.
- ^ "ISO 15924 – Code List in English". Unicode.org. Archived from the original on 26 May 2013. Retrieved 22 July 2013.
- ^ "Search – ISO". International Organization for Standardization. Archived from the original on 13 May 2014. Retrieved 12 May 2014.
- ^ "Zakon O Službenoj Upotrebi Jezika I Pisama" (PDF). Ombudsman.rs. 17 May 2010. Archived from the original (PDF) on 14 July 2014. Retrieved 5 July 2014.
- ^ Jain, Danesh; Cardona, George (26 July 2007). The Indo-Aryan Languages. Routledge. p. 804. ISBN 978-1-135-79710-2. Retrieved 3 August 2025.
- ^ Smith, Lahra (2013). "Review of Making Citizens in Africa: Ethnicity, Gender, and National Identity in Ethiopia". African Studies. 125 (3): 542–544. doi:10.1080/00083968.2015.1067017. S2CID 148544393. Archived from the original on 16 November 2021. Retrieved 16 November 2021 – via Taylor & Francis.
- ^ Pütz, Martin (1997). Language Choices: Conditions, constraints, and consequences. John Benjamins Publishing. p. 216. ISBN 9789027275844.
- ^ Gemeda, Guluma (18 June 2018). "The History and Politics of the Qubee Alphabet". Ayyaantuu. Archived from the original on 16 November 2021. Retrieved 16 November 2021.
- ^ a b c Yohannes, Mekonnen (2021). Language Policy in Ethiopia. Vol. 24. p. 33. doi:10.1007/978-3-030-63904-4. ISBN 978-3-030-63903-7. S2CID 234114762. Archived from the original on 22 February 2021. Retrieved 16 November 2021 – via Springer Link.
- ^ Pasch, Helma (2008). "Competing scripts: The Introduction of the Roman Alphabet in Africa" (PDF). International Journal of the Sociology of Language (191): 8. Archived (PDF) from the original on 16 November 2021. Retrieved 16 November 2021 – via ResearchGate.
- ^ Andrews, Ernest (2018). Language Planning in the Post-Communist Era: The Struggles for Language Control in the New Order in Eastern Europe, Eurasia and China. Springer. p. 132. ISBN 978-3-319-70926-0.
- ^ Faller, Helen (2011). Nation, Language, Islam: Tatarstan's Sovereignty Movement. Central European University Press. p. 131. ISBN 978-963-9776-84-5.
- ^ "Kazakh language to be converted to Latin alphabet – MCS RK". Kazinform. 30 January 2015. Archived from the original on 19 February 2017. Retrieved 28 September 2015.
- ^ "Klimkin welcomes discussion on switching to Latin alphabet in Ukraine". UNIAN. 27 March 2018. Archived from the original on 3 October 2021. Retrieved 5 August 2019.
- ^ Goble, Paul (12 October 2017). "Moscow Bribes Bishkek to Stop Kyrgyzstan From Changing to Latin Alphabet". Jamestown. Archived from the original on 21 February 2021. Retrieved 5 August 2019.
- ^ Rickleton, Chris (13 September 2019). "Kyrgyzstan: Latin (alphabet) fever takes hold". Eurasianet. Archived from the original on 2 July 2021. Retrieved 16 September 2019.
- ^ Mikovic, Nikola (2 March 2019). "Russian Influence in Mongolia is Declining". Global Security Review. Archived from the original on 24 February 2021. Retrieved 5 August 2019.
- ^ Tang, Didi (20 March 2020). "Mongolia abandons Soviet past by restoring alphabet". The Times. ISSN 0140-0460. Archived from the original on 22 April 2021. Retrieved 2 March 2021.
- ^ "Canadian Inuit Get Common Written Language". High North News (8 October 2019). Archived from the original on 17 August 2021. Retrieved 8 October 2019.
- ^ Sands, David (12 February 2021). "Latin lives! Uzbeks prepare latest switch to Western-based alphabet". The Washington Times. Archived from the original on 15 February 2021. Retrieved 15 February 2021.
- ^ "Uzbekistan Aims For Full Transition To Latin-Based Alphabet By 2023". Radio Free Europe/Radio Liberty. 12 February 2021. Archived from the original on 31 December 2022. Retrieved 15 February 2021.
- ^ Kuzio, Taras (2007). Ukraine - Crimea - Russia: Triangle of Conflict. Columbia University Press. p. 106. ISBN 978-3-8382-5761-7.
- ^ "Cabinet approves Crimean Tatar alphabet based on Latin letters". Ukrinform. 22 October 2021. Archived from the original on 7 October 2021. Retrieved 17 November 2021.
- ^ "The world's scripts and alphabets". WorldStandards. Archived from the original on 9 August 2020. Retrieved 11 August 2020.
- ^ "DIN 91379:2022-08: Characters and defined character sequences in Unicode for the electronic processing of names and data exchange in Europe, with CD-ROM". Beuth Verlag. Archived from the original on 19 August 2022. Retrieved 19 August 2022.
- ^ Koordinierungsstelle für IT-Standards (KoSIT). "String.Latin+ 1.2: eine kommentierte und erweiterte Fassung der DIN SPEC 91379. Inklusive einer umfangreichen Liste häufig gestellter Fragen. Herausgegeben von der Fachgruppe String.Latin. (zip, 1.7 MB)" [String.Latin+ 1.2: Commented and extended version of DIN SPEC 91379.] (in German). Archived from the original on 19 January 2022. Retrieved 19 March 2022.
- ^ "Localize Your Font: Turkish i". Glyphs. Archived from the original on 28 January 2021. Retrieved 28 January 2021.
- ^ "The New Yorker's odd mark — the diaeresis". 16 December 2010. Archived from the original on 16 December 2010. Retrieved 8 March 2022.
Sources
[edit]- Haarmann, Harald (2004). Geschichte der Schrift [History of Writing] (in German) (2nd ed.). München: C. H. Beck. ISBN 978-3-406-47998-4.
Further reading
[edit]- Boyle, Leonard E. 1976. "Optimist and recensionist: 'Common errors' or 'common variations.'" In Latin script and letters A.D. 400–900: Festschrift presented to Ludwig Bieler on the occasion of his 70th birthday. Edited by John J. O'Meara and Bernd Naumann, 264–74. Leiden, The Netherlands: Brill.
- Morison, Stanley. 1972. Politics and script: Aspects of authority and freedom in the development of Graeco-Latin script from the sixth century B.C. to the twentieth century A.D. Oxford: Clarendon.
External links
[edit]- Unicode collation chart—Latin letters sorted by shape
- Diacritics Project – All you need to design a font with correct accents Archived 23 October 2018 at the Wayback Machine
Latin script
View on GrokipediaOrigins and Early Development
Proto-Latin and Etruscan Influences
The Latin script emerged through the adaptation of the Etruscan alphabet by speakers of early Latin in central Italy during the 8th to 7th centuries BCE, reflecting direct borrowing of letter forms and writing conventions to represent Indo-European Italic phonemes.[11] The Etruscan system, comprising 26 letters derived from the Cumaean (western Greek) variant used in the Greek colony of Cumae near Naples, provided the visual and structural template, with early Latin reducing this to approximately 21 characters by eliminating Greek aspirates (such as theta, phi, and chi) that lacked equivalents in Latin's sound inventory.[2] [12] This selective retention prioritized utility for Latin's velar and sibilant distinctions, though initial ambiguities persisted, such as using a single "C" for both /k/ and /g/ sounds until the introduction of "G" around 230 BCE.[13] Proto-Latin inscriptions, the earliest attestations of this adapted script, date from the 7th century BCE and showcase Etruscan-derived features like reversed letter orientations, right-to-left directionality, and occasional boustrophedon (alternating direction) layouts inherited from Etruscan practice.[14] [15] The Praeneste fibula, a gold brooch unearthed near modern Palestrina, bears the inscription "Manios me fhefhaked Numasioi" (interpreted as "Manius made me for Numerius"), confirmed genuine through metallurgical and paleographic analysis, marking it as the oldest known Latin text with angular, monumental letter forms mirroring southern Etruscan styles.[16] Subsequent artifacts, such as the 6th-century BCE Duenos inscription on a vase, further illustrate these traits, with letters like the early "F" (resembling Etruscan digamma) and "S" (lunate form) evidencing unstandardized variants before classical regularization.[17] Etruscan influence extended beyond morphology to orthographic habits, including the use of three sibilant signs (later unified in Latin) and numeral systems, facilitating the script's role in recording votive, funerary, and dedicatory texts amid Rome's growing dominance over neighboring Italic groups.[18]Archaic and Classical Forms
The archaic forms of the Latin script appeared in the mid-7th century BC, derived from the Etruscan adaptation of western Greek alphabets.[19] The earliest known inscription is on the Praeneste fibula, dating to around 650 BC, bearing the text "MANIOS MED FHEFHAKED NUMASIOI," which translates roughly to "Manius made me for Numerius."[20] This artifact demonstrates early letter forms with angular strokes suited for metal engraving, including variants like a reversed S and a digamma-like F.[21] Another key example is the Duenos inscription on a ceramic vessel from Rome, dated to the 6th century BC, featuring three lines of text in a more developed but still irregular script.[22] The archaic Latin alphabet comprised 21 letters: A, B, C, D, E, F, Z, H, I, K, L, M, N, O, P, Q, S, T, V, X, with C serving dual duty for both /k/ and /g/ sounds.[19] Z was included initially but later dropped due to the rarity of the /z/ phoneme in Latin.[23] Letter shapes exhibited variability, often more monumental and less refined than later versions, with some inscriptions showing right-to-left directionality or boustrophedon style in transitional phases.[7] Transition to classical forms occurred during the 3rd to 1st centuries BC, marked by orthographic reforms including the introduction of G around 230 BC to distinguish /g/ from /k/, replacing Z in the sequence and shifting subsequent letters.[23] Y and Z were re-added by the 1st century BC for transcribing Greek loanwords, expanding the inventory to 23 letters.[7] This period saw standardization driven by expanding Roman administration and literacy, reducing archaic variations. Classical Latin script, solidified by the late Republic, featured formal monumental styles such as capitalis quadrata, characterized by geometric proportions and serifs, used for stone inscriptions from the 1st century BC onward.[24] Rustic capitals emerged for papyrus documents, with narrower, more condensed forms for efficient writing.[25] These majuscule scripts lacked distinct minuscules, relying on all-caps for clarity in public and literary contexts, reflecting the script's adaptation to imperial needs.[24]Historical Evolution
Medieval Adaptations
During the early Middle Ages, following the decline of the Western Roman Empire, the Latin script fragmented into regional variants derived from late antique forms such as uncial and half-uncial, adapting to local scribal practices and vernacular influences in monastic scriptoria across Europe.[26] These adaptations prioritized legibility for copying religious texts amid varying linguistic needs, with scribes in isolated regions developing distinctive letterforms to accommodate phonetic distinctions in emerging Romance and Germanic languages.[27] One prominent early adaptation was the Insular script, originating in Ireland around the 7th century and spreading to Anglo-Saxon England by the 8th century, characterized by its rounded minuscules, elongated ascenders and descenders, and insular majuscules for initials.[28] Derived from half-uncial, it was employed for both Latin manuscripts and Old English or Irish glosses, persisting in Ireland until the late Middle Ages and facilitating the preservation of patristic works during the Hiberno-Scottish mission.[26] Its aesthetic emphasized verticality and decorative ligatures, reflecting Celtic artistic traditions, though it gradually yielded to Carolingian influences in continental contacts.[29] The most influential medieval reform occurred during the Carolingian Renaissance, when Charlemagne's educational initiatives from 789 onward promoted a standardized minuscule script to unify liturgical and scholarly texts across the Frankish Empire.[27] Initiated around 778 at Corbie Abbey and refined by Alcuin of York after his arrival in 781, the Carolingian minuscule featured clear, proportional lowercase letters with consistent ascenders and descenders, ascending from earlier Merovingian cursives while drawing on Insular and Roman models for uniformity.[30] By approximately 820, it dominated scriptoria from England to Italy, enabling efficient production of codices and serving as a precursor to modern lowercase forms due to its readability on parchment.[31] From the 12th century, Gothic scripts evolved as denser alternatives to Carolingian minuscule, particularly in northern Europe, with textualis forms featuring angular strokes, fused letters, and reduced counter spaces to fit more text per page amid rising demand for legal and theological manuscripts.[32] Originating in the Frankish-Anglo-Saxon-German regions, these "blackletter" styles, including littera textualis, prioritized angularity for quill efficiency on paper and vellum, spreading via university centers like Paris and Bologna by the 13th century.[33] Regional subtypes, such as the rounded Rotunda in Italy and the rigid forms in Germany, adapted to local printing presses later, but in manuscript form, they reflected pragmatic responses to scribal workload rather than aesthetic revival of antiquity.[32]Renaissance Standardization
The Renaissance marked a pivotal phase in the standardization of the Latin script, driven by Italian humanists' efforts to revive classical Roman letterforms amid a broader revival of antiquity. In the late 14th and early 15th centuries, scholars rejected the angular, condensed Gothic scripts prevalent in medieval Europe, which they viewed as obscuring textual clarity, and instead modeled new handwriting styles on surviving ancient Roman inscriptions and Carolingian minuscule manuscripts. This humanist minuscule, characterized by rounded, proportionate lowercase letters with distinct ascenders and descenders, emerged around 1400 in Florence and Padua, emphasizing legibility and aesthetic fidelity to antiquity.[34][35] Poggio Bracciolini (1380–1459), a Florentine scribe and papal secretary, played a central role in this reform by meticulously copying classical texts in a reformed script that revived the clarity of Carolingian models while eliminating Gothic abbreviations and flourishes. Working under patrons like Coluccio Salutati, Poggio's script featured smaller minim heights, careful letter spacing, and a return to antique proportions, influencing subsequent scribes and laying groundwork for printed typefaces. His approach prioritized empirical recovery of ancient forms from rediscovered manuscripts, such as those he unearthed in monastic libraries, over medieval innovations.[36][35] The invention of the movable-type printing press by Johannes Gutenberg circa 1440 accelerated this standardization by enabling mass reproduction of uniform letterforms. Initial European imprints, like Gutenberg's 1455 Bible, employed blackletter (Gothic) types derived from regional manuscripts, but Italian printers swiftly adopted Roman types based on humanist minuscule for Latin classics. In 1465, Arnold Pannartz and Conrad Sweynheym at Subiaco near Rome produced the first books in roman typeface, including editions of Cicero, which featured upright capitals inspired by imperial Roman inscriptions and lowercase letters mirroring Poggio's script. This shift propagated standardized Latin script across printed works, fixing the 23-letter classical alphabet (A–Z excluding distinct J, U, W) in durable metal type.[37][38] Further refinement came through Venetian printer Aldus Manutius (c. 1449–1515), who collaborated with punchcutter Francesco Griffo to develop the first italic typeface in 1495 for Pietro Bembo's De Aetna, slanting letters to emulate swift humanist cursive while maintaining readability. Manutius's Aldine Press standardized roman and italic pairings in compact octavo editions of Virgil (1501) and other classics, introducing consistent punctuation like the semicolon and parentheses to enhance textual flow. By the early 16th century, these innovations supplanted regional variations, establishing the Latin script's modern skeletal structure—serif roman for body text and italic for emphasis—which spread via trade and scholarship, embedding causal uniformity in European typography.[39][40]Enlightenment and National Orthographies
The Enlightenment era, spanning roughly the late 17th to late 18th centuries, marked a concerted effort to apply rational principles to vernacular orthographies, adapting the Latin script to national languages through grammars, dictionaries, and academies that emphasized uniformity, etymology, and phonetic representation where feasible. Influenced by the prestige of classical Latin's perceived logical structure, European scholars produced orthographic manuals and rules that reduced inconsistencies arising from medieval scribal variations and dialectal diversity, facilitated by widespread printing presses. This rationalist approach prioritized clarity for emerging national literatures and administrative needs, often favoring conservative forms that preserved historical spellings over radical phonetic reforms, though debates on simplification persisted.[41][42][43] In England, Samuel Johnson's A Dictionary of the English Language, published on April 15, 1755, established authoritative spellings for over 42,000 words, codifying forms like "receive" and "believe" based on prevailing usage and etymological roots rather than strict phonetics, thereby stabilizing English orthography amid ongoing variability. This work influenced subsequent printers and educators, embedding Latin-derived conventions into standard practice despite criticisms from reformers advocating phonetic alignment. Similarly, in France, the Académie Française's Dictionnaire revisions—initially from 1694 and updated in 1718 and 1740—imposed rules favoring etymological consistency, such as retaining silent letters in words like parfait, to align vernacular writing with classical models while suppressing regional variants.[44] Across German-speaking regions, Enlightenment figures like Johann Christoph Gottsched promoted orthographic reforms in his 1740 Grundriß der deutschen Sprachkunst, advocating simplified spellings and consistent use of the Latin script's basic letters, though full national standardization awaited later unification efforts; his work drew on Latin grammar traditions to argue for logical vowel representation without diacritics. In Spain, the Real Academia Española, founded in 1713, issued its first orthographic guidelines in the 1740s, standardizing accents and conventions for Castilian to counter phonetic drifts, reflecting Enlightenment ideals of purity and rationality. These national initiatives collectively reinforced the Latin script's dominance in Europe by embedding it in codified systems that balanced tradition with reform, laying groundwork for 19th-century expansions.[44][43]Mechanisms of Global Spread
Roman Empire and Early Christianity
The Latin script served as the foundational writing system for Roman imperial administration, military records, legal edicts, and monumental inscriptions throughout the Empire's expansion from 27 BCE onward. Accompanying conquests and colonization, it disseminated from the Italian Peninsula to provinces in Gaul, Hispania, Britannia, North Africa, and the eastern frontiers, where local elites adopted it for communication in Latin alongside indigenous systems.[45] [46] By the 1st century CE, over time refined through epigraphic use on coins, milestones, and public works, the script achieved a standardized classical form with 21 letters (excluding later additions like J, U, and W), enabling efficient recording of laws, senatorial decrees, and historical accounts.[47] [48] In everyday governance and trade, the script's utility in rendering the Latin language—spoken by approximately 50-100 million people at the Empire's peak around 150 CE—facilitated bureaucratic cohesion across diverse regions, supplanting or coexisting with scripts like Greek in the East and Punic in Africa.[45] Roman engineering feats, such as aqueducts and roads inscribed with dedications (e.g., the 2nd-century CE Trajan's Column), exemplified its monumental application, with letter proportions and serifs evolving for legibility in stone carving.[47] This widespread epigraphy, numbering in the tens of thousands of surviving examples from the imperial era, underscores the script's role in asserting Roman cultural dominance and literacy, estimated at 10-20% among urban males.[46] Early Christianity, emerging in the 1st century CE within a predominantly Greek-linguistic eastern milieu, initially relied on Greek script for scriptures and liturgy, but Latin usage gained traction in the western provinces by the late 2nd century as converts from Roman society sought vernacular accessibility. Tertullian (c. 155–240 CE), a North African theologian, produced the earliest substantial body of Christian prose in Latin, including treatises like Apologeticus (c. 197 CE), which defended the faith against pagan critiques using the script's established imperial conventions.[49] This shift reflected causal pressures: the Church's growth among Latin-speaking provincials necessitated translations of Greek texts, fostering script adaptation for doctrinal works and epistles. A landmark in this adoption was Eusebius Hieronymus (St. Jerome)'s Vulgate translation of the Bible, commissioned by Pope Damasus I in 382 CE and substantially completed by 405 CE, which rendered Hebrew, Aramaic, and Greek sources into idiomatic Latin using the contemporary script.[50] The Vulgate's four Gospels and Old Testament revisions standardized orthography and phrasing for ecclesiastical use, circulating in codices that preserved the script amid rising illiteracy post-3rd century crises.[50] By the 4th-5th centuries, as the Western Empire fragmented after 395 CE, Christian communities in Rome, Carthage, and Gaul employed the Latin script for conciliar acts (e.g., Council of Nicaea records adapted westward) and patristic writings, ensuring its continuity in monastic and liturgical contexts where Greek waned.[51] This ecclesiastical entrenchment, independent of imperial patronage after Constantine's 313 CE Edict of Milan, positioned the script as a vector for theological transmission, with scribes refining uncial and half-uncial forms for parchment durability.[49]European Colonialism and Missions
European colonial expansion from the late 15th century onward disseminated the Latin script to the Americas, Africa, parts of Asia, and Oceania, primarily through administrative imposition, educational systems, and religious missions.[52] Spanish and Portuguese colonizers, beginning with Christopher Columbus's voyages in 1492, established viceroyalties in the Americas where Latin script became the medium for governance, legal documents, and literacy instruction. In regions like Mexico and Peru, Franciscan and Dominican friars arrived shortly after conquest, developing orthographies for indigenous languages such as Nahuatl and Quechua using Latin letters to facilitate evangelization and record native grammars by the 1540s.[53] Catholic missions played a pivotal role in entrenching Latin script literacy among indigenous populations, often prioritizing conversion over preservation of pre-existing writing systems like Mesoamerican pictographs or Andean quipus. In the Philippines, acquired by Spain in 1565, Augustinian and Jesuit missionaries supplanted the Baybayin script with Latin-based orthographies for Tagalog and other Austronesian languages, enabling the printing of doctrinas and catechisms by 1593.[54] Portuguese efforts in Brazil from 1500 similarly introduced Latin script, with Jesuit colleges establishing schools that taught reading and writing in Portuguese orthography to both settlers and natives by the mid-16th century.[52] In Africa, the Latin script's adoption accelerated during the 19th-century Scramble for Africa, where British, French, and Belgian colonial administrations, alongside Protestant and Catholic missionaries, standardized it for over 2,000 African languages lacking prior widespread scripts.[55] Mission stations, such as those run by the Church Missionary Society in Nigeria from 1845, produced vernacular Bibles and primers in Latin letters, displacing or marginalizing indigenous systems like Ajami in favor of romanization for administrative efficiency and proselytization.[55] By independence in the mid-20th century, Latin script dominated official orthographies across sub-Saharan Africa, reflecting the intertwined colonial and missionary legacies.[52] Protestant missions in the 19th and early 20th centuries further propelled this trend in Oceania and residual Asian outposts, with figures like Samuel Marsden establishing schools in New Zealand from 1814 that used Latin script for Maori orthographies developed by Thomas Kendall.[52] This pattern underscored how European powers leveraged the script's phonetic adaptability and association with Christianity to consolidate control, resulting in its entrenchment even post-decolonization.19th-20th Century National Reforms
In the 19th century, Romania transitioned from the Cyrillic alphabet, inherited from Orthodox Church influences, to a Latin-based script to emphasize its Romance linguistic roots and distinguish it from Slavic neighbors. This re-latinization process accelerated after the 1848 revolutions, with intellectuals advocating for phonetic alignment with Latin origins; the Romanian Academy formalized the Latin alphabet's adoption in 1862, standardizing spelling rules that incorporated diacritics like ă, â, î, and ț to represent unique phonemes.[56] During the early 20th century, Norway implemented orthographic reforms to align written Danish-influenced Bokmål more closely with spoken urban varieties, while developing Nynorsk as a rural-based standard. The 1907 reform introduced simplifications such as replacing "aa" with "å" and softening grammar rules, followed by the 1917 reform that further reduced Danish elements, mandated "hard" consonants (e.g., /p, t, k/ spellings), and promoted convergence between the two forms to foster national unity post-independence from Sweden in 1905.[57] In Turkey, Mustafa Kemal Atatürk's 1928 language reform replaced the Arabic-based Ottoman script with a Latin alphabet tailored to Turkish phonology, including letters like ç, ğ, ı, ö, ş, and ü. Announced in August 1928 and enacted by law on November 1, the change aimed to boost literacy—from under 10% to over 20% within a year—by simplifying writing and severing ties to Arabic religious texts, with mandatory implementation in education and public use by 1929.[58][59] The Soviet Union pursued a latinization campaign from the mid-1920s to early 1930s, targeting non-Slavic ethnic groups to eradicate illiteracy and counter Cyrillic-associated Russian imperialism and Orthodox influence. New Latin-derived alphabets, such as Yanalif for Turkic languages, were developed for over 40 languages, reaching millions through literacy drives; however, by 1936–1937, Stalin reversed the policy amid geopolitical shifts, mandating a switch to Cyrillic to reinforce Soviet unity, leaving only temporary gains in Yakut and some others before full Cyrillization.[60][61] Vietnam's adoption of the Latin-based Quốc ngữ script, originally devised by 17th-century Portuguese missionaries, gained momentum under French colonial rule in the late 19th and early 20th centuries as a tool for administration and education, replacing complex Chữ Nôm and Chữ Hán systems. By the 1910s, it supplanted traditional scripts in newspapers and schools, with full official status post-1945 independence, driven by its phonetic efficiency for tonal Vietnamese despite initial resistance from Confucian elites.[62][63] Germany's orthographic efforts included the 1901 conference, which standardized some spellings but saw limited immediate change, culminating in the 1996 reform that simplified rules for compounds, capitalization, and digraphs like "ss/ß," implemented from 1998 amid public debate over tradition versus clarity.[64]Post-1945 Adoptions and Digital Globalization
Following the dissolution of the Soviet Union in 1991, several Turkic-speaking former republics initiated transitions from the Cyrillic alphabet to Latin-based scripts as part of national identity assertions and modernization efforts. Uzbekistan began a gradual shift to a Latin alphabet in 1993, with a final draft approved in 2019, though Cyrillic remains in parallel use.[65] Turkmenistan completed its full adoption of a Latin script by 1993, replacing Cyrillic entirely for official purposes.[66] Azerbaijan transitioned between 1991 and 2001, establishing a Latin alphabet standardized in 1996.[67] These reforms, motivated by distancing from Russian influence and aligning with Turkey's 1928 Latinization, affected populations of over 60 million across these states, though implementation varied in completeness.[68] In Southeast Asia, post-colonial independence reinforced Latin script usage. Indonesia, upon declaring independence in 1945, standardized the Latin alphabet for Bahasa Indonesia, building on Dutch colonial precedents and replacing earlier Arabic-influenced Jawi script in official contexts.[69] Vietnam's Democratic Republic adopted the Latin-based Quốc ngữ as the national script in 1945, supplanting chữ Nôm and classical Chinese characters amid literacy campaigns that raised adult literacy from under 20% in the 1930s to over 90% by the 2000s. These adoptions facilitated administrative unification and education in newly sovereign states, with Latin's phonetic simplicity aiding rapid dissemination compared to logographic or abjad systems. The advent of digital technologies from the mid-20th century amplified Latin script's global reach through encoding standards favoring its structure. The American Standard Code for Information Interchange (ASCII), ratified in 1963, allocated 128 code points primarily to unaccented Latin letters, digits, and English punctuation, enabling efficient early computing in English-dominant environments.[70] This 7-bit system underpinned ARPANET protocols and personal computers, embedding Latin primacy in software keyboards and data transmission. Unicode, introduced in 1991, expanded to over 149,000 characters by 2023 but retained ASCII compatibility via UTF-8 encoding, which uses single bytes for basic Latin while multi-byte for others, thus preserving efficiency for Latin-heavy content.[71] Digital globalization entrenched Latin dominance as the internet proliferated from the 1990s, with over 50% of global websites using Latin scripts by 2001 due to U.S.-led infrastructure and English as the de facto digital lingua franca.[72] UTF-8's adoption as the web standard by 2008 minimized barriers for Latin users, while non-Latin scripts faced higher costs in font rendering and input methods, contributing to English's share of online content exceeding 50% despite comprising only 5% of world speakers.[73] Kazakhstan's ongoing Cyrillic-to-Latin transition, targeting completion by 2025, explicitly cites enhanced digital integration and Turkic alignment as rationales, reflecting causal links between script choice and technological interoperability.[74] This dynamic has spurred romanization in auxiliary roles, such as Pinyin for Chinese in global tech interfaces, underscoring Latin's role in bridging linguistic divides without supplanting native scripts.Core Alphabetic Structure
ISO Basic Latin Alphabet
The ISO Basic Latin Alphabet consists of 26 uppercase letters (A B C D E F G H I J K L M N O P Q R S T U V W X Y Z) and their 26 lowercase counterparts (a b c d e f g h i j k l m n o p q r s t u v w x y z), totaling 52 characters without diacritics, ligatures, or other modifications.[75] This set represents the minimal, unextended form of the Latin script standardized for international compatibility, particularly in computing and data interchange.[76] It aligns with the English alphabet but excludes accents used in languages such as French (e.g., é) or German (e.g., ß as a distinct form), treating only the base forms as canonical.[52] Standardized through efforts beginning in the 1960s, the alphabet emerged as part of ISO/IEC 646, a 7-bit character encoding designed to ensure consistent representation of Latin letters across national variants of telegraphic and computing codes.[75] Prior to this, variations in national standards (e.g., differing symbols for punctuation) complicated interoperability; the basic Latin set provided a neutral core, assigning the uppercase letters to code points 41–5A hexadecimal and lowercase to 61–7A in both ASCII and ISO/IEC 646 IRV (International Reference Version).[76] This standardization facilitated the global adoption of digital text processing by prioritizing the 26-letter inventory over locale-specific extensions.[52] In practice, the ISO Basic Latin Alphabet underpins the Unicode Basic Latin block (U+0000–U+007F), which extends it with control characters and basic punctuation but preserves the alphabetic core for rendering in environments lacking support for extended scripts.[75] It is employed verbatim in English orthography and serves as the foundational repertoire for romanization systems, where non-Latin languages are transcribed using only these letters to minimize encoding complexity.[76] Languages with fuller Latin usage, such as Portuguese or Dutch, rely on this base while adding diacritics as needed, but the ISO set ensures baseline portability in plain-text applications.[52]| Uppercase | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Lowercase | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z |