Hubbry Logo
DashDashMain
Open search
Dash
Community hub
Dash
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Dash
Dash
from Wikipedia

Dash
Figure dash En dash Em dash

The dash is a punctuation mark consisting of a long horizontal line. It is similar in appearance to the hyphen but is longer and sometimes higher from the baseline. The most common versions are the en dash , generally longer than the hyphen but shorter than the minus sign; the em dash , longer than either the en dash or the minus sign; and the horizontal bar , whose length varies across typefaces but tends to be between those of the en and em dashes.[a]

Typical uses of dashes are to mark a break in a sentence, to set off an explanatory remark (similar to parenthesis), or to show spans of time or ranges of values.

The em dash is sometimes used as a leading character to identify the source of a quoted text.

History

[edit]
1622 Okes-print of Othello, p. 19. Note use of dashes.

In the early 17th century, in Okes-printed plays of William Shakespeare, dashes are attested that indicate a thinking pause, interruption, mid-speech realization, or change of subject.[1] The dashes are variously longer (as in King Lear reprinted 1619) or composed of hyphens --- (as in Othello printed 1622); moreover, the dashes are often, but not always, prefixed by a comma, colon, or semicolon.[2][3][1][4]

In 1733, in Jonathan Swift's On Poetry, the terms break and dash are attested for and marks:[5]

Blot out, correct, insert, refine,
Enlarge, diminish, interline;
Be mindful, when Invention fails;
To scratch your Head, and bite your Nails.

Your poem finish'd, next your Care
Is needful, to transcribe it fair.
In modern Wit all printed Trash, is
Set off with num'rous Breaks⸺and Dashes

Types of dash

[edit]

Usage varies both within English and within other languages, but the usual conventions for the most common dashes in printed English text are these:

Glitter, felt, yarn, and buttons—his kitchen looked as if a clown had exploded.
A flock of sparrows—some of them juveniles—alighted and sang.

Glitter, felt, yarn, and buttons – his kitchen looked as if a clown had exploded.
A flock of sparrows – some of them juveniles – alighted and sang.

  • An en dash, but not an em dash, indicates spans or differentiation, where it may replace "and", "to", or "through".[6] For example:

The French and Indian War (1754–1763) was fought in western Pennsylvania and along the present US–Canada border

— Edwards, pp. 81–101.

  • An em dash or horizontal bar, but not an en dash, is used to set off the source of a direct quotation. For example:

Seven social sins: politics without principles, wealth without work, pleasure without conscience, knowledge without character, commerce without morality, science without humanity, and worship without sacrifice.

  • A horizontal bar (also called quotation dash)[7] or the em dash, but not the en dash, introduces quoted text.
  • In informal contexts, a hyphen-minus (-) is often used as a substitute for an en dash, as is a pair of hyphen-minuses (--) for an em dash, because the hyphen-minus symbol is readily available on most keyboards.[8] The autocorrection facility of word-processing software often corrects these to the typographically correct form of dash.

Figure dash

[edit]

The figure dash (U+2012 FIGURE DASH) has the same width as a numerical digit (many computer fonts have digits of equal width[9]). It is used within numbers, such as the phone number 555‒0199, especially in columns so as to maintain alignment. In contrast, the en dash (U+2013 EN DASH) is generally used for a range of values.[10]

The minus sign (U+2212 MINUS SIGN) glyph is generally set a little higher, so as to be level with the horizontal bar of the plus sign. In informal usage, the hyphen-minus - (U+002D - HYPHEN-MINUS), provided as standard on most keyboards, is often used instead of the figure dash.

In TeX, the standard fonts have no figure dash; however, the digits normally all have the same width as the en dash, so an en dash can be a substitution for the figure dash. In XeLaTeX, one can use \char"2012.[11] The Linux Libertine font also has the figure dash glyph.

En dash

[edit]

The en dash, en rule, or nut dash[12] is traditionally half the width of an em dash.[13][14] In modern fonts, the length of the en dash is not standardized, and the en dash is often more than half the width of the em dash.[15] The widths of en and em dashes have also been specified as being equal to those of the uppercase letters N and M, respectively,[16][17] and at other times to the widths of the lower-case letters.[15][18]

Usage

[edit]

The three main uses of the en dash are:

  1. to connect symmetric items, such as the two ends of a range or two competitors or alternatives
  2. to contrast values or illustrate a relationship between two things
  3. to compound attributes, where one of the connected items is itself a compound

Ranges of values

[edit]

The en dash is commonly used to indicate a closed range of values – a range with clearly defined and finite upper and lower boundaries – roughly signifying what might otherwise be communicated by the word "through" in American English, or "to" in International English.[19] This may include ranges such as those between dates, times, or numbers.[20][21][22][23] Various style guides restrict this range indication style to only parenthetical or tabular matter, requiring "to" or "through" in running text. Preference for hyphen vs. en dash in ranges varies. For example, the APA style (named after the American Psychological Association) uses an en dash in ranges, but the AMA style (named after the American Medical Association) uses a hyphen:

En dash range style (e.g., APA[b]) Hyphen range style (e.g., AMA[b]) Running text spell-out
June–July 1967 June-July 1967 June and July 1967
1:15–2:15 p.m. 1:15-2:15 PM 1:15 to 2:15 p.m.
For ages 3–5 For ages 3-5 For ages 3 through 5
pp. 38–55 pp 38-55 pages 38 through 55
President Jimmy Carter (1977–81) President Jimmy Carter (1977-81) President Jimmy Carter, in office from 1977 to 1981

Some style guides (including the Guide for the Use of the International System of Units (SI) and the AMA Manual of Style) recommend that, when a number range might be misconstrued as subtraction, the word "to" should be used instead of an en dash. For example, "a voltage of 50 V to 100 V" is preferable to using "a voltage of 50–100 V". Relatedly, in ranges that include negative numbers, "to" is used to avoid ambiguity or awkwardness (for example, "temperatures ranged from −18 °C to −34 °C"). It is also considered poor style (best avoided) to use the en dash in place of the words "to" or "and" in phrases that follow the forms from X to Y and between X and Y.[21][22]

Relationships and connections

[edit]

The en dash is used to contrast values or illustrate a relationship between two things.[20][23] Examples of this usage include:

  • Australia beat American Samoa 31–0.
  • Radical–Unionist coalition
  • Boston–Hartford route
  • New York–London flight (however, it may be argued that New York–to-London flight is more appropriate because New York is a single name composed of two valid words [see "Attributive compounds" below]; with a single en dash, the phrase is ambiguous and could mean either Flight from New York to London or New flight from York to London; such ambiguity is assuaged when used mid-sentence, though, because of the capital N in "New" indicating it is a special noun). If dash–hyphen use becomes too unwieldy or difficult to understand, the sentence can be rephrased for clarity and readability; for example, "The flight from New York to London was a pleasant experience".[23]
  • Mother–daughter relationship
  • The Supreme Court voted 5–4 to uphold the decision.

A distinction is often made between "simple" attributive compounds (written with a hyphen) and other subtypes (written with an en dash); at least one authority considers name pairs, where the paired elements carry equal weight, as in the Taft–Hartley Act to be "simple",[21] while others consider an en dash appropriate in instances such as these[24][25][26] to represent the parallel relationship, as in the McCain–Feingold bill or Bose–Einstein statistics. When an act of the U.S. Congress is named using the surnames of the senator and representative who sponsored it, the hyphen-minus is used in the short title; thus, the short title of Public Law 111–203 is "The Dodd-Frank Wall Street Reform and Consumer Protection Act", with a hyphen-minus rather than an en dash between "Dodd" and "Frank".[27] However, there is a difference between something named for a parallel/coordinate relationship between two people – for example, Satyendra Nath Bose and Albert Einstein – and something named for a single person who had a compound surname, which may be written with a hyphen or a space but not an en dash – for example, the Lennard-Jones potential [hyphen] is named after one person (John Lennard-Jones), as are Bence Jones proteins and Hughlings Jackson syndrome. Copyeditors use dictionaries (general, medical, biographical, and geographical) to confirm the eponymity (and thus the styling) for specific terms, given that no one can know them all offhand.

Preference for an en dash instead of a hyphen in these coordinate/relationship/connection types of terms is a matter of style, not inherent orthographic "correctness"; both are equally "correct", and each is the preferred style in some style guides. For example, the American Heritage Dictionary of the English Language, the AMA Manual of Style, and Dorland's medical reference works use hyphens, not en dashes, in coordinate terms (such as "blood-brain barrier"), in eponyms (such as "Cheyne-Stokes respiration", "Kaplan-Meier method"), and so on. In other styles, AP Style or Chicago Style, the en dash is used to describe two closely related entities in a formal manner.

Attributive compounds

[edit]

In English, the en dash is usually used instead of a hyphen in compound (phrasal) attributives in which one or both elements is itself a compound, especially when the compound element is an open compound, meaning it is not itself hyphenated. This manner of usage may include such examples as:[21][22][28][29]

  • The hospital–nursing home connection (the connection between the hospital and the nursing home, not a home connection between the hospital and nursing)
  • A nursing home–home care policy (a policy about the nursing home and home care)
  • Pre–Civil War era
  • Pulitzer Prize–winning novel
  • New York–style pizza
  • The non–San Francisco part of the world
  • The post–World War II era
    • (Compare post-war era, which, if not fully compounded (postwar), takes a hyphen, not an en dash. The difference is that war is not an open compound, whereas World War II is.)
  • Trans–New Guinea languages
  • The ex–prime minister
  • a long–focal length camera
  • water ice–based bedrock
  • The pro-conscription–anti-conscription debate
  • Public-school–private-school rivalries

The disambiguating value of the en dash in these patterns was illustrated by Strunk and White in The Elements of Style with the following example: When Chattanooga News and Chattanooga Free Press merged, the joint company was inaptly named Chattanooga News-Free Press (using a hyphen), which could be interpreted as meaning that their newspapers were news-free.[30]

An exception to the use of en dashes is usually made when prefixing an already-hyphenated compound; an en dash is generally avoided as a distraction in this case. Examples of this include:[30]

An en dash can be retained to avoid ambiguity, but whether any ambiguity is plausible is a judgment call. AMA style retains the en dashes in the following examples:[31]

  • non–self-governing
  • non–English-language journals
  • non–group-specific blood
  • non–Q-wave myocardial infarction
  • non–brain-injured subjects

Differing recommendations

[edit]

As discussed above, the en dash is sometimes recommended instead of a hyphen in compound adjectives where neither part of the adjective modifies the other—that is, when each modifies the noun, as in love–hate relationship.

The Chicago Manual of Style (CMOS), however, limits the use of the en dash to two main purposes:

  • First, use it to indicate ranges of time, money, or other amounts, or in certain other cases where it replaces the word "to".
  • Second, use it in place of a hyphen in a compound adjective when one of the elements of the adjective is an open compound, or when two or more of its elements are compounds, open or hyphenated.[32]

That is, the CMOS favors hyphens in instances where some other guides suggest en dashes, with the 16th edition explaining that "Chicago's sense of the en dash does not extend to between", to rule out its use in "US–Canadian relations".[33]

In these two uses, en dashes normally do not have spaces around them. Some make an exception when they believe avoiding spaces may cause confusion or look odd. For example, compare "12 June – 3 July" with "12 June–3 July".[34] However, other authorities disagree and state there should be no space between an en dash and adjacent text. These authorities would not use a space in, for example, "11:00 a.m.⁠–⁠1:00 p.m."[35] or "July 9–August 17".[36][37]

Parenthetic and other uses at the sentence level

[edit]

En dashes can be used instead of pairs of commas that mark off a nested clause or phrase. They can also be used around parenthetical expressions – such as this one – rather than the em dashes preferred by some publishers.[38][8]

The en dash can also signify a rhetorical pause. For example, an opinion piece from The Guardian is entitled:

Who is to blame for the sweltering weather? My kids say it's boomers – and me[39]

In these situations, en dashes must have a single space on each side.[8]

Typography

[edit]

Spacing

[edit]

In most uses of en dashes, such as when used in indicating ranges, they are typeset closed up to the adjacent words or numbers. Examples include "the 1914–18 war" or "the Dover–Calais crossing". It is only when en dashes are used in setting off parenthetical expressions – such as this one – that they take spaces around them.[40] For more on the choice of em versus en in this context, see En dash versus em dash.

Encoding and substitution

[edit]

When an en dash is unavailable in a particular character encoding environment—as in the ASCII character set—there are some conventional substitutions. Often two consecutive hyphens are the substitute.

The en dash is encoded in Unicode as U+2013 (decimal 8211) and represented in HTML by the named character entity –.

The en dash is sometimes used as a substitute for the minus sign, when the minus sign character is not available since the en dash is usually the same width as a plus sign and is often available when the minus sign is not; see below. For example, the original 8-bit Macintosh Character Set had an en dash, useful for the minus sign, years before Unicode with a dedicated minus sign was available. The hyphen-minus is usually too narrow to make a typographically acceptable minus sign. However, the en dash cannot be used for a minus sign in programming languages because the syntax usually requires a hyphen-minus.

Itemization mark

[edit]

Either the en dash or the em dash may be used as a bullet at the start of each item in a bulleted list.

Em dash

[edit]

The em dash, em rule, or mutton dash[12] is longer than an en dash. The character is called an em dash because it is one em wide, a length that varies depending on the font size. One em is the same length as the font's height (which is typically measured in points). So in 9-point type, an em dash is nine points wide, while in 24-point type the em dash is 24 points wide. By comparison, the en dash, with its 1 en width, is in most fonts either a half-em wide[41] or the width of an upper-case "N".[42]

The em dash is encoded in Unicode as U+2014 (decimal 8212) and represented in HTML by the named character entity —.

Usage

[edit]

The em dash is used in several ways. It is primarily used in places where a set of parentheses or a colon might otherwise be used,[43][full citation needed] and it can also show an abrupt change in thought (or an interruption in speech) or be used where a full stop (period) is too strong and a comma is too weak (similar to that of a semicolon). Em dashes are also used to set off summaries or definitions.[44] Common uses and definitions are cited below with examples.

Colon-like use

[edit]
Simple equivalence (or near-equivalence) of colon and em dash
[edit]
  • Three alkali metals are the usual substituents: sodium, potassium, and lithium.
  • Three alkali metals are the usual substituents—sodium, potassium, and lithium.
Inversion of the function of a colon
[edit]
  • These are the colors of the flag: red, white, and blue.
  • Red, white, and blue—these are the colors of the flag.

Parenthesis-like use

[edit]
Simple equivalence (or near-equivalence) of paired parenthetical marks
[edit]
  • Compare parentheses with em dashes:
    • Three alkali metals (sodium, potassium, and lithium) are the usual substituents.
    • Three alkali metals—sodium, potassium, and lithium—are the usual substituents.
  • Compare commas, em dashes and parentheses (respectively) when no internal commas intervene:
    • The food, which was delicious, reminded me of home.
    • The food—which was delicious—reminded me of home.
    • The food (which was delicious) reminded me of home.
Subtle differences in punctuation
[edit]

It may indicate an interpolation stronger than that demarcated by parentheses, as in the following from Nicholson Baker's The Mezzanine (the degree of difference is subjective).

  • "At that age I once stabbed my best friend, Fred, with a pair of pinking shears in the base of the neck, enraged because he had been given the comprehensive sixty-four-crayon Crayola box—including the gold and silver crayons—and would not let me look closely at the box to see how Crayola had stabilized the built-in crayon sharpener under the tiers of crayons."

Interruption of a speaker

[edit]
Interruption by someone else
[edit]
  • "But I'm trying to explain that I—"
    "I'm aware of your mitigating circumstances, but your negative attitude was excessive."

In a related use, it may visually indicate the shift between speakers when they overlap in speech. For example, the em dash is used this way in Joseph Heller's Catch-22:

  • He was Cain, Ulysses, the Flying Dutchman; he was Lot in Sodom, Deirdre of the Sorrows, Sweeney in the nightingales among trees. He was the miracle ingredient Z-147. He was—
    "Crazy!" Clevinger interrupted, shrieking. "That's what you are! Crazy!"
    "—immense. I'm a real, slam-bang, honest-to-goodness, three-fisted humdinger. I'm a bona fide supraman."
Self-interruption
[edit]
Either an ellipsis or an em dash can indicate aposiopesis, the rhetorical device by which a sentence is stopped short not because of interruption, but because the speaker is too emotional or pensive to continue. Because the ellipsis is the more common choice, an em dash for this purpose may be ambiguous in expository text, as many readers would assume interruption, although it may be used to indicate great emotion in dramatic monologue.
  • Long pause:
    • In Early Modern English texts and afterward, em dashes have been used to add long pauses (as noted in Joseph Robertson's 1785 An Essay on Punctuation):

Lord Cardinal! if thou think'st on heaven's bliss,
Hold up thy hand, make signal of that hope.—
He dies, and makes no sign!

Quotation

[edit]
Quotation mark–like use
[edit]
Dash is a popular method to quote dialogues in literature.

This is a quotation dash. It may be distinct from an em dash in its coding (see horizontal bar). It may be used to indicate turns in a dialogue, in which case each dash starts a paragraph.[46] It replaces other quotation marks and was preferred by authors such as James Joyce:[47]

—O saints above! miss Douce said, sighed above her jumping rose. I wished I hadn't laughed so much. I feel all wet.
—O, miss Douce! miss Kennedy protested. You horrid thing!
Attribution of quote source
[edit]

The Walrus and the Carpenter
Were walking close at hand;
They wept like anything to see
Such quantities of sand:
"If this were only cleared away,"
They said, "it would be grand!"

Redaction

[edit]

An em dash may be used to indicate omitted letters in a word redacted to an initial or single letter or to fillet a word, by leaving the start and end letters whilst replacing the middle letters with a dash or dashes (for censorship or simply data anonymization). It may also censor the end letter. In this use, it is sometimes doubled.

  • It was alleged that D—— had been threatened with blackmail.

Three em dashes might be used to indicate a completely missing word.[48]

Itemization mark

[edit]

Either the en dash or the em dash may be used as a bullet at the start of each item in a bulleted list, but a plain hyphen is more commonly used.

Repetition

[edit]

Three em dashes one after another can be used in a footnote, endnote, or another form of bibliographic entry to indicate repetition of the same author's name as that of the previous work,[48] which is similar to the use of id.

Typographic details

[edit]

Spacing and substitution

[edit]

According to most American sources (such as The Chicago Manual of Style) and some British sources (such as The Oxford Guide to Style), an em dash should always be set closed, meaning it should not be surrounded by spaces. But the practice in some parts of the English-speaking world, including the style recommended by The New York Times Manual of Style and Usage for printed newspapers and the AP Stylebook, sets it open, separating it from its surrounding words by using spaces or hair spaces (U+200A) when it is being used parenthetically.[49][50] The AP Stylebook rejects the use of the open em dash to set off introductory items in lists. However, the "space, en dash, space" sequence is the predominant style in German and French typography. (See En dash versus em dash below.)

In Canada, The Canadian Style: A Guide to Writing and Editing, The Oxford Canadian A to Z of Grammar, Spelling & Punctuation: Guide to Canadian English Usage (2nd ed.), Editing Canadian English, and the Canadian Oxford Dictionary all specify that an em dash should be set closed when used between words, a word and numeral, or two numerals.

The Australian government's Style Manual for Authors, Editors and Printers (6th ed.), also specifies that em dashes inserted between words, a word and numeral, or two numerals, should be set closed. A section on the 2-em rule (⸺) also explains that the 2-em can be used to mark an abrupt break in direct or reported speech, but a space is used before the 2-em if a complete word is missing, while no space is used if part of a word exists before the sudden break. Two examples of this are as follows:

  • I distinctly heard him say, "Go away or I'll ——".
  • It was alleged that D—— had been threatened with blackmail.

Approximating the em dash with two or three hyphens

[edit]

When an em dash is unavailable in a particular character encoding environment—as in the ASCII character set—it has usually been approximated as consecutive double (--) or triple (---) hyphen-minuses. The two-hyphen em dash proxy is perhaps more common, being a widespread convention in the typewriting era. (It is still described for hard copy manuscript preparation in The Chicago Manual of Style as of the 16th edition, although the manual conveys that typewritten manuscript and copyediting on paper are now dated practices.) The three-hyphen em dash proxy was popular with various publishers because the sequence of one, two, or three hyphens could then correspond to the hyphen, en dash, and em dash, respectively.

Because early comic book letterers were not aware of the typographic convention of replacing a typewritten double hyphen with an em dash, the double hyphen became traditional in American comics. This practice has continued despite the development of computer lettering.[51][52]

Usage in AI-generated text

[edit]

In April 2025, Rolling Stone reported on the growing perception that the em dash is a hallmark of AI-generated writing, particularly by ChatGPT. The article noted how this idea spread through social media, where users began referring to it as the "ChatGPT hyphen" and how these users advised avoiding it to appear more human. However, several writers defended the em dash as a legitimate and expressive punctuation mark with a long history in human writing.[53][54] New York Times Magazine editor Nitsuh Abebe has theorized that growing unfamiliarity with em dashes represents writing conceptually shifting from edited media toward casual emails and text messages.[55] An OpenAI spokesperson stated that while ChatGPT may favor the em dash, its style depends on prompts and is not a reliable indicator of machine authorship.[56][57][58]

En dash vis-à-vis em dash

[edit]
These comparisons of the hyphen (-), n, en dash (–), m, and em dash (—), in various 12-point fonts, illustrate the typical relationship between lengths ("- n – m —"). In some fonts, the en dash is not much longer than the hyphen, and in Lucida Grande, the en dash is actually shorter than the hyphen.

The en dash is wider than the hyphen but not as wide as the em dash. An em width is defined as the point size of the currently used font, since the M character is not always the width of the point size.[59] In running text, various dash conventions are employed: an em dash—like so—or a spaced em dash — like so — or a spaced en dash – like so – can be seen in contemporary publications.

Various style guides and national varieties of languages prescribe different guidance on dashes. Dashes have been cited as being treated differently in the US and the UK, with the former preferring the use of an em dash with no additional spacing and the latter preferring a spaced en dash.[38] As examples of the US style, The Chicago Manual of Style and The Publication Manual of the American Psychological Association recommend unspaced em dashes. Style guides outside the US are more variable. For example, The Elements of Typographic Style by Canadian typographer Robert Bringhurst recommends the spaced en dash – like so – and argues that the length and visual magnitude of an em dash "belongs to the padded and corseted aesthetic of Victorian typography".[8] In the United Kingdom, the spaced en dash is the house style for certain major publishers, including the Penguin Group, the Cambridge University Press, and Routledge. However, this convention is not universal. The Oxford Guide to Style (2002, section 5.10.10) acknowledges that the spaced en dash is used by "other British publishers" but states that the Oxford University Press, like "most US publishers", uses the unspaced em dash. Fowler's Modern English Usage, saying that it is summarising the New Hart's Rules, describes the principal uses of the em dash as "a single dash used to introduce an explanation or expansion" and "a pair of dashes used to indicate asides and parentheses", without stipulating whether it should be spaced but giving only unspaced examples.[60]

The en dash – always with spaces in running text when, as discussed in this section, indicating a parenthesis or pause – and the spaced em dash both have a certain technical advantage over the unspaced em dash. Most typesetting and word processing expects word spacing to vary to support full justification. Alone among punctuation that marks pauses or logical relations in text, the unspaced em dash disables this for the words it falls between. This can cause uneven spacing in the text, but can be mitigated by the use of thin spaces, hair spaces, or even zero-width spaces on the sides of the em dash. This provides the appearance of an unspaced em dash, but allows the words and dashes to break between lines. The spaced em dash risks introducing excessive separation of words. In full justification, the adjacent spaces may be stretched, and the separation of words further exaggerated. En dashes may also be preferred to em dashes when text is set in narrow columns, such as in newspapers and similar publications, since the en dash is smaller. In such cases, its use is based purely on space considerations and is not necessarily related to other typographical concerns.

On the other hand, a spaced en dash may be ambiguous when it is also used for ranges, for example, in dates or between geographical locations with internal spaces.

Horizontal bar

[edit]

The horizontal bar (U+2015 HORIZONTAL BAR), also known as a quotation dash, is used to introduce quoted text. This is the standard method of printing dialogue in some languages. The em dash is equally suitable if the quotation dash is unavailable or is contrary to the house style being used.

There is no support in the standard TeX fonts, but one can use \hbox{---}\kern-.5em--- or an em dash.

Swung dash

[edit]

The swung dash (U+2053 SWUNG DASH) resembles a lengthened tilde and is used to separate alternatives or approximates. In dictionaries, it is frequently used to stand in for the term being defined. A dictionary entry providing an example for the term henceforth might employ the swung dash as follows:

henceforth (adv.) from this time forth; from now on; "⁓ she will be known as Mrs. Smith"

Unicode

[edit]

In the following tables, the "Em and 5×" column uses a capital M as a standard comparison to demonstrate the vertical position of different Unicode dash characters. "5×" means that there are five copies of this type of dash.

Unicode dash characters

[edit]

This table lists characters with property Dash=yes in Unicode.[61]

Code Em and 5× Name Remark
U+002D - M----- hyphen-minus The ASCII hyphen. Sometimes this is used in groups to indicate different types of dash. In programming languages it is used as the minus sign.
U+058A ֊ Armenian hyphen
U+05BE ־ Hebrew punctuation maqaf
U+1400 Canadian syllabics hyphen
U+1806 MONGOLIAN TODO SOFT HYPHEN
U+2010 M‐‐‐‐‐ hyphen The character that can be used to unambiguously represent a hyphen.
U+2011 M‑‑‑‑‑ non-breaking hyphen Also called "hard hyphen",[citation needed] denotes a hyphen after which no word wrapping may apply. This is the case where the hyphen is part of a trigraph or tetragraph denoting a specific sound (like in the Swiss placename "S-chanf"), or where specific orthographic rules prevent a line break (like in German compounds of single-letter abbreviations and full nouns, as "E-Mail").
U+2012 M‒‒‒‒‒ figure dash Similar to an en dash, but with exactly the width of a digit in the chosen typeface. The vertical position may also be centered on the zero digit, and thus higher than the en dash and em dash, which are appropriate for use with lowercase text in a vertical position similar to the hyphen. The figure dash may therefore be preferred to the en dash for indicating a closed range of values.[62]
U+2013 M––––– en dash
U+2014 M————— em dash
U+2015 M――――― horizontal bar
U+2053 M⁓⁓⁓⁓⁓ swung dash
U+207B M⁻⁻⁻⁻⁻ superscript minus Usually is used together with superscripted numbers.
U+208B M₋₋₋₋₋ subscript minus Usually is used together with subscripted numbers.
U+2212 M−−−−− minus sign An arithmetic operation used in mathematics to represent subtraction or negative numbers. Its glyph is consistent with the glyph of the plus sign, and it is centred on the zero digit, unlike the ASCII hyphen-minus and U+2010 HYPHEN, that (especially the latter) are designed to match lowercase letters and are inconsistent with arithmetic operators.
U+2E17 DOUBLE OBLIQUE HYPHEN Used in ancient Near-Eastern linguistics.
U+2E1A HYPHEN WITH DIAERESIS Used mostly in German dictionaries and indicates umlaut of the stem vowel of a plural form.
U+2E3A two-em dash Supplemental Punctuation.
U+2E3B three-em dash
U+2E40 DOUBLE HYPHEN Used in the transcription of old German manuscripts.
U+2E5D OBLIQUE HYPHEN Used in medieval European manuscripts.[63]
U+301C M〜〜〜〜〜 WAVE DASH Wavy lines found in some East Asian character sets. Typographically, they have the width of one CJK character frame (fullwidth form), and follow the direction of the text, being horizontal for horizontal text, and vertical for columnar. They are used as dashes, and occasionally as emphatic variants of the katakana vowel extender mark.
U+3030 M〰〰〰〰〰 wavy dash
U+30A0 KATAKANA-HIRAGANA DOUBLE HYPHEN
U+FE31 PRESENTATION FORM FOR VERTICAL EM DASH Compatibility characters used in East Asian typography.
U+FE32 PRESENTATION FORM FOR VERTICAL EN DASH
U+FE58 M﹘﹘﹘﹘﹘ SMALL EM DASH
U+FE63 M﹣﹣﹣﹣﹣ SMALL HYPHEN-MINUS
U+FF0D M----- FULLWIDTH HYPHEN-MINUS
U+10D6E 𐵮 GARAY HYPHEN
U+10EAD 𐺭 YEZIDI HYPHENATION MARK
[edit]

This table lists characters similar to dashes, but with property Dash=no in Unicode.

Code M and 5× Name Remark
U+005F _ M_____ low line ASCII underscore, usually a horizontal line below the baseline (i.e. a spacing underscore). It is commonly used within URLs and identifiers in programming languages, where a space-like separation between parts is desired but a real space is not appropriate. As usual for ASCII characters, this character shows a considerable range of glyphic variation; therefore, whether sequences of this character connect depends on the font used. See also U+FF3F _ FULLWIDTH LOW LINE
U+007E ~ M~~~~~ tilde Used in programming languages (e.g. for the bitwise NOT operator in C and C++). Its glyphic representation varies, therefore for punctuation in running text the use of more specific characters is preferred, see above.
U+00AD soft hyphen Used to indicate where a line may break, as in a compound word or between syllables.
U+00AF ¯ M¯¯¯¯¯ macron A horizontal line positioned at cap height usually having the same length as U+005F _ LOW LINE. It is a spacing character, related to the diacritic mark "macron". A sequence of such characters is not expected to connect, unlike U+203E OVERLINE.
U+02C9 ˉ Mˉˉˉˉˉ modifier letter macron A phonetic symbol (a line applied above the base letter).
U+02CD ˍ Mˍˍˍˍˍ modifier letter low macron A phonetic symbol (a line applied below the base letter).
U+02D7 ˗ M˗˗˗˗˗ modifier letter minus sign A variant of the minus sign used in phonetics to mark a retracted or backed articulation. It may show small end-serifs.
U+02DC ˜ M˜˜˜˜˜ small tilde A spacing clone of tilde diacritic mark.
U+06D4 ۔ Arabic full stop
U+1428 Canadian syllabics final short horizontal stroke
U+1B78 Balinese musical symbol left-hand open pang
U+203E M‾‾‾‾‾ overline A character similar to U+00AF ¯ MACRON, but a sequence of such characters usually connects.
U+2043 M⁃⁃⁃⁃⁃ hyphen bullet A short horizontal line used as a list bullet.
U+223C M∼∼∼∼∼ tilde operator Used in mathematics. Ends not curved as much regular tilde. In TeX and LaTeX, this character can be expressed using the math mode command $\sim$.
U+23AF M⎯⎯⎯⎯⎯ horizontal line extension Miscellaneous Technical (Unicode block). "Used for extension of arrows".[64] Can be used in sequences to generate long connected horizontal lines.
U+23BA M⎺⎺⎺⎺⎺ horizontal scan line-1 "Refer to old, low-resolution technology for terminals, with only 9 scan lines per fixed-size character glyph". The scan line-5 is unified with U+2500 BOX DRAWINGS LIGHT HORIZONTAL.[64]
U+23BB M⎻⎻⎻⎻⎻ horizontal scan line-3
U+23BC M⎼⎼⎼⎼⎼ horizontal scan line-7
U+23BD M⎽⎽⎽⎽⎽ horizontal scan line-9
U+23E4 M⏤⏤⏤⏤⏤ straightness Miscellaneous Technical (Unicode block). Represents line straightness in technical context.
U+2500 M───── box drawings light horizontal Box-drawing characters. Several similar characters from one Unicode block used to draw horizontal lines.
U+2501 M━━━━━ box drawings heavy horizontal
U+268A monogram for yang I Ching monogram symbols, often used to form trigrams or hexagrams.
U+268B monogram for yin
U+2796 M➖➖➖➖➖ heavy minus sign Unicode symbols.
U+2E0F paragraphos Ancient Greek textual symbol, usually displayed by a long low line.
U+3161 HANGUL LETTER EU Hangul letters used in Korean to denote the sound [ɯ]. The halfwidth form is a compatibility character used in East Asian typography.
U+1173 HANGUL JUNGSEONG EU
U+FFDA HALFWIDTH HANGUL LETTER EU
U+30FC KATAKANA-HIRAGANA PROLONGED SOUND MARK Japanese chōonpu, used in Japanese to indicate a long vowel. The halfwidth form is a compatibility character used in East Asian typography.
U+FF70 HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK
U+4E00 CJK UNIFIED IDEOGRAPH-4E00 Chinese character for "one", used in various East Asian languages.
U+A4FE Lisu punctuation comma Looks like a sequence of a hyphen and a full stop (period).
U+FF5E M~~~~~ Fullwidth tilde Compatibility character used in East Asian typography.
U+10110 𐄐 Aegean number ten
U+10191 𐆑 roman uncia sign A symbol for an ancient Roman unit of length.
U+1104B 𑁋 Brahmi punctuation line Symbols for Brahmi script
U+11052 𑁒 Brahmi number one Brahmi digit for "one".
U+110BE 𑂾 Kaithi section mark A symbol for Kaithi that indicates the end of a sentence.
U+1CE1F 𜸟 M𜸟𜸟𜸟𜸟𜸟 Large type piece crossbar Used to form a large text character in legacy computing.
U+1D360 𝍠 Counting rod unit digit one Counting rod digit for "one", used in calculation in ancient East Asia.
U+1D372 𝍲 Ideographic tally mark one A tally mark based on segments from the Chinese character "正".
U+1D2E5 𝋥 Mayan numeral five Mayan number for "five".

In other languages

[edit]

In many languages, such as Polish, the em dash is used as an opening quotation mark. There is no matching closing quotation mark; typically a new paragraph will be started, introduced by a dash, for each turn in the dialogue.[citation needed]

Corpus studies indicate that em dashes are more commonly used in Russian than in English.[65] In Russian, the em dash is used for the present copula (meaning 'am/is/are'), which is unpronounced in spoken Russian.

In French and Italian, em or en dashes can be used as parentheses (brackets), but the use of a second dash as a closing parenthesis is optional. When a closing dash is not used, the sentence is ended with a period (full stop) as usual. Dashes are, however, much less common than parentheses.[citation needed]

In Spanish, em dashes can be used to mark off parenthetical phrases. Unlike in English, the em dashes are spaced like brackets, i.e., there is a space between main sentence and dash, but not between parenthetical phrase and dash.[66] For example: "Llevaba la fidelidad a su maestro —un buen profesor— hasta extremos insospechados." (In English: 'He took his loyalty to his teacher – a good teacher – to unsuspected extremes.')[67]

See also

[edit]

Explanatory notes

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The dash is a mark consisting of a long horizontal line. It is similar in appearance to the but is longer and serves different functions, such as indicating a break in a sentence or denoting ranges. There are several variants, including the em dash (the longest, used for parenthetical interruptions), the en dash (shorter, for ranges and connections), and the figure dash (same width as digits, for use in numbers). The dash has been part of since the and is encoded in for digital use. Unlike the , dashes are not typically spaced in modern styles, though practices vary by language and .

Overview and History

Definition and Purpose

The dash is a punctuation mark consisting of a long horizontal line, typically longer than a , employed in to signal interruptions, ranges, or relational links within text. Unlike the shorter , which primarily connects words or syllables, the dash provides a more pronounced visual and rhetorical separation, allowing writers to structure sentences with greater flexibility. Its purposes span syntactic, semantic, and stylistic dimensions. Syntactically, the dash denotes breaks or abrupt shifts in sentence flow, functioning as a stronger alternative to commas or parentheses for parenthetical elements. Semantically, it clarifies connections or spans, such as between related concepts or numerical extents, enhancing precision in expression. Stylistically, the dash adds emphasis or dramatic pause, drawing attention to inserted ideas and contributing to the overall rhythm of . The word "dash" derives from Middle English dasche, rooted in the verb dasshen meaning "to strike violently" or "to move swiftly," evoking the mark's resemblance to a hasty, bold . This etymology underscores its origins in scribal and printing practices, where it addressed the need for versatile amid evolving textual demands. In contemporary usage, the dash's adaptability shines through examples like introducing an explanatory aside—she hesitated, unsure of the path ahead—or bridging ideas in phrases such as "cause-effect dynamics." These applications illustrate its capacity to balance clarity and expressiveness without rigid formality.

Historical Origins and Evolution

The dash emerged in printing during the , serving to denote pauses, interruptions, or shifts in thought within sentences. Evidence of its use appears in editions of , such as printed in 1622 by Nicholas Okes and from 1619, where dashes of varying lengths—often longer than —marked dramatic breaks or hesitations in . These early instances distinguished the dash from the hyphen, which was primarily employed for word division, establishing the dash as a versatile tool for expressive rhythm in printed text. By the , the dash had become more integrated into literary and grammatical discourse, reflecting evolving conventions in . In Jonathan Swift's 1733 poem On Poetry: A , the mark is explicitly referenced as a "break" or "dash" to convey abrupt stylistic changes, underscoring its role in poetic structure. Printers like contributed to refined typographic practices during this period, producing works such as his 1757 edition of that emphasized clear spacing and for readability, though the dash itself predated his innovations in transitional typefaces. Grammars of the era, including Robert Lowth's influential A Short Introduction to (1762), discussed pauses but did not introduce the dash anew, instead building on its established presence in print. Standardization advanced in the amid the expansion of industrial printing and consistent typefounding, with the en dash and em dash defined by their widths relative to the lowercase letters "n" and "m" in a given . This , rooted in metal type measurements, became the Victorian-era norm for distinguishing dash variants, as noted in typographic treatises that prescribed their use for clarity in complex sentences. The , first issued in 1906 by the , further codified these conventions, recommending the em dash for interruptions and the en dash for ranges, influencing American publishing standards through subsequent editions. In the , mechanical limitations shaped the dash's evolution, particularly with typewriters that omitted dedicated keys for en and em dashes, prompting writers to improvise using single hyphens for en dashes or double hyphens (--) for em dashes—a practice that persisted in early digital word processing. The shift to digital typography in the late and resolved these issues through the development of , a universal standard initiated by and Apple in 1987 and first released in 1991, which assigned specific code points to the en dash (U+2013) and em dash (U+2014) for accurate cross-platform rendering. This adoption marked a pivotal milestone, enabling the dash's precise reproduction in and revitalizing its typographic potential.

Hyphen versus Dash

The hyphen is the shortest of these punctuation marks, typically measuring about half the width of an em in most typefaces, while the en dash is medium-length—roughly the width of a capital "N"—and the em dash is the longest, equivalent to the width of a capital "M." These physical distinctions originated in traditional typesetting, where the marks were sized relative to letter widths to maintain visual harmony. Functionally, the hyphen joins elements within words or compounds, such as in "well-known" or to divide words at line ends, whereas dashes—en and em—serve to indicate ranges, connections between unrelated items, or interruptions in thought. For instance, a hyphen links "mother-in-law" as a single modifier, but an en dash might connect opposing teams like "Yankees– Sox," and an em dash could break a sentence for emphasis, as in "The decision—final as it was—changed everything." Confusion between hyphens and dashes arose historically from the limitations of keyboards, which lacked dedicated keys for en and em dashes, leading typists to approximate them with a single or (--). This practice persisted into early digital typing, fostering overuse of hyphens in place of proper dashes even after dedicated characters became available. Major style guides, including and the , unanimously recommend using distinct marks for their specific roles rather than substituting hyphens, noting that such misuse can obscure meaning or reduce clarity. In informal writing, like emails or posts, hyphens often replace dashes (e.g., "pages 10-20" instead of "10–20"), but formal contexts insist on proper usage to avoid ambiguity, such as mistaking a compound word for a range. To visually distinguish them, examine the mark's length relative to surrounding letters in proportional fonts, where hyphens appear notably shorter than en or em dashes; in fixed-width (monospace) fonts, however, all may render at uniform lengths, requiring context or character inspection to identify.

Overview of Dash Variants

Dash variants in typography are classified primarily by their relative lengths, which are defined in terms of standard typographic units such as the width of specific characters or elements in a given font. The figure dash (‒) is the shortest, matching the width of a single digit for precise alignment in numerical contexts. The en dash (–) measures approximately the width of a lowercase "n," serving as an intermediate length. The em dash (—) is the longest among common dashes, equivalent to the width of a lowercase "m." The (―) extends to a full , often spanning the width of multiple ems, while the swung dash (⁓) is a wavy variant roughly comparable in baseline length to an em dash but with undulations for stylistic distinction. These variants fulfill distinct primary roles in typesetting: the figure dash aids in alignments, particularly for tabular or numerical data; the en dash denotes ranges or connections between elements; the em dash indicates breaks or interruptions in text flow; the appears in musical notations or to introduce quoted material; and the swung dash approximates values or substitutes for repeated terms in a tilde-like manner. In English , the em dash and en dash dominate usage due to their versatility in and , while the figure dash, , and swung dash remain niche, appearing mainly in specialized contexts like data presentation or notation systems. To illustrate relative widths, the following table compares the variants using descriptive approximations and sample glyphs in a monospaced representation (actual rendering varies by font, such as or ):
VariantRelative Width DescriptionSample Glyph
Figure DashDigit width (≈0.5 em)
En Dash"n" width (≈0.5 em)
Em Dash"m" width (1 em)
Horizontal BarFull line (2–3 ems)
Swung DashEm-like with waves
These variants emerged historically from the practical demands of metal type and early printing, where specific lengths addressed alignment issues in tables (as with the figure dash) or notational precision, evolving alongside the standardization of en and em units in the 19th century.

Figure Dash

Characteristics and Dimensions

The figure dash (U+2012, ‒) is a typographic character specifically designed with a width equivalent to a single digit (0–9) in the given font, ensuring precise alignment in numerical contexts without altering overall line proportions. This dimension makes it generally shorter than the en dash (U+2013), which spans the width of the letter "N". In fonts featuring fixed-width digits, such as those in monospace typefaces, the figure dash matches the uniform character width exactly, facilitating seamless integration in tabular data. Design variations exist across typefaces; in some, the figure dash glyph may visually resemble the en dash but is intended for contextual use where digit alignment is required, though authentic implementations adhering strictly to digit width are uncommon in contemporary digital fonts. The figure dash is intended for use in numerical contexts, such as separating grouped numbers in monospaced fonts, to match digit width and preserve alignment, as proposed for Unicode compatibility. For instance, in proportional fonts like Times New Roman, its rendering approximates the width of a standard digit like "0," contrasting with the broader variability of letter widths, whereas in Courier, it aligns uniformly due to the font's monospaced nature. Due to its specialized role, the figure dash remains rare in modern software and font libraries, where it is frequently substituted with the en dash or (U+002D) for simplicity, despite potential misalignment in numerical alignments. As part of the broader dash family, it occupies a niche focused on numerical precision rather than .

Usage in Numbered Lists and Tables

The figure dash (‒) is primarily employed within tables to separate components in numeric data, such as product codes or coordinates (e.g., 555‒0199 where supported), ensuring precise alignment in structured formats. This application is particularly valuable in numbered lists, where it divides multi-part identifiers without disrupting visual flow. In practice, hyphens are commonly used for phone numbers and serial numbers. In tables, the figure dash facilitates the alignment of numeric columns by serving as a digit-width separator, which preserves even spacing across entries in spreadsheets, databases, or technical reports. For instance, when listing product codes or coordinates in tabular data, it prevents misalignment that could occur with wider , enhancing readability in dense layouts. Style guides like note the figure dash's utility in tables for maintaining alignment and clarity in numerical in technical and . This property underscores its role in examples from computational outputs or formatted lists, where consistent digit alignment is essential. Compared to the en dash, the figure dash offers an advantage in narrow columns by avoiding visual distortion, as its width matches that of numerals rather than spanning half an em. This property, inherent to its design as a digit-equivalent separator, supports seamless integration in proportional fonts. A modern challenge with the figure dash lies in its limited accessibility on standard keyboards, often leading to substitutions with the more readily available hyphen-minus (‐) in digital composition. Despite this, typographic software and style-conscious editing continue to advocate for its proper insertion to uphold alignment standards.

En Dash

Usage in Ranges and Connections

The en dash serves primarily to indicate spans or ranges of values, such as numerical sequences, dates, scores, or measurements, where it replaces words like "to" or "through" for conciseness. For instance, it denotes page ranges like 10–20 in citations or book references, date spans such as January 1–5, sports scores like 3–2, and temperature scales like 0–100°C. This usage emphasizes a continuous interval rather than discrete items, distinguishing it from a hyphen, which connects compound terms. In denoting relationships and connections between related entities, the en dash links items that imply a mutual or directional association, often across more than two elements. Examples include flight routes like New York–London or conceptual ties such as parent–child relationship in diagrams and . It functions as a "strong hyphen" to signal these pairings without implying subordination. For directional or from-to relations, the en dash clearly marks progression or extent, such as time periods like 9 a.m.–5 p.m. or alphabetical indexes like A–Z. This application is common in schedules, itineraries, and navigational contexts, where it substitutes for prepositions to streamline expression. Style guides like specify no spaces around the en dash in these contexts to maintain visual flow and readability, contrasting with usage that requires adjacency in compounds. For example, ranges appear as 2001–02 rather than with intervening spaces, aligning with typographic preferences for unspaced en dashes in printing. A frequent error involves substituting an em dash for ranges, which disrupts the intended span notation and confuses it with interruptive , or using a , which is shorter and suited only for word joins. Proper adherence to en dash conventions avoids these issues in professional .

Usage in Compounds and Attribution

The en dash serves a specialized role in attributive compounds, particularly when connecting multi-word modifiers where one element is an open compound, , or phrase that would otherwise create ambiguity with a . For example, in "U.S.–Canada border dispute" or "pre– architecture," the en dash clearly links the full phrases, ensuring the reader interprets the modifier as a cohesive unit rather than separate elements. This usage, recommended by , acts as a "strong " to bridge complex structures and maintain readability in dense prose. In role attributions, the en dash denotes dual identities or relational connections, such as "editor–author partnership" or "mother–daughter relationship," highlighting the interplay between the terms without implying subordination. This application extends to descriptive links in compounds, like "," where it underscores the equal footing of the elements involved. The Chicago Manual of Style endorses this for clarity in compounds involving relational nouns, distinguishing it from simpler hyphenated forms. Style guides diverge on these practices: advocates the en dash to resolve potential ambiguities, as in "post–World War I treaties" versus a hyphenated alternative that might confuse phrasing, while the Stylebook eschews en dashes entirely, favoring hyphens for all compound modifiers to simplify production in . For instance, AP would render "U.S.-Canada border" with hyphens, potentially sacrificing nuance in multi-word units. This difference has led to varied implementations, with Chicago-influenced publishing resolving ambiguities like "civil rights–era activism" through the en dash's visual emphasis. Parenthetic uses of the en dash at the sentence level are rare and typically involve embedding complex attributive compounds for mild insertions, such as "The 19th-century–early 20th-century shift marked a pivotal change." Here, it preserves the compound's integrity within the aside, avoiding disruption from parentheses or em dashes. In and , this evolved from early 20th-century typographic , where en dashes gained traction for precise relational phrasing.

Typographic Rendering and Spacing

In , the en dash is conventionally rendered without spaces on either side when connecting words or elements, as in word–word for ranges or attributions, following style guides like . In , particularly for parenthetical or interruptive uses, the en dash is often spaced—word – word—to provide visual clarity and distinguish it from hyphens, as recommended by style preferences. This spacing rationale emphasizes by creating a more pronounced break in the text flow, preventing confusion with compound words and enhancing scannability in printed or digital formats. The en dash's width is nominally equivalent to the height of a lowercase "n" in the , approximately half the width of an em dash, which promotes proportional consistency across fonts while adapting to each design's metrics. adjustments are commonly applied during font rendering to refine spacing around the en dash, especially with numerals or curved letters, to avoid optical crowding and maintain even visual rhythm. In practice, this ensures the en dash integrates seamlessly into body text without disrupting line harmony. When proper glyphs are unavailable, the hyphen-minus (‐) acts as a standard fallback for the en dash, though its shorter length can compromise aesthetic precision. Double hyphens (--) occasionally substitute in plain-text contexts, particularly for approximating longer dashes, but this is less ideal for the en dash's mid-length form. The en dash also serves occasionally as an itemization mark in bulleted lists, offering a minimalist alternative to dots or other symbols, such as – Item one, for straightforward hierarchical presentation. In , the en dash is rendered in HTML and CSS via the Unicode character – or the entity –, guaranteeing cross-platform consistency and preventing fallback to hyphens in varied browsers. This approach aligns with W3C standards for typographic accuracy in digital typography.

Encoding and Substitutions

The en dash is assigned the Unicode U+2013 in the General block. This encoding ensures consistent representation across platforms supporting , with the character decimal value 8211. For keyboard input, users on Windows systems can insert the en dash by holding Alt and typing 0150 on the , provided the active supports it. On macOS, the shortcut Option + generates the character directly in most applications. In HTML and web contexts, it is represented by the named – or the numeric entity –. In plain text environments without full Unicode support, such as older ASCII-based systems, the en dash is commonly substituted with the hyphen-minus (U+002D, a single "-"), which serves as a basic approximation despite its shorter length. Alternatively, two consecutive hyphens (--) may be used, though this convention originates from typesetting practices like and is more typically associated with the em dash; such substitutions can lead to inconsistent rendering in legacy software or terminals limited to 7-bit ASCII. Major typefaces, including , , and , provide robust support for U+2013, ensuring proper display in standard Latin-based . However, fonts optimized for non-Latin scripts, such as certain or designs, may lack the en dash , resulting in fallback to the or a generic dash form. In programming and markup languages, the en dash is handled through specific conventions for reliable output. In , typing two s (--) automatically produces the en dash in , with \textendash available via packages like textcomp for explicit insertion.

Em Dash

Usage for Parentheticals and Interruptions

The em dash serves as a versatile punctuation mark for enclosing parenthetical information within a sentence, functioning similarly to parentheses but with greater emphasis on the . For instance, it can set off nonessential clauses or phrases that provide additional detail without altering the main sentence structure, such as "The —delayed by unforeseen circumstances—will proceed as planned." This usage draws attention to the interpolated material more dramatically than commas, making it ideal for explanatory or digressive elements in or expository writing. In and narrative contexts, the em dash indicates interruptions or abrupt halts in speech, signaling a break in the speaker's thought or an external disruption. Common in , it appears at the end of an incomplete , as in: "I can't believe you would—" she stammered, cut off by the slamming door. This conveys tension or hesitation more forcefully than an , which suggests trailing off rather than sudden cessation. For integrating quotations, the em dash facilitates attribution or interruption within quoted material, allowing seamless insertion of narrative commentary. An example is: "The decision is final," the declared—though his voice wavered with . Here, the dash separates the spoken words from the descriptive aside, maintaining flow while highlighting reluctance or contradiction. This approach is preferred in styles emphasizing clarity in reported speech over traditional commas. The em dash can also enclose redacted or sensitive information, masking portions of text while preserving sentence integrity, such as "The document——portions withheld for ——revealed key findings." This application treats the omitted content as a parenthetical insertion, drawing the reader's focus to the surrounding context without disrupting . Style guides vary on spacing around the em dash in these uses: recommends no spaces for a closed appearance, as in "word—word," to ensure tight integration, while MLA style permits rendering as two hyphens (-- ) without spaces in manuscripts but prefers the true em dash unspaced in final publications. These conventions prioritize visual cohesion, differing from the en dash's spaced use in ranges.

Usage for Introductions and Lists

The em dash functions similarly to a colon by introducing lists or explanations, but it imparts a more informal and emphatic tone, creating a seamless yet dramatic transition within the sentence. For instance, a writer might construct a sentence like "She needed only a few essentials—milk, eggs, and bread—to complete the recipe," where the dash draws attention to the enumerated items without the formality of a colon. This usage is endorsed by style guides such as , which notes that the em dash can substitute for a colon in introducing amplifying material or , particularly when emphasizing the forthcoming content (Chicago Manual of Style, 17th ed., section 6.91). The Punctuation Guide similarly highlights its role in emphasizing conclusions or expansions, stating that "the em dash can be used in place of a colon when you want to emphasize the conclusion of your sentence," offering flexibility in narrative pacing. In itemizing series within a sentence, the em dash can mark a break before the final element, adding rhythmic emphasis and avoiding the rigidity of commas alone. Consider the construction "The flag's colors represent passion, purity—and liberty," where the dash heightens the impact of the concluding term, evoking a sense of culmination. 's guide to punctuation describes this as a way to introduce or amplify material in lists, exemplified by "Chocolate chip, oatmeal raisin, , snickerdoodle—these are my favorite types of cookies," which uses the dash to summarize a preceding series with vivid flair (Merriam-Webster, "Em Dashes"). This technique is particularly effective in prose for building momentum, as it integrates the list fluidly rather than isolating it. The em dash also enables repetition for emphasis or correction, reinforcing ideas through immediate restatement or clarification. An example is "The best option available? None—the perfect one," where the repeated structure underscores certainty and corrects any prior ambiguity. New Hart's Rules advocates this application, instructing writers to "use a dash to introduce an explanation, amplification, paraphrase, particularisation or correction of what immediately precedes it," promoting a dynamic flow that sustains reader engagement without abrupt stops (Butterworth, New Hart's Rules, p. 107). In , this repetition via dash often conveys emotional intensity, as seen in F. Scott Fitzgerald's , where phrases like "I hope she'll be a fool—that's the best thing a girl in this world can be—a beautiful little fool" use successive dashes to layer emphasis and irony (Fitzgerald, The Great Gatsby, ch. 1). Compared to the colon, the em dash provides advantages in creative contexts by offering a softer, more conversational integration that enhances over stark delineation. While a colon signals a formal expectation ("She needed three things: milk, eggs, bread"), the dash blends the introduction more organically, fostering a narrative voice that feels immediate and less didactic. The Editor's Manual explains that "a colon is quieter; a dash is more emphatic and ," yet in , this drama translates to fluid momentum rather than interruption, aligning with Hart's preference for dashes in achieving "dynamic flow" in literary expression (Ritter, "Colon vs. Dash"). This versatility makes the em dash a favored tool in novels for maintaining tonal subtlety while amplifying key revelations or enumerations.

Typographic Details and Approximations

The em dash is conventionally rendered without spaces on either side of the , as in word—word, in typography per . This unspaced presentation creates a seamless integration with surrounding text, emphasizing the dash's role as an inline interruption. In , however, some style guides recommend thin spaces before and after the em dash, such as word — word, to enhance in certain contexts. The width of the em dash corresponds to one em unit, defined as the height of the typeface's capital M, ensuring proportional consistency across font sizes. This full-em measurement allows for balanced visual weight in text composition, distinguishing it from the narrower en dash at half an em. In justified text blocks, adjustments may be applied to the em dash—tighter or looser spacing relative to adjacent letters—to maintain even line lengths and prevent awkward gaps. Prior to digital typesetting, the em dash was approximated in typewriters and plain text using two consecutive hyphens (--), which provided a visual proxy roughly matching the intended length. Three hyphens (---) occasionally served as an alternative for emphasis, though two remained the standard convention. The introduction of word processors in the late 20th century enabled direct insertion of the proper em dash glyph, reducing reliance on these substitutions and standardizing typographic accuracy. Across typefaces, the em dash typically appears as a straight horizontal line, but display fonts may incorporate subtle curves or stylistic flourishes at the ends to harmonize with the overall aesthetic. In digital environments like web pages, the em dash is rendered via the HTML entity — or , with CSS properties such as font-family, font-size, and text-rendering influencing its appearance and integration into layouts.

Applications in AI-Generated Text

Large language models (LLMs) often exhibit a pronounced tendency to overuse em dashes in generated text, particularly for dramatic pauses and parenthetical insertions in narrative or explanatory prose. This pattern emerges prominently in outputs from models such as GPT-4o and Gemini, where em dashes serve as a to enhance flow and emphasis, mimicking sophisticated human writing styles found in training corpora. For instance, AI-generated narratives frequently insert em dashes to break thoughts or add asides, resulting in denser than typical casual human writing. Such overuse can be attributed to biases in training data, which predominantly draws from web-scraped texts, books, and articles rich in em dash usage for rhetorical effect. LLMs replicate these patterns without the contextual nuance humans apply, leading to repetitive or exaggerated application in non-literary contexts like emails or reports. Studies indicate that this replication contributes to detection challenges, as experts in 2023 struggled to differentiate AI-generated abstracts from human ones based on alone, with em dashes cited as a perceived but unreliable marker. Additionally, highlights the critical role of like em dashes in how LLMs contextual , where they act as structural anchors for coherence but can introduce semantic ambiguities if over-relied upon. By 2025, advancements in LLMs have begun addressing some inconsistencies, such as variable spacing around em dashes—often rendered without spaces in standard but inconsistently spaced in earlier outputs due to tokenization quirks. On November 14, 2025, updated to better follow custom instructions on em dash usage and formatting, including avoiding overuse and adhering to spacing rules, as announced by CEO . techniques, including explicit instructions to adhere to style guides like , have proven effective in reducing overuse and aligning AI prose with human norms. For example, an unedited AI output might read: "The experiment failed—spectacularly so—due to unforeseen variables," while a prompted version revises to: "The experiment failed, spectacularly so, due to unforeseen variables," substituting commas for variety. Linguistics analyses from 2023 to 2025 emphasize these interventions to mitigate training data artifacts, promoting more balanced punctuation in AI-assisted writing. A public debate has emerged on social media platforms like X (formerly Twitter) regarding whether the use of em dashes signals AI-generated text. Users have pushed back against such claims, defending em dashes as longstanding elements of standard English grammar employed by human writers well before the rise of AI, and asserting that AI merely replicates human styles. This discussion underscores the limitations of using punctuation as a reliable detector for AI content.

Comparisons and Variants

En Dash versus Em Dash

The en dash (–) and em dash (—) are the two primary dashes in English , distinguished primarily by their length and function. The en dash, approximately the width of a capital N, serves to indicate connections or spans, such as linking related elements or denoting ranges without implying a break in thought. In contrast, the em dash, roughly the width of a capital M and thus longer, functions to separate or interrupt, creating a stronger pause or setting off supplementary information. This length-based distinction underscores their roles: the en dash links (shorter for continuity), while the em dash separates (longer for emphasis or division). Common confusions arise in informal digital writing, where the en dash is frequently substituted with a hyphen (-) due to keyboard limitations or lack of typographic awareness, leading to reduced clarity in connecting spans. Similarly, the em dash is sometimes interchanged with the en dash or double hyphens (--), which can blur interruptions and affect readability by weakening the intended rhetorical pause. Such substitutions are prevalent in plain-text environments like email or social media, where precise rendering is not prioritized, potentially causing misinterpretation of linked versus separated ideas. Style guides differ markedly in their treatment of these dashes. maintains a strict distinction, recommending the en dash exclusively for spans and connections (e.g., bridging open compounds) and reserving the em dash for interruptions or parenthetical elements, viewing hyphens as unsuitable for the former. Conversely, the (AP) Stylebook eschews the en dash entirely, favoring hyphens for ranges and connections while employing the em dash only for abrupt breaks or emphasis, reflecting a more flexible approach suited to journalistic brevity. This contrast highlights how formal book publishing () prioritizes typographic precision, whereas news writing (AP) emphasizes simplicity and compatibility. To decide between the two, consider the semantic intent: opt for the en dash when indicating "to" or "between" in connections, such as denoting a relationship between entities, and use the em dash for pauses that disrupt flow. For instance, in ambiguous phrasing where a span might be read as an interruption, the en dash clarifies linkage (e.g., distinguishing a directional connection from a sudden ), while the em dash resolves cases where a connection could mimic a break by enforcing separation. This decision tree—assess if the dash bridges (en) or breaks (em)—avoids overlap and enhances precision, as supported by guidelines emphasizing contextual role over mere substitution.

Horizontal Bar and Swung Dash

The horizontal bar (Unicode U+2015, ―) is a typographic character used to introduce quoted text in some styles, known as a quotation dash, and is often wider than an em dash. It may also appear in musical notation to represent multi-measure rests, spanning the measure width with a number indicating duration. Due to limited font support and its niche role, the horizontal bar is frequently substituted with an em dash in plain text environments, providing a comparable but shorter approximation of its length and function. The swung dash (Unicode U+2053, ⁓), characterized by its wavy, oscillating form, has historical roots in where it replaces the entry word in examples to avoid repetition, such as substituting for the headword in definitions. This character, less common in everyday , is often rendered via the (~, U+007E) as a substitution in , digital glossaries, or programming contexts where approximate equivalence or placeholders are needed, maintaining its utility despite the approximation's straighter line.

Encoding and International Aspects

Unicode Representation

The Unicode Standard encodes various dash-like characters to support typographic and compatibility needs across scripts and legacy systems. The core dash characters, including the hyphen-minus (U+002D), figure dash (U+2012), en dash (U+2013), em dash (U+2014), horizontal bar (U+2015), and swung dash (U+2053), were primarily introduced in Unicode version 1.1 in June 1993 to standardize punctuation from earlier character sets like ISO 8859. Subsequent updates, such as the addition of the swung dash in Unicode 4.0 (April 2003), addressed compatibility with additional typographic traditions. Most of these characters reside in the General Punctuation block (U+2000–U+206F), which consolidates dashes, hyphens, and related marks for broad interoperability, while the hyphen-minus appears in the Basic Latin block (U+0000–U+007F) due to its foundational role in ASCII. The wave dash (U+301C), a related character used in East Asian , is encoded in the CJK Symbols and Punctuation block (U+3000–U+303F) and was also introduced in Unicode 1.1. Related characters include the hyphen (U+2010), intended for line-breaking contexts without the ambiguities of the hyphen-minus, also from Unicode 1.1. In non-Unicode environments, such as early ASCII systems, multiple hyphen-minus characters (e.g., --) often served as approximations for longer dashes. The following table provides a quick reference for the code points, official names, and common HTML decimal entities for these characters:
Code PointNameHTML Entity (Decimal)
U+002DHYPHEN-MINUS-
U+2010HYPHEN
U+2012FIGURE DASH
U+2013EN DASH– (–)
U+2014EM DASH— (—)
U+2015HORIZONTAL BAR
U+2053SWUNG DASH
U+301CWAVE DASH

Usage in Non-English Languages

In French , the em dash (tiret cadratin) is commonly employed with spaces on both sides to denote turns, as in — Bonjour — comment allez-vous ? — , distinguishing it from the spaced en dash (tiret demi-cadratin) used for ranges such as 2020 – 2025. This spaced convention aligns with guidelines from the Imprimerie Nationale, emphasizing clarity in literary and formal texts. German typography favors the en dash (Gedankenstrich or Halbgeviertstrich) with surrounding spaces for sentence breaks and parentheticals, such as Das ist – obwohl überraschend – wahr, while hyphens remain the standard for compound words like Zeitmanagement. This spaced approach, detailed in publishing standards, contrasts with unspaced usage in some informal contexts but maintains precision in academic and journalistic writing. In Spanish, the em dash (raya) is used unspaced to signal interruptions or dialogue shifts, for example, —No lo sé—interrumpió ella, as prescribed by the Real Academia Española (RAE), with no opening punctuation like English quotes. Regional variations in may include slight adaptations in spacing for emphasis, but the core unspaced usage persists across and the for narrative flow. East Asian languages exhibit distinct dash preferences, minimizing Western em and en dashes. Japanese typography relies on the wave dash (U+301C, 〜) for continuations and ranges, such as 午後5時〜7時, often in informal or enumerative contexts, per guidelines from the Japanese Standards Association. In Chinese, a double em dash (——) serves for breaks and asides, as in 他——突然停顿——继续说, following GB/T 15834-2011 standards to avoid confusion with the character for "one" (一). As of 2025, has influenced multilingual publications, with style guides like the European Commission's using unspaced em dashes for parentheticals in English texts to align with conventions while adapting to cross-lingual needs.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.