Hubbry Logo
Extended vocal techniqueExtended vocal techniqueMain
Open search
Extended vocal technique
Community hub
Extended vocal technique
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Extended vocal technique
Extended vocal technique
from Wikipedia

Vocalists are capable of producing a variety of extended technique sounds. These alternative singing techniques have been used extensively in the 20th century, especially in art song and opera. Particularly famous examples of extended vocal technique can be found in the music of Luciano Berio, John Cage, George Crumb, Peter Maxwell Davies, Hans Werner Henze, György Ligeti, Demetrio Stratos, Meredith Monk, Giacinto Scelsi, Arnold Schoenberg, Salvatore Sciarrino, Karlheinz Stockhausen, Tim Foust, Avi Kaplan, and Trevor Wishart.

Timbral techniques

[edit]

Phrasing

[edit]

Spoken

[edit]

Spoken text is frequently employed. The Italian term "parlato" has a similar meaning.

Rapping

[edit]

Sprechgesang

[edit]

Sprechgesang is a combination singing and speaking. It is usually heavily associated with Arnold Schoenberg (particularly his Pierrot Lunaire which uses sprechgesang for its entire duration) and the Second Viennese School. Schoenberg notated sprechgesang by placing a small cross through the stem of a note which indicates approximate pitch. In more modern music “sprechgesang” is frequently simply written over a passage of music.

Inhaling

[edit]

Singing is produced while a singer is inhaling. It is used in experimental contemporary classical compositions, such as in the 2006 chamber opera Ursularia by Nicholas DeMaison, for its ability to produce a variety of extreme high and low pitches impossible to create in typical exhaled vocals.[1] In popular music styles, ingressive singing can be combined with vocal distortion techniques for extreme metal vocals like death metal growls. In beatboxing, it is used to create certain percussion sounds (like the "inward K snare"). A careful mixture of ingressive and egressive sounds allows a beatboxer to sustain a rhythmic phrase indefinitely without needing to pause for breath.[2]

Pitch

[edit]

Falsetto

[edit]

A vocal technique allowing the singer to sing notes higher than their modal vocal range.

Glottal sounds

[edit]

A "frying"-type sound may be produced by means of the glottis. This technique has been frequently used by Meredith Monk.

Yodelling

[edit]

Yodelling is performed by rapidly alternating between a singer's chest and head voice.

Ululation

[edit]

A long, wavering, high-pitched vocal sound resembling a howl with a trilling quality. It is produced by emitting a high-pitched loud voice accompanied with a rapid back-and-forth movement of the tongue and the uvula. Ululation is practiced in certain styles of singing, as well as in communal ritual events, used to express strong emotion.

Squeaking

The thyrohyoid muscle can be tensed when speaking to shorten the vocal cords, allowing the use of an extremely high pitch but with a different timbre and resonance compared to a typical falsetto. This technique was originated and named by the rapper 645AR.[3][4]

Reverberation

[edit]

Vocal tremolo

[edit]

A vocal tremolo is performed by rapidly pulsing the air expelled from the singer's lungs while singing a pitch. These pulses usually occur from 4–8 times per second.[citation needed]

Vocal trill

[edit]

A vocal trill is performed by adding singing vibrato while performing a vocal tremolo.[clarification needed]

Rekuhkara

[edit]

Harmonics

[edit]

Overtones

[edit]

By manipulating the vocal cavity, overtones may be produced.[5] Although used in the traditional music of Mongolia, Tuva, and Tibet, overtones have also been used in the contemporary compositions of Karlheinz Stockhausen (Stimmung),[6] as well as in the work of David Hykes.[7]

Undertones

[edit]

By carefully controlling the configurations of the vocal cords, a singer may obtain "undertones" which may produce period doubling, tripling or a higher degree of multiplication;[citation needed] this may give rise to tones that fairly coincide with those of an inverse harmonic series. Although the octave below is the most frequently used undertone, a twelfth below and other lower undertones are also possible. This technique has been used most notably by Joan La Barbara.[1].However, undertones may be generated by processes that include more than the vocal folds.[citation needed] For instance, the ventricular folds (also called the false vocal folds) may be recruited, probably by solely aerodynamic forces, and made to vibrate with the vocal folds, generating undertones, like those found, for instance, in Tibetan low-pitched chant.[citation needed]

Multiphonics

[edit]

By overstressing or by asymmetrically contracting the laryngeal muscles, a multiphonic or chord may be produced.[citation needed] This technique features in the 1968 composition Versuch über Schweine by the German composer Hans Werner Henze. In voice pathology, there are various descriptions of somewhat similar effects, such as those found in patients with diplophonia, a condition that produces a "double voice", i.e., two or even more simultaneous pitches.[citation needed]

Distortion

[edit]

Screaming

[edit]

Growling

[edit]

Buccal speech

[edit]

A form of alaryngeal speech that has a high pitch that can be used for speaking and singing. It is most familiar as the voice of Donald Duck.

Non-vocal sounds

[edit]

Besides producing sounds with the mouth, singers can be required to clap or snap their fingers, shuffle their feet, or slap their body. This is usually notated by writing the appropriate word over a note. These gestures are sometimes written on a separate one-line staff as well.[citation needed]

Artificial timbral changes

[edit]

Inhalation of gases

[edit]

Inhaled helium is occasionally used to drastically change the timbre of the voice. When inhaled, helium changes the resonant properties of the human vocal track resulting in a very high squeaky voice. In Salvatore Martirano's composition L’s GA the singer is required to inhale from a helium mask.

Conversely, an unnaturally low voice may be achieved by asking the singer to inhale sulfur hexafluoride. This technique carries higher risk than helium inhalation due to the gas's heavy weight causing it to settle in the lungs, where it displaces oxygen.[8]

During the production of the album Eternal Atake 2, the rapper Lil Uzi Vert was recorded in the studio inhaling a gas that appeared to be nitrous oxide, an anesthetic that is used recreationally and that creates an unnaturally deep voice when inhaled.[9] The album heavily uses low-pitched vocals which sound identical to the effects of nitrous oxide on the voice,[10] though it is unconfirmed if these were created by inhaling gas, or through digital manipulation to give a similar effect.

Artificial vocal enhancement

[edit]

Amplification, such as microphone or even megaphone, possibly with electronic distortion of the voice, is frequently used in contemporary composition. Through the use of various electronic distortion techniques, vocal enhancement possibilities are nearly unlimited. A good example of this technique can be found in much of the music written and performed by Laurie Anderson.

Since its invention in the 1990s, Auto-Tune and other digital pitch correction methods have been widely used in commercial music genres to create expressive effects, popularized by artists in many genres including the pop singer Cher, the R&B singer T-Pain, and the hip-hop artist Future. Contemporary raï and Berber folk music, which historically placed emphasis on glissando, embraced Auto-Tune to enhance the beauty of vocals.[11]

Singing into the piano

[edit]

There are a number of pieces which require a singer to lean over a (sometimes amplified) piano and sing directly into the strings. If the strings are not damped, the effect is to start audible sympathetic vibrations in the piano. By far the most famous piece to use this technique is Ancient Voices of Children by George Crumb.[citation needed]

Notable performers using extended vocal techniques

[edit]

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Extended vocal technique encompasses a diverse array of non-traditional methods of vocal production employed in twentieth- and twenty-first-century , extending beyond conventional to generate unconventional timbres, pitches, and textures through techniques such as multiphonics, glottal stops, inhalation sounds, vocal fry, and speech-singing hybrids like Sprechstimme. These approaches treat the voice as a versatile instrument capable of mimicking other sounds or exploring physiological limits, often requiring amplification for clarity and projection in performance. The development of extended vocal techniques emerged in the early twentieth century amid avant-garde movements like , Dadaism, and , which challenged traditional notions of musical beauty and vocal norms. Pioneering works include Arnold Schoenberg's (1912), which introduced Sprechstimme—a half-spoken, half-sung delivery—to evoke emotional intensity and rhythmic precision. In the United States, composers like incorporated spoken elements and shouts in pieces such as Charlie Rutlage (1920), while post-World War II experimentalism, influenced by John Cage's emphasis on indeterminate sounds, further expanded the palette with phonetic explorations in (1958). Key figures in the mid-century advancement include , whose Sequenza III (1966) for solo voice—written for Cathy Berberian—integrated laughter, whispers, coughs, and s to blur boundaries between speech, song, and noise, analyzing the voice through physiological categories like phonation modes and adjustments. The 1970s saw formalized study through ensembles like the Extended Vocal Techniques Ensemble (EVTE) at the , which cataloged techniques into monophonic (e.g., whistles, ), multiphonic (e.g., simultaneous pitches via ventricular folds), and miscellaneous categories (e.g., clicks, sighs), as documented in their 1974 and 1975 lexicons. Contemporary applications span genres, from experimental opera and electroacoustic works to and , with pedagogical resources now advocating safe practice to mitigate vocal strain, emphasizing gradual training in control, manipulation, and when drawing from non-Western traditions like . Notable performers, including Joan La Barbara and , have pushed these techniques into improvisational and multimedia contexts, highlighting the voice's potential for timbral innovation and expressive depth.

Introduction

Definition and scope

Extended vocal technique encompasses a range of nontraditional vocal practices that deviate from the conventions of Western operatic or singing, focusing instead on experimental, , and non-Western approaches to sound production for . These techniques involve altering the natural of the voice through innovative methods, such as producing unconventional sounds that expand the instrument's sonic palette beyond melodic or lyrical norms. Defined as extranormal sounds that challenge established parameters of vocal production, extended vocal techniques include practices like growls, whispers, and percussive vocalizations, drawing from diverse traditions to transcend linguistic and tonal constraints. The scope of extended vocal technique emphasizes timbral exploration, multiphonics (simultaneous production of multiple pitches), distortion effects like screams and growls, and integration with other media such as electronics or instruments, applied across contemporary classical, experimental, and genres. In , these techniques appear in works that prioritize sonic innovation, while experimental contexts use them for and emotional depth, and traditions incorporate elements like for harmonic complexity. They play a pivotal role in 20th- and 21st-century composition by enabling composers to evoke abstract emotions, abstract forms, and interdisciplinary dialogues, often notated precisely or left to performer discretion. Central to these practices are phonation modes, which describe variations in vocal fold and that differentiate modal (balanced, neutral production) from extended registers like breathy (high , airy quality), pressed (high , tense sound), and flow phonation (high and for forceful expression). Physiologically, extended techniques rely on the housing the vocal folds—layered structures that to modulate —along with resonators like the vocal tract, which shape and amplify sounds without requiring excessive strain when properly managed. These elements allow for controlled manipulation of tension, closure, and resonance to achieve diverse timbres safely. Emerging in early 20th-century , such techniques marked a shift toward vocal experimentation in works.

Historical development

Extended vocal techniques trace their origins to non-Western traditions, including (khoomei), which emerged in and has been practiced for generations as a means to mimic and connect with the environment. Similarly, Inuit katajjaq, or throat games, developed as an indigenous practice among women in communities, serving as playful vocal competitions and cultural expressions with roots predating European contact. These techniques, involving multiphonic production and rhythmic interplay, prefigure modern extensions by emphasizing the voice's timbral and non-lyrical potential. In the early , Western composers adopted and adapted such ideas amid movements. The Italian Futurists, particularly in his 1913 manifesto , promoted noise-intoners (intonarumori) to liberate music from traditional harmony, inspiring vocal experimentation by integrating industrial and everyday sounds into performance. This evolved in the 1920s Dada scene, where ' Ursonate (1922–1932) exemplified through abstract vocal utterances, rejecting semantic meaning in favor of phonetic and rhythmic exploration. By the 1950s, further propelled these innovations with works like (1958), employing chance operations and unconventional phonations to treat the voice as an indeterminate instrument. Post-World War II developments saw extended techniques permeate diverse genres. The Fluxus movement of the 1960s, through happenings led by figures like Nam June Paik, incorporated spontaneous vocal emissions in multimedia performances, challenging conventional musical boundaries under Cage's influence. In the 1970s, French spectralism, pioneered by Gérard Grisey, applied acoustic analysis to vocal and instrumental sounds in cycles like Les Espaces Acoustiques (1976), emphasizing harmonic spectra and microtonal inflections. These approaches integrated into minimalism, via repetitive vocal motifs in works by composers like Steve Reich, and noise genres, where distorted phonations amplified raw timbres. A key milestone was Luciano Berio's Sequenza III (1966), a virtuosic score for female voice that demanded whispers, sighs, laughter, and multiphonics, marking a pinnacle of theatrical and technical expansion. The 21st century witnessed further evolution through digital integration and global fusion. Electronic music employed sampling to layer and manipulate extended vocals, while artists like blended non-Western influences with processing in albums such as (2001), creating immersive, hybrid soundscapes that fused organic techniques with algorithmic alterations. This era's innovations continue to democratize and hybridize vocal expression across cultural boundaries.

Natural timbral techniques

Phrasing and articulation variations

Phrasing and articulation variations in extended vocal techniques modify the temporal and textural delivery of the voice, emphasizing rhythmic experimentation and breath control to transcend conventional melodic . These approaches integrate spoken rhythms, irregular intonations, and consonantal articulations, allowing performers to evoke , percussive, or atmospheric qualities through altered phrasing structures. Spoken elements form a foundational aspect, incorporating recitation with musical inflection to bridge linguistic and musical domains. Sprechstimme, pioneered by Arnold Schoenberg in his 1912 composition Pierrot Lunaire, exemplifies this as a half-spoken style where performers intone notated pitches without sustaining them, adhering strictly to rhythmic notation while adopting speech-like inflections across a range from E♭3 to G♯5. Schoenberg specified that the melody "should definitely not be sung" but must be "transformed into a speech melody," distinguishing it from held sung notes or neutral speech by combining precise rhythm with gliding pitch transitions. This technique employs head voice and bright timbre to convey satirical or expressionist tones, as evaluated in performances where high pitch and tense phonation enhance its dramatic impact. Closely related, Sprechgesang blends speech and song through irregular pitches and fluid delivery, creating an intermediate vocal mode distinct from fully intoned singing. It facilitates expressive textual phrasing in 20th-century works by allowing performers to shift seamlessly between recitative-like speech and melodic contours, often to underscore emotional or abstract narratives. Unlike Sprechstimme's rhythmic precision, Sprechgesang prioritizes interpretive flexibility in pitch approximation, enabling applications beyond expressionism into broader contemporary vocal experimentation. Rapping and beatboxing represent rhythmic extensions of spoken phrasing, leveraging percussive articulation and patterned speech to construct dense, non-melodic structures. In , performers employ rhythmic speech with pitch variations to delineate phrasing, such as dropping contours at line ends to mimic exaggerated declamation, as in Kendrick Lamar's "Vice City" (2015), where this reinforces verse organization alongside rhyme and rhythm. amplifies this through vocal imitation of percussion, using non-syllabic sounds like inward clap snares and bilabial trills to generate continuous rhythmic streams, often inhaled for seamless phrasing and polyphonic illusions that disguise the voice's linguistic origins. These techniques, rooted in hip-hop, extend vocal delivery by alternating timbres—such as growls for bass and for high elements—to create layered, instrument-mimicking patterns. Inhaled phrasing introduces inverse , where sound emerges during via partially adducted vocal folds, producing ethereal, breathy effects that destabilize traditional phrasing. This ingressive technique yields harsher, less resonant timbres with weaker harmonics compared to egressive phonation, fostering unstable and otherworldly expressions in experimental contexts. Notable applications include Helmut Lachenmann's (1968), which notates ingressive pitches with glissandi and alternations for dramatic contrast, and Georges Aperghis's Récitations pour voix seule (1978), where continuous airflow shifts enhance textual abstraction. Specific articulations further diversify phrasing by embedding consonants into melodic lines for percussive enhancement. Glottal stops, created by abrupt vocal fold closure, deliver sharp onsets that punctuate phrasing, as in Bobby McFerrin's Play (1992), where they define tempo and rhythmic precision within improvisational melodies. Fricatives, involving turbulent airflow, add frictional texture for subtle percussion, evident in Linda Sharrock's Black Woman (1969), where they contribute to micro-rhythmic complexity and emotional intensity. Plosives, such as bilabial or alveolar bursts, integrate explosive attacks into phrasing, allowing McFerrin to simulate basslines and snares, thereby merging rhythmic drive with melodic flow in extended vocal performance.

Pitch and register extensions

Extended vocal techniques encompass a variety of methods to expand the singer's pitch range and manipulate register transitions beyond the conventional , enabling access to frequencies and timbres that challenge traditional boundaries. These approaches often involve altering the patterns of the vocal folds to produce lighter, higher, or more unstable registers, which can extend the usable from sub-bass growls to ultrasonic whistles. Such extensions are integral to genres ranging from to , allowing performers to evoke emotional depth or mimic non-human sounds while maintaining control over intonation. Falsetto and head voice represent light register productions that facilitate singing above the modal (chest) range, typically achieved by thinning the vocal folds and reducing their closure for a breathier, flute-like quality. In , the arytenoid cartilages approximate loosely, allowing only the edges of the vocal folds to vibrate, which produces pitches often exceeding the upper limits of chest voice—commonly up to E5 or higher in trained sopranos. Head voice, a related but fuller mechanism, engages the cricothyroid muscles more actively to elongate and tense the folds, enabling smoother transitions and greater dynamic control, as seen in extensions where singers like navigated seamless shifts from modal to head register for rapid scalic passages. These techniques demand precise breath support to avoid strain, and their integration has been documented in since the , enhancing agility in repertoire. The extends pitch capabilities into ultra-high frequencies, often above C6, through partial vibration of the vocal folds' ligamentous edges, creating a piercing, flutelike tone akin to a bird call. This register relies on extreme tension and minimal mass in the folds, with airflow modulated to sustain notes up to or beyond, and is prevalent in pop and operatic ; for instance, Mariah Carey's use of whistle tones in songs like "" (1991) popularized the technique, reaching frequencies around 2500 Hz. Physiologically, it involves the thyroarytenoid muscles relaxing while the cricothyroid tilts the , and training focuses on gradual ascent from to prevent vocal damage, as evidenced in studies of professional sopranos. Whistle register's clarity arises from its harmonic structure, where higher partials briefly interact with the fundamental to amplify presence without distortion. Yodelling involves rapid, voluntary shifts between chest and head registers, producing a distinctive yodel or "break" that alternates between full, resonant low tones and airy highs, often within a single phrase. Rooted in Alpine folk traditions since at least the and paralleled in African pastoral songs like those of the Fulani people, the technique exploits the —the transition zone between registers—by abruptly relaxing the vocal folds to flip from thick to thin vibration modes. Performers such as in American adapted yodelling for inflections, achieving shifts at around 300-400 Hz, while modern applications in emphasize its rhythmic and melodic versatility. This method requires strong diaphragmatic control to maintain pitch accuracy during flips, and ethnographic studies highlight its cultural role in signaling across distances. Ululation, a high-pitched trilling warble, extends pitch expression through rapid oscillations of the tongue or against the vocal folds, generating a sustained, wavering note in the upper register for celebratory or emotive calls. Common in Middle Eastern (e.g., zaghrouta) and African traditions like those of the Berber or Zulu peoples, it typically occupies frequencies from 1000-2000 Hz and serves ritualistic purposes, such as wedding cheers or warrior summons, dating back to ancient communal practices. The technique involves a fixed high pitch with superimposed , achieved by alternating and movement, and its emotional intensity stems from the vibrational feedback in the singer's resonators. Vocal analyses confirm its non-pathological nature when executed with proper hydration, distinguishing it from strained screams. Glottal sounds, including vocal fry, creak, and multiphonic glottalizations, enable microtonal pitch effects and sub-register extensions by inducing irregular or partial fold closures, producing low-frequency rattles or clustered tones below the typical speaking range. Vocal fry, characterized by a creaky quality at 20-70 Hz, results from relaxed arytenoid approximation and is used in for eerie undertones, as in Diamanda Galás's experimental works; creak similarly adds gritty multipitch layers through asynchronous fold vibration. These techniques, explored in 20th-century composition, allow for quarter-tone inflections and harmonic clusters without external aids, with applications in pieces like György Ligeti's vocal explorations. Their integration enhances textural depth, though prolonged use necessitates monitoring to avoid laryngeal fatigue.

Resonance and vibration effects

Extended vocal techniques that manipulate and enhance timbral depth by altering the acoustic properties of the vocal tract and introducing oscillatory elements, often creating buzzing, fluttering, or shifting tonal colors without fundamentally changing pitch or register. These methods leverage the singer's control over diaphragmatic pressure, articulator movement, and resonator shaping to produce effects that add texture and expressivity in contemporary and . Such techniques are distinct from natural , emphasizing intentional rapidity or non-laryngeal sources for artistic effect. Vocal tremolo involves rapid pitch oscillation, typically achieved through controlled diaphragmatic pulses that create a wavering effect faster than standard vibrato, often exceeding 8 Hz in rate and perceived as a wide, intentional fluctuation rather than a natural embellishment. Unlike vibrato, which pulses at 5-8 Hz to enrich tone through subtle variations in pitch, loudness, and timbre, tremolo prioritizes speed and extent for dramatic emphasis, sometimes resembling a slowed yodel or integrating register shifts in extended applications. This technique, rooted in appoggio breathing for stability, appears in modern compositions to evoke tension or mimic instrumental effects. The vocal trill extends traditional articulation by rapidly alternating between two notes, often employing or diaphragmatic control to produce a fluttering ; in extended contexts, this can involve non-adjacent intervals beyond the standard or whole step, creating microtonal or dissonant textures for heightened expressivity. Performed with relaxed laryngeal freedom, the trill enhances agility and can span wider intervals in contemporary works, distinguishing it from ornamental usage by its integration into melodic lines as a timbral device. Rekuhkara, a traditional Ainu throat-singing practice from , , produces overtone-like through interactive modulation where one performer provides a sustained vocal tone while the other, with a closed , shapes the sound using their vocal tract—often involving adjustments—to generate buzzing overtones and harmonic variations. This dyadic technique emphasizes manipulation for collective buzzing effects, historically used in games rather than solo performance, and has influenced experimental vocal explorations of . Lip and tongue trills introduce non-laryngeal vibrations into the vocal line, where rapid fluttering of the lips or tongue against airflow creates a periodic buzz that adds rhythmic texture and resonance enhancement, often integrated as semi-occluded vocal tract exercises in performance. These articulator-driven oscillations, akin to a "purr" or "squeak," promote even phonation and can be voiced to blend with sustained tones, providing timbral layering in avant-garde pieces without relying on vocal fold distortion. Formant shifting achieves timbral alteration through deliberate adjustments to the vocal tract's shape, such as modifying mouth opening, tongue position, or lip rounding, which repositions frequencies to emphasize different harmonics without altering the fundamental pitch. In , this technique—often termed tuning—aligns the lowest formants with specific partials to boost projection and color, as seen in modifications that shift F1 and F2 for brighter or darker resonances in high registers. These resonance and vibration effects can integrate briefly with distortion techniques to yield a fuller, more complex sound profile in experimental compositions.

Harmonic and overtone production

Overtone singing involves the selective amplification of partials within the vocal spectrum to produce multiple audible pitches simultaneously, creating a drone accompanied by a distinct . In , known as khoomei, the sygyt style exemplifies this by generating a fundamental drone around 100-200 Hz while emphasizing higher , often the 6th to 12th partials, to form a whistle-like up to 2000-3000 Hz. This technique relies on precise vocal tract adjustments to merge formants, enhancing specific while suppressing others, as demonstrated in acoustic analyses of professional Tuvan performers. The kargyraa style, in contrast, produces a lower-pitched overtone layer through reinforced subharmonics, but maintains a biphonic structure with a steady drone and secondary derived from partial amplification. The physiological mechanism underlying harmonic and overtone production centers on vocal tract filtering, where singers modify the shape of the , , and lips to act as a that boosts selected from the glottal source spectrum. This filtering effect, akin to tuning in speech, allows for the perceptual isolation of individual , enabling polyphonic textures from a single voice. In Karlheinz Stockhausen's 1968 composition Stimmung, performers employ "vowel overtone singing" by transitioning through phonetic to systematically tune the vocal tract, isolating of a B-flat fundamental for extended explorations across the ensemble. Such techniques extend pitch and register capabilities, facilitating multiphonics where extended ranges allow independent pitch layers to emerge. Multiphonics in arise from the simultaneous vibration of true vocal folds and adjacent structures, producing two or more independent pitches. Engagement of the ventricular (false) folds alongside the true folds creates a biphonic or output, with the false folds oscillating at a offset from the primary glottal tone, often resulting in intervals like octaves or fifths. Aerodynamic and glottographic studies confirm this self-sustained mode, where airflow interacts nonlinearly between fold layers to sustain distinct without external aids. This method, though demanding on laryngeal control, has been integrated into experimental vocal works for its capacity to mimic . Undertones, or perceived subharmonics, emerge from nonlinear vocal fold vibrations that generate frequencies below the fundamental, such as an F0/2 pattern sounding an octave lower. These occur through asymmetric or period-doubling oscillations in the , where glottal closure irregularities produce subharmonic components, as observed in high-speed imaging of normal vocal folds under controlled subglottal pressure. Though rare in conventional singing, undertones appear in experimental contexts to evoke dissonant or spectral depth, leveraging the voice's capacity for chaotic yet controlled bifurcations. In vocal practices, intentional distortion of harmonics yields inharmonic partials, altering the standard series to produce metallic or buzzing timbres through subtle perturbations in fold vibration or tract . Nonlinear phenomena, such as deterministic chaos in glottal airflow, intentionally disrupt harmonic alignment to create non-periodic spectra, as evidenced in spectrographic analyses of contemporary performers. This approach, prominent in works exploring vocal noise thresholds, prioritizes timbral innovation over pitch clarity, drawing from the voice's inherent sensitivity to aerodynamic instabilities.

Distortion and noise generation

Distortion and generation in extended vocal techniques involve deliberate manipulations of the and vocal tract to produce abrasive, gritty sounds characterized by irregular vibrations and added components, often through heightened tension or supraglottic involvement. These methods contrast with smoother by emphasizing roughness and subharmonics, enabling expressive intensity in genres like heavy metal and experimental performance. Screaming techniques, including fry screaming, rely on high-tension closure of the vocal folds to generate piercing high frequencies, achieved via irregularly spaced glottal pulses during or . Fry screaming specifically produces a brighter, less loud than growls through non-linear interactions in the vocal folds and tract, with variations classified as high, mid, or low based on vocal tract shaping—mid fry being prevalent in modern heavy metal for its balanced intensity. Subglottal pressure is modulated to sustain these pulses without excessive strain, allowing controlled that amplifies harmonic interactions for a noisier spectrum. Growling, particularly the , creates subharmonic distortion by engaging the false vocal folds (aryepiglottic folds) alongside true folds, resulting in a low, guttural rumble common in genres since the . This technique vibrates supraglottic structures to add layers, reducing glottal flow impedance while increasing airflow and for a thick, aggressive texture. The false fold involvement produces simultaneous and subharmonic components, distinguishing it from cleaner . Buccal speech generates muffled distortion by producing sound with a closed , utilizing the as a vicarious air chamber to form a neoglottis, which yields a high around 323 Hz and reduced intelligibility reminiscent of cartoonish voices. This method traps air between the upper and , creating a creaky, obstructed without relying on typical glottal . Vocal fry extension prolongs the creaky voice register for sustained low-end noise, involving irregular glottal closure to mimic percussive effects in beatboxing and experimental theater. In beatboxing, it produces breathy, low-frequency rumbles with durations up to 552 ms through open glottis and velar fricatives, enhancing rhythmic texture in performance art. Safety considerations for these distortion techniques highlight physiological risks like vocal fold swelling or hemorrhaging from overuse, but proper training mitigates damage by focusing on breath support and supraglottic control rather than true fold strain. The TWANG method, involving nasal resonance and epilaryngeal narrowing, promotes safe belting and distortion by clustering formants for efficient projection without excessive tension. Longitudinal studies of professional extreme vocalists show sustained health over 14 years with technique adherence, emphasizing warm-ups and monitoring for hoarseness.

Non-vocal and imitative sounds

Non-laryngeal sound production

Non-laryngeal sound production encompasses techniques that generate audible sounds through oral, nasal, or supraglottic mechanisms without primary involvement of the vocal folds' vibration. These methods expand the sonic palette of the voice by leveraging alternative airstreams and articulatory gestures, such as lingual egressive or pulmonic ingressive flows, to create percussive, resonant, or tonal effects distinct from traditional . In extended vocal practice, they allow performers to mimic rhythmic or environmental elements while preserving vocal health by avoiding glottal strain. Percussive oral sounds, including lip smacks, tongue clicks, and bilabial trills, rely on rapid articulatory closures and non-pulmonic airstreams to produce drum-like or rhythmic effects. Lip smacks, or "lip pops," involve a voiceless lingual egressive labial stop where the lips and form a closure, followed by a quick release of trapped air for a sharp, popping . Tongue clicks generate voiceless lingual egressive alveolar trills through tongue-body constrictions that force air outward, creating clicking or rolling percussions often used to simulate snare drums. Bilabial trills, known as "lip rolls," employ lateral lip vibrations with lingual egressive airflow, producing a buzzing or fluttering sound that can integrate into polyrhythmic patterns without vocal fold engagement. These techniques, observed via real-time MRI in beatboxers, highlight their precision and versatility in rhythmic imitation. Nasal ingressive sounds involve inhaled airflow through the to produce noisy or resonant effects, drawing from indigenous traditions where such mechanisms enhance communal or ritualistic expression. In katajjait (throat games), nasal sounds dominate certain stylistic subfamilies, comprising about 37% of recorded repertoires, and combine with ingressive (inhaled) patterns to create layered, competitive vocalizations primarily performed by women for social bonding or . These inhaled nasal noises, often voiceless or whispered, arise from velum lowering during , generating or snorting timbres that emphasize endurance and cultural . Similar ingressive nasal elements appear in some African vocal practices, such as nasalized clicks in , where pulmonic or lingual ingressive airflow vents through the nose for extended sonic variety in or . Whistle tones, produced via lip or tongue shaping without vocal fold vibration, yield pure, sine-wave-like pitches extending beyond typical human vocal ranges, often exceeding 2000 Hz. The mechanism functions as a Helmholtz resonator, with the oral cavity acting as a chamber bounded by lip and tongue orifices that modulate for tonal control. In extended vocal contexts, performers adjust tongue position and lip pursing to tune these whistles, creating flute-like melodies or harmonic overtones in improvisational or poetic settings. This non-phonatory approach allows seamless integration into vocal works, as seen in Joan La Barbara's explorations where complements rhythmic breathing for spatial effects. Throat bass and egressive grunts utilize subglottal bursts, employing supraglottic structures like the or ventricular folds to modulate airflow for low-frequency rumbles. Throat bass emerges from epiglottal constriction and retraction during egressive pulmonic flow, producing a deep, bass-like vibration involving the false vocal folds and often subharmonic glottal participation, as evidenced in aerodynamic studies of beatboxers. Egressive grunts involve abrupt subglottal releases above the , creating guttural bursts via approximation, which add percussive depth to rhythmic sequences. These techniques prioritize intraoral dynamics, enabling bass effects in performance without risking vocal fold fatigue. In performance, non-laryngeal sounds integrate into to evoke bodily immediacy and sonic experimentation, as exemplified by Henri Chopin's audiopoems in the . In Vibrespace (1963), Chopin captured lip smacks, grunts, and breaths via close-miked recordings, then manipulated them through tape looping and speed variations to form a "prosody of " that defamiliarizes the as a sound factory. These elements—layered without traditional articulation—highlight the voice's para-linguistic potential, influencing later works by emphasizing raw, non-semantic vibrations.

Vocal imitations of instruments or environments

Vocal imitations of instruments represent a key aspect of extended vocal technique, where performers replicate the timbres, articulations, and improvisational qualities of musical instruments using only the voice. In , scat singing exemplifies this approach, with singers employing nonsense syllables to mimic the rapid, bebop-style runs of brass instruments like the or . Pioneered by artists such as and elevated to virtuosic levels by , scat transforms the human voice into a flexible horn, allowing for melodic phrasing, bends, and rhythmic complexity that echo instrumental solos. Another method involves vocal multiphonics, the simultaneous production of multiple pitches, which can create harmonic-rich tones. Achieved through precise control of the vocal tract to generate and , this technique creates a layered quality, often used in contemporary compositions to blend vocal and sonorities. Performers like those in experimental vocal ensembles explore these multiphonics to evoke complex timbres without physical instruments. Environmental imitations extend this to natural phenomena, employing breathy and percussive vocal elements to recreate ambient sounds. whispers, produced by forcing air through a narrowed and oral cavity, simulate wind howls by generating turbulent, continuous noise with varying intensity and pitch modulation. This technique draws on non-laryngeal breath control to produce ethereal gusts, often layered in performance to build atmospheric depth. Similarly, bird calls are imitated through rapid tongue trills, whistles, and syrinx-inspired shaping of the vocal tract, replicating the high-frequency chirps and warbles of avian syrinxes via controlled airflow and oral resonance adjustments. In performance art and historical media, Foley-style vocals apply these imitations to everyday and dramatic effects, using the mouth and body to reproduce non-vocal sounds in real time. Performers in radio dramas and live theater generate footsteps by rhythmic lip smacks or tongue clicks on hard surfaces, or object manipulations like door creaks through vocal crepitation and friction. This vocal Foley enhances narrative immersion, as seen in early 20th-century broadcasts where actors vocalized environmental cues to compensate for absent visuals. Historically, Italian Futurists in the pioneered vocal simulations of machinery through "parole in libertà," onomatopoeic that mimicked industrial noises like engine roars and clanging metal. In works such as Filippo Tommaso Marinetti's , performers declaimed explosive syllables and buzzes to evoke wartime machinery, integrating voice as a raw, percussive tool in live recitations that blurred poetry and noise art. This approach influenced vocal practices by prioritizing mechanical timbres over melodic tradition. In cultural contexts, Australian Aboriginal songlines demonstrate vocal evocation of landscapes, where singers use rhythmic chants, yodels, and idiomatic calls to map and animate terrain features like rivers, rocks, and winds. These oral traditions encode environmental details through vocal patterns that mimic natural echoes and animal cries, fostering a relational bond between performer and country while preserving navigational and ecological knowledge across generations.

Artificial and external modifications

Chemical and physiological alterations

Extended vocal techniques involving chemical and physiological alterations primarily focus on temporary modifications to the voice through substance or bodily interventions, altering , pitch, and without relying on technological aids. of gases like produces a characteristic high-pitched squeak by changing the density of the medium through which sound travels. , being significantly less dense than air, increases the from approximately 343 m/s in air to about 1,000 m/s, raising the resonant frequencies of the vocal tract and amplifying higher harmonics while attenuating lower ones. Conversely, (SF6), which is about five times denser than air, slows the to around 140 m/s, lowering resonant frequencies and resulting in a deepened, bass-like voice quality. These effects stem from the gases' influence on the physics of sound propagation in the vocal tract rather than direct changes to the vocal folds themselves. Temporary adjustments to the vocal folds, such as through hydration or induced swelling, can alter by modifying mucosal wave and closure patterns. Adequate systemic and topical hydration—via increased water intake, steam inhalation, or humidified environments—reduces threshold pressure and promotes efficient vocal fold , potentially yielding a clearer or more resonant tone, while introduces breathiness and instability. Pre-performance rituals like sipping warm fluids or using mucolytics help maintain optimal hydration. Hormonal and medicinal interventions have historically enabled profound vocal range extensions, particularly in contexts like the 18th-century castrati tradition. before halted testosterone-driven laryngeal growth, preserving a high with adult lung capacity for sustained, powerful sopranino or ranges that exceeded typical female capabilities. During , surging sex hormones like testosterone in males elongate and thicken the vocal folds, dropping the by up to an and shifting toward a chestier resonance. In modern medicinal use, corticosteroids temporarily reduce vocal fold to restore range and clarity during acute , while testosterone therapy in men can deepen pitch by approximately 49 Hz after 12 months, though such changes are not always reversible. These alterations carry significant risks and raise ethical concerns, particularly for non-medical applications. Inhaling or SF6 displaces oxygen, risking hypoxia, dizziness, loss of consciousness, and even cerebral gas embolism or asphyxiation in prolonged or high-volume exposures; SF6's density exacerbates expulsion difficulties from the lungs. Vocal fold manipulations via or irritants can lead to chronic inflammation, nodules, or hemorrhage, while hormonal interventions like posed lifelong physiological and psychological harms, now condemned as unethical. use, though beneficial short-term, risks dependency, immune suppression, and rebound . Ethically, performers must weigh artistic innovation against potential irreversible damage, with medical oversight recommended to mitigate long-term vocal impacts. In experimental contexts, such as 1970s , gas was employed for surreal vocal effects to challenge conventional sound perception. Performers like those in New York City's new music scene used to distort voices into ethereal, "haywire" timbres during improvisational events, enhancing the disorienting, immersive quality of these spectacles.

Technological and electronic enhancements

Technological enhancements in trace their origins to mid-20th-century analog methods, particularly the tape manipulation practices of pioneered by at the Groupe de Recherches Musicales in during the 1950s. These techniques involved recording natural sounds, including human voices, onto and then physically altering the tape—through cutting, splicing, speed variations, and reversing—to create abstracted sonic textures that extended beyond traditional . This approach marked an early shift from live performance to studio-based composition, allowing vocal elements to be fragmented, layered, and transformed into non-literal sound objects. A pivotal development in vocal processing emerged with the , invented in 1938 by engineer Homer Dudley as a speech analysis-synthesis device to compress voice signals for . By the 1960s and 1970s, musicians adapted analog vocoders, such as the Sennheiser VSM-201, to blend human vocals with synthesized carriers, producing robotic timbres that extended the voice's and timbral range in electronic music. This effect, which analyzes vocal formants and imposes them on an input, enabled performers to create synthetic, machine-like vocal identities, as seen in Wendy Carlos's use on the 1968 album . The transition to digital vocoders in the 1980s further refined this, allowing real-time modulation for live settings. Building on vocoder principles, , developed in 1997 by engineer Andy Hildebrand at Exxon for seismic data analysis before adaptation for , revolutionized pitch correction and intentional . Originally designed to automatically adjust off-pitch notes to the nearest scale degree, it introduced the "T-Pain effect" or "hard tuning" when set to rapid retune speeds, creating quantized, stepwise vocal glides that extend pitch precision into artificial territory. popularized this in the late 1990s and , employing alongside s on albums like Discovery (2001) to achieve their signature filtered, vocal style, blending correction with expressive robotic aesthetics. Delay and reverb effects further expand vocal spatiality, simulating artificial echo chambers to create depth and immersion without physical acoustics. Analog delay units, like tape-based machines from the 1950s, repeated vocal phrases with gradual degradation, while digital reverbs, emerging in the 1970s with devices such as the EMT 250, modeled room convolutions to envelop the voice in vast, otherworldly environments. In , these processors allow singers to generate polyphonic illusions, as in the cascading echoes of Kate Bush's (1985), where reverb extends a single voice into a choral expanse. Modern plugins, such as those in or Logic, enable precise control over decay times and pre-delay, enhancing live and recorded vocal projections. Sampling and looping technologies facilitate real-time vocal layering, transforming the solo voice into intricate, multi-tracked compositions. Pedalboards like the Boss RC-300 Loop Station, introduced in the , capture and overdub vocal snippets instantaneously, allowing performers to build harmonic densities on stage. Imogen Heap exemplifies this in live performances and tracks like "Hide and Seek" (2005), where she layers breathy vocals into dense, emotive loops using custom setups, extending the voice's rhythmic and textural capabilities beyond linear singing. This method draws from 1980s sampler innovations, such as the , but achieves immediacy through foot-controlled hardware. In , software environments like Max/MSP, created by Miller Puckette in the 1980s and commercialized by in 1997, enable bespoke live vocal processing. Users program patches for , shifting, and manipulation, processing vocals in real time via microphone input to generate ethereal or fragmented extensions. Composers such as Natasha Barrett have employed Max/MSP for immersive acousmatic works, where voice is deconstructed into micro-sounds and reassembled, bridging analog tape legacies with computational precision. The platform's modular nature supports integration with hardware like modules, fostering hybrid setups for vocal exploration. The broader historical shift from analog to digital vocal processing, accelerating in the with affordable DAWs and VST plugins, democratized these enhancements. Analog tape's tactile manipulations gave way to non-destructive digital tools, such as EFX and Waves OVox, which combine vocoding, harmonizing, and effects in plugin form for seamless studio and live use. By 2025, AI-driven tools like real-time voice synthesis in software such as Enhance Speech or custom neural networks further extend vocal capabilities, enabling automatic timbre morphing and generative extensions in live performances.

Interactions with instruments or spaces

Extended vocal techniques often involve physical interactions between the voice and instruments or architectural spaces to produce hybrid timbres through acoustic and . Performers direct vocalizations into the body of an instrument, such as a with its sustaining pedal engaged, to excite sympathetic vibrations in the strings, creating ethereal, echoing overtones that blend the human voice with the instrument's natural resonances. This technique, known as singing into , generates a shimmering aura of sound where the vocal formants interact with the piano's series, amplifying and altering the original vocal without electronic intervention. A seminal example appears in George Crumb's Ancient Voices of Children (1970), where the sings phonetic sounds directly into an amplified , causing the undamped strings to vibrate sympathetically and produce haunting, child-like echoes that evoke Lorca's poetry. Similar effects can occur with other resonant objects, such as tubes or drums, where the voice's airflow and pressure couple with the object's eigenmodes to yield new content, shifting formants and introducing metallic or hollow qualities to the sound. In wind instruments, hybrid production arises through techniques like vocalizing into the bell or tube, where the performer's breath and voiced sounds interact with the instrument's bore, modifying airflow and creating textures that merge vocal harmonics with instrumental overtones. Spatial techniques leverage room acoustics to shape vocal output, with performers positioning themselves to exploit , echoes, or standing waves for amplification and timbral distortion. In enclosed environments like or caves, the voice's energy reflects off hard surfaces, extending decay times and coupling formants with the space's modal frequencies to produce immersive, layered soundscapes. pioneered such integrations in her 1970s Deep Listening exercises, where groups improvised vocally in resonant spaces like the 45-second-reverberant at , treating the architecture as an active collaborator that altered pitch perception, , and ensemble cohesion through acoustic feedback. Megaphones serve as portable resonators in these practices, funneling the voice to emphasize higher s and introduce lo-fi via mechanical compression and reflection, enhancing projection while imparting a raw, amplified edge to extended vocalizations. Oliveros' collaborative setups further exemplify this, as in her Expanded Instrument System improvisations, where voices and instruments like trombones or accordions respond to spatial cues, fostering symbiotic acoustics that evolve timbres in real time. These interactions highlight coupling as a core phenomenon, where the vocal tract's resonances align with an object or space's frequencies, boosting specific harmonics and yielding novel, site-specific sounds central to experimental vocal performance.

Notable practitioners

Pioneers and historical figures

, an Italian Futurist painter and composer, laid foundational groundwork for extended vocal techniques through his 1913 manifesto , which advocated incorporating urban and industrial sounds—including human cries, whispers, and shouts—into musical expression to expand beyond traditional tonality. This document influenced subsequent vocal experimentation by challenging performers to emulate noise through the voice, blurring lines between music and raw sound production. Cathy Berberian, an American born in 1925, emerged as a pivotal figure in the by collaborating closely with composers like her husband , pushing boundaries through innovative performances that integrated speech, theater, and unconventional . Her seminal work Stripsody (1966), which she composed and premiered, exemplifies her contributions by employing multiphonics, onomatopoeic imitations, and graphic notation derived from comic strips to produce layered vocal effects such as whistles, growls, and simultaneous tones. Berberian's versatility in works like Berio's Sequenza III (1966) further demonstrated her role in legitimizing extended techniques within , influencing generations of vocalists to explore timbral and gestural possibilities. Joan La Barbara, an American vocalist and composer born in 1947, is renowned for pioneering extended vocal techniques in the 1970s and 1980s, developing methods like multiphonics, circular singing, and glottal clicks to expand the voice's timbral possibilities. Her compositions, such as Hearing You Are I See You Are (1980), explore the voice as an instrument through layered recordings and non-verbal sounds, often in collaboration with composers like . La Barbara's work emphasized the physiological and acoustic limits of the voice, influencing and . Meredith Monk, an American composer, singer, and choreographer born in 1942, pioneered the integration of extended vocal techniques into multimedia performance starting in the , developing a personal lexicon of sounds including , purrs, and overtone-like multiphonics drawn from global traditions and personal invention. Her 1979 premiere of Dolmen Music, a vocal ensemble piece evoking ancient rituals through cyclical chants and layered harmonies, marked a high point in her early explorations, emphasizing the voice's capacity for non-verbal narrative and spatial resonance without instruments. Monk's innovations, as seen in earlier works like 16 Millimeter Earrings (1966), established extended voice as a holistic medium for interdisciplinary art, prioritizing intuitive expression over linguistic content. In the late 20th century, Tuvan throat singer Kongar-ool Ondar (1962–2013) played a crucial role in reviving and globalizing the ancient practice of khöömei, a multiphonic technique producing drone and overtone melodies simultaneously, which had been suppressed under Soviet policies. Ondar, honored as Tuva's People's Throat Singer in 1992, popularized khöömei internationally through collaborations with Western artists like Paul McCartney and Bela Fleck, and performances on platforms such as The Late Show with David Letterman in 1999, introducing audiences to styles like sygyt (whistled overtones) and kargyraa (subharmonics). His efforts helped preserve Tuvan heritage while inspiring cross-cultural vocal experimentation.

Contemporary performers and innovators

In the late 20th and early 21st centuries, contemporary performers have expanded extended vocal techniques into diverse genres, integrating traditional elements with experimental and electronic innovations to create immersive sonic landscapes. Björk has been a pivotal figure in blending natural vocal extensions with electronic processing, particularly in her 2004 album Medúlla, which relies almost exclusively on human voices for instrumentation. The album features intricate vocal layering, where multiple voices build gradually to form complex textures, as in "Komið," where approximately five distinct voices emerge to exchange ostinati on pitches C and G, fostering harmonic stasis rather than progression. Distortion and primal vocalizations, such as grunts and wails, add raw physicality, evident in "Ancestors," where collaborations with Inuit throat singer Tanya Tagaq create cacophonous layers that emphasize the body's role in sound production. Beatboxing by artists like Rahzel further innovates the sound, as in "Who Is It," where a single-take beatbox track merges with Björk's melodic lines to evoke bodily metaphors and non-teleological structures. These techniques culminate in emergent processes, where dissonant elements resolve into consonant patterns over time, redefining pop music's vocal possibilities. Mike Patton exemplifies versatility in extreme and experimental applications across rock and contexts, particularly from the 1990s onward in projects like and his solo work. His six-octave range enables seamless shifts between crooning, , , scatting, and , pushing the voice into distorted and territories. In John Zorn's "Litany IV" from the Moonchild series, Patton bridges metal vocalizations with structural innovations from , producing effects through manipulation and harsh timbres that challenge conventional phrasing. This approach influences his broader oeuvre, where vocal and rapid timbral changes create narrative tension in tracks like those on 's Angel Dust (1992), integrating influences from and noise genres. Anna-Maria Hefele has popularized polyphonic in the 2010s through accessible demonstrations that highlight its harmonic potential in contemporary settings. This technique involves producing a fundamental drone while isolating s to create simultaneous pitches, resulting in ethereal, multi-note harmonies without additional instruments. Hefele's method relies on precise vocal tract adjustments, such as positioning to filter harmonics, allowing for fluid transitions between notes in polyphonic contexts, as showcased in her viral performances that blend classical overtone traditions with modern . Her work extends the technique beyond ethnic roots into global audiences via digital platforms, emphasizing its meditative and textural qualities. Sainkho Namtchylak, active since the 1980s, fuses with experimental and , employing diphonic and techniques to evoke shamanistic and natural soundscapes. Her produces multiple tones simultaneously—a low drone paired with high harmonics—integrated into improvisational frameworks that incorporate growls, bleats, and gurgles for non-vocal imitations. In albums like Stepmother City (2005), she blends these with elements and electronic textures, drawing from Siberian to create compositions that challenge Western vocal norms. Namtchylak's performances often feature extended multiphonics in live settings, as seen in collaborations that merge throat techniques with and modern composition, expanding the voice's expressive range across cultures. Recent trends in the include AI-assisted vocal processing, which supports generation and enhancement in production. Tools like AI harmonizers can analyze input vocals to produce supportive layers, facilitating complex textures for performers. These innovations build on earlier pioneers' foundations, democratizing access to advanced vocal extensions through digital enhancement.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.