Hubbry Logo
search
logo
Utau
Utau
current hub
910559

Utau

logo
Community Hub0 Subscribers
Write something...
Be the first to start a discussion here.
Be the first to start a discussion here.
See all
Utau

UTAU is a Japanese singing synthesizer application created by Ameya/Ayame (飴屋/菖蒲). This program is similar to the VOCALOID software, with the difference being it is shareware instead of under a third party licensing.

In March 2008, Ameya/Ayame released UTAU, a free, advanced support tool shareware software that was downloadable from its main website. UTAU (歌う), literally meaning 'to sing' in Japanese, has its origin in the activity of "Jinriki Bōkaroido" (人力ボーカロイド; Manual Vocaloid), where people edit an existing vocal track, extract phonemes, adjust pitch, and reassemble them to create a Vocaloid-esque singing voice. UTAU was originally created to assist this process using concatenative synthesis. UTAU is able to use WAV files provided by the user, so that a singing voice can be synthesized by introducing song lyrics and melody. UTAU came with AQUEST's voice synthesizer "AquesTalk" for synthesizing the voice samples of the default voicebank, Utane Uta (also nicknamed Defoko (Defoko meaning 'Default Girl' in Japanese)) on its initial launch, after which the generator deletes itself. Voices made for the UTAU program are officially called "UTAU" as well, though they are colloquially known as "UTAUloids", a reference to VOCALOID. They are also called "voicebanks" (more common in English-speaking areas) and "(voice) libraries" in Japan. A myriad of voicebanks have been developed by independent users. These voicebanks are normally distributed directly from their creators via internet download, but some are sold as part of commercial projects.

UTAU is mostly a Japanese program and thus many of its voices are created specifically for the Japanese language. However, as users are able to make their own voicebanks, the userbase has devised methods to allow voicebanks to sing in languages other than Japanese. The X-SAMPA format is often used for English or other non-Japanese voicebanks, however other phonetic systems are sometimes used, such as ARPABET and any number of custom phonetic systems.

UTAU's project files are saved under the ".ust" (Utau Sequence Text) extension. These files can be freely distributed, allowing different UTAU to sing the same piece. Producers have developed several methods of producing their sound banks and results for the voicebanks vary because of this. UTAU also supports MIDI format and .vsq format.

Ameya/Ayame added support for Unicode in an unreleased newer version of UTAU as per the screenshots posted on Twitter. The corresponding backend support tail fixed region as well as several other audio encodings has already been released, while frontend support is yet to be released as of September 2020. Ameya also updated UTAU to be compatible with 64-bit systems.

The editor is capable of placing notes, entering phonemes, and changing pitch and volume on a piano roll. Only one track can be created in UTAU, and notes cannot be placed on top of each other, becuse a human can not say 2 different things at the same time and this is also true for a Utauloid. By default, only notes are displayed on the piano roll, but display settings can be changed to show the pitch curve, volume intensity, envelope, and flags. UTAU uses flags to change aspects of the voice, such as with low-pass and high-pass filters, and reducing or adding breathiness. These flags differ depending on the resampler used. Score information and data in the voicebank is processed with a resampler and wavtool based on the score created with the editor. Only one resampler can be utilized in a single .ust file. A formant filter is used to control changes in voice quality, which can be turned off.

The audio file to be loaded in is found by matching the symbols on the note with the audio file name in the voice library. However, a prefix.map file can change which subfolder the sample is taken from. The pitch of the synthesized voice is adjusted according to the difference between the original sound file and the pitch of the note in the editor. UTAU uses formant filters to prevent extreme changes in voice quality, which can be disabled. Batch processing is used to generate multiple notes at once. Cache files are created during this process. Depending on the resampler, the amount of cache files may increase. There are settings in the menu to delete cache files when the program is closed, or after a certain period of time.

There are built-in plugins which can automatically merge vowels, and the "Omakase/A la carte" settings which can add automatic pitch and vibrato to an entire file. Other plugins created by users can also be added into the software. The colors of the editor can also be changed in the setting.ini file.

See all
User Avatar
No comments yet.