Wikivoyage:Romanization

Romanization is the process of mapping a script into the Latin alphabet used for English. This page provides guidelines for romanization on Wikivoyage.

As a rule of thumb, romanization should aim for allowing the casual reader to guess at the pronunciation, and the expert to pronounce it right.

General

 * For article titles, use the name most commonly used in English for a place, regardless of local character sets. See Naming conventions for details.
 * In article content, use the correct diacritics at least the first time the name is given.
 * When linking to destinations, the local script is unnecessary clutter. That is, the Tokyo article is linked to as "Tokyo", not "Tokyo (東京)"; if somebody wants or needs to know the script and reading, it's just a click away.

Places
The following guidelines apply to any places in listings (See, Do, Eat, Drink, Sleep, etc.)


 * If a place has a common English name, use it, but always provide the local script and correct romanization in parentheses.
 * Example: Xilin Pagoda (西林塔 Xīlíntǎ) ...
 * If a place has no English name at all, use the romanization as the name and give the script in parentheses.
 * Example: Khrop Khrueang (ครบเครื่อง), Ari Samphan Soi 10. Famous for its kuay tiao yam bok rice noodles...

When using listings templates, the alt parameter can be used to specify the local script and its romanization, and they will automatically be formatted correctly. Examples:



Terms
The following guidelines apply to any local terms used in other content (the introductory glosses of any section, cultural coverage, etc.). This is about the place where it is introduced, later on, leave out the parenthesis and its contents.


 * If you want to give the local name of any term, but use its English name, place both local script and romanization in parentheses after the English.
 * Example: One Japanese specialty worth seeking out is eel ( うなぎ unagi) ...
 * If you want to use the romanized term, italicize it and provide the local script in parentheses. If you want to include a literal translation, this can also go within the parentheses, but in quotes.
 * Example: In Thailand, Western-style black tea is known as chaa ron (ชาร้อน, lit. "tea hot") ...

Languages
If you find an article (or phrasebook) that does not follow the conventions below, and you're unable to fix it, please tag it with the tag, which looks like this:

Chinese
Chinese romanization is complicated by the vast variety of dialects used and some intractable political difficulties. Rules of thumb are:


 * For articles about mainland China, use Hanyu pinyin romanization and simplified form Chinese characters.


 * For articles about Hong Kong and Macau, use Cantonese with Yale romanization and traditional Chinese characters. However, if the most commonly used name is under a different system, use that and not Yale.


 * For articles about Taiwan, use Wade-Giles romanization (without the necessary apostrophes) for older and well-known place names and either Hanyu pinyin or Tongyong pinyin for lesser known placenames (depending on which political party is controlling the locality, but we won't delve into that mess here). The Chinese characters included should be in traditional format.


 * Use tone marks, not tone numbers. Use the tone converter if necessary.  If you don't know the tones, leave them out and somebody will add them later.
 * 中国 is Zhōngguó, not Zhong1 Guo2


 * Use space between words, not between every syllable. Don't capitalize each syllable.
 * 南大街 is Nándàjiē (South-Great-Street), not Nan Da Jie or NanDaJie
 * 天河北路 is Tiānhé Běilù (Heaven-Lake North-Road)


 * Do not use tone marks for article titles, but give them in the intro.
 * Shanghai (上海 Shànghǎi) is a city.


 * Place parentheses around Chinese characters and their pinyin readings. Do not use bold or quote marks.
 * Beer (啤酒 píjiǔ) is very common in China.

See also: Wikipedia:Wikipedia:Manual of Style (China-related articles)

Hebrew
Hebrew romanization is highly nonstandard and complicated by the existence of numerous dialects with varying pronunciations. The closest to an official standard is the United Nations romanization (ISO 259), which is particularly useful for the traveller, as it is widely used in maps and signage.


 * Use United Nations romanization, with the following three exceptions:
 * Use "h" for het (ח), not ẖ (h-underscore)
 * Use "tz" for tzadi (ץ צ), not ts or ẕ (z-underscore)
 * Use "ei" for tzeire malei (אֵי), not e

Examples: Petah Tikva (פֶּתַח תִּקְוָה), Bnei Brak (בְּנֵי בְּרַק)

The surrogates above are widely used in Israel itself, and are easily supported by PCs for display and entry.

That said, some places in Israel have well known English names that differ from the Hebrew: thus Jerusalem (not Yerushalayim) and Hebron (not Hevron).

Indic
Romanization of Indic scripts is complicated by the existence of several different romanization systems and the use of Indic scripts by multiple languages.


 * In general, use the ISO 15919 romanization.
 * For anusvara:
 * Use "ṅ" before k, kh, g and gh.
 * Use "ñ" before c, ch, j and jh.
 * Use "ṇ" before ṭ, ṭh, ḍ and ḍh.
 * Use "n" before t, th, d and dh.
 * Use "m" before p, ph, b and bh.
 * Otherwise, use "ṁ".
 * For article titles, use the Hunterian transliteration.
 * In both cases, when the source script does not indicate the removal of the inherent vowel and it is not pronounced in the original source language, such unpronounced inherent vowels should be removed.

However, some places in South Asia (where Indic scripts are used) have well-known English names that differ from the ISO 15919 or Hunterian transliteration: thus Delhi (not Dilli) and Ahmedabad (not Amdavad). All cities of Karnataka should use the pre-2007 names: thus Bangalore (not Bengaluru) and Shimoga (not Shivamogga).

Tamil
Since the Tamil script does not indicate the voicing of certain consonants, they should be romanized as per their general pronunciations in certain words. In native Tamil words, the consonants are always voiced between the two vowels unless it is doubled.

Tamil ச் should always be romanized as c unless it is pronounced as an S, when it should be romanized as s.

Japanese
For Japanese, Hepburn (written by an American for foreigners) has been the de facto standard of romanization for the past 100 years especially in publications geared to foreigners, while official standard Kunrei (written by Japanese for Japanese) is used very little. Thus:


 * Use Hepburn romanization (specifically Modified/Revised Hepburn, which is the most common style).
 * Indicate long vowels with macrons, except in article titles.
 * Always romanize ん as n, but create a redirect from the m form if common. For example, 群馬 is Gunma, but you can still search for Gumma.
 * Syllabic n ん is written  n'  (with apostrophe) when followed by a vowel or y, but not when followed by another n or another consonant.  Hence 山陽 is San'yō, but こんにちは is konnichiwa and 新橋 is Shinbashi.

See also: Wikipedia:Wikipedia:Manual of Style (Japan-related articles)

Korean
Korean romanization is comparatively straightforward.


 * For South Korea, use the Revised Romanization of Korean, officially adopted in 2000 for governmental use in signs etc and widely used elsewhere too.
 * Use Hong's Hangul Conversion Tools to convert hangeul to Revised Romanization and back.
 * Note the McCune-Reischauer form in content and add a redirect if the old form is well known (eg. Pusan&rarr;Busan, Inchon&rarr;Incheon, Cheju&rarr;Jeju)


 * For North Korea, use standard McCune-Reischauer, which remains the country's sole official system. (However, do not use the North Korean variant of MR.)
 * Omit breves and apostrophes from article titles (eg. Pyongyang), but leave them in content (P'yŏngyang).


 * For personal names use the nameholder's preference or common convention (eg. Kim Il-sung, not "Kim Il-sŏng", and Syngman Rhee, not "I Seung-man"). The same applies to Hangul, kimchi and taekwondo.

Lao
There are two main ways to turn the Lao script into the Latin alphabet: either French-style spellings like Houeisay (formally the BGN/PCGN system), or English-style spellings like Huay Xai (Library of Congress aka LC-ALA style). While government documents seem to prefer the French style, Wikivoyage uses the English-style/Library of Congress system, because it's easier for English speakers to pronounce and is becoming more common. Notable exception: the capital is Vientiane, not Viangchan.

Mention the French-style romanization in the article intro. If it's so different that it's unrecognizable, e.g. Oudomxai for Muang Xay, consider adding it in parenthesis next to inbound links as well.

See also: Wikipedia:Romanization of Lao

Russian
Russian romanization is pretty straightforward if you know Russian Cyrillic, which is also phonetic. Like Thai, there are no universally-used transliteration systems, even for transliteration to English. Especially: the transliteration varies by target language; don't use German or Finnish spelling for the transliteration to English. Sometimes transcription is used instead of transliteration.

There are some general commonalities:


 * Although the name most commonly used in English may be close to the transliteration, they are not always the same. For example, Санкт-Петербург is Saint Petersburg in English, not the transliterated version Sankt-Peterburg . Use the English version in the title and in the text, and give the Russian and an accurate transcription (if different from the English name) in parenthesis where the name of the place or term is introduced in the text.
 * The exception is the word Москва. While Москва is the name of the city and the river flowing through it, the standard English use is Moscow for the city and Moskva (the direct transliteration) for the river.
 * Every vowel in Russian is given its own syllable (there are no diphthongs). When the same letter repeats (whether vowel or consonant), in transliterations it must be written as many times as it appears (this is more common with people's names than place names).
 * For example, Пицца would be most accurately transliterated as "Pitstsa". The best transcription – writing words as they are pronounced – would be "Pizza". In scenarios like this, either one works, just be consistent.
 * Romanizing "soft vowels" (Я/я, Е/е, И/и, Ё/ё, Ю/ю): At the start of a word, include the initial "y", except for И.
 * Hence Екатеринбург Yekaterinburg, not Ekaterinburg, but Иркутск Irkutsk, not Yirkutsk.
 * In the middle of words, the only soft vowels always written with an initial "y" are Я and Ю.
 * That is, write Ryazan, not Razan
 * Different systems will write an initial "y" for E in the middle of words or not. Generally, it is not needed (the meaning is not lost between Dostoyevsky and Dostoevsky). If either version is closer to the name commonly used in English, use that.
 * Ё can be written as "yo", "o", or "e" (the latter two in the middle of words). Default to the first except for names that are commonly written in English with an "e" (like Mikhail Gorbachev). This arises from the umlauts usually being omitted in Russian media.
 * The hard vowel Ы is usually written as a "y".
 * Hard sign (Ъ) and soft sign (Ь) are both written as an apostrophe ('), except for titles (e.g. use Kazan for searches but Kazan' in text). If a vowel other than Я or Ю immediately follows a soft sign, write a "y" instead of an apostrophe.
 * The two letters Ш and Щ should be written as "sh" and "shch", respectively. Older sources might use diacritics (š for Ш and šč for Щ). The letter Ц is always written as "ts", never "tz".
 * The letter Ж is always "zh", although older sources might use a diacritic and write ž. In some rare cases, there is an established English transcription that uses "J" (e.g. Nijinsky, not Nizhinsky ).
 * Russian lacks a single letter for the English letter "J". Russian usually writes this sound (which almost always appears in foreign names) with the digraph ДЖ. This can be written either as "Dzh", or as "J". Try to avoid unnecessary confusion.
 * Таджикистан could be written as either Tajikistan or "Tadzhikistan"; use the former as that's what is used in English.
 * When romanizing initials, "Дж." will always romanize to "J".
 * The letter Х should always be written "kh". Russian lacks a sound for "h", so this letter should never appear in isolation. The letter К is always "k". The letter Ч is always "ch" as in "cheese".
 * The letter Й represents the Y consonant (as in "toy"), and is usually omitted from transliterations except for at the start of words or after a vowel (except Ы and И, since these would give a redundant double I or double Y).
 * When at the end of a word and after either Ы or И, only write the letter "y" once. There is no need to write out "yy" or "iy" or "yi", since in Russian these would all be pronounced as "y".
 * For example, Йогурт is written "yogurt" while Чайковский is written either as "Chaykovsky" (which is the direct letter-for-letter transcription) or as "Tchaikovsky" (the common Anglicization).

Thai
Thai romanization is generally a mess, with several incompatible 'standards' and lots of completely nonstandard off-the-cuff attempts. In general:


 * Use the most common English name for article names.


 * When in doubt, use the Royal Thai General System of Transcription (RTGS), used in road signs, time tables and government publications, and the closest thing there is to an official standard.
 * Hence Khao San Road, not "Kao Sarn", and Ayutthaya, not "Ayodhya"
 * In particular, Ko (not Koh) for islands
 * Always use RTGS if providing the pronunciation after a Thai name, eg. กรุงเทพฯ Krung Thep for the city known in English as Bangkok

See also: Wikipedia:Wikipedia:Manual of Style (Thailand-related articles)

Vietnamese
Vietnamese is actually written in the Latin alphabet, but the vast slew of tonal diacritics make handling it a little difficult.


 * Strip diacritics from article titles, but include them in the article body.
 * Hanoi (Hà Nội) is the capital of Vietnam...


 * Syllables should be separated in article titles and content, but include the combined form in the article body if it's a reasonably common alternative.
 * Nha Trang is a nice town and xe om motorbikes are dangerous
 * Ha Long Bay (Halong Bay) is pretty
 * Five exceptions to the rule: Dalat, Danang, Hanoi, Saigon and Vietnam itself