ISO 15924
ISO 15924, Codes for the representation of names of scripts, defines two sets of codes for a number of writing systems (scripts). Each script is given both a four-letter code and a numeric one.[1] Script is defined as "set of graphic characters used for the written form of one or more languages".[1]
Where possible the codes are derived from ISO 639-2 where the name of a script and the name of a language using the script are identical (example: Gujarātī ISO 639 guj, ISO 15924 Gujr). Preference is given to the 639-2 Bibliographical codes, which is different from the otherwise often preferred use of the Terminological codes.[1]
4-letter ISO 15924 codes are incorporated into the Language Subtag Registry for IETF language tags and so can be used in file formats that make use of such language tags. For example, they can be used in HTML and XML to help Web browsers determine which typeface to use for foreign text. This way one could differentiate, for example, between Serbian written in the Cyrillic (sr-Cyrl
) or Latin (sr-Latn
) script, or mark romanized text as such.
Contents
Maintenance
ISO has appointed the Unicode Consortium as the Registration Authority (RA) for the standard. In 2004, the RA appointed Michael Everson to act as Registrar. The Registrar works with a Joint Advisory Committee (JAC) in developing and implementing the standard.[2] The JAC contains six members: the Registrar, one member from the Library of Congress, one from Standards Norway, one from the French Encyclopaedia Universalis, an officer of Unicode, and a member of Unicode. These individuals represent the interests of the ISO 15924 RA, the ISO 639-2 RA, ISO Technical Committee 37, ISO Technical Committee 46, and the ISO Coded Character Set Sub-Committee, ISO/IEC JTC1/SC2.[3]
Script codes
Numeric ranges
- 000–099 Hieroglyphic and cuneiform scripts
- 100–199 Right-to-left alphabetic scripts
- 200–299 Left-to-right alphabetic scripts
- 300–399 Alphasyllabic scripts
- 400–499 Syllabic scripts
- 500–599 Ideographic scripts
- 600–699 Undeciphered scripts
- 700–799 Shorthands and other notations[4]
- 800–899 (unassigned)
- 900–999 Private use, alias, special codes[5]
Special codes
- Qaaa—Qabx (900—949): 50 Codes reserved for private use.
- Zinh 994 : Code for inherited script
- Zmth 995 : Mathematical notation
- Zsym 996 : Symbols
- Zxxx 997 : Code for unwritten languages
- Zyyy 998 : Code for undetermined script
- Zzzz 999 : Code for uncoded script
List of codes
This list of codes is from the ISO 15924:2004 standard.[6]
ISO 15924 | Script in Unicode[e] | ||||||
---|---|---|---|---|---|---|---|
Code | No. | Name | Alias[f] | Direction | Version | Characters | Remark |
Adlm | 166 | Adlam | R-to-L | Approved for inclusion in a future version of the Unicode Standard[7][8] | |||
Afak | 439 | Afaka | L-to-R | Not in Unicode, proposal under review by the Unicode Technical Committee[7] | |||
Aghb | 239 | Caucasian Albanian | Caucasian Albanian | L-to-R | 7.0 | 53 | Ancient/historic |
Ahom | 338 | Ahom, Tai Ahom | Ahom | L-to-R | 8.0 | 57 | Ancient/historic |
Arab | 160 | Arabic | Arabic | R-to-L | 1.0 | 1,257 | |
Aran | 161 | Arabic (Nastaliq variant) | R-to-L | Typographic variant of Arabic | |||
Armi | 124 | Imperial Aramaic | Imperial Aramaic | R-to-L | 5.2 | 31 | Ancient/historic |
Armn | 230 | Armenian | Armenian | L-to-R | 1.0 | 93 | |
Avst | 134 | Avestan | Avestan | R-to-L | 5.2 | 61 | Ancient/historic |
Bali | 360 | Balinese | Balinese | L-to-R | 5.0 | 121 | |
Bamu | 435 | Bamum | Bamum | L-to-R | 5.2 | 657 | |
Bass | 259 | Bassa Vah | Bassa Vah | L-to-R | 7.0 | 36 | Ancient/historic |
Batk | 365 | Batak | Batak | L-to-R | 6.0 | 56 | |
Beng | 325 | Bengali | Bengali | L-to-R | 1.0 | 93 | |
Bhks | 334 | Bhaiksuki | L-to-R | Approved for inclusion in a future version of the Unicode Standard[7] | |||
Blis | 550 | Blissymbols | L-to-R | Not in Unicode, proposal in initial/exploratory stage[7] | |||
Bopo | 285 | Bopomofo | Bopomofo | L-to-R | 1.0 | 70 | |
Brah | 300 | Brahmi | Brahmi | L-to-R | 6.0 | 109 | Ancient/historic |
Brai | 570 | Braille | Braille | L-to-R | 3.0 | 256 | |
Bugi | 367 | Buginese | Buginese | L-to-R | 4.1 | 30 | |
Buhd | 372 | Buhid | Buhid | L-to-R | 3.2 | 20 | |
Cakm | 349 | Chakma | Chakma | L-to-R | 6.1 | 67 | |
Cans | 440 | Unified Canadian Aboriginal Syllabics | Canadian Aboriginal | L-to-R | 3.0 | 710 | |
Cari | 201 | Carian | Carian | L-to-R | 5.1 | 49 | Ancient/historic |
Cham | 358 | Cham | Cham | L-to-R | 5.1 | 83 | |
Cher | 445 | Cherokee | Cherokee | L-to-R | 3.0 | 172 | |
Cirt | 291 | Cirth | L-to-R | Not in Unicode | |||
Copt | 204 | Coptic | Coptic | L-to-R | 1.0 | 137 | Ancient/historic, Disunified from Greek in 4.1 |
Cprt | 403 | Cypriot | Cypriot | R-to-L | 4.0 | 55 | Ancient/historic |
Cyrl | 220 | Cyrillic | Cyrillic | L-to-R | 1.0 | 434 | |
Cyrs | 221 | Cyrillic (Old Church Slavonic variant) | L-to-R | Not in Unicode | |||
Deva | 315 | Devanagari (Nagari) | Devanagari | L-to-R | 1.0 | 154 | |
Dsrt | 250 | Deseret (Mormon) | Deseret | L-to-R | 3.1 | 80 | |
Dupl | 755 | Duployan shorthand, Duployan stenography | Duployan | L-to-R | 7.0 | 143 | |
Egyd | 070 | Egyptian demotic | R-to-L | Not in Unicode | |||
Egyh | 060 | Egyptian hieratic | R-to-L | Not in Unicode | |||
Egyp | 050 | Egyptian hieroglyphs | Egyptian Hieroglyphs | L-to-R | 5.2 | 1,071 | Ancient/historic |
Elba | 226 | Elbasan | Elbasan | L-to-R | 7.0 | 40 | Ancient/historic |
Ethi | 430 | Ethiopic (Geʻez) | Ethiopic | L-to-R | 3.0 | 495 | |
Geok | 241 | Khutsuri (Asomtavruli and Nuskhuri) | Georgian | L-to-R | Unicode groups Geok and Geor together as "Georgian" | ||
Geor | 240 | Georgian (Mkhedruli) | Georgian | L-to-R | 1.0 | 127 | For Unicode, see also Geok |
Glag | 225 | Glagolitic | Glagolitic | L-to-R | 4.1 | 94 | Ancient/historic |
Goth | 206 | Gothic | Gothic | L-to-R | 3.1 | 27 | Ancient/historic |
Gran | 343 | Grantha | Grantha | L-to-R | 7.0 | 85 | Ancient/historic |
Grek | 200 | Greek | Greek | L-to-R | 1.0 | 516 | |
Gujr | 320 | Gujarati | Gujarati | L-to-R | 1.0 | 85 | |
Guru | 310 | Gurmukhi | Gurmukhi | L-to-R | 1.0 | 79 | |
Hanb | 503 | Han with Bopomofo (alias for Han + Bopomofo) | L-to-R | See Hani, Bopo | |||
Hang | 286 | Hangul (Hangŭl, Hangeul) | Hangul | L-to-R | 1.0 | 11,739 | Hangul syllables relocated in 2.0 |
Hani | 500 | Han (Hanzi, Kanji, Hanja) | Han | L-to-R | 1.0 | 81,734 | |
Hano | 371 | Hanunoo (Hanunóo) | Hanunoo | L-to-R | 3.2 | 21 | |
Hans | 501 | Han (Simplified variant) | L-to-R | Subset Hani | |||
Hant | 502 | Han (Traditional variant) | L-to-R | Subset Hani | |||
Hatr | 127 | Hatran | Hatran | R-to-L | 8.0 | 26 | Ancient/historic |
Hebr | 125 | Hebrew | Hebrew | R-to-L | 1.0 | 133 | |
Hira | 410 | Hiragana | Hiragana | L-to-R | 1.0 | 91 | |
Hluw | 080 | Anatolian Hieroglyphs (Luwian Hieroglyphs, Hittite Hieroglyphs) | Anatolian Hieroglyphs | L-to-R | 8.0 | 583 | Ancient/historic |
Hmng | 450 | Pahawh Hmong | Pahawh Hmong | L-to-R | 7.0 | 127 | |
Hrkt | 412 | Japanese syllabaries (alias for Hiragana + Katakana) | Katakana or Hiragana | L-to-R | See Hira, Kana | ||
Hung | 176 | Old Hungarian (Hungarian Runic) | Old Hungarian | R-to-L | 8.0 | 108 | Ancient/historic |
Inds | 610 | Indus (Harappan) | R-to-L | Not in Unicode, proposal in initial/exploratory stage[7] | |||
Ital | 210 | Old Italic (Etruscan, Oscan, etc.) | Old Italic | L-to-R | 3.1 | 36 | Ancient/historic |
Jamo | 284 | Jamo (alias for Jamo subset of Hangul) | L-to-R | Subset Hang | |||
Java | 361 | Javanese | Javanese | L-to-R | 5.2 | 90 | |
Jpan | 413 | Japanese (alias for Han + Hiragana + Katakana) | L-to-R | See Hani, Hira and Kana | |||
Jurc | 510 | Jurchen | L-to-R | Not in Unicode | |||
Kali | 357 | Kayah Li | Kayah Li | L-to-R | 5.1 | 47 | |
Kana | 411 | Katakana | Katakana | L-to-R | 1.0 | 300 | |
Khar | 305 | Kharoshthi | Kharoshthi | R-to-L | 4.1 | 65 | Ancient/historic |
Khmr | 355 | Khmer | Khmer | L-to-R | 3.0 | 146 | |
Khoj | 322 | Khojki | Khojki | L-to-R | 7.0 | 61 | Ancient/historic |
Kitl | 505 | Khitan large script | L-to-R | Not in Unicode | |||
Kits | 288 | Khitan small script | T-to-B | Not in Unicode | |||
Knda | 345 | Kannada | Kannada | L-to-R | 1.0 | 87 | |
Kore | 287 | Korean (alias for Hangul + Han) | L-to-R | See Hani and Hang | |||
Kpel | 436 | Kpelle | L-to-R | Not in Unicode, proposal in initial/exploratory stage[7] | |||
Kthi | 317 | Kaithi | Kaithi | L-to-R | 5.2 | 66 | Ancient/historic |
Lana | 351 | Tai Tham (Lanna) | Tai Tham | L-to-R | 5.2 | 127 | |
Laoo | 356 | Lao | Lao | L-to-R | 1.0 | 67 | |
Latf | 217 | Latin (Fraktur variant) | L-to-R | Typographic variant of Latin | |||
Latg | 216 | Latin (Gaelic variant) | L-to-R | Typographic variant of Latin | |||
Latn | 215 | Latin | Latin | L-to-R | 1.0 | 1,349 | See Latin script in Unicode |
Leke | 364 | Leke | L-to-R | Not in Unicode | |||
Lepc | 335 | Lepcha (Róng) | Lepcha | L-to-R | 5.1 | 74 | |
Limb | 336 | Limbu | Limbu | L-to-R | 4.0 | 68 | |
Lina | 400 | Linear A | Linear A | L-to-R | 7.0 | 341 | Ancient/historic |
Linb | 401 | Linear B | Linear B | L-to-R | 4.0 | 211 | Ancient/historic |
Lisu | 399 | Lisu (Fraser) | Lisu | L-to-R | 5.2 | 48 | |
Loma | 437 | Loma | L-to-R | Not in Unicode, proposal in initial/exploratory stage[7] | |||
Lyci | 202 | Lycian | Lycian | L-to-R | 5.1 | 29 | Ancient/historic |
Lydi | 116 | Lydian | Lydian | R-to-L | 5.1 | 27 | Ancient/historic |
Mahj | 314 | Mahajani | Mahajani | L-to-R | 7.0 | 39 | Ancient/historic |
Mand | 140 | Mandaic, Mandaean | Mandaic | R-to-L | 6.0 | 29 | |
Mani | 139 | Manichaean | Manichaean | R-to-L | 7.0 | 51 | Ancient/historic |
Marc | 332 | Marchen | L-to-R | Approved for inclusion in a future version of the Unicode Standard[7][8] | |||
Maya | 090 | Mayan hieroglyphs | Not in Unicode | ||||
Mend | 438 | Mende Kikakui | Mende Kikakui | R-to-L | 7.0 | 213 | |
Merc | 101 | Meroitic Cursive | Meroitic Cursive | R-to-L | 6.1 | 90 | Ancient/historic |
Mero | 100 | Meroitic Hieroglyphs | Meroitic Hieroglyphs | R-to-L | 6.1 | 32 | Ancient/historic |
Mlym | 347 | Malayalam | Malayalam | L-to-R | 1.0 | 100 | |
Modi | 324 | Modi, Moḍī | Modi | L-to-R | 7.0 | 79 | Ancient/historic |
Mong | 145 | Mongolian | Mongolian | T-to-B | 3.0 | 153 | Includes Clear, Manchu scripts |
Moon | 218 | Moon (Moon code, Moon script, Moon type) | Not in Unicode, proposal in initial/exploratory stage[7] | ||||
Mroo | 199 | Mro, Mru | Mro | L-to-R | 7.0 | 43 | |
Mtei | 337 | Meitei Mayek (Meithei, Meetei) | Meetei Mayek | L-to-R | 5.2 | 79 | |
Mult | 323 | Multani | Multani | L-to-R | 8.0 | 38 | Ancient/historic |
Mymr | 350 | Myanmar (Burmese) | Myanmar | L-to-R | 3.0 | 223 | |
Narb | 106 | Old North Arabian (Ancient North Arabian) | Old North Arabian | R-to-L | 7.0 | 32 | Ancient/historic |
Nbat | 159 | Nabataean | Nabataean | R-to-L | 7.0 | 40 | Ancient/historic |
Newa | 333 | Newa, Newar, Newari, Nepāla lipi | L-to-R | Approved for inclusion in a future version of the Unicode Standard[7][8] | |||
Nkgb | 420 | Nakhi Geba ('Na-'Khi ²Ggŏ-¹baw, Naxi Geba) | L-to-R | Not in Unicode, proposal in initial/exploratory stage[7] | |||
Nkoo | 165 | N’Ko | NKo | R-to-L | 5.0 | 59 | |
Nshu | 499 | Nüshu | L-to-R | Approved for inclusion in a future version of the Unicode Standard[9][8] | |||
Ogam | 212 | Ogham | Ogham | 3.0 | 29 | Ancient/historic | |
Olck | 261 | Ol Chiki (Ol Cemet’, Ol, Santali) | Ol Chiki | L-to-R | 5.1 | 48 | |
Orkh | 175 | Old Turkic, Orkhon Runic | Old Turkic | R-to-L | 5.2 | 73 | Ancient/historic |
Orya | 327 | Oriya | Oriya | L-to-R | 1.0 | 90 | |
Osge | 219 | Osage | L-to-R | Approved for inclusion in a future version of the Unicode Standard[7][8] | |||
Osma | 260 | Osmanya | Osmanya | L-to-R | 4.0 | 40 | |
Palm | 126 | Palmyrene | Palmyrene | R-to-L | 7.0 | 32 | Ancient/historic |
Pauc | 263 | Pau Cin Hau | Pau Cin Hau | L-to-R | 7.0 | 57 | |
Perm | 227 | Old Permic | Old Permic | L-to-R | 7.0 | 43 | Ancient/historic |
Phag | 331 | Phags-pa | Phags-pa | T-to-B | 5.0 | 56 | Ancient/historic |
Phli | 131 | Inscriptional Pahlavi | Inscriptional Pahlavi | R-to-L | 5.2 | 27 | Ancient/historic |
Phlp | 132 | Psalter Pahlavi | Psalter Pahlavi | R-to-L | 7.0 | 29 | Ancient/historic |
Phlv | 133 | Book Pahlavi | R-to-L | Not in Unicode | |||
Phnx | 115 | Phoenician | Phoenician | R-to-L | 5.0 | 29 | Ancient/historic |
Piqd | 293 | Klingon (KLI pIqaD) | L-to-R | Rejected for inclusion in the Unicode Standard[10][11] | |||
Plrd | 282 | Miao (Pollard) | Miao | L-to-R | 6.1 | 133 | |
Prti | 130 | Inscriptional Parthian | Inscriptional Parthian | R-to-L | 5.2 | 30 | Ancient/historic |
Qaaa | 900 | Reserved for private use (start) | Not in Unicode | ||||
Qaai | 908 | (Private use) | Not in Unicode (Before version 5.2, this was used instead of Zinh) | ||||
Qabx | 949 | Reserved for private use (end) | Not in Unicode | ||||
Rjng | 363 | Rejang (Redjang, Kaganga) | Rejang | L-to-R | 5.1 | 37 | |
Roro | 620 | Rongorongo | Not in Unicode, proposal in initial/exploratory stage[7] | ||||
Runr | 211 | Runic | Runic | L-to-R | 3.0 | 86 | Ancient/historic |
Samr | 123 | Samaritan | Samaritan | R-to-L | 5.2 | 61 | |
Sara | 292 | Sarati | Not in Unicode | ||||
Sarb | 105 | Old South Arabian | Old South Arabian | R-to-L | 5.2 | 32 | Ancient/historic |
Saur | 344 | Saurashtra | Saurashtra | L-to-R | 5.1 | 81 | |
Sgnw | 095 | SignWriting | SignWriting | T-to-B | 8.0 | 672 | |
Shaw | 281 | Shavian (Shaw) | Shavian | L-to-R | 4.0 | 48 | |
Shrd | 319 | Sharada, Śāradā | Sharada | L-to-R | 6.1 | 94 | |
Sidd | 302 | Siddham, Siddhaṃ, Siddhamātṛkā | Siddham | L-to-R | 7.0 | 92 | Ancient/historic |
Sind | 318 | Khudawadi, Sindhi | Khudawadi | L-to-R | 7.0 | 69 | |
Sinh | 348 | Sinhala | Sinhala | L-to-R | 3.0 | 110 | |
Sora | 398 | Sora Sompeng | Sora Sompeng | L-to-R | 6.1 | 35 | |
Sund | 362 | Sundanese | Sundanese | L-to-R | 5.1 | 72 | |
Sylo | 316 | Syloti Nagri | Syloti Nagri | L-to-R | 4.1 | 44 | |
Syrc | 135 | Syriac | Syriac | R-to-L | 3.0 | 77 | |
Syre | 138 | Syriac (Estrangelo variant) | R-to-L | Typographic variant of Syriac | |||
Syrj | 137 | Syriac (Western variant) | R-to-L | Typographic variant of Syriac | |||
Syrn | 136 | Syriac (Eastern variant) | R-to-L | Typographic variant of Syriac | |||
Tagb | 373 | Tagbanwa | Tagbanwa | L-to-R | 3.2 | 18 | |
Takr | 321 | Takri, Ṭākrī, Ṭāṅkrī | Takri | L-to-R | 6.1 | 66 | |
Tale | 353 | Tai Le | Tai Le | L-to-R | 4.0 | 35 | |
Talu | 354 | New Tai Lue | New Tai Lue | L-to-R | 4.1 | 83 | |
Taml | 346 | Tamil | Tamil | L-to-R | 1.0 | 72 | |
Tang | 520 | Tangut | L-to-R | Approved for inclusion in a future version of the Unicode Standard[7][8] | |||
Tavt | 359 | Tai Viet | Tai Viet | L-to-R | 5.2 | 72 | |
Telu | 340 | Telugu | Telugu | L-to-R | 1.0 | 96 | |
Teng | 290 | Tengwar | L-to-R | Not in Unicode | |||
Tfng | 120 | Tifinagh (Berber) | Tifinagh | L-to-R | 4.1 | 59 | |
Tglg | 370 | Tagalog (Baybayin, Alibata) | Tagalog | L-to-R | 3.2 | 20 | |
Thaa | 170 | Thaana | Thaana | R-to-L | 3.0 | 50 | |
Thai | 352 | Thai | Thai | L-to-R | 1.0 | 86 | |
Tibt | 330 | Tibetan | Tibetan | L-to-R | 2.0 | 207 | Added in 1.0, removed in 1.1 and reintroduced in 2.0 |
Tirh | 326 | Tirhuta | Tirhuta | L-to-R | 7.0 | 82 | |
Ugar | 040 | Ugaritic | Ugaritic | L-to-R | 4.0 | 31 | Ancient/historic |
Vaii | 470 | Vai | Vai | L-to-R | 5.1 | 300 | |
Visp | 280 | Visible Speech | L-to-R | Not in Unicode | |||
Wara | 262 | Warang Citi (Varang Kshiti) | Warang Citi | L-to-R | 7.0 | 84 | |
Wole | 480 | Woleai | R-to-L | Not in Unicode, proposal in initial/exploratory stage[7] | |||
Xpeo | 030 | Old Persian | Old Persian | L-to-R | 4.1 | 50 | Ancient/historic |
Xsux | 020 | Cuneiform, Sumero-Akkadian | Cuneiform | L-to-R | 5.0 | 1,234 | Ancient/historic |
Yiii | 460 | Yi | Yi | L-to-R | 3.0 | 1,220 | |
Zinh | 994 | Code for inherited script | Inherited | Inherited | 563 | ||
Zmth | 995 | Mathematical notation | L-to-R | Not a 'script' in Unicode | |||
Zsym | 996 | Symbols | Not a 'script' in Unicode | ||||
Zsye | 993 | Symbols (emoji variant) | Not a 'script' in Unicode | ||||
Zxxx | 997 | Code for unwritten documents | Not a 'script' in Unicode | ||||
Zyyy | 998 | Code for undetermined script | Common | 7,179 | |||
Zzzz | 999 | Code for uncoded script | Unknown | 993,309 | All other code points | ||
Notes
|
Relations to other standards
The following standards are referred to as indispensable by ISO 15924.
- ISO 639-2:1998 Codes for the representation of names of languages — Part 2: Alpha-3 code
- ISO/IEC 9541-1:1991 Information technology — Font information interchange — Part 1: Architecture
- ISO/IEC 10646-1:2000 Information technology — Universal Multiple-Octet Coded Character Set (UCS)
For definition of font and glyph the standard refers to
- ISO/IEC 9541-1:1991
Some 100 scripts are defined in Unicode. Through a linkpin called "Property Value Alias", Unicode has made a 1:1 connection between a script defined, and its ISO 15924 standard. See Script (Unicode).
References
- ↑ 1.0 1.1 1.2 Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Unicode - ISO 15924 Registration Authority
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ In July, 2010, Duployan shorthand was assigned code 755, even though the 700-799 range still carried its original designation of (unassigned). Shortly thereafter, Revision 1.1 clarified that codes in the 700s were reserved for "Shorthands and other notations", although that revision is only provisional until it can be confirmed by governing committees.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ 7.00 7.01 7.02 7.03 7.04 7.05 7.06 7.07 7.08 7.09 7.10 7.11 7.12 7.13 7.14 Lua error in package.lua at line 80: module 'strict' not found.
- ↑ 8.0 8.1 8.2 8.3 8.4 8.5 Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.