From the human point of view, a Musa text is a sequence of half letters. But from a digital point of view, a Musa text is a sequence of full letters. In other words, the Musa encoding is essentially Alphabet Gait. Not only is it complete and compact, but unsophisticated rendering engines will still produce legible output.
Musa is encoded in the Private Use Area of Unicode, starting at E000 and ending at E2FF: three full pages. We also use the page E3xx for Musa Markup (explained on the next page). This makes Musa compatible with Unicode (and the SIL PUA) but not part of it. Since Unicode is not allowed to change, it won't be appropriate to encode Musa directly in Unicode until it stops evolving.
The first Musa codepoint, E000, has a special meaning as the end-of-text character. It indicates to computers that there's no more text to display or transmit. The ASCII equivalent is ETX 0003, while in the C language, it's 0000. The rest of the first four lines - E001-E03F - are used to encode Musa shapes (not letters). This collection includes a few shape variants, so that they can be used as half-letters, as short versions, as keycaps, as shapes on blocks and tiles, and other possible uses for the bare shapes. Here's a list of them:
Codepoint | Shape | Unicode Name |
---|---|---|
E001 | | MUSA YA SHAPE |
E002 | | MUSA FI SHAPE |
E003 | | MUSA FA SHAPE |
E004 | | MUSA FU SHAPE |
E005 | | MUSA YU SHAPE |
E006 | | MUSA NU SHAPE |
E007 | | MUSA MU SHAPE |
E008 | | MUSA PU SHAPE |
E009 | | MUSA NA SHAPE |
E00A | | MUSA PA SHAPE |
E00B | | MUSA KA SHAPE |
E00C | | MUSA TA SHAPE |
E00D | | MUSA SA SHAPE |
E00E | | MUSA WA SHAPE |
E00F | | MUSA MA SHAPE |
E010 | | MUSA LU SHAPE |
E011 | | MUSA WI SHAPE |
E012 | | MUSA SI SHAPE |
E013 | | MUSA TI SHAPE |
E014 | | MUSA KI SHAPE |
E015 | | MUSA PI SHAPE |
E016 | | MUSA NI SHAPE |
E017 | | MUSA SU SHAPE |
E018 | | MUSA KU SHAPE |
E019 | | MUSA TU SHAPE |
E01A | | MUSA RI SHAPE |
E01B | | MUSA TURNED KA SHAPE |
E01C | | MUSA TURNED TA SHAPE |
E01D | | MUSA TURNED SA SHAPE |
E01E | | MUSA TURNED WA SHAPE |
E01F | | MUSA TURNED NA SHAPE |
E020 | | MUSA TURNED PA SHAPE |
E021 | | MUSA TURNED WI SHAPE |
E022 | | MUSA BOTTOM SA SHAPE |
E023 | | MUSA BOTTOM KA SHAPE |
E024 | | MUSA TOP SA SHAPE |
E025 | | MUSA TOP KA SHAPE |
E026 | | MUSA SEMI NU SHAPE |
E027 | | MUSA SEMI MU SHAPE |
E028 | | MUSA SEMI PU SHAPE |
E029 | | MUSA SEMI NA SHAPE |
E02A | | MUSA SEMI PA SHAPE |
E02B | | MUSA SEMI KA SHAPE |
E02C | | MUSA SEMI TA SHAPE |
E02D | | MUSA SEMI SA SHAPE |
E02E | | MUSA SEMI WA SHAPE |
E02F | | MUSA SEMI MA SHAPE |
E030 | | MUSA SEMI LU SHAPE |
E031 | | MUSA SEMI WI SHAPE |
E032 | | MUSA SEMI SI SHAPE |
E033 | | MUSA SEMI TI SHAPE |
E034 | | MUSA SEMI KI SHAPE |
E035 | | MUSA SEMI PI SHAPE |
E036 | | MUSA SEMI NI SHAPE |
E037 | | MUSA SEMI SU SHAPE |
E038 | | MUSA SEMI KU SHAPE |
E039 | | MUSA SEMI TU SHAPE |
E03A | | MUSA SEMI RI SHAPE |
E03B | | MUSA TWIN KA SHAPE |
E03C | | MUSA TWIN TA SHAPE |
E03D | | MUSA TWIN SA SHAPE |
E03E | | MUSA TWIN WI SHAPE |
E03F | | MUSA TWIN WA SHAPE |
The rest of the E0xx page, the entire E1xx page, and most of the E2xx page encode letters, not shapes. Here is the complete set of codepoints; some are unused.
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
E00_ | | | | | | | | | | | | | | | | |
E01_ | | | | | | | | | | | | | | | | |
E02_ | | | | | | | | | | | | | | | | |
E03_ | | | | | | | | | | | | | | | | |
E04_ | | | | | | | | | | | | | | | | |
E05_ | | | | | | | | | | | | | | | | |
E06_ | | | | | | | | | | | | | | | | |
E07_ | | | | | | | | | | | | | | | | |
E08_ | | | | | | | | | | | | | | | | |
E09_ | | | | | | | | | | | | | | | | |
E0A_ | | | | | | | | | | | | | | | | |
E0B_ | | | | | | | | | | | | | | | | |
E0C_ | | | | | | | | | | | | | | | | |
E0D_ | | | | | | | | | | | | | | | | |
E0E_ | | | | | | | | | | | | | | | | |
E0F_ | | | | | | | | | | | | | | | | |
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F |
+0 | +1 | +2 | +3 | +4 | +5 | +6 | +7 | +8 | +9 | +A | +B | +C | +D | +E | +F | +10 | +11 | +12 | +13 | +14 | +15 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
E100 | | | | | | | | | | | | | | | | | | | | | | |
E116 | | | | | | | | | | | | | | | | | | | | | | |
E12C | | | | | | | | | | | | | | | | | | | | | | |
E142 | | | | | | | | | | | | | | | | | | | | | | |
E158 | | | | | | | | | | | | | | | | | | | | | | |
E16E | | | | | | | | | | | | | | | | | | | | | | |
E184 | | | | | | | | | | | | | | | | | | | | | | |
E19A | | | | | | | | | | | | | | | | | | | | | | |
E1B0 | | | | | | | | | | | | | | | | | | | | | | |
E1C6 | | | | | | | | | | | | | | | | | | | | | | |
E1DC | | | | | | | | | | | | | | | | | | | | | | |
E1F2 | | | | | | | | | | | | | | | | | | | | | | |
E208 | | | | | | | | | | | | | | | | | | | | | | |
E21E | | | | | | | | | | | | | | | | | | | | | | |
E234 | | | | | | | | | | | | | | | | | | | | | | |
E24A | | | | | | | | | | | | | | | | | | | | | | |
E260 | | | | | | | | | | | | | | | | | | | | | | |
E276 | | | | | | | | | | | | | | | | | | | | | | |
E28C | | | | | | | | | | | | | | | | | | | | | | |
E2A2 | | | | | | | | | | | | | | | | | | | | | | |
E2B8 | | | | | | | | | | | | | | | | | | | | | | |
E2CE | | | | | | | | | | | | | | | | | | | | | | |
Con | +0 | +1 | +2 | +3 | +4 | +5 | +6 | +7 | +8 | +9 | +A | +B | +C | +D | +E | +F | +10 | +11 | +12 | +13 | +14 | +15 |
The hexadecimal numbers at top left, top and left add up to indicate the code point of the letter in each cell. For instance, the Musa n is at code point E116.
The double-wide Musa logo is at E232, as if it were spelled by its two components. The Musa colon is at E1FD, as if it were spelled by two circles. E12D-E131 and E13B-E141 are used by our virtual keyboards to display control characters.
The Musa dot letter is encoded at E040 separately from the normal Unicode space at 0020. The rule is that the space between Musa text and other text is the normal space, but the dot is used within Musa text. That confounds the non-Musa end-of-line algorithms so that lines of Musa text in Alphabet gait are justified. Gaits with larger glyphs may have to leave an extra space or two at the right side of a line.
The Hentrax Musa Element font includes all the Musa codepoints, even if they don't correspond to a Musa letter. You can type these invalid letters in the Editor by selecting the Hentrax font.
In Musa, the gaits are implemented using OpenType Advanced Typography, which specifies substitutions or positionings of glyphs in certain circumstances. For example, in Kana gait, a sequence of consonant+vowel is replaced by the corresponding kana. The feature set is rich enough for everything Musa needs, mostly ligatures and contextual alternates.
Since gaits are implemented as fonts, there's no need for special treatment during text entry, transmission or storage. Musa text can be searched and sorted without regard for gait, and foreign words in text that can't be written in the gait of the text will appear in Alphabet gait. On the next page, we'll explain how Musa Markup gives you a way to embed the gait in the text without changing the letters.
Musa fonts share a common naming format: a font name, the word Musa, and then a gait keyword, like Dushan Musa Alphabet or Zhouhei Musa Fangzi, followed by a style (Regular, Bold, Italic, ...). The possible gait keywords are:
The conventional extension for sites completely in Musa will be .musa or the single Musa letter , at E232. However, there isn't yet a Musa superdomain, so your site could be musa.mysite.com or mysite.com/musa, for example.
< Letter Reference | Markup > |
© 2002-2024 The Musa Academy | musa@musa.bet | 25sep23 |