Digital Musa

Encoding

From the human point of view, a Musa text is a sequence of half letters. But from a digital point of view, a Musa text is a sequence of full letters. In other words, the Musa encoding is essentially Alphabet Gait. Not only is it complete and compact, but unsophisticated rendering engines will still produce legible output.

Musa is encoded in the Private Use Area of Unicode, starting at E000 and ending at E2FF. This makes Musa compatible with Unicode (and the SIL PUA) but not part of it. Since Unicode is not allowed to change, it won't be appropriate to encode Musa directly in Unicode until it stops evolving.

The entire E2xx page, from E200 to E2FF, is reserved for Musa markup: codes that control how Musa is displayed. We'll tell you about markup on the next page.

The first Musa codepoint, E000, has a special meaning as the end-of-text character. It indicates to computers that there's no more text to display or transmit. The ASCII equivalent is ETX 0003, while in the C language, it's 0000. We're not yet using the rest of the E00x line.

The next three lines - E010-E3FF - are used to encode Musa shapes (not letters). This collection includes a few shape variants, so that they can be used as half-letters, as short versions, as keycaps, as shapes on blocks and tiles, and other possible uses for the bare shapes. Here's a list of them:

CodepointShapeUnicode Name
E010MUSA KA TOP SHAPE
E011MUSA SA TOP SHAPE
E012MUSA SHORT YUMU SHAPE
E013MUSA KA BOTTOM SHAPE
E014MUSA SA BOTTOM SHAPE
E015MUSA SHORT YULU SHAPE
E016MUSA WI BOTTOM SHAPE
E017MUSA SHORT YUSI SHAPE
E018MUSA SHORT YUTI SHAPE
E019MUSA SHORT YUKI SHAPE
E01AMUSA SHORT YUPI SHAPE
E01BMUSA SHORT YUNI SHAPE
E01DMUSA SHORT YUKU SHAPE
E01FMUSA SHORT YURI SHAPE
E021MUSA YA SHAPE
E022MUSA FI SHAPE
E023MUSA FA SHAPE
E024MUSA FU SHAPE
E025MUSA YU SHAPE
E026MUSA NU SHAPE
E027MUSA MU SHAPE
E028MUSA PU SHAPE
E029MUSA NA SHAPE
E02AMUSA PA SHAPE
E02BMUSA KA SHAPE
E02CMUSA TA SHAPE
E02DMUSA SA SHAPE
E02EMUSA WA SHAPE
E02FMUSA MA SHAPE
E030MUSA LU SHAPE
E031MUSA WI SHAPE
E032MUSA SI SHAPE
E033MUSA TI SHAPE
E034MUSA KI SHAPE
E035MUSA PI SHAPE
E036MUSA NI SHAPE
E037MUSA SU SHAPE
E038MUSA KU SHAPE
E039MUSA TU SHAPE
E03AMUSA RI SHAPE
E03BMUSA TWIN KA SHAPE
E03CMUSA TWIN TA SHAPE
E03DMUSA TWIN SA SHAPE
E03EMUSA TWIN WI SHAPE

The rest of the E0 page and the entire E1 page encode letters, not shapes. Here is the complete set of codepoints; some are unused and some others are no longer used.

_0_1_2_3_4_5_6_7_8_9_A_B_C_D_E_F
E01_
E02_
E03_
E04_
E05_
E06_
E07_
E08_
E09_
E0A_
E0B_
E0C_
E0D_
E0E_
E0F_
_0_1_2_3_4_5_6_7_8_9_A_B_C_D_E_F
E10_
E11_
E12_
E13_
E14_
E15_
E16_
E17_
E18_
E19_
E1A_
E1B_
E1C_
E1D_
E1E_
E1F_
_0_1_2_3_4_5_6_7_8_9_A_B_C_D_E_F

The hexadecimal numbers at top left, top and left add up to indicate the code point of the letter in each cell. For instance, the Musa n is at code point E110.

The double-wide Musa logo is at E17E, as if it were spelled by its two components. The Musa colon is at E155, as if it were spelled by two circles.

The Musa dot letter is encoded at E040 separately from the normal Unicode space at 0020. The rule is that the space between Musa text and other text is the normal space, but the dot is used within Musa text. That confounds the non-Musa end-of-line algorithms so that lines of Musa text in Alphabet gait are justified. Gaits with larger glyphs may have to leave an extra space or two at the right side of a line.

Digital Gaits

In Musa, the gaits are implemented using OpenType Advanced Typography, which specifies substitutions or positionings of glyphs in certain circumstances. For example, in Kana gait, a sequence of consonant+vowel is replaced by the corresponding kana. The feature set is rich enough for everything Musa needs, mostly ligatures and contextual alternates.

Since gaits are implemented as fonts, there's no need for special treatment during text entry, transmission or storage. Musa text can be searched and sorted without regard for gait, and foreign words in text that can't be written in the gait of the text will appear in Alphabet gait. On the next page, we'll explain how Musa Markup gives you a way to embed the gait in the text without changing the letters.

Musa fonts share a common naming format: a font name, the word Musa, and then a gait keyword, like Dushan Musa Alphabet or Zhou Musa Fangzi, followed by a style (Regular, Bold, Italic, ...). The possible gait keywords are:

Domains

The conventional extension for sites completely in Musa will be .musa or the single Musa letter , at E17E. However, there isn't yet a Musa superdomain, so your site could be musa.mysite.com or mysite.com/musa, for example.


< Letter Reference Markup >


© 2002-2022 The Musa Academy musa@musa.bet 05jul22