Punctuation

In addition to letters, the Musa script includes punctuation.

End of Word: Space with Dot

In many fonts, Musa uses a dot in the space at the end of a word. This helps you read the separation between words as different from the separation between letters. We don't always need a dot: for example, in Akshara fonts, words are separated by the break in the stroke linking the letters. We also don't use the dot in Alphabet or Ligature fonts with kerning, since there isn't enough whitespace inside each word to be confused with the space between words.

A Dot is one cell wide, like all the letters. Musa fonts are fixed-width, and the glyphs are designed to be be of equal width (monospace). You don't need to break a line at the end of a word or to indicate a word that is split between lines ; just write letters all the way to the end of the line and then write the next letter on a new line. This means that lines of Musa text are always both left- and right-justified, and "soft" newlines are not embedded in the text.

End of Sentence: Double Dot

One Dot ends a word; two consecutive Dots ends a Sentence. As you saw on the Intonation page, the end of a sentence is almost always marked with intonation. But after the intonation, we still write a Double Dot to mark the end of the sentence.

New Paragraph: Portal

In Musa, a new paragraph is marked with a symbol that we call a Portal . It's just a black rectangle the size of a Musa domino. It's used where, in older typography, we might have used a pilcrow ¶, a section sign §, or even a fleuron ❦. For example, here's a text from around 1500AD showing paragraph markers in use.

It's typed using a shortcut, as described on the Keyboard page. The shortcut inserts six ASCII spaces (u0020) followed by the Portal rectangle, and finally a Musa space (uE040). The result is to insert both whitespace and a very dark mark at the beginning of a new paragraph - hard to miss! In addition, the use of ASCII spaces means renderers will often break the line there, so the new paragraph will start on a new line, but it doesn't have to - that helps keep Musa text justified. If the prior paragraph ends by chance at the end of the line, the new paragraph will be indented.

Outlines

Outlines and other forms of structured text, like laws and contracts, use section numbers at the left margin, followed by two normal Musa spaces. Here's an example, in Roman but with Musa punctuation:

8Vertebrates
80Fish
81Amphibians
82Reptiles
83Mammals
830Monotremes
831Marsupials
832Placentals
84Birds

Where do spaces go between words?

I mention above that a dot marks the end of a word, without discussing when a word ends. There are two types of difficult case: the first when words are too long, and the second when words are too short.

A famous example of the first is German, which seems to have very long words. The rule of thumb for Musa is to split the word in front of every stressed syllable, whether the stress is primary or secondary, even though only the last is inflected. For example, the Viennese company DDSG stands for Donaudampfschiffahrtsgesellschaft, which is currently spelled as one word, even though its abbreviation breaks it up into Donau Dampf Schiffahrts Gesellschaft (Danube Steam Shipping Company). The three final words each start with "secondary" stress, but Musa would spell them as individual words, even though only the last is inflected.

In English, we distinguish between a greenhouse and a green house, and here again, stress is a good indicator of the word boundary. But there are languages without (lexical) stress, like French, or with words with more than one stressed syllable, like Swedish, so stress isn't a definitive marker. In general, lexical items usually have their own words. Green and house each have their own meanings, but the meaning of greenhouse can't be predicted from them: it's its own word.

The other tricky case concerns small grammatical words, the kind of words that we usually don't stress at all in English. Our current spelling attaches prefixes and suffixes directly to their base words, and so does Musa: cat-s demand-ed un-reason-able help-ing-s. Our current spelling separates prepositions and articles, and so does Musa: in the room. But the word for the in Swedish is a suffix that Musa attaches to the base word, and Japanese has postpositions that Musa attaches to the base word. We don't have a definitive rule.

In general, we separate words to make the text easier to read - they don't represent anything phonetic. Most of the time - but not always - a word boundary does represent the boundary between syllables, like a Break. But sometimes a syllable crosses a word boundary, as we'll discuss now.

Contraction, Enchainment, Elision, Liaison, and Crasis

In many languages, words in sequence are pronounced differently than they would be separately. The general linguistic term for this is sandhi. In general, Musa writes those changes.

In English, the most obvious case is the word a, which is pronounced and written an when the following word starts with a vowel: a pear, but an apple. The word the also changes its pronunciation - from thə pear to thē apple - with no change in spelling. In Musa, we spell both changes:

   

Most of the time, in most languages, words in speech aren't separated: cold rain sounds just like coal drain. When two consecutive words both contribute to a single syllable, that's called enchainment. For example, fine art usually sounds like fy nart, and true sport sounds like truce port: one syllable spans both words. But when we write, we insert spaces between lexical words to make it easier to parse, and in Musa, we do this even if the space falls in the middle of a syllable.

For example, in English, unstressed am/is/are have/has/had will/would did often combine with a preceding pronoun as in I'm you've he'll, and unstressed not often combines with a preceding auxiliary verb as in don't won't can't to form contractions. Musa separates the two words with a space, even if one or both of the words changes to form a single syllable:

I'm  you're  he's 

I'll  you've  he'd 

don't  doesn't  can't 

I'm		you're		he's	
I'll		you've		he'd	
don't		doesn't		can't	

Note that English words like doesn't have the apostrophe between the n and the t, where the o was removed, but in Musa, we put the space before the n, because that separates the two words.

The English possessive case is also written with an apostrophe, but it's not a contraction. In Musa, we don't write that apostrophe: the possessive the boy's bike is written just like the plural the boys bike: . But the possessive is not written the same as the contraction:

John's home (possessive: the home of John) 

John's home (contraction: John is home) 

John's home (possessive: the home of John)	
John's home (contraction: John is home)	

Likewise, the words its - the possessive of it - and the word it's - the contraction of it is or it has - are spelled differently, just as they are now in English:

its (possessive: its home = the home of it) 

it's (contraction: it's home = it is home) 

its (possessive: its home = the home of it)	
it's (contraction: it's home = it is home)	

English also has "reduced contractions" like wanna gonna gotta woulda coulda shoulda, that sometimes even coexist with normal contractions like could've. But we consider these reduced contractions to be complete words on their own, and we write them as a single word.

French is much more complicated, but we're not going to explain it here - we're just going to tell you how to write it, assuming you already know how to pronounce it. French uses a lot of enchaînement, with words blending together, but they are written with spaces between the words, even if the words change form:

cher ami 

bon ami 

cher ami	
bon ami	

French and Catalan also feature elision, one of two ways to prevent consecutive vowels: the final vowel of one word is elided before another word beginning with a vowel, and the two words are combined. The first word is usually a small word like le, ce, se, de, ne, me, te, que, je (all ending in a schwa), or one of their compounds like quelque. We write the first word as a single letter:

(French) l'ami 
(Catalan) l'amic 

(French) l'ami	
(Catalan) l'amic	

In les hommes, the word les is one of many French words that are now spelled with a silent final consonant: the z (spelled s in French) at the end of les is normally silent. But when the subsequent word starts with a vowel, in order to separate the two vowels, the hitherto silent consonant becomes audible, and we prefix it to the following word. This is called liaison - the other way French prevents consecutive vowels. In Musa, we don't write the final consonant when it's silent. Compare les hommes - where the first word ends with a z - with both les femmes and les Halles, where the s of les isn't pronounced at all:

  
les femmes les hommes les Halles

		
les femmes	les hommes	les Halles

French, Portuguese, Catalan, and Italian have a form of contraction called crasis, where two words combine to form a single word, and we use neither a space nor an apostrophe.

(French) de + les = des 
(Portuguese) de + os = dos 
(Catalan) de + els = dels 
(Italian) di + gli = degli 

(French) de + les = des	
(Portuguese) de + os = dos	
(Catalan) de + els = dels	
(Italian) di + gli = degli	

Sometimes, you'll see a whole series of crases, elisions, liaisons, and/or contractions:

il n'y en a pas 

il n'y en a pas	

Defective Punctuation

In addition to the system just presented, Musa offers a second, simpler system of punctuation for use when there isn't enough data for full punctuation. This might arise, for example, as a result of transcription of existing Roman text, as opposed to text written directly in Musa.

In this defective system of punctuation, the period, question mark and exclamation point are simply transcribed using double accents :

The . period is transcribed as a double Level accent 
The ? question mark is transcribed as a double Rising accent 
The ! exclamation point is transcribed as a double Falling accent 
The use of one of these three signs indicates that the sentence's punctuation is defective.

Punctuation within a sentence uses some other signs :

The , comma is a low rising accent 
The _ underscore is a low falling accent 
The - hyphen is a high level accent 
The ‐ dash is a low level accent 
The ; semicolon is a level accent above a comma 
The : colon is a level accent above an underscore 
The / slash is a slash above a level accent 
The \ backslash is a backslash above a level accent 
The . point (not a full stop period) is a Break 
Quotes "" ‘’ «» use a high rising accent  to open and a high falling accent  to close.
Parentheses ( ), braces { }, and brackets [ ] use paired rising and falling accents with either a high long mark  or a low long mark . For embedded quotes or braces, use the other type.

Musa text can be bold, italic or underlined, for example for text cited from a different source or quotations.

< Intonation Gaits >

musa@musa.bet

19feb25