vkethana.com

Inventing my own writing system for English, VJScript


By the end of this post, you will be able to read and create your own English sentences in VJScript, a writing system I invented. As a sneak peak, here’s what it looks like:

A sample sentence in VJScript The word 'complex' in VJScript

Why invent your own writing system?

English spelling reform movements, as you may know, are nothing new. Figures ranging from Benjamin Franklin to the Mormons to Teddy Roosevelt have tried their hand at English spelling reform. Why did these people spend years of their lives and (in the case of Teddy Roosevelt) risk ridicule from the press to reform English spelling?

Because it sucks. As one English teacher put it, the language “seems like a bright and inestimable jewel wrapped up in a nasty rag unworthy to be touched… [the alphabet’s] actual distinct sounds are by far insufficient for the purposes of English itself – being scarcely half the number required.” 1 As the quote (colorfully) illustrates, the number of letters in the English alphabet is not enough to represent all of its sounds. For example, we are told in school that English has five vowels (a, e, i, o, u, and sometimes y). But in reality, American English has 14 or 15, and other varieties have even more. Compare the “i” in “fight” to the “i” in “island”, or the “u” in “tube” to the “u” in “under.” A second version, which has all the diacritics on top Fig. 1: A vowel chart of standard California English. (Way more than 5 vowels!)

Introducing VJScript

VJScript fixes these issues with three main design principles: consonants and vowels are treated differently, no silent letters, and one-to-one correspondence between letters and sounds. VJScript does not have a single alphabet but rather two “buckets”, one for consonants and one for vowels.

Here's the vowel inventory:
Vowel Example IPA (for linguists and nitpickers)
EE beet /biːt/
I bit /bɪt/
EI bait /beɪt/
E bet /bɛt/
AE bat /bæt/
AW bought /bɔːt/ or /baːt/ (represents both a and ɔ)
OA boat /boʊt/
Ø book /bʊk/
U but, comma /bʌt/, /kɑmə/
AI bite /baɪt/
OW bout /baʊt/
OY boy /bɔɪ/
O boot /buːt/
R bird, nurse /bərd/, /nərs/
And now the consonants:
Consonant Example IPA (for linguists and nitpickers)
K kite /kaɪt/
G good /ɡʊd/
NG sing /sɪŋ/
CH choose /tʃuːz/
J jump /dʒʌmp/
T type, better /taɪp/, /ˈbɛɾɚ/
TH think, this /θɪŋk/, /ðɪs/
D do /duː/
N nice /naɪs/
P pop /pɒp/
B bob /bɒb/
F fun /fʌn/
M mill /mɪl/
Y yes /jɛs/
R run /rʌn/
L look /lʊk/
V voice /vɔɪs/
S sit /sɪt/
Z zoo /zuː/
SH shine /ʃaɪn/
ZH measure /ˈmɛʒər/

The consonant and vowel inventories do not perfectly map one-to-one with English sounds. (But then again, neither does regular English spelling.) For example, the voiced “th” in “this” and the unvoiced “th” in “throw”, which are really different phonemes, are both represented with “TH”. Also, “CH” and “J” are given unique letters even though “CH” = “T” + “SH” and “J” = “D” + “ZH”.

How to write words in VJScript

To write a word, first split it up into its vowels and consonants. Take the word “alphabet” as an example. “Alphabet” = “AE” + “LF” + “U” + “B” + “E” + “T”. Notice that consonant clusters (“LF”) are written together and all silent letters are removed (there are none here).

Now for every vowel that appears after a consonant, write it on top of the consonant. Vowels at the beginning of words, like the “AE” in alphabet, remain unchanged. So “alphabet” becomes: The word "alphabet" in VJScript

The advantage of separating vowels and consonants like this is that someone who has never seen the writing system before can guess the meaning of the sentence by scanning over the consonants, which remain largely unchanged. Recall the example from earlier: A sample sentence in VJScript

Practice words

Take out a pen and paper and try writing the following words. See if your representation matches with the one I came up with:

1) "Apple" (hint: recall that "l" is not considered a vowel) The word 'apple' in VJScript
2) "Banana" The word 'banana' in VJScript
3) "Complex" (hint: when there's more than one consonant in a row, only the last one receives a vowel) The word "complex" is tricky because it contains the consonant cluster "PL." I never specified a rule for where to place vowels when there's a consonant cluster, so there are at least two ways to represent this word, the only difference being the location of the vowel E: The word 'complex' in VJScript The word 'complex' in VJScript

Some scrapped versions that look cool

This version used vowel diacritics that appear both above and below consonants: A writing system combining English consonants with Hindi vowels

One this one put all the vowel diacritics on top of the consonants. Also, I added a rule that all words ending in a consonant had to have a special marker (్) on top of the final letter: A second version, which has all the diacritics on top

Later on, I tried mixing in letters from the International Phonetic Alphabet (e.g. “ð” for the voiced “th”): A final version, which uses some IPA letters (= “This is a sentence.”)

Additional Information (May 25th)

I showed VJScript to a few other people and posted about it on Hackernews. Some people pointed out a few mistakes in the original post, so I’ve clarified some stuff below:

Paragraph-Long Example in VJScript

As extra practice, here’s a longer, paragraph-length example of VJScript.

T
I H
S
I
S
U
F
L
L
AE P
U R
G
AE R
F
I
N
EE V
EI J
S
K
I R
P
T.
A
I
A
E
M
AI R
I T
N
G
T
I H
S
OA S
T
AE H
T
U
T
H
R
S
AE K
N
L
R
N
O T
EE R
D
I
T.
A
E
T
F
R
S
T
I
AE H
N
D
OA R
T
AI M
E
K
AE S
M
P
L
S.
OW H
E W
V
R
A
I
EE R
E S
N
T
EE L
I D
AI Z
N
D
A
S
K
I R
P
T
I
N
AI P
T
AW H
N
O T
A
W
OA T
EI M
T
T
U H
OA H
L
P
AW R
E S
S.
O Y
AE C
N
AI F
N
D
I
T
I B
OA L.
T
AE H
N
G
K
S
AO F
R
EE R
I D
N
G
T
I H
S!
Click to See Translation: "This is a full paragraph in VJScript. I am writing this so that others can learn to read it. At first I handwrote my examples. However I recently designed a script in Python to automate the whole process. You can find it below [in the next section titled "Software to Generate VJScript"]. Thanks for reading this!"

Software to Generate VJScript

I’ve written a simple script to generate VJScript. Note that this codew assumes you’ve already split the word into consonants and vowels. Check it out here.

Update: My Twitter friend Daniel B. Gray has written a better script which accepts ordinary English spelling, converts the words into phonetic transcription using the NLP library NLTK, and then turns that into VJScript. It also has a neat web GUI built with Flask: the link to try it for yourself is here.

Gliding Vowels

Gliding vowels, also called diphthongs, are combinations of two vowels in a single syllable, e.g. “California”. (When it’s three vowels, it’s called a tripthong.)

The rule for writing diphthongs/triphthongs is to simply combine the vowels in question. So “California” becomes:

AE K
I L
AW F
R
EEU N

Note that some diphthongs which are common in English, e.g. the “AI” in “write”, were given their own unique digraphs.

The letter “R”

Careful readers might notice that “R” appears as both a vowel (as in “bird”) and a consonant (“read”). This ends up making things simpler in the long run by preventing unnecessary diphthongs. For example, if we had a rule that “ər” (as in “bird”) –> “UR”, then the word “script” would have to be written:

S
URI K
PT


Instead of:

S
K
I R
P
T.

Conclusion

Considering that people can’t even agree on the right way to spell the word “gray”, English spelling is unlikely to ever undergo a comprehensive reform. After all, the language was never regulated by a central authority the way French and Spanish are (MLA doesn’t count), which makes launching a set of reforms like this very difficult. Still, I don’t feel that designing a writing system is a waste of time. It forced me to better understand how writing works in general, and it gave me an excuse to go on several Wikipedia rabbit holes.

Thanks for reading this post. If you get any good ideas from reading this post or find any mistakes, please reach out over email or leave a comment below.


  1. Source: Page 44 of the preface to An English-Telugu Dictionary by P. Sankaranarayana, 1900. The entire preface (link) is really a treat to read. The author, a language tutor from British India, spends a solid two or three pages roasting the English “alphabet” and, like me, concludes by proposing a writing system of his own invention.