Building a Multilingual Command Line: John Samuel

A multilingual command-line allows the users to interact with the computers from the command-line in multiple languages. Up until now, the command-line interface mostly had commands that are more or less English-like or have mnemonics made out of English words. This makes it difficult for new command-line users, who are also non-English speakers to use this interface. Thus the major question is how to help non-English speakers to work with the command-line interface in their native lanugages. Is there really a need to rewrite the command-line emulators so that they can support multiple human languages? This article takes a look into the different ways to extend them for supporting multiple languages.

This is part 2 of the series on multilingual application development. Part 1 of the series proposed a rethinking of the command-line.

Introduction

A recent article on Wired¹ talked about how most of the programming languages are suited for English speakers and do not help non-English speakers. The keywords used in programming languages make use of the English (Latin) alphabet and are more or less derived from the English words. Take for example, if you wish to define a function in a programming language like Python, you need to make use of the keyword def (derived from the word 'define'). Similarly, if you want to define a class, you make use of the keyword class, another English word. Imagine a non-English speaker who decides to learn programming. Unlike mathematics, the person has to learn these English(-like) keywords. It becomes more challenging if the speaker is of a language that does not use English (Latin) alphabet. One may say that most developers know or speak a little bit of English. This has been a valid argument for quite a long time, but as internet use is increasing across the world, it is indeed time to rethink developer's interaction with the computers.

Most developers, if not using Integrated Development Environment (IDE) do come across the command-line interface. The command-line interface lets user type commands² to perform a given task on the computer. Examples include mkdir to create a directory, cd to change a directory etc, ps to list all the processes. These commands, including the above ones are abbreviations or short form of English words or phrases, similar to the case of the programming languages, discussed above. It will be interesting to have a detailed study on the learning curve of non-English speakers, when they start using the command-line. However, one could ask another question here: "What if native-language support is brought to the command-line, where people can write commands in their native languages?".

Taking the example of Bash³, a shell and command language for Unix/Linux based systems, let's explore how a shell can be extended to give a multilingual interface. The idea is to let speakers of any language interact with the computers using commands in their native languages in the most transparent way possible. These commands have been developed from developers across the world

Solution 1: Ask the developers and maintainers to support their tools, applications or commands in multiple languages and release executables in multiple languages.
Solution 2: Modify the shell (e.g., Bash) source code to allow accepting translations of existing commands, in such a manner that when a user types a command in their native language, the shell searches for this command in all the translations and if it finds one, checks whether the translation is mapped to any existing command.
Solution 3: Extend the shell (e.g., Bash) in a transparent manner, similar to the approach taken by Ubiquity^4,5 that does not require any modification to the shell (like browser in the case of Ubiquity) nor to the individual applications.

Solutions 1 and 2 require a tremendous amount of work for translating all the existing commands, their arguments (or options), etc. In real-life, however, we do not use all the available commands. Why not take a look at these regularly-used commands and build a transparent multilingual solution using these commands. And this is what brings us to the Solution 3.

Ubiquity^4,5, for example, opted for solution 3 considering several languages for their study. One interesting contribution from this work was that one cannot assume a generic word order. In case of English, imperatives (or commands) are of the form Verb + Subject + Object, Verb + Object or just Verb^6,7. Take for example, some imperative sentences from English, from the perspective of command-line.

list all processes
create a file
delete a file

If it were French, the above imperative sentences may look like

affiche tous les processus
crée un fichier
supprime un fichier

However, these imperative sentences follow a different order in languages like Malayalam, where the object comes before the verb.

പ്രക്രിയകൾ കാണിക്കുക
ഫയൽ സൃഷ്ടിക്കുക
ഫയൽ ഇല്ലാതാക്കുക

Thus, the word order cannot be assumed, and it's important to be flexible. Figure 1 shows the first possibility, where the verb comes first and the object(s) come at the second place.

commands and actions — Fig 1: Command to list files, directories, processes or network connections

Other possibility of commands is shown in Figure 2, where the object comes at the first place.

While working with the command-line, taking a look at some of the commands that people normally use, we can obtain some possible objects and the actions on them as shown in Figure 3.

Development

Let's explore a very simple way to implement this idea. We want to implement commands as shown in Figure 1, i.e., the action verb comes first and the object comes in the second place. For testing these commands, we want to add these commands in .bashrc^8,9 file in the home directory. The idea is to create a new word combining action word and object. So, the next time, the user has to use these new combinations of words.


        alias listfile="ls"

        alias createfile="touch"

        alias deletefile="rm"

        alias showfile="cat"



        alias listdirectory="ls"

        alias createdirectory="mkdir"

        alias deletedirectory="rmdir"

        alias showdirectory="ls"

This approach is simple and can be used to create any number of commands. The user has to remember the four action words create, show, delete and list as well as the objects file and directory.

However, some may see that the presence of big commands like listdirectory, createfile as not a very interesting approach since these words do not exist in the dictionary. We wish to create commands very close to human language.

So, the second approach is to separate action words and objects, so that the user can run the following type of commands.

create directory dir1
show directory dir1
delete directory dir1
create file file1
show file file1
...

For this purpose, we may use bash functions and aliases.


          function deleteaction() {

            count=$#

            if [[ $1 == "file" ]]

            then

              shift

              rm $@

            elif [[ $1 == "directory" ]]

            then

              shift

              rmdir $@

            fi

          }

         

         alias delete="deleteaction"

The above code shows how alias has been used to call a function. Though it shows a way to delete files and directories, it can be used in a similar manner for working with network connections, processes etc. But what if we want to repeat this for French language?


          function supprimeraction() {

            count=$#

            if [[ $1 == "fichier" ]]

            then

              shift

              rm $@

            elif [[ $1 == "répertoire" ]]

            then

              shift

              rmdir $@

            fi

          }

         

         alias supprimer="supprimer"

As you can see that the code is similar to that in the English language and we have replaced the action words and objects by their translations in French language. Now we can run the following commands in the French language.

supprimer répertoire rép1
supprimer fichier f1

One may ask what's the need to write the complete word, why not just commands in the following way.

s r rép1
s f f1

The above code can be easily modified to obtain these commands.

Finally, how about the commands in Figure 2, where the object comes first. For this purpose, let's check these commands in the Malayalam language, where the object comes first and the action word comes later. For example, we want to have the following commands for creating and deleting directories.

ഡയറക്ടറി സൃഷ്ടിക്കുക ഡ1
ഡയറക്ടറി ഇല്ലാതാക്കുക ഡ1

To be able to run the above commands, the above code has been modified. As can be seen, all relevant operations related to directory have been grouped under a single function.


          function ഡയറക്ടറിപ്രവർത്തനങ്ങൾ() {

            count=$#

            if [[ $1 == "സൃഷ്ടിക്കുക" ]]

            then

              shift

              mkdir $@

            elif [[ $1 == "ഇല്ലാതാക്കുക" ]]

            then

              shift

              rmdir $@

            fi

          }

         

         alias ഡയറക്ടറി="ഡയറക്ടറിപ്രവർത്തനങ്ങൾ"

Another question that remains answering is how to ensure that even the arguments (options) to the commands are also taken into consideration. For example, a person wants to pass -R (recursive) for recursively navigating a directory. These options may still work with the newly created aliases, but may need some modifications to the above functions, if we want to support translations of options as well.

Conclusion

There are several possible ways by which a multilingual command-line can be built. This article presented the various possible solutions, but focused in detail a solution, where existing commands need not be modified or translated. This possible transparent solution may be helpful to reduce the learning curve of students who are very new to the command-line and may have just started their studies in computer science. However, a detailed user evaluation is still required to understand whether these changes are indeed helpful to non-English speaking terminal users.

Another major challenge is to encourage application developers to think multilingual by design, so that users do not face language barriers for using their applications.

References

Coding Is for Everyone—as Long as You Speak English
Rethinking the command-line
Bash
Ubiquity
Ubiquity: Designing a Multilingual Natural Language Interface, Michael Yoshitaka Erlewine, SIGIR Workshop on Information Access in a Multilingual World, July 23, 2009
Imperative Verbs
Imperative mood
.bashrc
Bash Startup Files