left-icon

Implementing a Custom Language Succinctly®
by Vassili Kaplan

Previous
Chapter

of
A
A
A

CHAPTER 7

Localization

Localization


“Programs must be written for people to read, and only incidentally for machines to execute.”

Abelson & Sussman

In this chapter we are going to see how to write CSCS programs so that they can be localized in any human language. You’ll see how we can supply keyword translations in a configuration file so that the keywords from different languages can be used interchangeably. Not only that, we’ll also see how to translate a CSCS program written with the keywords in one human language to another.

Adding translations for the keywords

To add translations for keywords, we use the configuration file. You can start with the one that is automatically created by Visual Studio (or Xamarin Studio); it is usually called App.config. Code Listing 68 shows an excerpt from the CSCS configuration file.

Code Listing 68: An excerpt from the configuration file

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <configSections>
    <section name="Languages" 

             type="System.Configuration.NameValueSectionHandler" />
    <section name="Synonyms"

             type="System.Configuration.NameValueSectionHandler" />
    <section name="Spanish"

             type="System.Configuration.NameValueSectionHandler" />

    <section name="Russian"

             type="System.Configuration.NameValueSectionHandler" />

  </configSections>
  <appSettings>

    <add key="maxLoops" value="100000" />
    <add key="dictionaryPath" value="scripts/" />
    <add key="errorsPath" value="scripts/errors.txt" />
    <add key="language" value="ru" />
  </appSettings>

  <Languages>
    <add key="languages" value="Synonyms,Spanish,Russian" />
  </Languages>
  <Synonyms>
    <add key="copy"     value ="cp" />
    <add key="del"      value ="rm" />
    <add key="dir"      value ="ls" />

    <add key="include"  value ="import" />
    <add key="move"     value ="mv" />
    <add key="print"    value ="writenl" />
    <add key="read"     value ="scan" />
  </Synonyms>
  <Spanish>
    <add key="if"       value ="si" />
    <add key="else"     value ="sino" />
    <add key="for"      value ="para" />
    <add key="while"    value ="mientras" />
    <add key="function" value ="función" />
    <add key="size"     value ="tamaño" />
    <add key="print"    value ="imprimir" />
  </Spanish>
  <Russian>
    <add key="if"       value ="если" />
    <add key="else"     value ="иначе" />
    <add key="for"      value ="для" />
    <add key="while"    value ="пока" />
    <add key="return"   value ="вернуть" />
    <add key="function" value ="функция" />
    <add key="include"  value ="включить" />

    <add key="return"   value ="вернуться" />
  </Russian>

</configuration>

The configuration file has translations of the keywords to Spanish and Russian, and also to “Synonyms.” It lets us use synonyms for the keywords in English. Anyone who worked with both Windows and the macOS (or any other Unix-based system) would often confuse dir with ls, copy with cp, and grep with findstr. The Synonyms section in the configuration file allows them to be used interchangeably.

To make it work, we introduce a new module, Translation, to support all of the localization concepts. For each keyword that we want to add in the configuration file (but it doesn’t have to be added), we call the Translation.Add static method. See the implementation of the  Interpreter.ReadConfig and a short version of the Translation.Add methods in Code Listing 69.

Code Listing 69: The implementation of the Interpreter.ReadConfig and Translation.Add methods

void ReadConfig()
{

  if (ConfigurationManager.GetSection("Languages") == null) {
    return;
  }
  var languagesSection = ConfigurationManager.GetSection("Languages") as

                         NameValueCollection;
  if (languagesSection.Count == 0) {
    return;
  }

  string errorsPath = ConfigurationManager.AppSettings["errorsPath"];
  Translation.Language = ConfigurationManager.AppSettings["language"];
  Translation.LoadErrors(errorsPath);

  string dictPath = ConfigurationManager.AppSettings["dictionaryPath"];
  string baseLanguage = Constants.ENGLISH;
  string languages = languagesSection["languages"];
  string[] supportedLanguages = languages.Split(",".ToCharArray());

  foreach(string lang in supportedLanguages) {
    string language = Constants.Language(lang);
    Dictionary<stringstring> tr1 = 

        Translation.KeywordsDictionary(baseLanguage, language);
    Dictionary<stringstring> tr2 =

        Translation.KeywordsDictionary(language, baseLanguage);

    Translation.TryLoadDictionary(dictPath, baseLanguage, language);
    var languageSection = ConfigurationManager.GetSection(lang) as

                          NameValueCollection;

    Translation.Add(languageSection, Constants.IF, tr1, tr2);
    Translation.Add(languageSection, Constants.FOR, tr1, tr2);
    Translation.Add(languageSection, Constants.WHILE, tr1, tr2);
    Translation.Add(languageSection, Constants.BREAK, tr1, tr2);
    Translation.Add(languageSection, Constants.CONTINUE, tr1, tr2);

    // More keywords go here...

    // Special dealing for else, catch, etc. since they are not separate
    // functions but are part of the if and try statement blocks.
    Translation.AddSubstatement(languageSection, Constants.ELSE,

                                Constants.ELSE_LIST, tr1, tr2);
    Translation.AddSubstatement(languageSection, Constants.ELSE_IF,

                                Constants.ELSE_IF_LIST, tr1, tr2);
    Translation.AddSubstatement(languageSection, Constants.CATCH,

                                Constants.CATCH_LIST, tr1, tr2);
  }
}

// Incomplete definition of the Translation.Add method: 

public static void Add(NameValueCollection langDictionary, string origName,
       Dictionary<stringstring> tr1, Dictionary<stringstring> tr2)

{

  ParserFunction origFunction = ParserFunction.GetFunction(origName);

  ParserFunction.RegisterFunction(translation, origFunction);
}

The Translation.Add method registers a function, previously (already) registered with the parser (using the origName string variable) but under a new name (translation). Therefore, if we get either the origName token, or a translation token, the same function will be invoked.

Code Listing 70 contains an example of using the keywords in Spanish in CSCS. The CSCS code there contains a size function (“tamaño” in Spanish). We haven’t shown the implementation of the size function, but it just returns the number of elements in an array.

Code Listing 70: An example of using Spanish keywords in CSCS

números = {"uno""dos""tres""quatro""cinco""seis"};
para (i = 1; i <= tamaño(números); i++) {
  si (i % 2 == 0) {
    imprimir(números[i - 1]" es par");
  } sino {
    imprimir(números[i - 1]" es impar");   
  }
}

// Output:

uno es impar

dos es par

tres es impar

quatro es par

cinco es impar

seis es par

Adding translations for function bodies

Now let’s see how to add translations to the parser to translate error messages and words other than CSCS keywords. A full version of the Translation.Add method shows it in Code Listing 71.

Code Listing 71: Adding translations to the parser

public class Translation
{

  private static HashSet<string> s_nativeWords = new HashSet<string>();
  private static HashSet<string> s_tempWords   = new HashSet<string>();

  private static Dictionary<stringstring>

    s_spellErrors  = new Dictionary<stringstring>();
  private static Dictionary<stringDictionary<stringstring>>

    s_keywords     = new Dictionary<stringDictionary<stringstring>>();
  private static Dictionary<stringDictionary<stringstring>>

    s_dictionaries = new Dictionary<stringDictionary<stringstring>>();
  private static Dictionary<stringDictionary<stringstring>>

    s_errors =       new Dictionary<stringDictionary<stringstring>>();

  // The default user language. Can be changed in settings.
  private static string s_language = Constants.ENGLISH;
  public static string Language { set { s_language = value} }

 

  public static void Add(NameValueCollection langDictionary,

                         string origName,
                         Dictionary<stringstring> translations1,
                         Dictionary<stringstring> translations2) {
    AddNativeKeyword(origName);

    string translation = langDictionary[origName];
    if (string.IsNullOrWhiteSpace(translation)) {
      // No translation is provided for this function.
      translations1[origName] = origName;
      translations2[origName] = origName;
      return;
    }

    AddNativeKeyword(translation);
    translations1[origName] = translation;
    translations2[translation] = origName;

    if (translation.IndexOfAny((" \t\r\n").ToCharArray()) >0) {
      throw new ArgumentException("Translation of [" + translation +

                                  "] contains white spaces");
    }
    ParserFunction origFunction = ParserFunction.GetFunction(origName);
    Utils.CheckNotNull(origName, origFunction);
    ParserFunction.RegisterFunction(translation, origFunction);

    // Also add the translation to the list of functions after which

    // there can be a space (besides a parenthesis).
    if (Constants.FUNCT_WITH_SPACE.Contains(origName)) {
        Constants.FUNCT_WITH_SPACE.Add(translation);
    }
    if (Constants.FUNCT_WITH_SPACE_ONCE.Contains(origName)) {
        Constants.FUNCT_WITH_SPACE_ONCE.Add(translation);
    }
  }

  

  public static void AddNativeKeyword(string word) {
    s_nativeWords.Add(word);
    AddSpellError(word);
  }


  public static void AddTempKeyword(string word) {
    s_tempWords.Add(word);
    AddSpellError(word);
  }
  

  public static void AddSpellError(string word) {
    if (word.Length > 2) {
      s_spellErrors[word.Substring(0, word.Length - 1)] = word;
      s_spellErrors[word.Substring(1)] = word;
    }
  }

}

For each pair of languages, we have two dictionaries, each one mapping words from one language to another. In addition to the keywords, you can add translations to any words in the configuration file. We skip loading custom translations, but it can be consulted in the accompanying source code.

Code Listing 72 shows the implementation of the TranslateFunction, which translates any custom function to the language supplied.

Code Listing 72: The implementation of the function translations

class TranslateFunction : ParserFunction
{
  protected override Variable Evaluate (ParsingScript script)
  {
    string language = Utils.GetToken(script, Constants.TOKEN_SEPARATION);
    string funcName = Utils.GetToken(script, Constants.TOKEN_SEPARATION);

    ParserFunction function = ParserFunction.GetFunction(funcName);
    CustomFunction custFunc = function as CustomFunction;
    Utils.CheckNotNull(funcName, custFunc);

    string body = Utils.BeautifyScript(custFunc.Body, custFunc.Header);
    string translated = Translation.TranslateScript(body, language);
    Translation.PrintScript(translated);

    return new Variable(translated);
  }

}

public class Translation
{

  public static string TranslateScript(string script, string fromLang,

                                       string toLang) {
    StringBuilder result = new StringBuilder();
    StringBuilder item = new StringBuilder();

    Dictionary<stringstring> keywordsDict =

                               KeywordsDictionary(fromLang, toLang);
    Dictionary<stringstring> transDict =

                               TranslationDictionary(fromLang, toLang);
    bool inQuotes = false;

    for (int i = 0; i < script.Length; i++) {
      char ch = script [i];
      inQuotes = ch == Constants.QUOTE ? !inQuotes : inQuotes;

      if (inQuotes) {
        result.Append (ch);
        continue;
      }


      if (!Constants.TOKEN_SEPARATION.Contains(ch)) {
        item.Append(ch);
        continue;
      }


      if (item.Length > 0) {
        string token = item.ToString();
        string translation = string.Empty;
        if (toLang == Constants.ENGLISH) {
          ParserFunction func = ParserFunction.GetFunction(token);
          if (func != null) {
            translation = func.Name;
          }
        }


        if (string.IsNullOrEmpty(translation) &&
            !keywordsDict.TryGetValue(token, out translation) &&
            !transDict.TryGetValue(token, out translation)) {
          translation = token;
        }
        result.Append(translation);
        item.Clear();
      }


      result.Append(ch);
    }


    return result.ToString();
  }

}

Figure 7 shows an example of running the show and translate commands in CSCS. You can see different colors there; this is done in the Translate.PrintScript method. This is where Variable.IsNative property is used: all “native” (implemented in C#) functions are printed in one color, and all other functions and variables (implemented in CSCS) are printed in another color. The implementation of the show function is very similar to the implementation of the translate function, shown in Code Listing 72, so we skip it as well.

Show and translate functions run in CSCS

Figure 7: Show and translate functions run in CSCS

Adding translations for error messages

Now let’s see how to add translations to the parser for the error messages. We’ll also add possible errors in spelling. Here we use a simplified version, where there is an incorrect or missing first or last letter of the word. Code Listing 73 shows the loading of error messages in different languages, and a sample file in English and German.

Code Listing 73: Loading error messages in different languages

public class Translation
{

  public static void LoadErrors(string filename)
  {
    if (!File.Exists(filename)) {
      return;
    }


    Dictionary<stringstring> dict = GetDictionary(Constants.ENGLISH,

                                                    s_errors);
    string [] lines = Utils.GetFileLines (filename);
    foreach (string line in lines) {
      string[] tokens = line.Split("=".ToCharArray (),
                                   StringSplitOptions.RemoveEmptyEntries);
      if (tokens.Length < 1 || tokens[0].StartsWith("#")) {
        continue;
      }
      if (tokens.Length == 1) {
        dict = GetDictionary(tokens[0], s_errors);
        continue;
      }
      dict[tokens[0].Trim()] = tokens[1].Trim();
    }
  }

}

// Sample contents of the errors.txt file in English and German.

en

parseToken        = Couldn't parse [{0}] (not registered as a function).

parseTokenExtra   = Did you mean [{0}]?

errorLine         = Line {0}: [{1}]

errorFile         = File: {0}.

de

parseToken        = [{0}] konnte nicht analysiert werden (nicht als Funktion registriert).

parseTokenExtra   = Meinen Sie [{0}]?

errorLine         = Zeile {0}: [{1}]

errorFile         = Datei: {0}.

Code Listing 8 uses the Utils.ThrowException function, which throws an exception in a language that is configured as user language in the properties file. The implementation of the Utils.ThrowException function is in Code Listing 74.

Code Listing 74: Implementations of Utils.ThorowException and Translation.GetErrorString

public class Utils
{

  public static void ThrowException(ParsingScript script, string excName1,
                         string errorToken = ""string excName2 = "") 

  {
    string msg = Translation.GetErrorString(excName1);

    if (!string.IsNullOrWhiteSpace(errorToken)) {
      msg = string.Format(msg, errorToken);
      string candidate = Translation.TryFindError(errorToken, script);

      if (!string.IsNullOrWhiteSpace(candidate) &&
          !string.IsNullOrWhiteSpace(excName2)) {
        string extra = Translation.GetErrorString(excName2);
        msg += " " + string.Format(extra, candidate);
      }
    }

    if (!string.IsNullOrWhiteSpace(script.Filename)) {
      string fileMsg = Translation.GetErrorString("errorFile");
      msg += Environment.NewLine + string.Format(fileMsg, script.Filename);
    }

    int lineNumber = -1;
    string line = script.GetOriginalLine(out lineNumber);
    if (lineNumber >0) {
      string lineMsg = Translation.GetErrorString("errorLine");
      msg += string.IsNullOrWhiteSpace(script.Filename) ?

                                    Environment.NewLine : " ";
      msg += string.Format(lineMsg, lineNumber + 1, line.Trim());
    }
    throw new ArgumentException(msg);
  }

}

public class Translation
{

  public static string GetErrorString(string key)
  {
    string result = null;
    Dictionary<stringstring> dict = GetDictionary(s_language, s_errors);
    if (dict.TryGetValue (key, out result)) {
        return result;
    }
    if (s_language != Constants.ENGLISH) {
      dict = GetDictionary(Constants.ENGLISH, s_errors);
      if (dict.TryGetValue(key, out result)) {
        return result;
      }
    }
    return key;
  }

}

Consider the following script with a typo in “fibonacci” (an additional “i” at the end):

  b = 10;

  c = fibonaccii(b);

Here is what our parser prints when running that script with the user language configured as German (see the “language” parameter in the configuration file in Code Listing 68) and loading the errors.txt file shown in Code Listing 73:

[fibonaccii] konnte nicht analysiert werden (nicht als Funktion registriert).

Meinen Sie [fibonacci]?

Zeile 2: [c = fibonaccii(b);]

You can implement catching more advanced spelling errors—not only problems with the first and last letters—for example, by using the Levenshtein distance[15] in strings.

Getting line numbers where errors occur

How do we know which line the error occurred on? (“Zeile 2” means “Line 2” in German). Most of the information is already in the char2Line data structure that was loaded in the Utils.CovertToScript method (see Code Listing 42). But we still need to know what line we are on, only knowing at what character in the script we stopped when the error occurred. Code Listing 75 implements this, using the binary search.

I am not particularly proud of this method of finding the line numbers (even though it works), so hopefully you can come out with a better idea.

Code Listing 75: Implementation of ParsingScript.GetOriginalLineNumber function

public int GetOriginalLineNumber()
{
  if (m_char2Line == null || m_char2Line.Count == 0) {
    return -1;
  }

  int pos = m_scriptOffset + m_from;
  List<int> lineStart = m_char2Line.Keys.ToList();
  int lower = 0;
  int index = lower;

  if (pos <= lineStart[lower]) { // First line.
    return m_char2Line[lineStart[lower]];
  }


  int upper = lineStart.Count - 1;
  if (pos >= lineStart[upper]) { // Last line.
    return m_char2Line[lineStart[upper]];
  }

  while (lower <= upper) {
    index = (lower + upper) / 2;
    int guessPos = lineStart[index];
    

    if (pos == guessPos) {
      break;
    }


    if (pos < guessPos) {
      if (index == 0 || pos > lineStart[index - 1]) {
        break;
      }
      upper = index - 1;
    } else {
      lower = index + 1;
    }
  }

  return m_char2Line[lineStart[index]];
}

Conclusion

In this chapter we saw how to write the CSCS scripts in any language by using different configuration files. Also we saw how to have error messages for programming mistakes in different languages.

In the next chapter we are finally going to talk about testing and how to run CSCS from a shell prompt.

Scroll To Top
Disclaimer
DISCLAIMER: Web reader is currently in beta. Please report any issues through our support system. PDF and Kindle format files are also available for download.

Previous

Next



You are one step away from downloading ebooks from the Succinctly® series premier collection!
A confirmation has been sent to your email address. Please check and confirm your email subscription to complete the download.