left-icon

Implementing a Custom Language Succinctly®
by Vassili Kaplan

Previous
Chapter

of
A
A
A

CHAPTER 2

The Split-and-Merge Algorithm

The Split-and-Merge Algorithm


“To iterate is human, to recurse divine.”

L. Peter Deutsch

All the parsing of the CSCS language that we are going to develop in this book is based on the Split-and-Merge algorithm to parse a custom string. We will start with pure mathematical expressions, and then generalize them into any custom CSCS language expression.

Description of the Split-and-Merge algorithm

The description that I provide here is based on the Split-and-Merge algorithm previously published in the MSDN Magazine,[11] ACCU Overload,[12] and in the CODE Magazine.[13]

Note that in the discussion that follows, we suppose that the parsing string doesn’t contain any extra blank spaces, tabs, new lines, etc. If this is not the case, we need first to get rid of all these extra characters. (The code that gets rid of these extra characters is available in the accompanying source code).

We could have removed these unneeded blanks at the same time that we parsed the script, but this would have unnecessarily made the algorithm look more complex than it should be.

The algorithm consists of two steps: as the algorithm name implies, the first part is a Split, and the second one is a Merge.

Splitting an expression

In the first step, the string containing an expression is split into a list of so-called “variables.” Each variable consists, primarily, of a number or a string and an action that must be applied to it. The numbers are internally represented as doubles, so in particular, they can be integers or Booleans.

For numbers, the actions can be all possible operations we can do with the numbers, for example, +, , *, /, %, or ^ (power, 2^3 means 2 to the power of 3; that is, 8) and Boolean comparisons (&&, ||).

For strings, the actions can be a + (meaning string concatenation) or Boolean comparisons.

For convenience, we denote the action of the last variable in the list as ). It could’ve been any character, but we chose ) to be sure that it doesn’t represent any other action. We call this special action a “null” action. It has the lowest priority.

The separation criteria for splitting the string into tokens are mathematical and logical operators or parentheses. Note that the priority of the operators does not matter in the first step.

Let’s see an example of applying the first step to some expressions:

Split (“3+5”) = {Variable(3, “+“), Variable(5, “)“)}

Split (“11-2*5”) = {Variable(11, “-“), Variable(2, “*“), Variable(5, “)“)}

Easy! Now, what about parentheses? As soon as we get an opening parentheses in the expression string, we recursively apply the whole Split-and-Merge algorithm to the expression in parentheses and replace the expression in parentheses with the calculated result. For example:

Split (“3*(2+2)”) = {Variable(3, “*“),Variable(SplitAndMerge(“2+2”), “)“)} ={Variable(3, “*“), Variable(4, “)“)}

We apply the same procedure for any function in the expression to be parsed: first we apply recursively the whole Split-and-Merge algorithm to the function, replacing it with the calculated result.

For example, consider parsing an expression 1+sin(3-3). As soon as we get the opening parenthesis in the (3-3) sub-expression, we apply the whole algorithm to 3-3, which yields 0 as a result. Then our expression is reduced to 1+sin(0). Then we calculate the sine of 0, which is 0, reducing the expression to 1+0. Splitting this expression into a list of variables leads to two variables: Variable(1, “+”) and Variable (0, “)”) as a result.

Note that at the end of the first step, all of the parentheses and functions will be eliminated (by possibly recursive calls to the whole Split-and-Merge algorithm).

Merging the list of variables

The priorities of the actions start counting in the second step, when we merge the list of variables created in the previous step. Code Listing 1Error! Reference source not found. contains the priorities of some actions that I defined.

Code Listing 1: Priorities of the actions

static int GetPriority(string action)
{
  switch (action) {
    case "++":
    case "--"return 10;
    case "^" : return 9;
    case "%" :
    case "*" :
    case "/" : return 8;
    case "+" :
    case "-" : return 7;
    case "<" :
    case ">" :
    case ">=":
    case "<="return 6;
    case "==":
    case "!="return 5;
   case "&&"return 4;
    case "||"return 3;
    case "+=":
    case "-=":
    case "*=":
    case "/=":
    case "%=":
    case "=" : return 2;
  }
  return 0;
}

Note that the last variable action ) has the lowest priority of 0, since it isn’t included in the switch statement.

The list of variables is merged one by one, starting from the first variable in the list. Merging means applying the action of the left variable to the values of the left and right variables.

The resulting variable will have the same action as the right variable. The variables can only be merged if the priority of the action of the left variable is greater or equal than the priority of the action of the right variable. As soon as we have only one variable left, its value will be the final result of the whole calculation.

For example, since the priority of the + action is greater than the priority of the null action, we have:

Merge {Variable(3, “+“), Variable(5, “)“)} = Variable(3 + 5, “)”) = Variable(8, “)”) = 8.

Now what happens if two variables cannot be merged because the priority of the left variable is lower than the priority of the right variable? Then we temporarily move to the next-right variable, in order to try to merge it with the variable next to it, and so on, recursively. As soon as the right variable has been merged with the variable on its right, we return back to the original, left variable, and try to remerge it with the newly created right variable.

Eventually we will be able to merge the whole list, since sooner or later we will reach the last variable of the list that has the lowest priority, and therefore, can be merged with any variable on its left.

Let’s see an example of merging a list of variables (11, “-“), (2, “*“), (5, “)“) from the Split section. Note that the priority of * is greater than the priority of - and therefore, we cannot merge the first and second variables right away, but must first merge the second and the third variables:

Merge {Variable(11, “-“), Variable(2, “*“), Variable(5, “)“)} =

Merge {Variable(11, “-“), Merge {Variable(2, “*“), Variable(5, “)“)} =

Merge {Variable(11, “-“), Variable(2 * 5, “)“) } =

Merge {Variable(11, “-“), Variable(10, “)“) } =

Variable(11 - 10, “)“) } = Variable(1, “)“) } = 1.

The implementation of the Split

Let’s see the implementation of the Split-and-Merge algorithm in C#.

First we need to define a few auxiliary data structures. Code Listing 2 contains the ParsingScript class (the definition is incomplete there—we are going to see more functions and methods from the ParsingString class later on). It is a wrapper over the expression being parsed. It contains the actual string being parsed (m_data) and the pointer to the character where we currently are (m_from). It also has a few convenience functions for accessing and manipulating the string being parsed.

Code Listing 2: The ParsingScript class

public class ParsingScript
{
  private string m_data; // Contains the whole script.
  private int    m_from; // A pointer to the char being processed.

  public ParsingScript(string data, int from = 0)
  {
    m_data = data;
    m_from = from;

  }

  public char Current { get { return m_data[m_from]} }


  public bool StillValid()           { return m_from < m_data.Length; }  

  public char CurrentAndForward()    { return m_data[m_from++]}
  public void Forward(int delta = 1) { m_from += delta; }
  
  public string FromPrev(int backChars = 1,

                int maxChars = Constants.MAX_CHARS_TO_SHOW) {
    return Substr(m_from - backChars, maxChars);
  }

  public char TryCurrent() { 
    return m_from < m_data.Length ? m_data[m_from] : Constants.EMPTY; 
  }

  public char TryPrev() { 
    return m_from >1 ? m_data[m_from - 1] : Constants.EMPTY;
  }


  public char TryPrevPrev() {
    return m_from >2 ? m_data[m_from - 2] : Constants.EMPTY;
  }

  public void MoveForwardIf(char expected,

                            char expected2 = Constants.EMPTY) {
    if (StillValid() && (Current == expected || Current == expected2)) {
      Forward();
    }
  }

}

The Constants.Empty character is defined as ‘\0’.

The main task of the first step is to split the string into a list of variables. The Variable class is defined in Code Listing 3.

Basically, a variable is a wrapper over either a number (including integers, doubles, and Booleans), a string, or a list of other variables.

Code Listing 3: The Variable class

public class Variable
{
  public enum VarType { NONE, NUMBER, STRING, ARRAY };

  public Variable()         { Type = VarType.NONE; }
  public Variable(double d) { Value = d; }
  public Variable(bool d)   { Value = d ? 1.0 : 0.0}
  public Variable(string s) { String = s; }

  public double          Value  {
    get { return m_value; }
    set { m_value = value;  Type = VarType.NUMBER; } }
    
  public string          String {
    get { return m_string; }
    set { m_string = value; Type = VarType.STRING; } }
    
  public List<Variable>  Tuple  {
    get { return m_tuple; }
    set { m_tuple = value; Type = VarType.ARRAY; } }

  

  public void            SetAsArray() {
    Type = VarType.ARRAY;
    if (m_tuple == null) { m_tuple = new List<Variable>(); }
  }
    
  public string          Action   { getset}
  public VarType         Type     { getset}
  public bool            IsReturn { getset}


  public static Variable EmptyInstance = new Variable();

  private double         m_value;
  private string         m_string;
  private List<Variable> m_tuple;

}

Code Listing 4 shows the main function to split an expression into a list of variables.

Code Listing 4: The Split part of the algorithm

static List<Variable> Split(ParsingScript script, char[] to)
{
  List<Variable> listToMerge = new List<Variable>();

  StringBuilder item = new StringBuilder();
  

  do { // Main processing cycle of the first part.
    char ch = script.CurrentAndForward();
    string action = null;

    bool keepCollecting = StillCollecting(item.ToString(),

                                          to, script, ref action);
    if (keepCollecting)  {
    // The char still belongs to the token being collected.
      item.Append(ch);

      bool goForMore = script.StillValid() && !to.Contains(script.Current);
      if (goForMore) {
        continue;
      }
    }

    string token = item.ToString();
    if (action != null && action.Length > 1) {
      script.Forward(action.Length - 1);
    }

    // We are done getting the next token. The GetValue() call may
    // recursively call this method. This can happen if the extracted item
    // is a function or if the next item is starting with START_ARG '('.
    ParserFunction func = new ParserFunction(script, token,

                                             ch, ref action);
    Variable current    = func.GetValue(script);

    if (action == null)  {
      action = UpdateAction(script, to);
    }
    current.Action = action;
    listToMerge.Add(current);

    item.Clear();
  } while (script.StillValid() && !to.Contains(script.Current));

  // This happens when called recursively inside of a math expression:

  script.MoveForwardIf(Constants.END_ARG); // END_ARG = ')'

  return listToMerge;
}

The item variable holds the current token, adding characters to it one by one as soon as they are read from the string being parsed. To figure out whether we need to add the next character to the current token, the StillCollecting method is invoked. We show it in Code Listing 5.

Code Listing 5: The StillCollecting method to check if we are still collecting the current token

static bool StillCollecting(string item, char[] to,

                            ParsingScript script, ref string action)
{
  char prev = script.TryPrevPrev();
  char ch   = script.TryPrev();
  char next = script.TryCurrent();

  if (to.Contains(ch) || ch == Constants.START_ARG ||
                         ch == Constants.START_GROUP ||
                       next == Constants.EMPTY)      {
    return false;
  }

  // Otherwise, if it's an action (+, -, *, etc.) or a space
  // we're done collecting current token.
  if ((action = Utils.ValidAction(script.FromPrev())) != null ||
      (item.Length > 0 && ch == Constants.SPACE)) {
    return false;
  }

  return true;
}

The StillCollecting method uses the following constants defined in the Constants class:

const char START_ARG     = '(';

const char START_GROUP   = '{';

Obviously, you are free to redefine them as you wish. The condition next == Constants.EMPTY means that there are no more characters in the script being parsed; that is, the ch argument is already the last character in the script.

The StillCollecting method in Code Listing 5 also invokes the Utils.VaildAction method to find the action that follows the token that we just extracted.

We show the VailidAction and UpdateAction methods in Code Listing 6.

Code Listing 6: The methods to return an action if the passed string starts with it

public static string ValidAction(string rest)
{

  foreach (string action in Constants.ACTIONS) {
    if (rest.StartsWith(action)) {
      return action;
    }
  }
  return null;
}

static string UpdateAction(ParsingScript script, char[] to)
{
  // We look for a valid action till we get to the END_ARG ')'
  // or we pass the end of the string.
  if (!script.StillValid() || script.Current == Constants.END_ARG ||
                              to.Contains(script.Current)) {
    return Constants.NULL_ACTION;
  }

  string action = Utils.ValidAction(script.Rest);

  int advance = action == null ? 0 : action.Length;
  script.Forward(advance);
  return action == null ? Constants.NULL_ACTION : action;
}

The Constants.ACTIONS is a string array containing all actions we want to define:

public static string[] OPER_ACTIONS = { "+=""-=""*=""/=",

                                        "%=""&=""|=""^="};
public static string[] MATH_ACTIONS = { "&&""||""==""!=""<="">=",

                   "++""--", "%""*""/""+""-""^""<"">""="};
// Actions: always decreasing by the number of characters.
public static string[] ACTIONS=(OPER_ACTIONS.Union(MATH_ACTIONS)).ToArray();

The actual dealing with every extracted token in the Split method (Code Listing 4) is in the following 2 lines of code:

    ParserFunction func = new ParserFunction(script, token, ch, ref action);
    Variable current    = func.GetValue(script);

If the extracted token is not a function previously registered with the parser, this code will try to convert the extracted token to a double (and throw an exception if the conversion is not possible).

Otherwise, a previously registered function will be called, which can in turn recursively invoke the the whole Split-and-Merge algorithm. The appropriate ParserFunction will be selected based on the token being processed. How do we select the appropriate function? This is what we’ll explore next.

Virtual constructor idiom

To choose the appropriate function, I decided to use the virtual constructor idiom. Virtual constructor? Is it legal? James Coplien described a trick that simulates a virtual constructor. He described this idiom to be used in C++, but it can be easily used in any other object-oriented language as well, so here we are going to use this idiom in C#.

In C# the same functionality is often implemented using a factory method that uses an additional factory class to produce the required object. But Coplien’s design pattern doesn’t need an extra class, and produces instead the required object on the fly. According to Coplien:[14]

“The virtual constructor idiom is used when the type of an object needs to be determined from the context in which the object is constructed.”

This is exactly the case when we associate each token being parsed with a function at runtime. The main advantage of this idiom is that we can easily add a new function to the parser without modifying the main algorithm implementation code.

The main idea of this idiom is to use the ParserFunction constructor to create the m_impl member of the object, which is a ParserFunction itself:

   private ParserFunction m_impl;

So the constructor of ParserFunction initializes the m_impl member with the appropriate class, deriving from ParserFunction. See the implementation in Code Listing 7.

Code Listing 7: The ParserFunction base class

public class ParserFunction

{
  // A "normal" Constructor
  public ParserFunction() { m_impl = this}

  // A "virtual" Constructor
  internal ParserFunction(ParsingScript script, string token,

                          char ch, ref string action) {
    if (token.Length == 0 &&

       (ch == Constants.START_ARG || !script.StillValid ())) {
      // There is no function, just an expression in parentheses.
      m_impl = s_idFunction;
      return;
    }


    m_impl = GetFunction(token);
    if (m_impl != null) {
      return;
    }

    if (m_impl == s_strOrNumFunction &&

        string.IsNullOrWhiteSpace(token))  {
      string restData = ch.ToString () + script.Rest;
      throw new ArgumentException("Couldn't parse [" + restData + "]");
    }         
 
    // Function not found, parse this as a string in quotes or a number.
    s_strOrNumFunction.Item = token;
    m_impl = s_strOrNumFunction;
 }

  public Variable GetValue(ParsingScript script)
  {
    return m_impl.Evaluate(script);
  }

  protected virtual Variable Evaluate(ParsingScript script)
  {
    // The real implementation will be in the derived classes.
    return new Variable();
  }

  public static ParserFunction GetFunction(string item)
  {
    ParserFunction impl;

    if (s_functions.TryGetValue(item, out impl)) {
    // Function exists and is registered (e.g. pi, exp, or a variable)
        return impl.NewInstance();
    }
    return null;
  }

  // Derived classes may want to return a new instance in order to
  // not to use the same object in calculations.
  public virtual ParserFunction NewInstance() {
    return this;
  }

  public static void RegisterFunction(string name,

                                      ParserFunction function) {
    s_functions[name] = function;

  }

  private static Dictionary<stringParserFunction> s_functions =

             new Dictionary<stringParserFunction>();

  private static StringOrNumberFunction s_strOrNumFunction =
             new StringOrNumberFunction();
  private static IdentityFunction s_idFunction =
             new IdentityFunction();

Notice that the ParserFunction has a missing curly brace. This is because the class listing continues later on, in particular in Listings 8 and 9.

The Evaluate method (called from the base class GetValue method) will use the actual ParserFunction object.

So each class deriving from the ParserFunction class is expected to override the Evaluate method: this is where the real action takes place. We’ll see plenty examples later on.

Special parser functions

Different parser functions can be registered with the parser using the RegisterFunction method in Code Listing 7. A dictionary is used to map all of the parser functions to their names:

  private static Dictionary<stringParserFunction> s_functions =

             new Dictionary<stringParserFunction>();

There are two special functions deriving from the ParserFunction class:

  private static StringOrNumberFunction s_strOrNumFunction =
             new StringOrNumberFunction();
  private static IdentityFunction s_idFunction =
             new IdentityFunction();

They are used explicitly in the ParserFunction constructor, and their implementation is provided in Code Listing 8.

Code Listing 8: IdentityFunction and StringOrNumberFunction implementations

class IdentityFunction : ParserFunction
{
  protected override Variable Evaluate(ParsingScript script)
  {
    return script.ExecuteTo(Constants.END_ARG); // END_ARG == ')'
  }
}

class StringOrNumberFunction : ParserFunction
{
  protected override Variable Evaluate(ParsingScript script)
  {
    // First check if the passed expression is a string between quotes.
    if (Item.Length > 1 &&
      Item[0] == Constants.QUOTE &&
      Item[Item.Length - 1]  == Constants.QUOTE) {
      return new Variable(Item.Substring(1, Item.Length - 2));
    }

    // Otherwise this should be a number.
    double num;
    if (!Double.TryParse(Item, out num)) {
      Utils.ThrowException(script, "parseToken");
    }
    return new Variable(num);
  }


  public string Item { private getset}
}

The IdentityFunction is used when we have an expression in parentheses. The whole Split-and-Merge algorithm is called on that expression in parentheses, because the ParsingScript.ExecuteTo function is just a wrapper over, it as you can see in Code Listing 9.

Code Listing 9: The implementation of ParsingScript Execute, ExecuteTo, and ExecuteFrom functions

public Variable ExecuteTo(char to = '\0')
{
  return ExecuteFrom(Pointer, to);
}


public Variable ExecuteFrom(int from, char to = '\0')
{
  Pointer = from;
  char[] toArray = to == '\0' ? Constants.END_PARSE_ARRAY :
                                to.ToString().ToCharArray();
  return Execute(toArray);
}


public Variable Execute(char[] toArray)
{
  if (!m_data.EndsWith(Constants.END_STATEMENT.ToString())) {
    m_data += Constants.END_STATEMENT;
  }
  return Parser.SplitAndMerge(this, toArray);
}

The StringOrNumberFunction is a “Catch All” function that is called when no function is found that corresponds to the current token. Therefore, it yields either a number or a string in quotes. Otherwise, an exception is thrown.

Registering a function with the parser

A typical implementation of a function deriving from the ParserFunction is shown in Code Listing 10.

In order to use this function with the parser, one needs to register it using the RegisterFunction static method (refer to Code Listing 7).

Code Listing 10: An example of a ParserFunction: the exponential function

class ExpFunction : ParserFunction
{
  protected override Variable Evaluate(ParsingScript script)
  {
    Variable result = script.ExecuteTo(Constants.END_ARG);
    result.Value = Math.Exp(result.Value);
    return result;
  }
}

Basically, to register a function with the parser, there are three steps:

  1. Find an appropriate function name and define it a s a constant (I define all the constants in the Constants class):

public const string EXP = "exp";

  1. Implement a new class, having ParserFunction as its base class, and override its Evaluate method (as in Code Listing 10).
  1. Register this function with the Parser:

ParserFunction.RegisterFunction(Constants.EXP, new ExpFunction());

That’s it! After completing these steps, you can add any function you wish and use it with the parser. For example, we can now use the exponential function that we defined previously in expressions like 1+exp(10), and so on.

The Implementation of the Merge

In the second part of the algorithm, we merge the list of variables created in the previous step one by one, applying recursion if we can’t merge items right away.

See Code Listing 11 for the implementation details.

From the outside, the Merge method is called with the mergeOneOnly argument set to false, so it will not return before completing the whole merging part.

When the Merge method is called recursively (when the left and right cells cannot be merged with each other because of their priorities), the Merge method will be called recursively with the mergeOneOnly parameter set to true.

This is because we want to return to where we were as soon as we complete one merge in the MergeCells method (“cell” is a synonym of “variable” in our context).

Code Listing 11: The implementation of the Merge part of the algorithm

static Variable Merge(Variable current, ref int index,

               List<Variable> listToMerge, bool mergeOneOnly = false)
{

  while (index < listToMerge.Count)  {
    Variable next = listToMerge[index++];

    while (!CanMergeCells(current, next)) {
      // If we cannot merge cells yet, go to the next cell and merge
      // next cells first. E.g. if we have 1+2*3, we first merge next
      // cells, i.e. 2*3, getting 6, and then we can merge 1+6.
      Merge(next, ref index, listToMerge, true /* mergeOneOnly */);
    }

    MergeCells(current, next);
    if (mergeOneOnly)  {
      break;
    }
  }

  return current;
}

We merge variables one by one, and for each pair we invoke the CanMergeCells method, shown in Code Listing 12. This method just compares priorities of the actions of each cell. We saw the priorities of each action in Code Listing 1.

Code Listing 12 also shows the implementation of the MergeCells method. There is a special treatment of the variables Break and Continue. It consists in not doing nothing. We will see the Break and Continue variables in more detail in the next chapter.

Code Listing 12: The implementation of the CanMerge and MergeCells methods

static bool CanMergeCells(Variable leftCell, Variable rightCell)
{
  return GetPriority(leftCell.Action) >= GetPriority(rightCell.Action);
}

static void MergeCells(Variable leftCell, Variable rightCell)
{

  if (leftCell.Type == Variable.VarType.BREAK ||
      leftCell.Type == Variable.VarType.CONTINUE) {
    // Done!
    return;
  }

  if (leftCell.Type  == Variable.VarType.NUMBER) {
    MergeNumbers(leftCell, rightCell);
  }  else {
    MergeStrings(leftCell, rightCell);
  }

  leftCell.Action = rightCell.Action;
}

Depending on the cells being merged, they will be merged either as strings or as numbers. Code Listing 13 shows the implementation of the MergeNumbers method.

Code Listing 13: The implementation of the MergeNumbers method

static void MergeNumbers(Variable leftCell, Variable rightCell)
{
  switch (leftCell.Action) {
    case "^": leftCell.Value = Math.Pow(leftCell.Value, rightCell.Value);
      break;
    case "%": leftCell.Value %= rightCell.Value;
      break;
    case "*": leftCell.Value *= rightCell.Value;
      break;
    case "/":
      if (rightCell.Value == 0.0) {
        throw new ArgumentException("Division by zero");
      }
      leftCell.Value /= rightCell.Value;
      break;
    case "+":
      if (rightCell.Type != Variable.VarType.NUMBER) {
        leftCell.String = leftCell.AsString() + rightCell.String;
      } else {
        leftCell.Value += rightCell.Value;
      }
      break;
    case "-":  leftCell.Value -= rightCell.Value;
      break;
    case "<":  leftCell.Value = Convert.ToDouble(                                       

                                  leftCell.Value < rightCell.Value);
      break;
    case ">":  leftCell.Value = Convert.ToDouble(                                                  

                                  leftCell.Value > rightCell.Value);
      break;
    case "<=": leftCell.Value = Convert.ToDouble(                                       

                                  leftCell.Value <= rightCell.Value);
      break;
    case ">=": leftCell.Value = Convert.ToDouble(                                       

                                  leftCell.Value >= rightCell.Value);
      break;
    case "==": leftCell.Value = Convert.ToDouble(                                       

                                  leftCell.Value == rightCell.Value);
      break;
    case "!=": leftCell.Value = Convert.ToDouble(                                       

                                  leftCell.Value != rightCell.Value);
      break;
    case "&&": leftCell.Value = Convert.ToDouble(
                                  Convert.ToBoolean(leftCell.Value) &&

                                  Convert.ToBoolean(rightCell.Value));
      break;
    case "||": leftCell.Value = Convert.ToDouble(
                                  Convert.ToBoolean(leftCell.Value) ||

                                  Convert.ToBoolean(rightCell.Value));
      break;
  }
}

Code Listing 14 shows the implementation of the MergeStrings method. Note that this method may result in a variable of the Number type, in case the action is a comparison operator (for example, “a” > “z” leads to a variable of type Number and value 0). Note that we represent Boolean values internally as integers with 0 corresponding to “false” and 1 corresponding to “true.”

Code Listing 14: The implementation of the MergeStrings method

static void MergeStrings(Variable leftCell, Variable rightCell)
{
  switch (leftCell.Action) {
    case "+": leftCell.String = leftCell.AsString() + rightCell.AsString();
         break;
    case "<":  leftCell.Value = Convert.ToDouble(
         string.Compare(leftCell.AsString(), rightCell.AsString()) < 0);
         break;
    case ">":  leftCell.Value = Convert.ToDouble(
         string.Compare(leftCell.AsString(), rightCell.AsString()) > 0);
         break;
    case "<=": leftCell.Value = Convert.ToDouble(
         string.Compare(leftCell.AsString(), rightCell.AsString()) <0);
         break;
    case ">=": leftCell.Value = Convert.ToDouble(
         string.Compare(leftCell.AsString(), rightCell.AsString()) >0);
         break;
    case "==": leftCell.Value = Convert.ToDouble(
         string.Compare(leftCell.AsString(), rightCell.AsString()) == 0);
         break;
    case "!=": leftCell.Value = Convert.ToDouble(
         string.Compare(leftCell.AsString(), rightCell.AsString()) != 0);
         break;
    defaultthrow new ArgumentException("Can't perform action [" +
                       leftCell.Action + "] on strings");        
  }
}

The MergeStrings method heavily relies on the Variable.AsString method, which will be shown later on (you can jump to Code Listing 26 to check it out).

Note that at the end of the Merge method, only one Variable will be left and returned back. That Variable will contain the final result of the calculation.

Conclusion

We have seen how to parse a mathematical expression, containing any number of functions (previously registered with the parser) and any number of nested parentheses.

Note that the code we developed in this chapter is still not quite bullet-proof. Some of the items I took away in order to make the algorithm look easier and concentrate on the remaining items in later chapters. Some of these missing items that we are going to catch later on are:

  • The “short-circuit” evaluation (when we stop calculating a complex logical condition once we know for sure that it will be true or false, regardless of the remaining terms)
  • Processing logical negation
  • Processing a string in quotes
  • Processing arrays and dictionaries
  • Error handling

Prefix and postfix operators, assignments, etc.

Scroll To Top
Disclaimer
DISCLAIMER: Web reader is currently in beta. Please report any issues through our support system. PDF and Kindle format files are also available for download.

Previous

Next



You are one step away from downloading ebooks from the Succinctly® series premier collection!
A confirmation has been sent to your email address. Please check and confirm your email subscription to complete the download.