CHAPTER 5
“The primary duty of an exception handler is to get the error out of the lap of the programmer and into the surprised face of the user.”
Verity Stob
In this chapter we are going to see how to implement custom functions and methods in CSCS. We’ll also see how to throw and catch an exception, since when implementing exception stack, we need to use the information about functions and methods being called (“on the stack”).
Related functions and methods are usually implemented together, in the same file, so it would be a nice feature to include code from different files. We’ll see how to do that next.
To implement including file functionality, we use the same algorithm we already used with other functions—making the implementation in a class deriving from the ParserFunction class.
The IncludeFile derives from the ParserFunction class (see also the UML diagram in Figure 4). Check out its implementation in Code Listing 40.
Code Listing 40: The implementation of the IncludeFile class
class IncludeFile : ParserFunction out char2Line); char2Line); Constants.END_LINE.ToString(), lines); |
The Utils.GetItem method is, on one hand, just a wrapper over the Parser.SplitAndMerge method. Additionally, it takes care of the string expressions between quotes. On the other hand, it also converts an expression between curly braces to an array.
Code Listing 41: The implementation of the GetItem method
public static Variable GetItem(ParsingScript script) Variable value = new Variable(); Constants.END_GROUP, out isList); else { script.MoveForwardIf(Constants.SPACE); |
In addition, the IncludeFile.Evaluate method invokes the Utils.ConvertToScript method, shown in Code Listing 42.
It’s actually one of the key methods in understanding how the CSCS language works. It shows the first, preprocessing step of the script to be parsed. Basically, the method translates the passed string to another string, which our parser can understand. Among other things, this method removes all the text in comments and all unnecessary blanks, like spaces, tabs, or new lines.
Code Listing 42: The implementation of the Utils.ConvertToScript method
public static string ConvertToScript(string source, out Dictionary<int, int> char2Line) { inQuotes = !inQuotes; previous != Constants.NEXT_ARG && spaceOK); // Same for “groups”. Nonzero means there are some unmatched // parentheses or curly braces. |
The Utils.CovertToScript method uses an auxiliary char2Line dictionary. It’s needed to have a reference to the original line numbers in the parsing script in case an exception is thrown (or the code is just wrong), so that the user knows in which line the problem occurred. We’ll see it in more detail in the "Getting line numbers where errors occur” section in Chapter 7: Localization.
If you want to define a new style for comments, it will be in this method. This method also allows the user to have different characters for a quote character: “ and ” characters are both replaced by the " character, which is the only one that our parser understands.
How do we know which spaces must be removed and which not? First of all, if we are inside of quotes, we leave everything as is, because the expression is just a string value.
In other cases, we leave at most one space between tokens. For some functions we need spaces to separate tokens, but for others, spaces aren’t needed and tokens are separated by other characters, for example, by an opening parenthesis.
An example of a function that requires a space as a separation token is a change directory function: cd C:\Windows. A function that doesn’t require a space to separate tokens is any mathematical function, for instance, in sin(2*10). A space between the sine function and the opening parenthesis is never needed.
Code Listing 43 contains auxiliary functions used in the CovertToScript method that determine whether we need to keep a space or not.
Code Listing 43: The implementation of the keeping space functions
public static bool EndsWithFunction(string buffer, List<string> functions) public static bool KeepSpaceOnce(StringBuilder sb, char next) |
As you can see, we distinguish between two types of functions that require a space as a separation token. See Code Listing 44.
Code Listing 44: Functions allowing spaces as separation tokens
// Functions that allow a space separator after them, on top of the // parentheses. The function arguments may have spaces as well, // e.g. “copy a.txt b.txt” public static List<string> FUNCT_WITH_SPACE = new List<string> { FINDFILES, FINDSTR, FUNCTION, MKDIR, MORE, MOVE, PRINT, READFILE, RUN, SHOW, STARTSRV, TAIL, TRANSLATE, WRITE, WRITELINE, WRITENL // parentheses, but only once, i.e. function arguments are not allowed // to have spaces between them e.g. return a*b; throw exc; |
Basically, we want to maintain two modes of operation for our language—a command-line language (or a “shell” language, in Unix terms), which uses mostly spaces as a separation criterion between tokens, and a regular, scripting language, which uses parentheses as a separation criterion.
Let’s return to the implementation of the IncludeFile class in Code Listing 40. Once we’ve converted the string containing the script to a format that we understand (in Code Listing 42), we go over each statement of the script and apply the whole Split-and-Merge algorithm to each statement.
To move to the next statement in the script, we use the ParsingScript.GoToNextStatement auxiliary method (as seen in Code Listing 18). In particular, it deals with cases when the last processed statement is also the last one in a group of statements (between the curly braces), or when we need to get rid of the character separating different statements (defined as END_STATEMENT = ';' in the Constants class).
To throw an exception, we use the same approach we already used with other control flow functions: we implement the ThrowFunction class as a ParserFunction class. The ThrowFunction class is included in Figure 4, and its implementation is in Code Listing 45.
Code Listing 45: The implementation of the ThrowFunction class
class ThrowFunction : ParserFunction |
This class is registered with the parser as follows:
public const string THROW = "throw"; ParserFunction.RegisterFunction(Constants.THROW, new ThrowFunction());
This means that as soon as our parser sees something like:
throw "Critical exception!";
In CSCS code, our C# code will throw an exception. How can we catch it?
To catch an exception, we must have a try block. The processing of the catch module will follow the processing of the try block. The class TryBlock is also derived from the ParserFunction class; see Figure 4. Its implementation is in Code Listing 46. The main functionality is in the Interpreter class, where we can reuse the already implemented ProcessBlock (Code Listing 17), SkipBlock (Code Listing 19), and SkipRestBlocks (Code Listing 20) methods.
Code Listing 46: The implementation of the try and catch functionality
class TryBlock : ParserFunction internal Variable ProcessTry(ParsingScript script) Exception exception = null; currentStackLevel); new Variable(exception.Message + excStack)); |
The TryBlock class is registered with the parser as follows:
public const string TRY = "try";
ParserFunction.RegisterFunction(Constants.TRY, new TryBlock());
When we catch an exception, we then also create an exception stack; see Code Listing 47.
Code Listing 47: The implementation of the Interpreter.CreateExceptionStack method
static string CreateExceptionStack(string exceptionName, int lowestStackLevel) { foreach (ParserFunction.StackLevel stackLevel in stack) { |
In order to use the exception data in CSCS, we add a variable containing exception information. This variable is the GetVarFunction class, which we add to the parser:
ParserFunction.AddGlobalOrLocalVariable(exceptionName, excFunc);
The excFunc variable is of type GetVarFunction; see its implementation in Code Listing 48. The GetVarFunction class is just a wrapper over the exception being thrown. We register it with the parser using the exception name, so that as soon as the CSCS code accesses the exception by its name, it gets the exception information that we supplied. You can easily add more fancy things to the exception information there, like having separate fields for the exception name and the exception stack. We’ll see some examples of exceptions at the end of this chapter.
Code Listing 48: The implementation of the GetVarFunction class
class GetVarFunction : ParserFunction protected override Variable Evaluate(ParsingScript script) return m_value; private Variable m_value; } |
To make everything work, we have defined a few new data structures to the ParserFunction class. In particular, the StackLevel class contains all the local variables used inside of a CSCS function; see Code Listing 49.
The Stack<StackLevel> s_locals member holds a stack having the local variables for each function being called on the stack. The AddLocalVariable and AddStackLevel methods add a new local variable and, correspondingly, a new StackLevel.
Note: This is the place where you want to disallow local names if there is already a global variable or function with the same name.
The Dictionary<string, ParserFunction> s_functions member holds all the global variables and functions (in CSCS all variables and functions are the same; both derive from the ParserFunction class). The keys to the dictionary are the function or variable names. The RegisterFunction and AddGlobal methods both add a new variable or a function. There is also the isNative Boolean flag, which indicates whether the function is implemented natively in C# or is a custom function implemented in CSCS.
When trying to associate a function or a variable name to the actual function or variable, the GetFunction is called. Note that it first searches the local names—they have a precedence over the global names.
Code Listing 49: The implementation of the global and local variables in the ParserFunction class
public class ParserFunction public class StackLevel { new Dictionary<string, ParserFunction>(); // Local variables: get { return s_locals; } } public static ParserFunction GetFunction(string item) // (e.g. pi, exp, or a variable). public static bool FunctionExists(string item) public static void AddGlobalOrLocalVariable(string name, ParserFunction function) { public static void RegisterFunction(string name, ParserFunction function, static void AddGlobal(string name, ParserFunction function, } public static void AddLocalVariables(StackLevel locals) public static void AddStackLevel(string name) public static void PopLocalVariables() public static int GetCurrentStackLevel() public static void InvalidateStacksAfterLevel(int level) public static void PopLocalVariable(string name) } |
To implement custom methods and functions in CSCS, we need two classes, FunctionCreator and CustomFunction, both deriving from the ParserFunction class; see Figure 4. The FunctionCreator class is shown in Code Listing 50. We register it with the parser as follows:
public const string FUNCTION = "function";
ParserFunction.RegisterFunction(Constants.FUNCTION, new FunctionCreator());
Note: No worries, we’ll see very soon how to redefine a function name in the configuration file.
So a typical function in our language looks like:
function functionName(param1, param2, ..., paramN) {
// Function Body;
}
Code Listing 50: The implementation of the FunctionCreator class
class FunctionCreator : ParserFunction Constants.END_GROUP); false /* not native */); |
First, the FunctionCreator.Evaluate method calls an auxiliary Utils.GetToken method, which extracts the functionName in the function definition. Then, the Utils.GetFunctionSignature auxiliary function gets all the function arguments, see Code Listing 51.
Note that we do not have explicit types in our language: the types are deduced on the fly from the context. Therefore, the result of the Utils.GetFunctionSignature function is an array of strings, like arg1, arg2, …, argN. An example of a function signature is: function power(a, n).
Code Listing 51: The implementation of the GetFunctionSignature method
public static string[] GetFunctionSignature(ParsingScript script) |
The auxiliary Utils.GetBodyBetween method extracts the actual body of the function in Code Listing 52. As method arguments, we pass the open character as Constants.START_GROUP (which I defined as { ), and the close character as Constants.END_GROUP (which I defined as }).
Code Listing 52: The implementation of the GetBodyBetween method
public static string GetBodyBetween(ParsingScript script, char open, char close) { |
Basically, the FunctionCreator class creates a new instance of the CustomFunction class (see its implementation in Code Listing 53) and registers it with the parser.
Code Listing 53: The implementation of the CustomFunction class
class CustomFunction : ParserFunction "] arguments mismatch: " + m_args.Length + " declared, " + functionArgs.Count + " supplied"); new GetVarFunction(functionArgs[i]); return Constants.FUNCTION + " " + Name + " " + } |
We call the Utils.GetArgs auxiliary function to extract the arguments (defined in Code Listing 25).
We’ll see the usage of the m_parentOffset, m_parentScript (and its properties: Char2Line, Filename, and OriginalScript) in the "Getting line numbers where errors ” section in Chapter 7 Localization.
Let’s now see custom functions and exceptions in action. As an example, we’ll create a factorial. The factorial function, denoted as n!, is defined as follows:
n! = 1 * 2 * 3 * … * (n - 1) * n.
There is a special definition when n = 0: 0! = 1. The factorial is not defined for negative numbers. Note that we can also define the factorial recursively: n! = (n - 1)! * n.
Code Listing 54: The CSCS code for the factorial recursive implementation
function factorial(n) function isInteger(candidate) { return candidate == round(candidate); } function factorialHelper(n) |
The implementation of the recursive version of the factorial in CSCS is shown in Code Listing 54. The factorial function uses the isInteger CSCS function to check if the passed parameter is an integer. This function is implemented in Code Listing 54 as well. It calls the round function, which was already implemented in C# (see Code Listing 34).
These are the results of running the CSCS script of Code Listing 54:
factorial(0)=1
factorial(10)=3628800
Caught exception: Factorial is for nonnegative integers only (n=blah) --> exc
factorial()
factorialHelper()
In the Caught exception clause you can see the exception stack, which was produced by the CreateExceptionStack method (see Code Listing 47). One can also add the line numbers where the exception occurred. We’ll see more about that in Chapter 7 Localization.
In this chapter we continued adding functionality to the CSCS language: we saw how to include files, to throw and to catch an exception, how to add local and global variables, and how to implement custom functions in CSCS. We also saw some examples of writing custom functions.
In the next chapter we are going to see how to develop some data structures to be used in CSCS.