TREE PATTERN MATCHING AND TRANSFORMATION RULES v.05 Mathematical expression trees that are not able to be changed are not very useful. Therefore, computer algebra systems provide ways to replace parts of a tree such as literal substitution and pattern matching transformation rules. In this worksheet, the variables x, y and z will be turned into symbols. In> Symbols(x, y, z) %output,preserve="false" Result: True . %/output ATOMIC EXPRESSION TREES A tree that only has one node is called an atomic expression tree and is abbreviated as just "atom". Examples of atoms are numerals, strings, and strings that have had their quotes removed from their ends. The following example shows the tree for the atom '2'. %mathpiper ViewTreeParts(2, Scale:1.5, FontSize:20); %/mathpiper %output,preserve="false" Result: . %/output The "Atom?" procedure can be used to determine if a tree is an atom or not. In the following example, '2' is an atom, and the string "Box" is an atom. However, the expression 1 + 2 is not atom because its tree has more than one node. %mathpiper Print(Atom?(2)); Print(Atom?("Box")); Print(Atom?('(1 + 2))); %/mathpiper %output,preserve="false" Result: True Side Effects: True True False . %/output The "ToAtom" procedure can be used to remove the quotes from the ends of strings and the result is an atom which is not evaluated. For example, the following code removes the quotes from the string "Box" and returns the resulting atom 'Box' unevaluated. In> ToAtom("Box") %output,preserve="false" Result: Box . %/output The "ToAtom" procedure is often used to convert strings into unevaluated operators for the purposes of substituting and pattern matching. SUBSTITUTING TREES FOR LITERAL PARTS OF A TREE A capability that is often used in a computer algebra system is substituting a tree for parts of a tree that are specified literally. Lets experiment with substituting parts of an arithmetic expression tree that contains numerals and a single variable. In> tree1 := '((1*2 + x)==(3*(4 - 5/x))) %output,preserve="false" Result: 1*2 + x == 3*(4 - 5/x) . %/output %mathpiper ViewTreeParts(tree1, Scale:1.5, FontSize:20, ShowPositions:True); %/mathpiper %output,preserve="false" Result: . %/output The "Substitute" procedure is used to substitute a tree for parts of a tree that can be specified literally. The calling format for "Substitute" is: Substitute(from, to) tree The argument "from" specifies the parts of the tree to match, the argument "to" is the replacement tree, and the argument "tree" is the original tree that substitutions will be applied to. This procedure searches a tree from its root to its leaves looking for all parts of the tree that literally match the tree that was given as an argument. In the following code, '2222' is substituted for the numeral '2' using "Substitute". The "Substitute" procedure works with a copy of the original tree and returns this copy as its result. %mathpiper tree2 := Substitute(2, 2222) tree1; ViewTreeParts(tree2, Scale:1.5, FontSize:20, NodeHighlight:["1,1,2"]); %/mathpiper %output,preserve="false" Result: . %/output The the tree 5*5 can be substituted for the subtree 1 * 2. %mathpiper tree2 := Substitute('(1*2), '(5*5)) tree1; ViewTreeParts(tree2, Scale:1.5, FontSize:20, NodeHighlight:["1,1", "1,1,1", "1,1,2"]); %/mathpiper %output,preserve="false" Result: . %/output The operator '-' can be substituted for the operator '+' at position 1 using the "ToAtom" procedure. %mathpiper tree2 := Substitute('+, '-) tree1; ViewTreeParts(tree2, Scale:1.5, FontSize:20, NodeHighlight:["1"]); %/mathpiper %output,preserve="false" Result: . %/output The "Substitute" procedure replaces all occurrences of the part of a tree it is searching for. In the following example, all occurrences of the variable 'x' are replaced with the variable 'z'. %mathpiper Substitute(x, z) tree1; %/mathpiper %output,preserve="false" Result: 1*2 + z == 3*(4 - 5/z) . %/output The "Substitute" procedure can also be used to substitute parts of code trees. In the following example, a frozen tree is assigned to the variable "expression". %mathpiper expression := ''{ result := []; index := 14; While(index <=? 33) { result := Append(result, index); index++; }; result; }; ViewTreeParts(expression, Scale:1.5, ShowPositions:True, NodeHighlight:["1,3", "1,3,1"], ArcsAutoHighlight:False, IncludeExpression:False, Code:True); %/mathpiper %output,preserve="false" Result: . %/output In the result of the following substitution, notice how 'While' has been replaced by 'Until', and the operator "<=?" had been replaced by the operator ">?". %mathpiper Substitute('While, 'Until) Substitute(ToAtom("<=?"), ToAtom(">?")) expression; %/mathpiper %output,preserve="false" Result: '' { result := []; index := 14; Until(index >? 33) { result := Append(result,index); index++; } result; } . %/output LOCATING PARTS OF TREES USING PATTERN MATCHING Specifying parts of trees to locate literally is useful but limited. A much more powerful way to specify parts of trees to locate is with pattern matching. Pattern matching also searches a tree top to bottom like the "Substitute" procedure does. LITERAL PATTERNS The simplest kinds of patterns are literal trees that work just like the literal trees that the "Substitute" procedure uses to locate parts of trees. In the following examples, a numeral, a variable, and an operator are each located in a tree using literal pattern matching. The "Pattern" option of the "ViewTreeParts" procedure is used to pass the pattern that this procedure should use when searching the given tree. The following code locates all occurrences of the numeral '3' in the three. In this case there is just one occurrence. %mathpiper ViewTreeParts(tree1, Pattern: 3, Scale:1.5, FontSize:20); %/mathpiper %output,preserve="false" Result: . %/output The following example locates the two occurences of the unknown 'x' that are in the tree. %mathpiper ViewTreeParts(tree1, Pattern: x, Scale:1.5, FontSize:20); %/mathpiper %output,preserve="false" Result: . %/output Finally, this example locates the two occurrences of the operator '*' that are in the tree. %mathpiper ViewTreeParts(tree1, Pattern: ''*, Scale:1.5, FontSize:20); %/mathpiper %output,preserve="false" Result: . %/output MATCHING PARTS OF TREES BY THEIR ATTRIBUTES ATTRIBUTES, PROPERTIES, AND RELATIONS The real power in pattern matching comes from its ability to match a part of a tree based on the part's attributes. Everything that exists, including parts of trees, have attributes that can be used to describe it. There are two kinds of attributes, which are properties and relations. Examples of properties include color, shape, odd, and prime. Some properties of the sun are it is yellow and it is spherical. Some properties of the number 7 are it is odd and it is prime. Examples of relations include heaver-than, hotter-than, less-than, and greater-than. Some relations of the sun are it is heavier than Jupiter, and it is hotter than Venus. Some relations of the number 7 are it is greater-than 3 and it is less-than 9. In expression trees there are structural relations between each operator and its operand or operands. In code trees there are structural relations between procedures and their arguments. In logic, the designations of properties and relations are called predicates. In MathPiper, all predicates end with a question mark character '?' so they can be easily identified. For example, "Even?", "List?", and "GreaterThan?" are procedures that are predicates and "<?" and ">?" are operators that are predicates. Predicates that have a single argument specify properties. Examples of predicates that specify properties are "Integer?", "Even?", "Prime?", and "Atom?". Predicates that have two or more arguments specify relations. Examples of predicates that specify relations are "<?" and ">?". Structural relations are usually specified using trees that contain pattern variables, but they can also be specified with literal trees. PATTERN VARIABLES MATCHING TREE PARTS USING PROPERTIES The way MathPiper enables parts of a tree to be specified using properties is with PATTERN VARIABLES. A pattern variable is defined by using any underscore constant that DOES NOT BEGIN with an underscore character. Examples of pattern variables include x_ , n_, var_, and num_. Underscore constants that BEGIN with an underscore character are simply constants that have no additional meaning associated with them. So, _x, _n, _var, and _num are constants that are not pattern variables. A pattern variable that ends with an underscore will match any part of a tree regardless of the part's properties, so n_ will match any tree part. A pattern variable that matches a tree part using one or more properties can be created by appending the names of one or more single argument predicates to it. For example, n_Atom? matches all parts of a tree that have the property of being an atom. %mathpiper ViewTreeParts(tree1, Pattern:(n_Atom?), ArcsAutoHighlight:False, Scale:1.5, FontSize:20); %/mathpiper %output,preserve="false" Result: . %/output The pattern variable n_Integer? matches all nodes in a tree that have the property of being an integer. %mathpiper ViewTreeParts(tree1, Pattern:(n_Integer?), ArcsAutoHighlight:False, Scale:1.5, FontSize:20); %/mathpiper %output,preserve="false" Result: . %/output The pattern variable n_Even?_Integer? matches all nodes in a tree that have the property of being even and the property of being an integer. Notice that multiple predicates in a pattern variable are separated by underscore characters. %mathpiper ViewTreeParts(tree1, Pattern:(n_Even?_Integer?), ArcsAutoHighlight:False, Scale:1.5, FontSize:20); %/mathpiper %output,preserve="false" Result: . %/output MATCHING TREE PARTS USING STRUCTURAL RELATIONS The way MathPiper enables parts of a tree to be specified using structural relations is with operators and their operands. For example, the structural relation "(1 * 2)" can be matched using the literal pattern '(1 * 2). %mathpiper ViewTreeParts(tree1, Pattern:''(1 * 2), Scale:1.5, FontSize:20); %/mathpiper %output,preserve="false" Result: . %/output All multiplication structural relations in the tree can be matched by using pattern variables for the multiplication operator's operands. Both of the pattern variables that are used in the following example match any subtrees that may be used as operands of any multiplication operator in the tree. %mathpiper ViewTreeParts(tree1, Pattern:(n_ * m_), Scale:1.5, FontSize:20); %/mathpiper %output,preserve="false" Result: . %/output Patterns that have the ability to match based on both structural relations and properties of tree parts can be created by combining operators with pattern variables that have predicates associated with them. In the following code only multiplication operations that have second operands that are integers are matched. %mathpiper ViewTreeParts(tree1, Pattern:(n_ * m_Integer?), Scale:1.5, FontSize:20); %/mathpiper %output,preserve="false" Result: . %/output POST PREDICATES When a pattern matches a part of a tree, all of the pattern variables in the pattern have the part of the tree they matched assigned to them as a value. The pattern then provides an optional method of determining if the part of the tree that the pattern matched has additional properties or relations that could not be specified in the pattern due to lack of space. This additional matching method is called a POST PREDICATE, and it is created using the "::" POST PREDICATE OPERATOR. The post-predicate operator uses the pattern as its first operand, and its second operand can contain any code that returns True or False when it is evaluated. <pattern>::<post predicate code> Post predicate code is typically used to check the matched tree parts (that were assigned to the pattern variables) for additional properties and relations. However, all of the property checking that is done in the pattern could be moved into the post-predicate if desired. In the following code, the "Echo" procedure is used in the post-predicate to print each value that is assigned to the predicate variable n_ as this variable matches various parts of the tree. The "Echo" procedure can be used by itself here because it always returns "True". %mathpiper ViewTreeParts(tree1, Pattern:( (n_Even?_Integer?)::Echo(n) ), Scale:1.5, FontSize:20); %/mathpiper %output,preserve="false" Result: Side Effects: 2 4 . %/output The following code contains a pattern with a post predicate that only matches even integers that are greater than 3. %mathpiper ViewTreeParts(tree1, Pattern:( (n_Even?_Integer?)::(n >? 3) ), Scale:1.5, FontSize:20); %/mathpiper %output,preserve="false" Result: . %/output Sometimes it is useful to place one or more "Echo" procedures into a post predicate without having them affect the post-predicate checking code. This can be done by using a code sequence. %mathpiper ViewTreeParts(tree1, Pattern:( (n_Even?_Integer?)::{Echo(n); n >? 3;} ), Scale:1.5, FontSize:20); %/mathpiper %output,preserve="false" Result: Side Effects: 2 4 . %/output TRANSFORMATION RULE = PATTERN MATCHING + TREE PART REPLACEMENT In previous sections we saw that locating parts of trees using pattern matching is more powerful than locating them literally like the "Substitute" procedure does. However, locating a part of a tree using pattern matching would not be very useful without the ability to also replace the matched part with another tree. Transformation rules combine the power of pattern matching with the ability to replace the matched tree part with another tree. MathPiper contains two kinds of transformation rules which are global transformation rules and local transformation rules. Local transformation rules will be covered first. LOCAL TRANSFORMATION RULES Local transformation rules consist of a pattern followed the "<-" local transformation rule operator followed by the tree that will replace the tree part if the pattern matches it: pattern <- replacement_tree A local transformation rule can also have a post predicate: pattern :: post_predicate <- replacement_tree Either the "/:" single-pass local transformation rule operator or the "/::" multiple-pass local transformation rule operator can be used to apply one or more local transformation rules to a tree. The transformation rules are passed to these operators in a list. The difference between these two operators is the "/:" operator will only make one pass through the list of local transformation rules it is given while the "/::" operator will continue making passes through the transformation rules until the tree does not change anymore. Lets start with the "/:" operator. The following code replaces all numbers and variables in the expression that is assigned to "tree3" with English equivalents. %mathpiper tree3 := '((1*2 + x)==(3*(4 - 5/x))); tree3 /: [ 1 <- "One", 2 <- "Two", 3 <- "Three", 4 <- "Four", 5 <- "Five", x <- "The Unknown" ]; %/mathpiper %output,preserve="false" Result: "One"*"Two" + "The Unknown" == "Three"*("Four" - "Five"/"The Unknown") . %/output Since the "/:" operator only makes one pass through the local transformation rules it is given, only one of the additions in the following example will be matched and rewritten. %mathpiper tree3 := '(1 + 2 + 3); tree3 /: [ q_Number? + r_Number? <- q+r, ]; %/mathpiper %output,preserve="false" Result: 3 + 3 . %/output The "/::" operator will continue making passes through the transformation rules it is given until the tree stops changing. In the following code this has the effect of fully evaluating the tree. %mathpiper tree3 := '(1 + 2 + 3); tree3 /:: [ q_Number? + r_Number? <- q+r, ]; %/mathpiper %output,preserve="false" Result: 6 . %/output Only five transformation rules are needed to process trees that contain arithmetic operators. Notice that the transformation rule for the division operator contains a post-predicate that prevents the rule from matching if the operator's denominator is 0. %mathpiper tree3 := '(1 + 2/2*4 + 5^6); tree3 /:: [ q_Number? ^ r_Number? <- q^r, (q_Number? / r_Number?)::(r !=? 0) <- q/r, q_Number? * r_Number? <- q*r, q_Number? + r_Number? <- q+r, q_Number? - r_Number? <- q-r, ]; %/mathpiper %output,preserve="false" Result: 15630 . %/output GLOBAL TRANSFORMATION RULES AND THE MATHPIPER RULEBASE Global transformation rules consist of a pattern followed the "<--" global transformation rule operator followed by the tree that will replace the tree part if the pattern matches it: pattern <-- replacement_tree A global transformation rule can also have a post predicate: pattern :: post-predicate <-- replacement_tree MathPiper consists of a large database of global transformation rules called a RULEBASE. At the time it is defined every global transformation rule is placed into a section of the rulebase which is determined by the operator or procedure that is the dominant node in the rule's pattern. For example, all rules related to the "+" operator are grouped together, all rules related to the "*" operator are grouped together, etc. Within a given group, the sequence in which the transformation rules are applied can be specified by an optional precedence number using the "#" precedence operator: precedence # pattern post_predicate <-- replacement_tree If a precedence number is not specified, then the precedence of the transformation rule is set to 0. Rules that have the same precedence can be tried in any order. An example of how global transformation rules are used to handle an operator are the transformation rules for the "+" operator. The following are some of the rules in the rulebase that handle this operator: 50 # x_Number? + y_Number? <-- AddN(x,y); 100 # +x_ <-- x; 100 # 0 + x_ <-- x; 100 # x_ + 0 <-- x; 100 # x_ + n_Constant?*(x_) <-- (n+1)*x; 100 # n_Constant?*(x_) + x_ <-- (n+1)*x; 101 # x_ + - y_ <-- x-y; 101 # x_ + (- y_)/(z_) <-- x-(y/z); 101 # (- y_)/(z_) + x_ <-- x-(y/z); 101 # (- x_) + y_ <-- y-x; 102 # x_ + y_NegativeNumber? <-- x-(-y); 102 # x_ + y_NegativeNumber? * z_ <-- x-((-y)*z); 102 # x_ + (y_NegativeNumber?)/(z_) <-- x-((-y)/z); 102 # (y_NegativeNumber?)/(z_) + x_ <-- x-((-y)/z); 102 # (x_NegativeNumber?) + y_ <-- y-(-x); 150 # n1_ / d_ + n2_ / d_ <-- (n1+n2)/d; 210 # x_Number? + (y_Number? / z_Number?) <--(x*z+y)/z; 210 # (y_Number? / z_Number?) + x_Number? <--(x*z+y)/z; 210 # (x_Number? / v_Number?) + (y_Number? / z_Number?) <--(x*z+y*v)/(v*z); HOW EXPRESSIONS ARE EVALUATED IN MATHPIPER Every time an expression is evaluated in MathPiper, each part in its tree is tested (starting at the root node and working downwards) to determine if any of the global transformation rules in the rulebase match it. If a rule has its pattern match a part of the tree, it replaces this part with another tree. The tree that contains the replaced tree part is then searched again to determine if any of its parts match any of the rules in the rulebase. Further matches result in further replacements, and this process continues until no more rules in the rulebase match any of the parts in the tree. The final unchanging version of the tree is then returned as the result of the evaluation. This process is similar to the way the "/::" operator works except it works on every expression that is evaluated by MathPiper while the "/::" operator only works on the expression that is passed to it.