subtree_substituting_and_pattern

TREE PATTERN MATCHING AND TRANSFORMATION RULES
v.05

Mathematical expression trees that are not able to
be changed are not very useful. Therefore,
computer algebra systems provide ways to replace
parts of a tree such as literal substitution and
pattern matching transformation rules.

In this worksheet, the variables x, y and z
will be turned into symbols.

In> Symbols(x, y, z)

    %output,preserve="false"
      Result: True
.   %/output


ATOMIC EXPRESSION TREES

A tree that only has one node is called an atomic
expression tree and is abbreviated as just "atom".
Examples of atoms are numerals, strings, and
strings that have had their quotes removed from
their ends. The following example shows the tree
for the atom '2'.

%mathpiper

ViewTreeParts(2, Scale:1.5, FontSize:20);

%/mathpiper

    %output,preserve="false"
      Result: 
.   %/output




The "Atom?" procedure can be used to determine if
a tree is an atom or not. In the following
example, '2' is an atom, and the string "Box" is
an atom. However, the expression 1 + 2 is not atom
because its tree has more than one node.

%mathpiper

Print(Atom?(2));
Print(Atom?("Box"));
Print(Atom?('(1 + 2)));

%/mathpiper

    %output,preserve="false"
      Result: True
      
      Side Effects:
      True
      True
      False
      
.   %/output




The "ToAtom" procedure can be used to remove the
quotes from the ends of strings and the result is
an atom which is not evaluated. For example, the
following code removes the quotes from the string
"Box" and returns the resulting atom 'Box'
unevaluated.

In> ToAtom("Box")

    %output,preserve="false"
      Result: Box
.   %/output

The "ToAtom" procedure is often used to convert
strings into unevaluated operators for the
purposes of substituting and pattern matching.
 



SUBSTITUTING TREES FOR LITERAL PARTS OF A TREE

A capability that is often used in a computer
algebra system is substituting a tree for parts of
a tree that are specified literally. Lets
experiment with substituting parts of an
arithmetic expression tree that contains numerals
and a single variable.

In> tree1 := '((1*2 + x)==(3*(4 - 5/x)))

    %output,preserve="false"
      Result: 1*2 + x == 3*(4 - 5/x)
.   %/output


%mathpiper

ViewTreeParts(tree1, Scale:1.5, FontSize:20, ShowPositions:True);

%/mathpiper

    %output,preserve="false"
      Result: 
.   %/output




The "Substitute" procedure is used to substitute a
tree for parts of a tree that can be specified
literally. The calling format for "Substitute" is:

Substitute(from, to) tree

The argument "from" specifies the parts of the
tree to match, the argument "to" is the
replacement tree, and the argument "tree" is the
original tree that substitutions will be applied
to.

This procedure searches a tree from its root to
its leaves looking for all parts of the tree that
literally match the tree that was given as an
argument. In the following code, '2222' is
substituted for the numeral '2' using
"Substitute". The "Substitute" procedure works
with a copy of the original tree and returns this
copy as its result.

%mathpiper

tree2 := Substitute(2, 2222) tree1;

ViewTreeParts(tree2, Scale:1.5, FontSize:20, NodeHighlight:["1,1,2"]);

%/mathpiper

    %output,preserve="false"
      Result: 
.   %/output




The the tree 5*5 can be substituted for the
subtree 1 * 2.

%mathpiper

tree2 := Substitute('(1*2), '(5*5)) tree1;

ViewTreeParts(tree2, Scale:1.5, FontSize:20, NodeHighlight:["1,1", "1,1,1", "1,1,2"]);

%/mathpiper

    %output,preserve="false"
      Result: 
.   %/output




The operator '-' can be substituted for the
operator '+' at position 1 using the "ToAtom"
procedure.

%mathpiper

tree2 := Substitute('+, '-) tree1;

ViewTreeParts(tree2, Scale:1.5, FontSize:20, NodeHighlight:["1"]);

%/mathpiper

    %output,preserve="false"
      Result: 
.   %/output




The "Substitute" procedure replaces all
occurrences of the part of a tree it is searching
for. In the following example, all occurrences of
the variable 'x' are replaced with the variable
'z'.

%mathpiper

Substitute(x, z) tree1;

%/mathpiper

    %output,preserve="false"
      Result: 1*2 + z == 3*(4 - 5/z)
.   %/output




The "Substitute" procedure can also be used to
substitute parts of code trees. In the following
example, a frozen tree is assigned to the variable
"expression".

%mathpiper

expression := 
''{
    result := [];
    
    index := 14;
    
    While(index <=? 33)
    {
        result := Append(result, index);
        
        index++;
    };
    
    result;
};

ViewTreeParts(expression, Scale:1.5, ShowPositions:True, NodeHighlight:["1,3", "1,3,1"], ArcsAutoHighlight:False, IncludeExpression:False, Code:True);

%/mathpiper

    %output,preserve="false"
      Result: 
.   %/output




In the result of the following substitution,
notice how 'While' has been replaced by 'Until',
and the operator "<=?" had been replaced by the
operator ">?".

%mathpiper

Substitute('While, 'Until) Substitute(ToAtom("<=?"), ToAtom(">?")) expression;

%/mathpiper

    %output,preserve="false"
      Result: 
      ''
      {
        result := [];
      
        index := 14;
      
        Until(index >? 33)
        {
          result := Append(result,index);
      
          index++;
        }
      
        result;
      }
.   %/output




LOCATING PARTS OF TREES USING PATTERN MATCHING

Specifying parts of trees to locate literally is
useful but limited. A much more powerful way to
specify parts of trees to locate is with pattern
matching. Pattern matching also searches a tree
top to bottom like the "Substitute" procedure
does.


LITERAL PATTERNS

The simplest kinds of patterns are literal trees
that work just like the literal trees that the
"Substitute" procedure uses to locate parts of
trees.

In the following examples, a numeral, a variable,
and an operator are each located in a tree using
literal pattern matching. The "Pattern" option of
the "ViewTreeParts" procedure is used to pass the
pattern that this procedure should use when
searching the given tree.

The following code locates all occurrences of the
numeral '3' in the three. In this case there is
just one occurrence.

%mathpiper

ViewTreeParts(tree1, Pattern: 3, Scale:1.5, FontSize:20);

%/mathpiper

    %output,preserve="false"
      Result: 
.   %/output



The following example locates the two occurences
of the unknown 'x' that are in the tree.

%mathpiper

ViewTreeParts(tree1, Pattern: x, Scale:1.5, FontSize:20);

%/mathpiper

    %output,preserve="false"
      Result: 
.   %/output



Finally, this example locates the two occurrences
of the operator '*' that are in the tree.

%mathpiper

ViewTreeParts(tree1, Pattern: ''*, Scale:1.5, FontSize:20);

%/mathpiper

    %output,preserve="false"
      Result: 
.   %/output




MATCHING PARTS OF TREES BY THEIR ATTRIBUTES

ATTRIBUTES, PROPERTIES, AND RELATIONS

The real power in pattern matching comes from its
ability to match a part of a tree based on the
part's attributes. Everything that exists,
including parts of trees, have attributes that can
be used to describe it. There are two kinds of
attributes, which are properties and relations.

Examples of properties include color, shape, odd,
and prime. Some properties of the sun are it is
yellow and it is spherical. Some properties of the
number 7 are it is odd and it is prime.

Examples of relations include heaver-than,
hotter-than, less-than, and greater-than. Some
relations of the sun are it is heavier than
Jupiter, and it is hotter than Venus. Some
relations of the number 7 are it is greater-than 3
and it is less-than 9.

In expression trees there are structural relations
between each operator and its operand or operands.
In code trees there are structural relations
between procedures and their arguments.

In logic, the designations of properties and
relations are called predicates. In MathPiper, all
predicates end with a question mark character '?'
so they can be easily identified. For example,
"Even?", "List?", and "GreaterThan?" are
procedures that are predicates and "<?" and ">?"
are operators that are predicates.

Predicates that have a single argument specify
properties. Examples of predicates that specify
properties are "Integer?", "Even?", "Prime?", and
"Atom?".

Predicates that have two or more arguments specify
relations. Examples of predicates that specify
relations are "<?" and ">?".

Structural relations are usually specified using
trees that contain pattern variables, but they can
also be specified with literal trees.



PATTERN VARIABLES

MATCHING TREE PARTS USING PROPERTIES

The way MathPiper enables parts of a tree to be
specified using properties is with PATTERN
VARIABLES. A pattern variable is defined by using
any underscore constant that DOES NOT BEGIN with
an underscore character. Examples of pattern
variables include x_ , n_, var_, and num_.

Underscore constants that BEGIN with an
underscore character are simply constants that
have no additional meaning associated with them.
So, _x, _n, _var, and _num are constants that are
not pattern variables.

A pattern variable that ends with an underscore
will match any part of a tree regardless of the
part's properties, so n_ will match any tree part.
A pattern variable that matches a tree part using
one or more properties can be created by appending
the names of one or more single argument
predicates to it. For example, n_Atom? matches all
parts of a tree that have the property of being an
atom.

%mathpiper

ViewTreeParts(tree1, Pattern:(n_Atom?), ArcsAutoHighlight:False, Scale:1.5, FontSize:20);

%/mathpiper

    %output,preserve="false"
      Result: 
.   %/output




The pattern variable n_Integer? matches all nodes
in a tree that have the property of being an
integer.

%mathpiper

ViewTreeParts(tree1, Pattern:(n_Integer?), ArcsAutoHighlight:False, Scale:1.5, FontSize:20);

%/mathpiper

    %output,preserve="false"
      Result: 
.   %/output




The pattern variable n_Even?_Integer? matches all
nodes in a tree that have the property of being
even and the property of being an integer. Notice
that multiple predicates in a pattern variable are
separated by underscore characters.

%mathpiper

ViewTreeParts(tree1, Pattern:(n_Even?_Integer?), ArcsAutoHighlight:False, Scale:1.5, FontSize:20);

%/mathpiper

    %output,preserve="false"
      Result: 
.   %/output




MATCHING TREE PARTS USING STRUCTURAL RELATIONS

The way MathPiper enables parts of a tree to be
specified using structural relations is with
operators and their operands. For example, the
structural relation "(1 * 2)" can be matched using
the literal pattern '(1 * 2).

%mathpiper

ViewTreeParts(tree1, Pattern:''(1 * 2), Scale:1.5, FontSize:20);

%/mathpiper

    %output,preserve="false"
      Result: 
.   %/output




All multiplication structural relations in the
tree can be matched by using pattern variables for
the multiplication operator's operands. Both of
the pattern variables that are used in the
following example match any subtrees that may be
used as operands of any multiplication operator in
the tree.

%mathpiper

ViewTreeParts(tree1, Pattern:(n_ * m_), Scale:1.5, FontSize:20);

%/mathpiper

    %output,preserve="false"
      Result: 
.   %/output




Patterns that have the ability to match based on
both structural relations and properties of tree
parts can be created by combining operators with
pattern variables that have predicates associated
with them. In the following code only
multiplication operations that have second
operands that are integers are matched.

%mathpiper

ViewTreeParts(tree1, Pattern:(n_ * m_Integer?), Scale:1.5, FontSize:20);

%/mathpiper

    %output,preserve="false"
      Result: 
.   %/output





POST PREDICATES

When a pattern matches a part of a tree, all of
the pattern variables in the pattern have the part
of the tree they matched assigned to them as a
value. The pattern then provides an optional
method of determining if the part of the tree that
the pattern matched has additional properties or
relations that could not be specified in the
pattern due to lack of space. This additional
matching method is called a POST PREDICATE, and it
is created using the "::" POST PREDICATE OPERATOR.

The post-predicate operator uses the pattern as
its first operand, and its second operand can
contain any code that returns True or False when
it is evaluated. 

<pattern>::<post predicate code>

Post predicate code is typically used to check the
matched tree parts (that were assigned to the
pattern variables) for additional properties and
relations. However, all of the property checking
that is done in the pattern could be moved into
the post-predicate if desired.

In the following code, the "Echo" procedure is
used in the post-predicate to print each value
that is assigned to the predicate variable n_ as
this variable matches various parts of the tree.
The "Echo" procedure can be used by itself here
because it always returns "True".

%mathpiper

ViewTreeParts(tree1, Pattern:( (n_Even?_Integer?)::Echo(n) ), Scale:1.5, FontSize:20);

%/mathpiper

    %output,preserve="false"
      Result: 
      
      Side Effects:
      2 
      4 
      
.   %/output




The following code contains a pattern with a post
predicate that only matches even integers that are
greater than 3.

%mathpiper

ViewTreeParts(tree1, Pattern:( (n_Even?_Integer?)::(n >? 3) ), Scale:1.5, FontSize:20);

%/mathpiper

    %output,preserve="false"
      Result: 
.   %/output




Sometimes it is useful to place one or more "Echo"
procedures into a post predicate without having
them affect the post-predicate checking code. This
can be done by using a code sequence.

%mathpiper

ViewTreeParts(tree1, Pattern:( (n_Even?_Integer?)::{Echo(n); n >? 3;} ), Scale:1.5, FontSize:20);

%/mathpiper

    %output,preserve="false"
      Result: 
      
      Side Effects:
      2 
      4 
      
.   %/output




TRANSFORMATION RULE = PATTERN MATCHING + TREE PART REPLACEMENT

In previous sections we saw that locating parts of
trees using pattern matching is more powerful than
locating them literally like the "Substitute"
procedure does. However, locating a part of a tree
using pattern matching would not be very useful
without the ability to also replace the matched
part with another tree. Transformation rules
combine the power of pattern matching with the
ability to replace the matched tree part with
another tree.

MathPiper contains two kinds of transformation
rules which are global transformation rules and
local transformation rules. Local transformation
rules will be covered first.




LOCAL TRANSFORMATION RULES

Local transformation rules consist of a pattern
followed the "<-" local transformation rule
operator followed by the tree that will replace
the tree part if the pattern matches it:

pattern <- replacement_tree

A local transformation rule can also have a post
predicate:

pattern :: post_predicate <- replacement_tree

Either the "/:" single-pass local transformation
rule operator or the "/::" multiple-pass local
transformation rule operator can be used to apply
one or more local transformation rules to a tree.
The transformation rules are passed to these
operators in a list. 

The difference between these two operators is the
"/:" operator will only make one pass through the
list of local transformation rules it is given
while the "/::" operator will continue making
passes through the transformation rules until the
tree does not change anymore.

Lets start with the "/:" operator. The following
code replaces all numbers and variables in the
expression that is assigned to "tree3" with
English equivalents.

%mathpiper

tree3 := '((1*2 + x)==(3*(4 - 5/x)));

tree3 /:
   [
      1 <- "One",
      2 <- "Two",
      3 <- "Three",
      4 <- "Four",
      5 <- "Five",
      x <- "The Unknown"
   ];
   
%/mathpiper

    %output,preserve="false"
      Result: "One"*"Two" + "The Unknown" == "Three"*("Four" - "Five"/"The Unknown")
.   %/output




Since the "/:" operator only makes one pass
through the local transformation rules it is
given, only one of the additions in the following
example will be matched and rewritten.

%mathpiper

tree3 := '(1 + 2 + 3);

tree3 /:
   [
     q_Number? + r_Number? <- q+r,
   ];
   
%/mathpiper

    %output,preserve="false"
      Result: 3 + 3
.   %/output




The "/::" operator will continue making passes
through the transformation rules it is given until
the tree stops changing. In the following code
this has the effect of fully evaluating the tree.

%mathpiper

tree3 := '(1 + 2 + 3);

tree3 /::
   [
     q_Number? + r_Number? <- q+r,
   ];
   
%/mathpiper

    %output,preserve="false"
      Result: 6
.   %/output




Only five transformation rules are needed to
process trees that contain arithmetic operators.
Notice that the transformation rule for the
division operator contains a post-predicate that
prevents the rule from matching if the operator's
denominator is 0.

%mathpiper

tree3 := '(1 + 2/2*4 + 5^6);

tree3 /::
   [
     q_Number? ^ r_Number? <- q^r,
     (q_Number? / r_Number?)::(r !=? 0) <- q/r,
     q_Number? * r_Number? <- q*r,
     q_Number? + r_Number? <- q+r,
     q_Number? - r_Number? <- q-r,
   ];
   
%/mathpiper

    %output,preserve="false"
      Result: 15630
.   %/output




GLOBAL TRANSFORMATION RULES AND THE MATHPIPER RULEBASE

Global transformation rules consist of a pattern
followed the "<--" global transformation rule
operator followed by the tree that will replace
the tree part if the pattern matches it:

pattern <-- replacement_tree

A global transformation rule can also have a post
predicate:

pattern :: post-predicate <-- replacement_tree

MathPiper consists of a large database of global
transformation rules called a RULEBASE. At the
time it is defined every global transformation
rule is placed into a section of the rulebase
which is determined by the operator or procedure
that is the dominant node in the rule's pattern.
For example, all rules related to the "+" operator
are grouped together, all rules related to the "*"
operator are grouped together, etc.

Within a given group, the sequence in which the
transformation rules are applied can be specified
by an optional precedence number using the "#"
precedence operator:

precedence # pattern post_predicate <-- replacement_tree

If a precedence number is not specified, then the
precedence of the transformation rule is set to 0.
Rules that have the same precedence can be tried
in any order.

An example of how global transformation rules are
used to handle an operator are the transformation
rules for the "+" operator. The following are some
of the rules in the rulebase that handle this
operator:

50 # x_Number? + y_Number? <-- AddN(x,y);
100 # +x_ <-- x;
100 # 0 + x_ <-- x;
100 # x_ + 0 <-- x;
100 # x_ + n_Constant?*(x_) <-- (n+1)*x;
100 # n_Constant?*(x_) + x_ <-- (n+1)*x;
101 # x_ + - y_ <-- x-y;
101 # x_ + (- y_)/(z_) <-- x-(y/z);
101 # (- y_)/(z_) + x_  <-- x-(y/z);
101 # (- x_) + y_ <-- y-x;
102 # x_ + y_NegativeNumber? <-- x-(-y);
102 # x_ + y_NegativeNumber? * z_ <-- x-((-y)*z);
102 # x_ + (y_NegativeNumber?)/(z_) <-- x-((-y)/z);
102 # (y_NegativeNumber?)/(z_) + x_ <-- x-((-y)/z);
102 # (x_NegativeNumber?) + y_ <-- y-(-x);
150 # n1_ / d_ + n2_ / d_ <-- (n1+n2)/d;
210 # x_Number? + (y_Number? / z_Number?) <--(x*z+y)/z;
210 # (y_Number? / z_Number?) + x_Number? <--(x*z+y)/z;
210 # (x_Number? / v_Number?) + (y_Number? / z_Number?) <--(x*z+y*v)/(v*z);




HOW EXPRESSIONS ARE EVALUATED IN MATHPIPER

Every time an expression is evaluated in
MathPiper, each part in its tree is tested
(starting at the root node and working downwards)
to determine if any of the global transformation
rules in the rulebase match it. If a rule has its
pattern match a part of the tree, it replaces this
part with another tree.

The tree that contains the replaced tree part is
then searched again to determine if any of its
parts match any of the rules in the rulebase.
Further matches result in further replacements,
and this process continues until no more rules in
the rulebase match any of the parts in the tree.
The final unchanging version of the tree is then
returned as the result of the evaluation. This
process is similar to the way the "/::" operator
works except it works on every expression that is
evaluated by MathPiper while the "/::" operator
only works on the expression that is passed to it.