THE HIDDEN RULES MATHEMATICIANS USE TO SOLVE ELEMENTARY ALGEBRA EQUATIONS
v.05

MATHEMATICIANS DON'T KNOW HOW THEY DO MATH

In the early 1970s, Artificial Intelligence (AI)
researchers at the University of Edinburgh, led by
Dr. Alan Bundy, started a research project named
PRESS (PRolog Equation Solving System) that
involved determining how computers can be "taught"
to solve elementary algebra equations like humans
do. The first thing his research group did was to
try to determine how mathematicians solve
equations.

The group discovered that a significant number of
the rules that mathematicians used to solve
equations were not written down anywhere. They
were not in any textbooks, nor were they in any
journals or research papers. As the researchers
dug deeper, they also discovered the rules did not
have names, and they were not taught explicitly.
The researchers concluded that the mathematicians
were using these rules unconsciously.

These discoveries lead to the question: If
mathematicians don't consciously know a
significant number of the rules they use to solve
equations, how do they teach students how to solve
equations? I think the way persistent students
"learn" these unwritten rules is by solving an
enormous number of equations over many years.


WHAT STUDENTS ARE BEING TAUGHT IS NOT MATHEMATICS

Scott Gray, who was the creator of the O'Reilly
School of Technology, discovered that what is
taught in the American system of teaching
mathematics is not mathematics. Here is his
description of how he made this discovery while
teaching mathematics at The Ohio State University:

"I [...] started giving my students oral exams
after their written exams. This was a long and
arduous process, because I had to schedule an hour
for testing for each student. However, I
discovered something that depressed the hell out
of me. None of my students knew what they were
talking about. Even students who got perfect
scores on my written exams didn't really
understand what it was that they were doing.

It became clear that students were simply
emulating calculation techniques, without
understanding where those techniques came from, or
how to create them themselves. Then it became
clear to me why my review sessions [which were
offered outside of class at night] were so
popular. In those sessions, students would ask me
to solve every type of problem they could find in
the text book. Even though I'd have them try the
problems before showing them the solution, they
were really preparing a decision matrix for a
matching game. If the problem was like this, then
they would do this; if it was like that then
they'd do that, and so on. I also realized that
the problems I was asking them to do, were
designed with this system in mind. It seemed that
most of the calculation techniques were designed
to help students pass tests, but DID NOT
ILLUMINATE THE TRUE NATURE OF THE MATHEMATICAL
STRUCTURES. [emphasis mine]

In the American system of teaching mathematics, we
are actually teaching algorithms for getting
answers from synthetically designed problems, but
not teaching students the art and science of
mathematics. Sure, some students get through
school, and become great mathematicians despite
this system, but we are losing most students
through attrition."
http://blog.oreillyschool.com/2010/05/the-story-of-the-oreilly-school-of-technology---part-2.html


DISCOVERING THE HIDDEN RULES MATHEMATICIANS
USE TO DO MATH

Why were AI researchers the first group in history
to discover that mathematicians don't know how
they do math? I think it’s because computers were
the first "students" in history that absolutely
refused to learn any mathematics that was not
taught explicitly. The researchers then spent
years discovering and naming the unwritten rules
that mathematicians used to solve elementary
algebra equations. When they "taught" computers
these rules by encoding them into a program named
PRESS, the computers were able to solve these
equations like a human typically would.


AN EXPLANATION OF THE MYSTERIOUS HIDDEN RULES

The rest of this worksheet explains the
fundamentals of the mysterious hidden rules
mathematicians most likely use to solve elementary
algebra equations.


THE STRUCTURE AND SEMANTICS OF MATHEMATICAL EXPRESSIONS

All mathematical expressions (hereafter referred to
as just expressions) have a STRUCTURE which can be
unambiguously represented by an expression tree.

SEMANTICS refers to the meaning of an expression
when it is evaluated. However, before an
expression can be evaluated, a DOMAIN OF DISCOURSE
(which is also called a WORLD or a UNIVERSE) must
be specified from which the values will be
obtained. The VALUES in a domain of discourse are
also called OBJECTS. In elementary algebra, the
domain of discourse are the real numbers, and each
real number is an object in this domain.


LAWS OF ELEMENTARY ALGEBRA

All elementary algebra books state laws of
elementary algebra for real numbers that specify
the characteristics of operator symbols such as
'+' and '*'. Some of these laws are as follows:

For every q, r, s ∈ ℝ

q + r = r + q
(commutative law of addition)

q + (r + s) = (q + r) + s
(associative law of addition)

q*r = r*q
(commutative law of multiplication)

q*(r*s) = (q*r)*s
(associative law of multiplication)

q*(r + s) = q*r + q*s
(r + s)*q = r*q + s*q
(distributive law)

q + 0 = q
0 + q = q
(additive identity)

q * 1 = q
1 * q = q
(multiplicative identity)


DEFINITIONS OF ELEMENTARY ALGEBRA

DEFINITIONS define new operators in terms of
operators that have been specified by laws or
defined by other definitions. The main purpose of
definitions is to provide a shorthand way to
represent more complex expressions. All
definitions can be undefined into the more
fundamental operators and definitions that they
are shorthand for.

Some examples of defined operators are the unary
minus operator, the binary minus operator, and the
division operator. The following are definitions
for these operators:

For every q, r, s ∈ ℝ

-q := -1 * q
(definition of the unary minus operator)

q - r := q + (-r)
(definition of the binary minus operator)

q/r := q * (1/r) for r != 0
(definition of the division operator)


RULES OF ELEMENTARY ALGEBRA

Laws and definitions state truths about the domain
of discourse, however they are not able to be used
directly in problem solving processes. It is
rules, not laws, that are the tools used during
problem solving.

All elementary algebra books state laws and
definitions of elementary algebra, however none
of them explicitly state all of the rules of
algebra. The rules of elementary algebra are
classified into two types: object-level rules and
meta-level rules.


OBJECT-LEVEL RULES TRANSFORM AN EXPRESSION

Expressions that refer to objects in the domain of
discourse are called OBJECT-LEVEL expressions.
Each of the above laws of algebra is an
object-level expression which gives rise to two
object-level rules, one that results from reading
the law LEFT-TO-RIGHT, and one that results from
reading the law RIGHT-TO-LEFT.

Both of a law's rules can be used to transform an
equation into an equivalent equation that has the
same meaning. For example, the left-to-right
reading of the commutative law of addition
produces a rule that states: any subtree in an
expression that matches the pattern q + r which is
on the left side of the law's '=' sign can be
replaced by the pattern r + q which is on the
right side of this sign without changing the
meaning of the expression.

Conversely, the right-to-left reading of the
commutative law of addition produces a rule that
states: any subtree in an expression that matches
the pattern r + q which is on the right side of
the law's '=' sign can be replaced by the
pattern q + r which is on the left side of this
sign without changing the meaning of the
expression.

The follow example shows the eight rules that the
identity laws which are listed above give rise to:

In> RulesByType("Type", "TEXTBOOK")

    %output,preserve="false"
      Result: True
      
      Side Effects:
      q_*1 <- q
      q_ + 0 <- q
      0 + q_ <- q
      1*q_ <- q
      q_ <- q*1
      q_ <- q + 0
      q_ <- 0 + q
      q_ <- 1*q
.   %/output




AN EQUATION CAN'T BE SOLVED BY RANDOMLY APPLYING
OBJECT-LEVEL RULES TO IT

Object-level rules are used when solving an
equation. However, much of the knowledge of how to
solve elementary algebra equations involves
knowing which object-level rule to apply to an
equation, and which subtree to apply it too, at
each step.

One may think that an equation may be able
to be solved by randomly applying object-level
rules to it until a solution has been arrived at.
However, the number of trees that an equation can
be transformed into is so large that the
probability of arriving at a tree that represents
a solution to the equation is very low.

In the following example, the identity rules that
were introduced earlier are randomly applied five
times to an equation to show the strange equations
that usually result from using object-level rules
without guidance:

%mathpiper
Show(StepsView(
        SolveSteps(
            '(-20 == _x*-4-_x*6), 
            _x, 
            MaximumSteps:Infinity, 
            ControlRandom:True, 
            StepsToTake:5), 
        ShowTree:True, 
        ShowPositions:False,
        ShowRuleName:False,
        ShowRule:True,
        Scale:1.2), returnContent:True);
%/mathpiper

    %output,preserve="false"
Result: 

.   %/output




THE META-LEVEL OF ELEMENTARY ALGEBRA

In elementary algebra, object-level expressions
have the real numbers as their domain of
discourse. However, when humans solve elementary
algebra equations, the domain of discourse they
use consists of object-level elementary algebra
expression trees, not real numbers. Any language
that refers to object-level elementary algebra
expression trees is part of what is called the the
META-LEVEL of elementary algebra.

The AI researchers discussed earlier discovered
that when humans solve equations, much of the
reasoning they do is done unconsciously at
meta-level of elementary algebra. Features of an
equation's expression tree that they unconsciously
notice include:

- The number of copies of the unknown in the tree.
- The length of the path between two copies of the
unknown.
- Which object-level rules match a given subtree
in the tree.
- The structural effect each object-level rule
produces in a tree.


THE STRUCTURAL EFFECTS OBJECT-LEVEL RULES PRODUCE

Each object-level rule produces a structural
effect on a tree that is useful for solving an
equation. These effects fall into three categories
which are: 1) OTHER-SIDE rules, 2) ELIMINATE-UNKNOWN
rules, and 3) CLOSER-UNKNOWNS rules.


UNBURYING THE UNKNOWN WITH OTHER-SIDE RULES

OTHER-SIDE rules remove the dominant operator node
that is on the left side of an equation by
placing its inverse operation on the right side of
the equation. The AI researchers called these
rules "isolation" rules. The following are some of
the other-side rules for elementary algebra:

In> RulesByType("StructuralEffect", "OTHERSIDE")

    %output,preserve="false"
      Result: True
      
      Side Effects:
      q_/r_ = rhs_ :: UnknownIn?(q) <- q = rhs*r
      q_/r_ = rhs_ :: UnknownIn?(r) <- rhs*r = q followed by r = q/rhs
      q_^r_ = rhs_ :: UnknownIn?(q) <- q = rhs^(1/r)
      q_ - r_ = rhs_ :: UnknownIn?(q) <- q = rhs + r
      q_ - r_ = rhs_ :: UnknownIn?(r) <- -r = rhs - q
      q_ + r_ = rhs_ :: UnknownIn?(q) <- q = rhs - r
      q_ + r_ = rhs_ :: UnknownIn?(r) <- r = rhs - q
      q_*r_ = rhs_ :: UnknownIn?(q) <- q = rhs/r
      q_*r_ = rhs_ :: UnknownIn?(r) <- r = rhs/q
      -lhs_ = rhs_ :: The unknown is in lhs. <- lhs = -rhs
.   %/output




The other-side rules work by: 1) locating the
dominant operator which is on the left side of the
equation, 2) determining which of its operand
subtrees contains the unknown, and then 3)
applying the operator's inverse operation to the
operand subtree that does not contain the unknown
(on both sides of the equation). During the
application of an other-side rule, the operand
subtree that does not contain the unknown is moved
to the other side of the equation.

The following example shows an other-side
rule being used to move "+ b" to the other side of
an equation:

%mathpiper

Show(StepsView(
        SolveSteps(
            _a + _b == _c, 
            _a, 
            MaximumSteps:Infinity), 
        ShowTree:True,
        ShowPositions:False,
        ShowRuleName:False,
        ShowRule:True,
        Scale:1.2), returnContent:True);

%/mathpiper

    %output,preserve="false"
Result: 

.   %/output




Each time an other-side rule is applied to the left
side of an equation, it brings the unknown closer
to the top of the subtree it is in. Repeatedly
applying other-side rules to this subtree will
eventually bring the unknown to the top of the
tree, and this is called UNBURYING the unknown.

The following example shows other-side rules being
used to unbury the unknown "b":

%mathpiper

Show(StepsView(
        SolveSteps(
            _a + _b * _c / _d == 0, 
            _b,
            MaximumSteps:Infinity, UnburySimple:True), 
        ShowTree:True, 
        ShowPositions:False,
        ShowRuleName:False,
        ShowRule:True,
        Scale:1.2), returnContent:True);

%/mathpiper

    %output,preserve="false"
Result: 

.   %/output




ELIMINATING A COPY OF THE UNKNOWN USING ELIMINATE-UNKNOWN RULES

The other-side rules only work if an equation
contains a SINGLE COPY of the unknown. If an
equation contains two or more copies of the
unknown, then copies of the unknown need to be
eliminated from the tree until only a single copy
of the unknown remains.

Some of the rules of algebra have the structural
effect of eliminating one copy of the unknown from
the subtree they match. The AI researchers called
these rules "collection" rules, but here they are
called ELIMINATE-UNKNOWN.

Eliminate-unknown rules are applied to the left side of
an equation until only a single copy of the
unknown remains. Then the unknown can be unburied.

The following are some of the eliminate-unknown rules for
elementary algebra.

In> RulesByType("StructuralEffect", "ELIMINATE?UNKNOWN")

    %output,preserve="false"
      Result: True
      
      Side Effects:
      q_*s_ + r_*s_ :: Unknown?(s) <- (q + r)*s
      r_*q_ + q_ :: Unknown?(q) <- (r + 1)*q
      q_ + q_ :: Unknown?(q) <- 2*q
      q_*q_ :: Unknown?(q) <- q^2
.   %/output




The following example shows an eliminate rule
being used to eliminate one copy of the unknown
"c":

%mathpiper

Show(StepsView(
        SolveSteps(
            _a * _c + _b * _c == _d, 
            _c, 
            MaximumSteps:1), 
        ShowTree:True,
        ShowPositions:False,
        ShowRuleName:False,
        ShowRule:True,
        Scale:1.2), returnContent:True);

%/mathpiper

    %output,preserve="false"
Result: 

.   %/output




UNKNOWS THAT ARE TOO FAR APART TO ELIMINATE

Sometimes the length of the path between two
copies of the unknown is too long for any
elimination rules to be applied. For example,
the length of the path between the two copies of
the unknown in the expression a*x + b*(c*x) = 0 is
5, and the longest path any of the above
elimination rules will match is 4:

%mathpiper

ViewTreeParts(_a*_x + _b*(_c*_x) == _0, 
                Path:["1,1,2", "1,2,2,2"], 
                PathNumbers:False,
                Scale:1.5, FontSize:20);

%/mathpiper

    %output,preserve="false"
Result: 

.   %/output




BRINGING COPIES OF THE UNKNOWN CLOSER TOGETHER

Some of the rules of algebra have the structural
effect of bringing two copies of the unknown
closer together along a path. The AI researchers
called these rules "attraction" rules, but here
they are called CLOSER-UNKNOWNS rules.

If an equation has more than one copy of the
unknown, but they are too far apart for any of the
elimination rules to be used, the closer-unknowns
rules are tried. If one of the closer-unknowns
rules matches a subtree, it is applied and then
the elimination rules are tried again. This
process is continued until all copies of the
unknown are close enough for the elimination rules
to match.

The following are some of the closer-unknowns
rules for elementary algebra:

In> RulesByType("StructuralEffect", "CLOSER?UNKNOWNS")

    %output,preserve="false"
      Result: True
      
      Side Effects:
      q_*(r_*s_) :: TwoCopiesOfTheUnknownWillBecomeCloserAlongAPath?() <- (q*r)*s
      q_ + (r_ + s_) :: TwoCopiesOfTheUnknownWillBecomeCloserAlongAPath?() <- (q + r) + s
      (q_*r_)*s_ :: TwoCopiesOfTheUnknownWillBecomeCloserAlongAPath?() <- q*(r*s)
      (q_ + r_) + s_ :: TwoCopiesOfTheUnknownWillBecomeCloserAlongAPath?() <- q + (r + s)
      q_ + (r_ + s_) :: UnknownIn?(q) &? (UnknownNotIn?(r) &? UnknownIn?(s)) <- q + (s + r)
      (q_ + r_) + s_ :: UnknownIn?(q) &? (UnknownNotIn?(r) &? UnknownIn?(s)) <- (r + q) + s
.   %/output




In the following example, a closer-unknowns rule
is used on the previous equation to bring the two
copies of the unknown closer together so that an
elimination rule can be applied to them:

%mathpiper

Show(StepsView(
        SolveSteps(
            _a * _x + _b * (_c * _x) == _0, 
            _x, 
            MaximumSteps:2), 
        ShowTree:True,
        ShowPositions:False,
        ShowRuleName:False,
        ShowRule:True,
        Scale:1.2), returnContent:True);

%/mathpiper

    %output,preserve="false"
Result: 

.   %/output




META-LEVEL RULES GUIDE THE EQUATION SOLVING
PROCESS

Rules that refer to the expression trees of
object-level expressions are called META-LEVEL
rules. The meta-level rules of elementary algebra
guide the equation solving process by inspecting
the expression tree of the equation that is being
solved at each step of the solving process, and
then determining which object-level rule to apply.

The following is a flowchart of the sequence that
the meta-level rules are applied in during the
equation solving process:

  START
    |
    v
RemoveAllNegationOperatorsThatHaveNumberOperands
    |
    v
MoveUnknownsToLHS
    |
    v
    +<-------------------------------------+
    |                                      |
    v                                      |
UndefineNegationOperators -------(match)-->|
    |                                      |
    v                                      |
UndefineSubtractionOperators ----(match)-->|
    |                                      |
    v                                      |
Evaluate ------------------------(match)-->|
    |                                      |
    v                                      |
EliminateUnknown ----------------(match)-->|
    |                                      |
    v                                      |
MoveUnknownsToRightmostPositons -(match)-->|
    |                                      |
    v                                      |
HigherUnknowns ------------------(match)-->|
    |                                      |
    v                                      |
CloserUnknownHorizontal ---------(match)-->|
    |                                      |
    v                                      |
CloserUnknownPath ---------------(match)-->|
    |                                      |
    v                                      |
MakeCoefficientsExplicit --------(match)-->+
    |
    v
UnburyKnown
    |
    v
 SOLUTION