THE HIDDEN RULES MATHEMATICIANS USE TO SOLVE ELEMENTARY ALGEBRA EQUATIONS v.05 MATHEMATICIANS DON'T KNOW HOW THEY DO MATH In the early 1970s, Artificial Intelligence (AI) researchers at the University of Edinburgh, led by Dr. Alan Bundy, started a research project named PRESS (PRolog Equation Solving System) that involved determining how computers can be "taught" to solve elementary algebra equations like humans do. The first thing his research group did was to try to determine how mathematicians solve equations. The group discovered that a significant number of the rules that mathematicians used to solve equations were not written down anywhere. They were not in any textbooks, nor were they in any journals or research papers. As the researchers dug deeper, they also discovered the rules did not have names, and they were not taught explicitly. The researchers concluded that the mathematicians were using these rules unconsciously. These discoveries lead to the question: If mathematicians don't consciously know a significant number of the rules they use to solve equations, how do they teach students how to solve equations? I think the way persistent students "learn" these unwritten rules is by solving an enormous number of equations over many years. WHAT STUDENTS ARE BEING TAUGHT IS NOT MATHEMATICS Scott Gray, who was the creator of the O'Reilly School of Technology, discovered that what is taught in the American system of teaching mathematics is not mathematics. Here is his description of how he made this discovery while teaching mathematics at The Ohio State University: "I [...] started giving my students oral exams after their written exams. This was a long and arduous process, because I had to schedule an hour for testing for each student. However, I discovered something that depressed the hell out of me. None of my students knew what they were talking about. Even students who got perfect scores on my written exams didn't really understand what it was that they were doing. It became clear that students were simply emulating calculation techniques, without understanding where those techniques came from, or how to create them themselves. Then it became clear to me why my review sessions [which were offered outside of class at night] were so popular. In those sessions, students would ask me to solve every type of problem they could find in the text book. Even though I'd have them try the problems before showing them the solution, they were really preparing a decision matrix for a matching game. If the problem was like this, then they would do this; if it was like that then they'd do that, and so on. I also realized that the problems I was asking them to do, were designed with this system in mind. It seemed that most of the calculation techniques were designed to help students pass tests, but DID NOT ILLUMINATE THE TRUE NATURE OF THE MATHEMATICAL STRUCTURES. [emphasis mine] In the American system of teaching mathematics, we are actually teaching algorithms for getting answers from synthetically designed problems, but not teaching students the art and science of mathematics. Sure, some students get through school, and become great mathematicians despite this system, but we are losing most students through attrition." http://blog.oreillyschool.com/2010/05/the-story-of-the-oreilly-school-of-technology---part-2.html DISCOVERING THE HIDDEN RULES MATHEMATICIANS USE TO DO MATH Why were AI researchers the first group in history to discover that mathematicians don't know how they do math? I think it’s because computers were the first "students" in history that absolutely refused to learn any mathematics that was not taught explicitly. The researchers then spent years discovering and naming the unwritten rules that mathematicians used to solve elementary algebra equations. When they "taught" computers these rules by encoding them into a program named PRESS, the computers were able to solve these equations like a human typically would. AN EXPLANATION OF THE MYSTERIOUS HIDDEN RULES The rest of this worksheet explains the fundamentals of the mysterious hidden rules mathematicians most likely use to solve elementary algebra equations. THE STRUCTURE AND SEMANTICS OF MATHEMATICAL EXPRESSIONS All mathematical expressions (hereafter referred to as just expressions) have a STRUCTURE which can be unambiguously represented by an expression tree. SEMANTICS refers to the meaning of an expression when it is evaluated. However, before an expression can be evaluated, a DOMAIN OF DISCOURSE (which is also called a WORLD or a UNIVERSE) must be specified from which the values will be obtained. The VALUES in a domain of discourse are also called OBJECTS. In elementary algebra, the domain of discourse are the real numbers, and each real number is an object in this domain. LAWS OF ELEMENTARY ALGEBRA All elementary algebra books state laws of elementary algebra for real numbers that specify the characteristics of operator symbols such as '+' and '*'. Some of these laws are as follows: For every q, r, s ∈ ℝ q + r = r + q (commutative law of addition) q + (r + s) = (q + r) + s (associative law of addition) q*r = r*q (commutative law of multiplication) q*(r*s) = (q*r)*s (associative law of multiplication) q*(r + s) = q*r + q*s (r + s)*q = r*q + s*q (distributive law) q + 0 = q 0 + q = q (additive identity) q * 1 = q 1 * q = q (multiplicative identity) DEFINITIONS OF ELEMENTARY ALGEBRA DEFINITIONS define new operators in terms of operators that have been specified by laws or defined by other definitions. The main purpose of definitions is to provide a shorthand way to represent more complex expressions. All definitions can be undefined into the more fundamental operators and definitions that they are shorthand for. Some examples of defined operators are the unary minus operator, the binary minus operator, and the division operator. The following are definitions for these operators: For every q, r, s ∈ ℝ -q := -1 * q (definition of the unary minus operator) q - r := q + (-r) (definition of the binary minus operator) q/r := q * (1/r) for r != 0 (definition of the division operator) RULES OF ELEMENTARY ALGEBRA Laws and definitions state truths about the domain of discourse, however they are not able to be used directly in problem solving processes. It is rules, not laws, that are the tools used during problem solving. All elementary algebra books state laws and definitions of elementary algebra, however none of them explicitly state all of the rules of algebra. The rules of elementary algebra are classified into two types: object-level rules and meta-level rules. OBJECT-LEVEL RULES TRANSFORM AN EXPRESSION Expressions that refer to objects in the domain of discourse are called OBJECT-LEVEL expressions. Each of the above laws of algebra is an object-level expression which gives rise to two object-level rules, one that results from reading the law LEFT-TO-RIGHT, and one that results from reading the law RIGHT-TO-LEFT. Both of a law's rules can be used to transform an equation into an equivalent equation that has the same meaning. For example, the left-to-right reading of the commutative law of addition produces a rule that states: any subtree in an expression that matches the pattern q + r which is on the left side of the law's '=' sign can be replaced by the pattern r + q which is on the right side of this sign without changing the meaning of the expression. Conversely, the right-to-left reading of the commutative law of addition produces a rule that states: any subtree in an expression that matches the pattern r + q which is on the right side of the law's '=' sign can be replaced by the pattern q + r which is on the left side of this sign without changing the meaning of the expression. The follow example shows the eight rules that the identity laws which are listed above give rise to: In> RulesByType("Type", "TEXTBOOK") %output,preserve="false" Result: True Side Effects: q_*1 <- q q_ + 0 <- q 0 + q_ <- q 1*q_ <- q q_ <- q*1 q_ <- q + 0 q_ <- 0 + q q_ <- 1*q . %/output AN EQUATION CAN'T BE SOLVED BY RANDOMLY APPLYING OBJECT-LEVEL RULES TO IT Object-level rules are used when solving an equation. However, much of the knowledge of how to solve elementary algebra equations involves knowing which object-level rule to apply to an equation, and which subtree to apply it too, at each step. One may think that an equation may be able to be solved by randomly applying object-level rules to it until a solution has been arrived at. However, the number of trees that an equation can be transformed into is so large that the probability of arriving at a tree that represents a solution to the equation is very low. In the following example, the identity rules that were introduced earlier are randomly applied five times to an equation to show the strange equations that usually result from using object-level rules without guidance: %mathpiper Show(StepsView( SolveSteps( '(-20 == _x*-4-_x*6), _x, MaximumSteps:Infinity, ControlRandom:True, StepsToTake:5), ShowTree:True, ShowPositions:False, ShowRuleName:False, ShowRule:True, Scale:1.2), returnContent:True); %/mathpiper %output,preserve="false" Result:. %/output THE META-LEVEL OF ELEMENTARY ALGEBRA In elementary algebra, object-level expressions have the real numbers as their domain of discourse. However, when humans solve elementary algebra equations, the domain of discourse they use consists of object-level elementary algebra expression trees, not real numbers. Any language that refers to object-level elementary algebra expression trees is part of what is called the the META-LEVEL of elementary algebra. The AI researchers discussed earlier discovered that when humans solve equations, much of the reasoning they do is done unconsciously at meta-level of elementary algebra. Features of an equation's expression tree that they unconsciously notice include: - The number of copies of the unknown in the tree. - The length of the path between two copies of the unknown. - Which object-level rules match a given subtree in the tree. - The structural effect each object-level rule produces in a tree. THE STRUCTURAL EFFECTS OBJECT-LEVEL RULES PRODUCE Each object-level rule produces a structural effect on a tree that is useful for solving an equation. These effects fall into three categories which are: 1) OTHER-SIDE rules, 2) ELIMINATE-UNKNOWN rules, and 3) CLOSER-UNKNOWNS rules. UNBURYING THE UNKNOWN WITH OTHER-SIDE RULES OTHER-SIDE rules remove the dominant operator node that is on the left side of an equation by placing its inverse operation on the right side of the equation. The AI researchers called these rules "isolation" rules. The following are some of the other-side rules for elementary algebra: In> RulesByType("StructuralEffect", "OTHERSIDE") %output,preserve="false" Result: True Side Effects: q_/r_ = rhs_ :: UnknownIn?(q) <- q = rhs*r q_/r_ = rhs_ :: UnknownIn?(r) <- rhs*r = q followed by r = q/rhs q_^r_ = rhs_ :: UnknownIn?(q) <- q = rhs^(1/r) q_ - r_ = rhs_ :: UnknownIn?(q) <- q = rhs + r q_ - r_ = rhs_ :: UnknownIn?(r) <- -r = rhs - q q_ + r_ = rhs_ :: UnknownIn?(q) <- q = rhs - r q_ + r_ = rhs_ :: UnknownIn?(r) <- r = rhs - q q_*r_ = rhs_ :: UnknownIn?(q) <- q = rhs/r q_*r_ = rhs_ :: UnknownIn?(r) <- r = rhs/q -lhs_ = rhs_ :: The unknown is in lhs. <- lhs = -rhs . %/output The other-side rules work by: 1) locating the dominant operator which is on the left side of the equation, 2) determining which of its operand subtrees contains the unknown, and then 3) applying the operator's inverse operation to the operand subtree that does not contain the unknown (on both sides of the equation). During the application of an other-side rule, the operand subtree that does not contain the unknown is moved to the other side of the equation. The following example shows an other-side rule being used to move "+ b" to the other side of an equation: %mathpiper Show(StepsView( SolveSteps( _a + _b == _c, _a, MaximumSteps:Infinity), ShowTree:True, ShowPositions:False, ShowRuleName:False, ShowRule:True, Scale:1.2), returnContent:True); %/mathpiper %output,preserve="false" Result:
. %/output Each time an other-side rule is applied to the left side of an equation, it brings the unknown closer to the top of the subtree it is in. Repeatedly applying other-side rules to this subtree will eventually bring the unknown to the top of the tree, and this is called UNBURYING the unknown. The following example shows other-side rules being used to unbury the unknown "b": %mathpiper Show(StepsView( SolveSteps( _a + _b * _c / _d == 0, _b, MaximumSteps:Infinity, UnburySimple:True), ShowTree:True, ShowPositions:False, ShowRuleName:False, ShowRule:True, Scale:1.2), returnContent:True); %/mathpiper %output,preserve="false" Result:
. %/output ELIMINATING A COPY OF THE UNKNOWN USING ELIMINATE-UNKNOWN RULES The other-side rules only work if an equation contains a SINGLE COPY of the unknown. If an equation contains two or more copies of the unknown, then copies of the unknown need to be eliminated from the tree until only a single copy of the unknown remains. Some of the rules of algebra have the structural effect of eliminating one copy of the unknown from the subtree they match. The AI researchers called these rules "collection" rules, but here they are called ELIMINATE-UNKNOWN. Eliminate-unknown rules are applied to the left side of an equation until only a single copy of the unknown remains. Then the unknown can be unburied. The following are some of the eliminate-unknown rules for elementary algebra. In> RulesByType("StructuralEffect", "ELIMINATE?UNKNOWN") %output,preserve="false" Result: True Side Effects: q_*s_ + r_*s_ :: Unknown?(s) <- (q + r)*s r_*q_ + q_ :: Unknown?(q) <- (r + 1)*q q_ + q_ :: Unknown?(q) <- 2*q q_*q_ :: Unknown?(q) <- q^2 . %/output The following example shows an eliminate rule being used to eliminate one copy of the unknown "c": %mathpiper Show(StepsView( SolveSteps( _a * _c + _b * _c == _d, _c, MaximumSteps:1), ShowTree:True, ShowPositions:False, ShowRuleName:False, ShowRule:True, Scale:1.2), returnContent:True); %/mathpiper %output,preserve="false" Result:
. %/output UNKNOWS THAT ARE TOO FAR APART TO ELIMINATE Sometimes the length of the path between two copies of the unknown is too long for any elimination rules to be applied. For example, the length of the path between the two copies of the unknown in the expression a*x + b*(c*x) = 0 is 5, and the longest path any of the above elimination rules will match is 4: %mathpiper ViewTreeParts(_a*_x + _b*(_c*_x) == _0, Path:["1,1,2", "1,2,2,2"], PathNumbers:False, Scale:1.5, FontSize:20); %/mathpiper %output,preserve="false" Result:
. %/output BRINGING COPIES OF THE UNKNOWN CLOSER TOGETHER Some of the rules of algebra have the structural effect of bringing two copies of the unknown closer together along a path. The AI researchers called these rules "attraction" rules, but here they are called CLOSER-UNKNOWNS rules. If an equation has more than one copy of the unknown, but they are too far apart for any of the elimination rules to be used, the closer-unknowns rules are tried. If one of the closer-unknowns rules matches a subtree, it is applied and then the elimination rules are tried again. This process is continued until all copies of the unknown are close enough for the elimination rules to match. The following are some of the closer-unknowns rules for elementary algebra: In> RulesByType("StructuralEffect", "CLOSER?UNKNOWNS") %output,preserve="false" Result: True Side Effects: q_*(r_*s_) :: TwoCopiesOfTheUnknownWillBecomeCloserAlongAPath?() <- (q*r)*s q_ + (r_ + s_) :: TwoCopiesOfTheUnknownWillBecomeCloserAlongAPath?() <- (q + r) + s (q_*r_)*s_ :: TwoCopiesOfTheUnknownWillBecomeCloserAlongAPath?() <- q*(r*s) (q_ + r_) + s_ :: TwoCopiesOfTheUnknownWillBecomeCloserAlongAPath?() <- q + (r + s) q_ + (r_ + s_) :: UnknownIn?(q) &? (UnknownNotIn?(r) &? UnknownIn?(s)) <- q + (s + r) (q_ + r_) + s_ :: UnknownIn?(q) &? (UnknownNotIn?(r) &? UnknownIn?(s)) <- (r + q) + s . %/output In the following example, a closer-unknowns rule is used on the previous equation to bring the two copies of the unknown closer together so that an elimination rule can be applied to them: %mathpiper Show(StepsView( SolveSteps( _a * _x + _b * (_c * _x) == _0, _x, MaximumSteps:2), ShowTree:True, ShowPositions:False, ShowRuleName:False, ShowRule:True, Scale:1.2), returnContent:True); %/mathpiper %output,preserve="false" Result:
. %/output META-LEVEL RULES GUIDE THE EQUATION SOLVING PROCESS Rules that refer to the expression trees of object-level expressions are called META-LEVEL rules. The meta-level rules of elementary algebra guide the equation solving process by inspecting the expression tree of the equation that is being solved at each step of the solving process, and then determining which object-level rule to apply. The following is a flowchart of the sequence that the meta-level rules are applied in during the equation solving process: START | v RemoveAllNegationOperatorsThatHaveNumberOperands | v MoveUnknownsToLHS | v +<-------------------------------------+ | | v | UndefineNegationOperators -------(match)-->| | | v | UndefineSubtractionOperators ----(match)-->| | | v | Evaluate ------------------------(match)-->| | | v | EliminateUnknown ----------------(match)-->| | | v | MoveUnknownsToRightmostPositons -(match)-->| | | v | HigherUnknowns ------------------(match)-->| | | v | CloserUnknownHorizontal ---------(match)-->| | | v | CloserUnknownPath ---------------(match)-->| | | v | MakeCoefficientsExplicit --------(match)-->+ | v UnburyKnown | v SOLUTION