with: Erik Oosterwal
Infix to RPN Conversion
Infix notation is the typical way we use numbers and mathematical formulas.
Each mathematical operator bisects the two operands it is working on,
such as 23 + 87, 123 / 76, or 76 ^ 3. Lines are evaluated with ^ having
the highest priority, * and / having the next highest priority, and + and
- having the lowest priority. Pairs to the left are evaluated before
pairs to the right, which means that 5 - 3 - 1 is evaluated as 5 - 3 = 2
and 2 - 1 = 1. It would be wrong to evaluate the second pair first
as is 3 - 1 = 2 and 5 - 2 = 3. Using parentheses alleviates most of
the confusion. ( 5 - 3 ) - 1 is evaluated as our former example, and
5 - ( 3 - 1 ) is evaluated as our latter example.
In Reverse Polish Notation (RPN) the operators follow the pairs of operands
and the operands are typically separated with some special character, such
as a comma. Our earlier examples would be written as 23,87,+
123,76,/ and 76,3,^. Parentheses are no longer needed because
RPN removes all ambiguity of how the line is to be evaluated. Our last
two examples in the previous paragraph would be written as 5,3,-,1,- and
5,3,1,-,- In the first example 5,3,-,1,- the 5 is placed in the
first position, the 3 is read into the second position, the minus operator
is used and the result left in the first position. The 1 is then read
into the second position and the final minus operator is used leaving the
result in the first position. In the second example, the 5 and 3 are
read into the first and second positions, then the 1 is read into the second
position pushing the 3 into the first position which pushes the 5 into a
first-in-last-out (FILO) memory slot. The first minus operator works
its magic on the 3 and 1 in the first and second position leaving the result
in position 1, then the 5 is brought back in from memory into the first position
pushing the previous result into the second position. The final minus
operator then does its stuff on the 5 and the 2 and the result is left in
the first position.
You may ask, "Why on earth would we ever want to do a thing like that?"
The answer for most of us is that you probably wouldn't. However
there are practical applications to using RPN besides figuring out how to
use that Hewlitt-Packard calculator. Computers have a terrible time
trying to figure out how to handle infix notation, so any time you want to
instruct a computer how to perform some mathematical computations you have
to tell it how to do it in RPN. This method was first described to
me in my second year of college when we were writing a simple language
interpreter to convert this language called SLIP (Simplified Language for
Instructional Purposes) into assembly language, which was then compiled and
run. SLIP was not a real language in that it did not have looping
constructs built in, but it was useful in teaching us about reading input
files, parsing the lines for tokens then directing further actions based
on those tokens. For our purposes tokens will be considered any numeric
value, any alphanumeric string that could be interpreted as a variable, or
any of the following mathematical operations: +, -, *, /, ^.
Ok, now on to the algorithm...
Parsing from left to right, any numeric value we encounter or any alphanumeric
string that could be interpreted as a variable is sent directly to the output,
including decimal points.
When we encounter anything other than numbers or letters (or a decimal point),
we do the following:
Append a token separator to the output string (in our case a comma.)
If we encounter an open parenthesis, that parenthesis is pushed to a FILO
stack.
If we encounter a closing parenthesis, we discard it and pop operators from
the stack to the end of the output string until we come to an open parenthesis.
The open parenthesis is then also discarded.
If we encounter a low priority operator (+ or -) then we must first pop all
operators (+, -, *, /, ^) from the stack to the end of the output string
until the stack is empty or we encounter an open parenthesis and then push
the low priority operator on to the stack.
If we encounter a medium priority operator (* or /) then we must first pop
any medium or high priority operators (*, /, ^) from the stack to the end
of the output string until the stack is empty or we encounter a low
priority operator (+ or -) or we encounter an open parenthesis, then push
the medium priority operator on to the stack.
If we encounter a high priority operator (^) we pop all other high priority
operators from the stack to the end of the output string then push the new
high priority operator to the stack.
Once all the input character have been processed, we pop operators off the
stack until it's empty.
That's basically it, but it leaves somewhat to be desired. For instance,
this algorithm has no way of handling unary minuses (indicating negative
numbers) nor does it deal with functions such as SIN(), COS(), SQR(), etc.
I describe one way of handling unary minus in the comments of the program
listing below and will alter the code to handle those as well as sines, cosines,
square roots, etc. at a later date. In the mean time if you need an
infix to RPN converter, this should handle most of your needs.
'
' Infix to RPN - program to convert infix notation to
' RPN notation. Operands are separated by commas, to
' use a different separation character change the
' line where txtSeparator is initialized. The operator
' stack has been set to an upper limit of 100 operators,
' if you expect to have input strings with more than 100
' operators (heaven forbid...) increase the size of
' txtStack in the Dim statement.
'
' This routine does not handle unary minus signs. To allow
' unary minus signs two additional changes must be made. In the
' Low Operator case (+, -) we must check to see if the previous
' input character was an alphanumeric. If it IS alphanumeric
' we treat the minus as a normal binary operator. If it is NOT
' alphanumeric, then we append a zero and token separator to the
' output string, and push a special unary minus character to the
' stack, which has an operator value equal to ^. You can use the
' tilde "~" as the unary minus character since it's not being used
' for anything else. I'll make this changes to the routine at a
' later date.
'
' Written by - Erik Oosterwal
' Started on - Nov. 14, 2005
' Completed on - Nov. 15, 2005
Dim txtSeparator, txtStack(100), txtCharacter As String ' Declare all variables
Dim intPointer, intStackPointer As Integer ' used in the routine
txtSeparator = "," ' Set the token separator to a comma
intPointer = 1 ' Start by looking at the first character in the input string
intStackPointer = 1 ' Start the operator stack at 1
txtOutput.Text = "" ' Clear the output sting
While intPointer <= Len(txtInput.Text) ' As long as there's characters in the string
txtCharacter = Mid$(txtInput.Text, intPointer, 1) ' get the next character.
Select Case txtCharacter
Case "0" To "9", ".", "A" To "Z", "a" To "z" ' Numbers and variables get appended
txtOutput.Text = txtOutput.Text + txtCharacter ' directly to the output string.
Case "("
txtStack(intStackPointer) = txtCharacter ' Open parentheses get pushed on the stack
intStackPointer = intStackPointer + 1
Case ")" ' If we find a closing parenthesis
While intStackPointer > 1 And _
txtStack(intStackPointer - 1) <> "(" ' Clear the stack until we
txtOutput.Text = txtOutput.Text + txtSeparator + _
txtStack(intStackPointer - 1) ' get to the opening parenthesis.
intStackPointer = intStackPointer - 1
Wend
intStackPointer = intStackPointer - 1 ' Decrease the stack pointer to overwrite the opening parenthesis.
Case "+", "-" ' Lowest operators
txtOutput.Text = txtOutput.Text + txtSeparator ' Append a token separator to the output string.
' If there are any operators on the stack AND they're higher or equal valued than + and -, then...
' pop them off the stack and append them to the output string.
While intStackPointer > 1 And _
txtStack(intStackPointer - 1) <> "("
txtOutput.Text = txtOutput.Text + _
txtStack(intStackPointer - 1) + txtSeparator
intStackPointer = intStackPointer - 1
Wend
txtStack(intStackPointer) = txtCharacter ' Push the low operator on the stack.
intStackPointer = intStackPointer + 1
Case "*", "/" ' Medium operators
txtOutput.Text = txtOutput.Text + txtSeparator ' Append a token separator to the output string.
' If there are any operators on the stack and they're higher or equal valued than * and /, then
' pop them off the stack and append them to the output string.
While intStackPointer > 1 And _
(txtStack(intStackPointer - 1) = "*" Or _
txtStack(intStackPointer - 1) = "/" Or _
txtStack(intStackPointer - 1) = "^")
txtOutput.Text = txtOutput.Text + _
txtStack(intStackPointer - 1) + txtSeparator
intStackPointer = intStackPointer - 1
Wend
txtStack(intStackPointer) = txtCharacter ' Push the medium operator on the stack.
intStackPointer = intStackPointer + 1
Case "^" ' High operators
txtOutput.Text = txtOutput.Text + txtSeparator ' Append a token separator to the output string.
While intStackPointer > 1 And _
txtStack(intStackPointer - 1) = "^"
txtOutput.Text = txtOutput.Text + _
txtStack(intStackPointer - 1) + txtSeparator
intStackPointer = intStackPointer - 1
Wend
txtStack(intStackPointer) = txtCharacter ' Push the high operator on the stack.
intStackPointer = intStackPointer + 1
' There is no default case included in this routine. Any characters not explicitly taken care
' of by the cases will be ignored as whitespace. Some characters, such as x, #, and others
' that are used to denote binary, hex, octal, etc. numbers could be taken care of in the first
' case where numeric and alpha tokens are handled. It's ok to ignore spaces, tabs and other
' white space.
End Select
intPointer = intPointer + 1 ' Set the pointer to look at the next character
Wend
' All the input characters have been taken care of, now it's time to clear the stack.
'
While intStackPointer > 1 ' As long as there's still operators on the stack
txtOutput.Text = txtOutput.Text + txtSeparator + _
txtStack(intStackPointer - 1) ' Take one off and append it to the output string
intStackPointer = intStackPointer - 1 ' look at the previous operator
Wend
End Sub
You can also download a zip file that contains
an executable file that demonstrates this routine along with the Visual Basic
project.
Discuss computer algorithms and other computer science topics at the Computer Algorithms blog page.
All code and original algorithms are © Erik Oosterwal - 1987-2008
Computer Science 101
Dressing for Success