Computer Science 101

with:  Erik Oosterwal

Search for specific programs or algorithms:
Custom Search





Infix to RPN Conversion


Infix notation is the typical way we use numbers and mathematical formulas.  Each mathematical operator bisects the two operands it is working on, such as 23 + 87, 123 / 76, or 76 ^ 3.  Lines are evaluated with ^ having the highest priority, * and / having the next highest priority, and + and - having the lowest priority.  Pairs to the left are evaluated before pairs to the right, which means that 5 - 3 - 1 is evaluated as 5 - 3 = 2 and 2 - 1 = 1.  It would be wrong to evaluate the second pair first as is 3 - 1 = 2 and 5 - 2 = 3.  Using parentheses alleviates most of the confusion.  ( 5 - 3 ) - 1 is evaluated as our former example, and 5 - ( 3 - 1 ) is evaluated as our latter example.

In Reverse Polish Notation (RPN) the operators follow the pairs of operands and the operands are typically separated with some special character, such as a comma.  Our earlier examples would be written as 23,87,+  123,76,/  and 76,3,^.  Parentheses are no longer needed because RPN removes all ambiguity of how the line is to be evaluated.  Our last two examples in the previous paragraph would be written as 5,3,-,1,- and 5,3,1,-,-  In the first example  5,3,-,1,- the 5 is placed in the first position, the 3 is read into the second position, the minus operator is used and the result left in the first position.  The 1 is then read into the second position and the final minus operator is used leaving the result in the first position.  In the second example, the 5 and 3 are read into the first and second positions, then the 1 is read into the second position pushing the 3 into the first position which pushes the 5 into a first-in-last-out (FILO) memory slot.  The first minus operator works its magic on the 3 and 1 in the first and second position leaving the result in position 1, then the 5 is brought back in from memory into the first position pushing the previous result into the second position.  The final minus operator then does its stuff on the 5 and the 2 and the result is left in the first position.

You may ask, "Why on earth would we ever want to do a thing like that?"  The answer for most of us is that you probably wouldn't.  However there are practical applications to using RPN besides figuring out how to use that Hewlitt-Packard calculator.  Computers have a terrible time trying to figure out how to handle infix notation, so any time you want to instruct a computer how to perform some mathematical computations you have to tell it how to do it in RPN.  This method was first described to me in my second year of college when we were writing a simple language interpreter to convert this language called SLIP (Simplified Language for Instructional Purposes) into assembly language, which was then compiled and run.  SLIP was not a real language in that it did not have looping constructs built in, but it was useful in teaching us about reading input files, parsing the lines for tokens then directing further actions based on those tokens.  For our purposes tokens will be considered any numeric value, any alphanumeric string that could be interpreted as a variable, or any of the following mathematical operations:  +, -, *, /, ^.



Ok, now on to the algorithm...

Parsing from left to right, any numeric value we encounter or any alphanumeric string that could be interpreted as a variable is sent directly to the output, including decimal points.

When we encounter anything other than numbers or letters (or a decimal point), we do the following:

Append a token separator to the output string (in our case a comma.)

If we encounter an open parenthesis, that parenthesis is pushed to a FILO stack.

If we encounter a closing parenthesis, we discard it and pop operators from the stack to the end of the output string until we come to an open parenthesis.  The open parenthesis is then also discarded.

If we encounter a low priority operator (+ or -) then we must first pop all operators (+, -, *, /, ^) from the stack to the end of the output string until the stack is empty or we encounter an open parenthesis and then push the low priority operator on to the stack.

If we encounter a medium priority operator (* or /) then we must first pop any medium or high priority operators (*, /, ^) from the stack to the end of the output string  until the stack is empty or we encounter a low priority operator (+ or -) or we encounter an open parenthesis, then push the medium priority operator on to the stack.

If we encounter a high priority operator (^) we pop all other high priority operators from the stack to the end of the output string then push the new high priority operator to the stack.

Once all the input character have been processed, we pop operators off the stack until it's empty.




That's basically it, but it leaves somewhat to be desired.  For instance, this algorithm has no way of handling unary minuses (indicating negative numbers) nor does it deal with functions such as SIN(), COS(), SQR(), etc.  I describe one way of handling unary minus in the comments of the program listing below and will alter the code to handle those as well as sines, cosines, square roots, etc. at a later date.  In the mean time if you need an infix to RPN converter, this should handle most of your needs.



'
'   Infix to RPN - program to convert infix notation to
'       RPN notation.  Operands are separated by commas, to
'       use a different separation character change the
'       line where txtSeparator is initialized.  The operator
'       stack has been set to an upper limit of 100 operators,
'       if you expect to have input strings with more than 100
'       operators (heaven forbid...) increase the size of
'       txtStack in the Dim statement.
'
'       This routine does not handle unary minus signs.  To allow
'       unary minus signs two additional changes must be made.  In the
'       Low Operator case (+, -) we must check to see if the previous
'       input character was an alphanumeric.  If it IS alphanumeric
'       we treat the minus as a normal binary operator.  If it is NOT
'       alphanumeric, then we append a zero and token separator to the
'       output string, and push a special unary minus character to the
'       stack, which has an operator value equal to ^.  You can use the
'       tilde "~" as the unary minus character since it's not being used
'       for anything else.  I'll make this changes to the routine at a
'       later date.
'
'   Written by -    Erik Oosterwal
'   Started on -    Nov. 14, 2005
'   Completed on -  Nov. 15, 2005

    Dim txtSeparator, txtStack(100), txtCharacter As String     ' Declare all variables
    Dim intPointer, intStackPointer As Integer                  ' used in the routine

    txtSeparator = ","      ' Set the token separator to a comma
    intPointer = 1          ' Start by looking at the first character in the input string
    intStackPointer = 1     ' Start the operator stack at 1
    txtOutput.Text = ""     ' Clear the output sting

    While intPointer <= Len(txtInput.Text)                      ' As long as there's characters in the string
    
        txtCharacter = Mid$(txtInput.Text, intPointer, 1)       ' get the next character.
        
        Select Case txtCharacter
        
            Case "0" To "9", ".", "A" To "Z", "a" To "z"        ' Numbers and variables get appended
                txtOutput.Text = txtOutput.Text + txtCharacter  '   directly to the output string.
                
            Case "("
                txtStack(intStackPointer) = txtCharacter        ' Open parentheses get pushed on the stack
                intStackPointer = intStackPointer + 1
                
            Case ")"                                                    ' If we find a closing parenthesis
                While intStackPointer > 1 And _
                        txtStack(intStackPointer - 1) <> "("            ' Clear the stack until we
                    txtOutput.Text = txtOutput.Text + txtSeparator + _
                        txtStack(intStackPointer - 1)                   ' get to the opening parenthesis.
                    intStackPointer = intStackPointer - 1
                Wend
                intStackPointer = intStackPointer - 1       ' Decrease the stack pointer to overwrite the opening parenthesis.
                
            Case "+", "-"                                               ' Lowest operators
                txtOutput.Text = txtOutput.Text + txtSeparator          ' Append a token separator to the output string.
                
' If there are any operators on the stack AND they're higher or equal valued than + and -, then...
'   pop them off the stack and append them to the output string.
                While intStackPointer > 1 And _
                        txtStack(intStackPointer - 1) <> "("
                    txtOutput.Text = txtOutput.Text + _
                        txtStack(intStackPointer - 1) + txtSeparator
                    intStackPointer = intStackPointer - 1
                Wend
                txtStack(intStackPointer) = txtCharacter            ' Push the low operator on the stack.
                intStackPointer = intStackPointer + 1
                
            Case "*", "/"                                           ' Medium operators
                txtOutput.Text = txtOutput.Text + txtSeparator      ' Append a token separator to the output string.
                
' If there are any operators on the stack and they're higher or equal valued than * and /, then
'   pop them off the stack and append them to the output string.
                While intStackPointer > 1 And _
                        (txtStack(intStackPointer - 1) = "*" Or _
                        txtStack(intStackPointer - 1) = "/" Or _
                        txtStack(intStackPointer - 1) = "^")
                    txtOutput.Text = txtOutput.Text + _
                        txtStack(intStackPointer - 1) + txtSeparator
                    intStackPointer = intStackPointer - 1
                Wend
                txtStack(intStackPointer) = txtCharacter        ' Push the medium operator on the stack.
                intStackPointer = intStackPointer + 1
                
            Case "^"                                            ' High operators
                txtOutput.Text = txtOutput.Text + txtSeparator  ' Append a token separator to the output string.
                While intStackPointer > 1 And _
                        txtStack(intStackPointer - 1) = "^"
                    txtOutput.Text = txtOutput.Text + _
                        txtStack(intStackPointer - 1) + txtSeparator
                    intStackPointer = intStackPointer - 1
                Wend
                txtStack(intStackPointer) = txtCharacter        ' Push the high operator on the stack.
                intStackPointer = intStackPointer + 1
                
' There is no default case included in this routine.  Any characters not explicitly taken care
'   of by the cases will be ignored as whitespace.  Some characters, such as x, #, and others
'   that are used to denote binary, hex, octal, etc. numbers could be taken care of in the first
'   case where numeric and alpha tokens are handled.  It's ok to ignore spaces, tabs and other
'   white space.
        
        End Select
    
        intPointer = intPointer + 1             ' Set the pointer to look at the next character
        
    Wend
    
'   All the input characters have been taken care of, now it's time to clear the stack.
'
    While intStackPointer > 1                               ' As long as there's still operators on the stack
        txtOutput.Text = txtOutput.Text + txtSeparator + _
            txtStack(intStackPointer - 1)                   ' Take one off and append it to the output string
        intStackPointer = intStackPointer - 1               ' look at the previous operator
    Wend

End Sub

You can also download a zip file that contains an executable file that demonstrates this routine along with the Visual Basic project.





Discuss computer algorithms and other computer science topics at the Computer Algorithms blog page.

All code and original algorithms are © Erik Oosterwal - 1987-2008
Computer Science 101

Dressing for Success