Matcher for real strings

PROBLEM
The text is an array T.1,T.2,...,T.N and the pattern is an array P.1,P.2,...,P.M. The elements of P. and T. are characters drawn from a finite alphabet Sigma., Sigma. is an array of R elements. We say that pattern P. occurs with shift S in text T. if 0<=S<=N-M and if T.SpJ=P.J, where SpJ indicates S+J, for J=1,...,M. String-matching problem is the problem of finding all valid shifts with which a given pattern occurs in a given text.

ALGORITHM
When a text and a pattern are real strings (symbols of alphabet are real characters such as 'a' or X'0E'), there is a natural solution by the POS built-in function of the Rexx language.

PRACTICE
In the first 320KB of Text (REGINA.DOC) finds REAL_STRING_MATCHER all occurence of the pattren Regina after 0.1 seconds. KMP_MATCHER makes the same after 406 seconds and BOYER_MOORE_MATCHER after 89 seconds.

IMPLEMENTATION
Unit: internal subroutine
 
Parameters: a string Text, a string Pattern
 
Output: displays on the screen Pattern occurs with shift S for all valid shift S
 


REAL_STRING_MATCHER: parse arg Text, Pattern
do F = 0 until F = 0
  F = POS(Pattern, Text, F + 1)
  if F > 0 then
    say "Pattern occurs with shift" F - 1
end
return

 

CONNECTIONS
String-Matching Problem
     Naive string matcher
     Boyer-Moore algorithm
     Knuth-Morris-Pratt matcher


Cover Contents Index Main page Rexx page   Mail

last modified 28st September 2001
Copyright © 2000-2001 Vladimir Zabrodsky
Czech Republic