Naive string matcher

PROBLEM
The text is an array T.1,T.2,...,T.N and the pattern is an array P.1,P.2,...,P.M. The elements of P. and T. are characters drawn from a finite alphabet Sigma., Sigma. is an array of R elements. We say that pattern P. occurs with shift S in text T. if 0<=S<=N-M and if T.SpJ=P.J, where SpJ indicates S+J, for J=1,...,M. String-matching problem is the problem of finding all valid shifts with which a given pattern occurs in a given text.

ALGORITHM
The worst-case running time of naive string-matcher is O(N**2).

PRACTICE
When a text and a pattern are real strings then the fastest solution is by REAL_STRING_MATCHER. In general case there is not an unambiguous winner. You have to test subroutines:

NAIVE_STRING_MATCHER
KMP_MATCHER
BOYER_MOOR_MATCHER

on your data. Example is following, where M respectively R is number of elements in P. respectively in Sigma. and S is number of valid shifts:

Running time in seconds for N=100000
Algorithm M=10,R=1999,S=50 M=10,R=4,S=10000 M=100,R=4,S=10000
Naive 16  25  150 
KMP 21  22  23 
Boyer-Moore 15  138 

IMPLEMENTATION
Unit: internal subroutine
 
Global variables: array T.1,...,T.N of strings, array P.1,...,P.M of strings
 
Parameters: a positive integer N - number of elements in T., a positive integer M - number of elements in P.
 
Output: displays on the screen Pattern occurs with shift S for all valid shift S
 


NAIVE_STRING_MATCHER:
procedure expose T. P.
parse arg M, N
do S = 0 to N - M + 1
  do J = 1 to M
    SpJ = S + J
    if P.J <> T.SpJ then iterate S
  end
  say "Pattern occurs with shift" S
end
return

 

CONNECTIONS
String-Matching Problem
     Matcher for real string
     Boyer-Moore algorithm
     Knuth-Morris-Pratt matcher

Literature
Cormen T. H., Leiserson Ch. E., Rivest R. L. Introduction to Algorithms
The MIT Press, Cambridge, 1990


Cover Contents Index Main page Rexx page   Mail

last modified 1st August 2001
Copyright © 2000-2001 Vladimir Zabrodsky
Czech Republic