Next: Algorithms in m0, Up: Top [Contents]
This document provides insight in the algorithms used to detect macros
and patterns in the m0 macro processor.
Useful for the curious and persons who want to use these for similar purposes.
Bitap algorithms are already known in different implementations. It is also known as the shift-or, shift-and or Baeza-Yates–Gonnet algorithm (see e.g. wikipedia Bitap algorithm).
In Algorithms in m0 the bitap algorithms used in m0 in general and for detecting
macros are described. In Bitap for patterns in m0 the implementation of the bitap
algorithm for the patterns is described.
In many search algorithms the characters in a text are compared with the characters to be found using a compare instruction. A compare instruction is not used in the bitap algorithm, instead:
The bitap algorithm uses a look-up table instead of a compare.
The advantage of a look-up table is that multiple characters can be checked in a single position with a single look-up. The table has an index the size of the possible values of a character. When using all possible values in a byte the index size is thus 256.
The result of a look-up is either true or false, so that a single bit value is sufficient for the result. To check a word it is only necessary to use a bit AND to combine all the result of the positions of the word and if the result is true, then the word is found.
The following figure illustrates this:
search word: wor text: | | | |H|e|l|l|o| |w|o|r|l|d | | | | search w o r bits 0 0 0 = 0 search w o r bits 0 0 0 = 0 search w o r bits 0 0 0 = 0 search w o r bits 0 1 0 = 0 search w o r bits 0 0 0 = 0 search w o r bits 0 0 0 = 0 search w o r bits 1 1 1 = 1 found: wor
To check if a word has been found:
A bit AND is used on the look-up results to test if the word is found.
This AND is the "and" in the name of the shift-and. The shift is used to put the bits over each other. How this works is explained in the following parts.
Next: Algorithms in m0, Up: Top [Contents]