RegExp Reusing

Sometimes we need to reuse some frequently used RegExp, not only to reduce the complexity of the code, but also to speed up the program. We can do that this way in kiot-lexer:

import org.kiot.automaton.RegExp

val capitalizedWord = "[A-Z]\\w+".regexp()
val word = "\\w+".regexp()
val number = "\\d+".regexp()
val sentence = (RegExp + capitalizedWord + "( (" + word + "|" + number + "))+").build()

sentence.match("We can deal with numbers like 1926") // true
sentence.match("not capitalized") // false

Note that we use RegExp to mark the beginning of a regexp sequence, then we call build on this sequence to get its parsing result. In the example above, capitalizedWord, word and number are NFAs instead of strings, so the process of repeatedly parsing some regexp is reduced.

Last updated

Was this helpful?