Lexing with Multiple States

You may have some states in your lexer, like the state in a string, the state in a method body or something else. We can switch between these states like this:

import org.kiot.lexer.Lexer

val data = LexerData.build {
	// default state is always 0
	state(default) {
		": " action 1
		"\\w+" action 2
	}
	state(1) {
		".+" action 3
	}
}

class SimpleLexer(chars: CharSequence) : Lexer<Nothing>(data, chars) {
	override fun onAction(action: Int) {
		when (action) {
			1 -> switchState(1)
			2 -> println("word: ${string()}")
			3 -> println("definition: ${string()}")
		}
	}
}

lexer.lex("KiotLand: A land where Kotlin lovers gather.")
/*
	word: KiotLand
	definition: A land where Kotlin lovers gather.
*/

You may think it's not good to indicate each state with a number. kiot-lexer has provided LexerState to do this:

import org.kiot.lexer.LexerState
import org.kiot.lexer.Lexer

enum class MyState : LexerState {
	DEFINITION
}

// ...

class SimpleLexer(chars: CharSequence) : Lexer<Nothing>(data, chars) {
	override fun onAction(action: Int) {
		when (action) {
			1 -> switchState(MyState.DEFINITION)
			2 -> println("word: ${string()}")
			3 -> println("definition: ${string()}")
		}
	}
}

Last updated

Was this helpful?