(pattern, action)where pattern is a regular axpression built using the Plex pattern constructors, and action is the action to be performed when this pattern is recognised. Pattern constructors and actions are defined below.
State(name, tokens)where nameis a character string naming the state, and tokens is a list of token definitions as above. The meaning and usage of states is described below.
Plex patterns are built using the following constructors.
Str(s)Matches the literal string s.Str(s1,s2, ...)Matches either the string s1 or s2 or ...Any(s)
Equivalent to Alt(Str(s1),Str(s2),...).Matches any single character in the string s.AnyBut(s)Matches any single character (including newline) which is not in the string s.AnyCharMatches any single character (including newline). Equivalent to AnyBut('').EmptyMatches the empty string.p1 + p2Matches the pattern p1 followed by p2. Equivalent to Seq(p1, p2).p1 | p2Matches either the pattern p1 or p2. Equivalent to Alt(p1, p2).Seq(p1, p2, ...)Matches the pattern p1 followed by p2 followed by ...Alt(p1, p2, ...)Matches either the pattern p1 or p2 or ...Opt(p)Matches either the pattern p or the empty string. Equivalent to p | Empty.Rep(p)Matches zero or more repetitions of the pattern p.Rep1(p)Matches one or more repetitions of the pattern p.NoCase(p)Matches the same strings as the pattern p, except that, in any part of p not enclosed by a Case(), upper and lower case letters are treated as equivalent.Case(p)Matches the same strings as the pattern p, except that, in any part of p not enclosed by a NoCase(), upper and lower case letters are treated as distinct.BolMatches an imaginary character at the beginning of a line (i.e. at the start of the file or just after a newline).EolMatches an imaginary character at the end of a line (i.e. just before a newline or at the end of the file).EofMatches an imaginary character at the end of the file.Note: The patterns Bol, Eol and Eof will only match once at any given position.
The action in a token specication may be one of three things:
- A function, which is called as follows:function(scanner, text)where scanner is the relevant Scanner instance, and text is the matched text. If the function returns anything other than None, that value is returned as the value of the token. If it returns None, scanning continues as if the IGNORE action were specified (see below).
- One of the following special actions:
The recognised characters will be treated as white space and ignored. Scanning will continue until the next non-ignoredtoken is recognised before returning.TEXT
Causes the scanned text itself to be returned as the value of the token.Begin(state)
Causes the Scanner to enter the state named state(see below).
- Any other value, which is returned as the value of the token.
At any given time, the scanner is in one of a number of states. Associated with each state is a set of possible tokens. When scanning, only tokens associated with the current state are recognised.
There is a default state, whose name is the empty string. Token definitions which are not inside any State definition belong to the default state.
The initial state of the scanner is the default state. The state can be changed by:
A Scanner instance associates a Lexicon with a stream of characters and provides a means of reading tokens from the stream.
Scanner(lexicon, stream[, name = ''])
read() --> (value, text)Reads the next lexical token from the stream and returns a tuple (value, text), where value is the value associated with the token as specified by the Lexicon, and text is the actual string read from the stream. Returns (None, '') on end of file.position() --> (name, line, col)Returns a tuple (name,line,col) representing the location of the last token read using the read() method. name is the name that was provided to the Scanner constructor; line is the line number in the stream (1-based); col is the position within the line of the first character of the token (0-based).begin(state_name)Sets the current state of the Scanner to the state named state_name.yield(value [, text])Called from an action procedure, causes value to be returned as the token value from the current call to read(). If text is supplied, it is returned in place of the scanned text.eof()
yield() can be called more than once during a single call to an action procedure. In this case, scanning is suspended and tokens are queued and returned one at a time by subsequent calls to read(). When the queue is empty, scanning resumes.This method can be overridden to perform an action when the end of the input stream is encountered. The default implementation does nothing.