public class RegExp extends Object
Automaton
.
Regular expressions are built from the following abstract syntax:
regexp | ::= | unionexp | ||
| | ||||
unionexp | ::= | interexp | unionexp | (union) | |
| | interexp | |||
interexp | ::= | concatexp & interexp | (intersection) | [OPTIONAL] |
| | concatexp | |||
concatexp | ::= | repeatexp concatexp | (concatenation) | |
| | repeatexp | |||
repeatexp | ::= | repeatexp ? | (zero or one occurrence) | |
| | repeatexp * | (zero or more occurrences) | ||
| | repeatexp + | (one or more occurrences) | ||
| | repeatexp {n} | (n occurrences) | ||
| | repeatexp {n,} | (n or more occurrences) | ||
| | repeatexp {n,m} | (n to m occurrences, including both) | ||
| | complexp | |||
complexp | ::= | ~ complexp | (complement) | [OPTIONAL] |
| | charclassexp | |||
charclassexp | ::= | [ charclasses ] | (character class) | |
| | [^ charclasses ] | (negated character class) | ||
| | simpleexp | |||
charclasses | ::= | charclass charclasses | ||
| | charclass | |||
charclass | ::= | charexp - charexp | (character range, including end-points) | |
| | charexp | |||
simpleexp | ::= | charexp | ||
| | . | (any single character) | ||
| | # | (the empty language) | [OPTIONAL] | |
| | @ | (any string) | [OPTIONAL] | |
| | " <Unicode string without double-quotes> " | (a string) | ||
| | ( ) | (the empty string) | ||
| | ( unionexp ) | (precedence override) | ||
| | < <identifier> > | (named automaton) | [OPTIONAL] | |
| | <n-m> | (numerical interval) | [OPTIONAL] | |
charexp | ::= | <Unicode character> | (a single non-reserved character) | |
| | \ <Unicode character> | (a single character) |
The productions marked [OPTIONAL] are only allowed
if specified by the syntax flags passed to the RegExp
constructor. The reserved characters used in the (enabled) syntax
must be escaped with backslash (\
) or double-quotes
("..."
). (In contrast to other regexp syntaxes,
this is required also in character classes.) Be aware that
dash (-
) has a special meaning in charclass expressions.
An identifier is a string not containing right angle bracket
(>
) or dash (-
). Numerical intervals are
specified by non-negative decimal integers and include both end
points, and if n
and m
have the
same number of digits, then the conforming strings must have that
length (i.e. prefixed by 0's).
Modifier and Type | Field and Description |
---|---|
static int |
ALL
Syntax flag, enables all optional regexp syntax.
|
static int |
ANYSTRING
Syntax flag, enables anystring (
@ ). |
static int |
AUTOMATON
Syntax flag, enables named automata (
< identifier> ). |
static int |
COMPLEMENT
Syntax flag, enables complement (
~ ). |
static int |
EMPTY
Syntax flag, enables empty language (
# ). |
static int |
INTERSECTION
Syntax flag, enables intersection (
& ). |
static int |
INTERVAL
Syntax flag, enables numerical intervals (
<n-m> ). |
static int |
NONE
Syntax flag, enables no optional regexp syntax.
|
Constructor and Description |
---|
RegExp(String s)
Constructs new
RegExp from a string. |
RegExp(String s,
int syntax_flags)
Constructs new
RegExp from a string. |
Modifier and Type | Method and Description |
---|---|
Set<String> |
getIdentifiers()
Returns set of automaton identifiers that occur in this regular expression.
|
boolean |
setAllowMutate(boolean flag)
Sets or resets allow mutate flag.
|
Automaton |
toAutomaton()
Constructs new
Automaton from this RegExp . |
Automaton |
toAutomaton(AutomatonProvider automaton_provider)
Constructs new
Automaton from this RegExp . |
Automaton |
toAutomaton(AutomatonProvider automaton_provider,
boolean minimize)
Constructs new
Automaton from this RegExp . |
Automaton |
toAutomaton(boolean minimize)
Constructs new
Automaton from this RegExp . |
Automaton |
toAutomaton(Map<String,Automaton> automata)
Constructs new
Automaton from this RegExp . |
Automaton |
toAutomaton(Map<String,Automaton> automata,
boolean minimize)
Constructs new
Automaton from this RegExp . |
String |
toString()
Constructs string from parsed regular expression.
|
public static final int INTERSECTION
&
).public static final int COMPLEMENT
~
).public static final int EMPTY
#
).public static final int ANYSTRING
@
).public static final int AUTOMATON
<
identifier>
).public static final int INTERVAL
<n-m>
).public static final int ALL
public static final int NONE
public RegExp(String s) throws IllegalArgumentException
RegExp
from a string.
Same as RegExp(s, ALL)
.s
- regexp stringIllegalArgumentException
- if an error occured while parsing the regular expressionpublic RegExp(String s, int syntax_flags) throws IllegalArgumentException
RegExp
from a string.s
- regexp stringsyntax_flags
- boolean 'or' of optional syntax constructs to be enabledIllegalArgumentException
- if an error occured while parsing the regular expressionpublic Automaton toAutomaton()
Automaton
from this RegExp
.
Same as toAutomaton(null)
(empty automaton map).public Automaton toAutomaton(boolean minimize)
Automaton
from this RegExp
.
Same as toAutomaton(null,minimize)
(empty automaton map).public Automaton toAutomaton(AutomatonProvider automaton_provider) throws IllegalArgumentException
Automaton
from this RegExp
.
The constructed automaton is minimal and deterministic and has no
transitions to dead states.automaton_provider
- provider of automata for named identifiersIllegalArgumentException
- if this regular expression uses
a named identifier that is not available from the automaton providerpublic Automaton toAutomaton(AutomatonProvider automaton_provider, boolean minimize) throws IllegalArgumentException
Automaton
from this RegExp
.
The constructed automaton has no transitions to dead states.automaton_provider
- provider of automata for named identifiersminimize
- if set, the automaton is minimized and determinizedIllegalArgumentException
- if this regular expression uses
a named identifier that is not available from the automaton providerpublic Automaton toAutomaton(Map<String,Automaton> automata) throws IllegalArgumentException
Automaton
from this RegExp
.
The constructed automaton is minimal and deterministic and has no
transitions to dead states.automata
- a map from automaton identifiers to automata
(of type Automaton
).IllegalArgumentException
- if this regular expression uses
a named identifier that does not occur in the automaton mappublic Automaton toAutomaton(Map<String,Automaton> automata, boolean minimize) throws IllegalArgumentException
Automaton
from this RegExp
.
The constructed automaton has no transitions to dead states.automata
- a map from automaton identifiers to automata
(of type Automaton
).minimize
- if set, the automaton is minimized and determinizedIllegalArgumentException
- if this regular expression uses
a named identifier that does not occur in the automaton mappublic boolean setAllowMutate(boolean flag)
flag
- if true, the flag is setpublic String toString()
Copyright © 2020. All rights reserved.