aboutsummaryrefslogtreecommitdiffstats
path: root/README
diff options
context:
space:
mode:
authorMattias Andrée <m@maandree.se>2026-01-05 14:35:26 +0100
committerMattias Andrée <m@maandree.se>2026-02-23 07:53:08 +0100
commit208594ca9f95a87f60ff052490a4d5824dc23801 (patch)
tree915a17b51f62b289ba4115214a057610416d755e /README
parentAdd committed-operator (diff)
downloadlibparser-208594ca9f95a87f60ff052490a4d5824dc23801.tar.gz
libparser-208594ca9f95a87f60ff052490a4d5824dc23801.tar.bz2
libparser-208594ca9f95a87f60ff052490a4d5824dc23801.tar.xz
Make deterministic the default
Signed-off-by: Mattias Andrée <m@maandree.se>
Diffstat (limited to 'README')
-rw-r--r--README37
1 files changed, 27 insertions, 10 deletions
diff --git a/README b/README
index 37b2701..23d231b 100644
--- a/README
+++ b/README
@@ -76,11 +76,14 @@ EXTENDED DESCRIPTION
_low = character | integer;
_high = character | integer;
+ nondeterministic = "?";
+
+ committed = "+", _, _operand;
rejection = "!", _, _operand;
concatenation = _operand, {_, ",", _, _operand};
- alternation = concatenation, {_, "|", _, concatenation};
- optional = "[", _, _expression, _, "]";
- repeated = "{", _, _expression, _, "}";
+ alternation = concatenation, {_, [nondeterministic], "|", _, concatenation};
+ optional = [nondeterministic], "[", _, _expression, _, "]";
+ repeated = [nondeterministic], "{", _, _expression, _, "}";
group = "(", _, _expression, _, ")";
char-range = "<", _, _low, _, ",", _, _high, "_", ">";
exception = "-";
@@ -88,7 +91,7 @@ EXTENDED DESCRIPTION
_literal = char-range | exception | string;
_group = optional | repeated | group | embedded-rule;
- _operand = _group | _literal | rejection;
+ _operand = _group | _literal | rejection | committed;
_expression = alternation;
@@ -109,12 +112,24 @@ EXTENDED DESCRIPTION
reached, the parser will terminate there.
Repeated symbols may occur any number of times, including
- zero. The compiler is able to backtrack if it takes too much.
+ zero. The parser will try to take as much as as possible.
+
+ Optional symbols are taken whenever possible.
+
+ Concatenation has higher precedence than alternation, groups
+ ("(", ..., ")") have no semantic meaning and are useful only
+ for including alternations inside concatenations or put
+ alternations or concatenations inside a commitment or
+ rejection.
- Concatenation has higher precedence than alternation,
- groups ("(", ..., ")") have no semantic meaning and are useful
- only to put a alternation inside a concatenation without
- creating a new rule for that.
+ The parser has the ability to be non-deterministic, which
+ can make it really slow. The speed up parsing, you can add
+ commits. Once the committed sentence has been matched, the
+ branching-points inside it are unrecorded, so when the parser
+ backtracks, it will not try different choices inside the
+ committed sentence. Commit sentences are undone by the parser
+ when it backtracks to a branching-point outside the committed
+ sentence.
In character ranges, the _high and _low values must be at
least 0 and at most 255, and _high must be greater than _low.
@@ -125,7 +140,9 @@ EXTENDED DESCRIPTION
Left recursion is illegal (it will cause stack overflow at
runtime as the empty condition before the recursion is always
- met).
+ met). Likewise, repeated optional sentences are illegal; a
+ repeated sentence must always consume input, otherwise it
+ gets stuck.
Right-context-sensitive grammar
libparser originally used context-free grammar, but with