11.3.2021, 9:00 - 11:00: Due to updates GitLab may be unavailable for some minutes between 09:00 and 11:00.

Commit 3dab649e authored by eckhart's avatar eckhart

- TODO.md changed: New objective: early tree optimization

parent fab4160e
......@@ -1097,7 +1097,7 @@ class PreprocessorToken(Parser):
class PlainText(Parser):
"""
Parses plain text strings.
Parses plain text strings. (Could be done by RegExp as well, but is faster.)
Example:
>>> while_token = PlainText("while")
......@@ -1108,14 +1108,14 @@ class PlainText(Parser):
def __init__(self, text: str, name: str = '') -> None:
super().__init__(name)
self.text = text
self.textlen = len(text)
self.len = len(text)
def __deepcopy__(self, memo):
return self.__class__(self.text, self.name)
def __call__(self, text: StringView) -> Tuple[Optional[Node], StringView]:
if text.startswith(self.text):
return Node(self, self.text, True), text[self.textlen:]
return Node(self, self.text, True), text[self.len:]
return None, text
......@@ -2247,10 +2247,12 @@ def compile_source(source: str,
2. A list of error or warning messages
3. The root-node of the abstract syntax tree
"""
source_text = load_if_file(source)
original_text = load_if_file(source)
log_file_name = logfile_basename(source, compiler)
if preprocessor is not None:
source_text, source_mapping = with_source_mapping(preprocessor(source_text))
if preprocessor is None:
source_text = original_text
else:
source_text, source_mapping = with_source_mapping(preprocessor(original_text))
syntax_tree = parser(source_text)
if is_logging():
syntax_tree.log(log_file_name + '.cst')
......@@ -2273,4 +2275,5 @@ def compile_source(source: str,
# print(syntax_tree.as_sxpr())
messages.extend(syntax_tree.collect_errors(source_text))
syntax_tree.error_flag = max(syntax_tree.error_flag, efl)
return result, messages, syntax_tree
......@@ -3,6 +3,7 @@
#cython: c_string_type=unicode
#cython: c_string_encoding=utf-8
import cython
# type hints for Cython python -> C compiler to speed up the most
......
General TODO-List
-----------------
=================
Optimizations
-------------
**Early discarding of nodes**:
Reason: `traverse_recursive` and `Node.result-setter` are top time consumers!
Allow to specify parsers/nodes, the result of which
will be dropped right away, so that the nodes they produce do not need to be
removed during the AST-Transformations. Typical candidates would be:
1. Tokens ":Token"
2. Whitespace ":Whitespace" (in some cases)
3. empty Nodes
and basically anything that would be removed globally ("+" entry in the
AST-Transformation dictionary) later anyway.
A directive ("@discarable = ...") could be introduced to specify the discardables
Challenges:
1. Discardable Nodes should not even be created in the first place to avoid
costly object creation and assignment of result to the Node object on
creation.
2. ...but discarded or discardable nodes are not the same as a not matching parser.
Possible solution would be to introduce a dummy/zombie-Node that will be discarded
by the calling Parser, i.e. Zero or More, Series etc.
3. Two kinds of conditions for dis
4. Capture/Retrieve/Pop - need the parsed data even if the node would otherwise
be discardable (Example: Variable Delimiters.) So, either:
a. temporarily suspend discarding by gramma-object-flag set and cleared by
Capture/Retrieve/Pop. Means yet another flag has to be checked every time
the decision to discard or not needs to be taken...
b. statically check (i.e. check at compile time) that Capture/Retrieve/Pop
neither directly nor indirectly call a discardable parser. Downside:
Some parsers cannot profit from the optimization. For example variable
delimiters, otherwise as all delimiters a good candidate for discarding
cannot be discarded any more.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment