Commit 5a431a7f authored by Eckhart Arnold's avatar Eckhart Arnold
Browse files

documentation extended

parent 26eac43a
......@@ -16,8 +16,8 @@
# permissions and limitations under the License.
Module ``dsl`` contains various functions to support the
compilation of domain specific languages based on an EBNF-grammar.
Module ``dsl`` contains high-level functions for the compilation
of domain specific languages based on an EBNF-grammar.
import functools
......@@ -1680,14 +1680,14 @@ def dsl_error_msg(parser: Parser, error_str: str) -> str:
return " ".join(msg)
_GRAMMAR_PLACEHOLDER = None # type: Grammar
_GRAMMAR_PLACEHOLDER = None # type: Optional[Grammar]
def get_grammar_placeholder() -> Grammar:
_GRAMMAR_PLACEHOLDER = Grammar.__new__(Grammar)
return cast(Parser, _GRAMMAR_PLACEHOLDER)
return cast(Grammar, _GRAMMAR_PLACEHOLDER)
def is_grammar_placeholder(grammar: Optional[Grammar]) -> bool:
......@@ -17,18 +17,18 @@ Step Guide`_.) The usage and API of DHParser is (or will be) described
with many examples in the doctrings of its various modules. The following
reading-order is recommended to understand DHParser:
1. `` - Although DHParser also offers a Python-interface for specifying
1. `ebnf` - Although DHParser also offers a Python-interface for specifying
grammers (similar to pyparsing_), the recommended way of using DHParser
is by specifying the grammar in EBNF_. Here it is described how grammars
are specified in EBNF_ and how parsers can be auto-generated from these
grammars and how they are used to parse text.
2. `` - Syntax-trees are the central data-structure of any
2. `syntaxtree` - Syntax-trees are the central data-structure of any
parsing system. The description to this modules explains how syntax-trees
are represented within DHParser, how they can be manipulated, queried
and serialized or deserialized as XML, S-expressions or json.
3. `` - It is not untypical for digital humanities applications
3. `transform` - It is not untypical for digital humanities applications
that document tress are transformed again and again to produce different
representations of research data or various output forms. DHParser
supplies the scaffolding for two different types of tree transformations,
......@@ -42,7 +42,7 @@ reading-order is recommended to understand DHParser:
(An example for this kind of declaratively specified transformation is
the ``EBNF_AST_transformation_table`` within the DHParser's ebnf-module.)
4. `` - The compile-module offers an object-oriented scaffolding
4. `compile` - The compile-module offers an object-oriented scaffolding
for the `visitor pattern`_ that is more suitable for complex
transformations that make heavy use of algorithms as well as
transformations from trees to non-tree objects like program code.
......@@ -51,91 +51,102 @@ reading-order is recommended to understand DHParser:
With the documentation of these four modules you should have enough
knowledge to realize projects that follow the workflow described
in the `Step by Step Guide`_. There will seldom be need to interact
with the other modules directly. However, reading their documentation
may help deepening the understanding of how DHParser works under
the hood and be useful for more special usa cases.
in the `Step by Step Guide`_. In most cases there will be no need to
interact with the other modules directly.
5. `parse` - contains the parsing algorithms and the
Python-Interface for defining parsers. DHParser features a packrat-parser
for parsing-expression-grammars with full left-recursion support as well
configurable error catching an continuation after error. The
Python-Interface allows to define grammars directly as Python-code
without the need to compile an EBNF-grammar first. This is an alternative
approach to defining grammars similar to that of pyparsing_.
6. `dsl` - contains high-level functions for compiling
ebnf-grammars and domain specific languages "on the fly".
7. `preprocess` - provides support for DSL-preprocessors as well as source
mapping of (error-)locations from the preprocessed document to the original
document(s). Preprocessors are a practical means for adding features to
a DSL which are difficult or impossible to define with context-free-grammars
in EBNF-notation, like for example scoping based on indentation (as used
by Python) or chaining of source-texts via an "include"-directive.
8. `error` - defines the ``Error``-class, the objects of which describe
errors in the source document. Errors are defined by - at least - an
error code (indicating at the same time the level of severity), a human
readable error message and a position in the source text.
9. `testing` - provides functions for unit-testing of grammars. Usually,
developers will not need to interact with this module directly, but rely on
the unit-testing script generated by the "" command-line tool.
10. `trace` - Apart from unit-testing DHParser offers "post-mortem"
debugging of the parsing process itself - as described in the
`Step by Step Guide`_. This is helpful to figure out why a parser went
wrong. Again, there is little need to interact with this module directly,
as it functionality is turned on by setting the configuration variables
``history_tracking`` and, for tracing continuation after errors,
``resume_notices``, which in turn can be triggered by calling the
auto-generated with the parameter ``--debug``.
11. `log` - logging facilities for DHParser as well as tracking of the
parsing-history in connection with module `trace`.
12. `configuration` - the central place for all configuration settings of
DHParser. Be sure to use the ``access``, ``set`` and ``get`` functions
to change presets and configuration values in order to make sure that
changes to the configuration work when used in combination with
multithreading or multiprocessing.
13. `server` - In order to avoid startup times or to provide a language
sever for a domain-specific-language (DSL), DSL-parsers generated by
DHParser can be run as a server. Module `server` provides the scaffolding
for an asynchronous language server. The"-script generated
by DHParser provides a minimal language server (sufficient) for
compiling a DSL. Especially if used with the just-in-time compiler
`pypy`_ using the script allows for a significant speed-up.
14. `lsp` - (as of now, this is just a stub!) provides data classes that
resemble the typescript-interfaces of the `language server protocol specification`_.
15. `stringview` - defines a low level class that provides views on slices
of strings. It is used by the `parse`-module to avoid excessive copying
of data when slicing strings. (Python always creates a copy of the
data when slicing strings as a design decision.) If any, this module
can significantly be sped up by compiling it with cython_. (Use the
``cythonize_stringview``-skript in DHParser's main directory or,
even better, compile (almost) all modules with the
``build_cython-modules``-skript. This yields a 2-3x speed increase.
The fastest way to run DHParser, however, is pypy_, which yields
a 4-5x speed increase, albeit only in the long run.)
16. `toolkit` - various little helper functions for DHParser. Usually,
there is no need to call any of these directly.
17. `` - this modules contains merely the version-number
of the DHParser-package.
.. _pyparsing:
.. _lark:
.. _`language server protocol`:
.. _cython:
.. _`language server`:
.. _`language sever protocol`:
.. _`language sever protocol specification`:
.. _EBNF:
.. _`visitor pattern`:
.. _pypy:
.. _`Step by Step Guide`: StepByStepGuide.rst
DHParser is split into a number of modules plus one command line utility
(````, which will not be described here.)
Usually, the user or "importer" of DHParser does not need to worry
about its internal module structure, because DHParser provides a flat
namespace form which all of its symbols can be imported, e.g.::
from DHParser import *
from DHParser import recompile_grammar, grammar_suite, compile_source
However, in order to add or change the source code of DHParser, its module
structure must be understood. DHParser's modules can roughly be sorted into
three different categories:
1. Modules that contain the basic functionality for packrat-parsing,
AST-transformation and the skeleton for a DSL-compilers.
2. Modules for EBNF-Grammars and DSL compilation.
3. Support or "toolkit"-modules that contain different helpful functions
The import-order of DHParser's modules runs across these categories. In the
following list the modules further below in the list may import one or
more of the modules further above in the list, but not the other way round:
- -- contains the verison number of DHParser
- -- utility functions for DHParser
- -- a string class where slices are views not copies as
with the standard Python strings.
- -- preprocessing of source files for DHParser
- -- error handling for DHParser
- -- syntax tree classes for DHParser
- -- transformation functions for converting the concrete
into the abstract syntax tree
- -- logging and debugging for DHParser
- -- parser combinators for for DHParser
- -- abstract base class for compilers that transform an AST
into something useful
- -- EBNF -> Python-Parser compilation for DHParser
- -- Support for domain specific notations for DHParser
- -- test support for DHParser based grammars and compilers
Main Modules Reference
The core of DHParser are the modules containing the functionality
for the parsing and compiling process. The modules ``preprocess``,
``parse``, ``transform`` and ``compile`` represent particular stages of the
parsing/compiling process, while ``syntaxtree`` and ``error`` define
classes for syntax trees and parser/compiler errors, respectively.
Module ``preprocess``
Module ``ebnf``
.. automodule:: preprocess
.. automodule:: ebnf
Module ``syntaxtree``
......@@ -144,12 +155,6 @@ Module ``syntaxtree``
.. automodule:: syntaxtree
Module ``parse``
.. automodule:: parse
Module ``transform``
......@@ -162,41 +167,32 @@ Module ``compile``
.. automodule:: compile
Module ``error``
Module ``parse``
.. automodule:: error
.. automodule:: parse
Domain Specific Language Modules Reference
DHParser contains additional support for domain specific languages.
Module ``ebnf`` provides a self-hosting parser for EBNF-Grammars as
well as an EBNF-compiler that compiles an EBNF-Grammar into a
DHParser based Grammar class that can be executed to parse source text
conforming to this grammar into contrete syntax trees.
Module ``dsl`` contains additional functions to support the compilation
of arbitrary domain specific languages (DSL).
One very indispensable part of the systematic construction of domain
specific languages is testing. DHParser supports unit testing of
smaller as well as larger components of the Grammar of a DSL.
Module ``dsl``
.. automodule:: dsl
Module ``ebnf``
Module ``preprocess``
.. automodule:: ebnf
.. automodule:: preprocess
Module ``dsl``
.. automodule:: dsl
Module ``error``
.. automodule:: error
Module ``testing``
......@@ -206,36 +202,34 @@ Module ``testing``
Module ``trace``
Supporting Modules Reference
Finally, DHParser comprises a number of "toolkit"-modules which
define helpful functions and classes that will are used at different
places throughout the other DHParser-modules.
.. automodule:: trace
Module ``server``
Module ``log``
.. automodule:: server
.. automodule:: log
Module ``toolkit``
Module ``configuration``
.. automodule:: toolkit
.. automodule:: configuration
Module ``trace``
Module ``server``
.. automodule:: trace
.. automodule:: server
Module ``log``
Module ``lsp``
.. automodule:: log
.. automodule:: lsp
Module ``stringview``
......@@ -244,6 +238,12 @@ Module ``stringview``
.. automodule:: stringview
Module ``toolkit``
.. automodule:: toolkit
Module ``versionnumber``
......@@ -900,7 +900,7 @@ if __name__ == "__main__":
log_dir = 'LOGS'
set_config_value('history_tracking', True)
set_config_value('resume_notices', True)
set_config_value('log_syntax_trees', set(['cst', 'ast'])) # don't use a set literal, here
set_config_value('log_syntax_trees', set(['cst', 'ast'])) # don't use a set literal, here!
if args.xml:
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment