Skip to content
GitLab
Menu
Projects
Groups
Snippets
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Sign in
Toggle navigation
Menu
Open sidebar
badw-it
DHParser
Commits
5aa3acd2
Commit
5aa3acd2
authored
Dec 17, 2017
by
eckhart
Browse files
some documentation added (still a stub)
parent
c90818b7
Changes
15
Show whitespace changes
Inline
Side-by-side
DHParser/ebnf.py
View file @
5aa3acd2
...
...
@@ -145,7 +145,8 @@ class EBNFGrammar(Grammar):
def
grammar_changed
(
grammar_class
,
grammar_source
:
str
)
->
bool
:
"""Returns ``True`` if ``grammar_class`` does not reflect the latest
"""
Returns ``True`` if ``grammar_class`` does not reflect the latest
changes of ``grammar_source``
Parameters:
...
...
DHParser/syntaxtree.py
View file @
5aa3acd2
...
...
@@ -583,7 +583,7 @@ class Node(collections.abc.Sized):
def
log
(
self
,
log_file_name
):
"""
Writes a
b
S-expression
s
of the tree with root `self` to a file.
Writes a
n
S-expression
-representation
of the tree with root `self` to a file.
"""
if
is_logging
():
path
=
os
.
path
.
join
(
log_dir
(),
log_file_name
)
...
...
DHParser/versionnumber.py
View file @
5aa3acd2
...
...
@@ -17,4 +17,4 @@ permissions and limitations under the License.
"""
__all__
=
(
'__version__'
,)
__version__
=
'0.7.
8
'
# + '_dev' + str(os.stat(__file__).st_mtime)
__version__
=
'0.7.
9
'
# + '_dev' + str(os.stat(__file__).st_mtime)
Introduction.md
View file @
5aa3acd2
...
...
@@ -5,9 +5,8 @@ Introduction to [DHParser](https://gitlab.lrz.de/badw-it/DHParser)
Motto:
**Computers enjoy XML, humans don't.**
Why use domain specific languages in the humanities?
----------------------------------------------------
Why use domain specific languages in the humanities
---------------------------------------------------
Suppose you are a literary scientist and you would like to edit a poem
like Heinrich Heine's "Lyrisches Intermezzo". Usually, the technology of
...
...
@@ -101,7 +100,7 @@ can be simply encoded like this:
so muß ich weinen bitterlich.
Yes, that's right. It is as simple as that. Observe, how much more
effacious a verse like "Wenn ich mich lehn' an deine Brust, / kommt's
eff
ic
acious a verse like "Wenn ich mich lehn' an deine Brust, / kommt's
über mich wie Himmelslust," can be if it is not cluttered with XML tags
;-)
...
...
@@ -203,7 +202,7 @@ definition of `text` in the 6th line: `{ strophe {LEERZEILE} }+`. This reads as
follows: The text of the poem consists of a sequence of stanzas, each of which
is followed by a sequence of empty lines (German: "Leerzeilen"). If you now look
at the structural definition of a stanza, you find that it consists of a sequence
of verses, each of which starts, i.e. is prece
e
ded by a new line.
of verses, each of which starts, i.e. is preceded by a new line.
Can you figure out the rest? Hint: The angular brackets
`[`
and
`]`
mean that and
item is optional and the
`§`
sign means that it is obligatory. (Strictly speaking,
...
...
@@ -215,7 +214,7 @@ This should be enough for an introduction to the purpose of DSLs in the
humanities. It has shown the probably most important use case of
DHParser, i.e. as a frontend-technology form XML-encodings. Of course,
it can just as well be used as a frontend for any other kind of
structured data, like SQL or graph-str
c
utured data. The latter is by the
structured data, like SQL or graph-stru
c
tured data. The latter is by the
way is a very reasonable alternative to XML for edition projects with a
complex transmission history. See Andreas Kuczera's Blog-entry on
[
"Graphdatenbanken für Historiker"
](
http://mittelalter.hypotheses.org/5995
)
.
...
...
@@ -362,7 +361,7 @@ yields the fairly clean Pseudo-XML-representation of the DSL-encoded
poem that we have seen above. Just as a teaser, you might want to look
up, how the AST-transformation is specified with DHParser. For this
purpose, you can have a look in file
`LyrikCompiler_example.py`
. If you
scro
o
l down to the AST section, you'll see something like this:
scro
l
l down to the AST section, you'll see something like this:
Lyrik_AST_transformation_table = {
# AST Transformations for the Lyrik-grammar
...
...
README.md
View file @
5aa3acd2
...
...
@@ -7,7 +7,6 @@ specific languages (DSL) in Digital Humanities projects.
Author: Eckhart Arnold, Bavarian Academy of Sciences
Email: arnold@badw.de
License
-------
...
...
@@ -32,7 +31,6 @@ Python 3.5 source code in order for DHParser to be backwards compatible
with Python 3.4. The module
``DHParser/foreign_typing.py``
is licensed under the
[
Python Software Foundation License Version 2
](
https://docs.python.org/3.5/license.html
)
Sources
-------
...
...
@@ -44,7 +42,6 @@ Get them with:
Please contact me, if you are intested in contributing to the
development or just using DHParser.
Disclaimer
----------
...
...
@@ -55,7 +52,6 @@ function names changed in future versions. The API is NOT YET STABLE!
Use it for testing an evaluation, but not in an production environment
or contact me first, if you intend to do so.
Purpose
-------
...
...
@@ -122,13 +118,11 @@ Python-based parser class representing that grammar. The concrete and
abstract syntax tree as well as a full and abbreviated log of the
parsing process will be stored in a sub-directory named "LOG".
Introduction
------------
see
[
Introduction.md
](
https://gitlab.lrz.de/badw-it/DHParser/blob/master/Introduction.md
)
References
----------
...
...
@@ -146,7 +140,6 @@ München 2016. Short-URL: [tiny.badw.de/2JVT][Arnold_2016]
[
Arnold_2016
]:
https://f.hypotheses.org/wp-content/blogs.dir/1856/files/2016/12/EA_Pr%C3%A4sentation_Auszeichnungssprachen.pdf
Brian Ford: Parsing Expression Grammars: A Recognition-Based Syntactic
Foundation, Cambridge
Massachusetts, 2004. Short-URL:
[
http://t1p.de/jihs
][
Ford_2004
]
...
...
@@ -155,13 +148,11 @@ Massachusetts, 2004. Short-URL:[http://t1p.de/jihs][Ford_2004]
[
Ford_20XX
]:
http://bford.info/packrat/
Richard A. Frost, Rahmatullah Hafiz and Paul Callaghan: Parser
Combinators for Ambiguous Left-Recursive Grammars, in: P. Hudak and
D.S. Warren (Eds.): PADL 2008, LNCS 4902, pp. 167–181, Springer-Verlag
Berlin Heidelberg 2008.
Dominikus Herzberg: Objekt-orientierte Parser-Kombinatoren in Python,
Blog-Post, September, 18th 2008 on denkspuren. gedanken, ideen,
anregungen und links rund um informatik-themen, short-URL:
...
...
@@ -169,7 +160,6 @@ anregungen und links rund um informatik-themen, short-URL:
[
Herzberg_2008a
]:
http://denkspuren.blogspot.de/2008/09/objekt-orientierte-parser-kombinatoren.html
Dominikus Herzberg: Eine einfache Grammatik für LaTeX, Blog-Post,
September, 18th 2008 on denkspuren. gedanken, ideen, anregungen und
links rund um informatik-themen, short-URL:
...
...
@@ -177,17 +167,14 @@ links rund um informatik-themen, short-URL:
[
Herzberg_2008b
]:
http://denkspuren.blogspot.de/2008/09/eine-einfache-grammatik-fr-latex.html
Dominikus Herzberg: Uniform Syntax, Blog-Post, February, 27th 2007 on
denkspuren. gedanken, ideen, anregungen und links rund um
informatik-themen, short-URL:
[
http://t1p.de/s0zk
][
Herzberg_2007
]
[
Herzberg_2007
]:
http://denkspuren.blogspot.de/2007/02/uniform-syntax.html
[
ISO_IEC_14977
]:
http://www.cl.cam.ac.uk/~mgk25/iso-14977.pdf
John MacFarlane, David Greenspan, Vicent Marti, Neil Williams,
Benjamin Dumke-von der Ehe, Jeff Atwood: CommonMark. A strongly
defined, highly compatible specification of
...
...
@@ -195,7 +182,6 @@ Markdown, 2017. [commonmark.org][MacFarlane_et_al_2017]
[
MacFarlane_et_al_2017
]:
http://commonmark.org/
Stefan Müller: DSLs in den digitalen Geisteswissenschaften,
Präsentation auf dem
[
dhmuc-Workshop: Digitale Editionen und Auszeichnungssprachen
](
https://dhmuc.hypotheses.org/workshop-digitale-editionen-und-auszeichnungssprachen
)
,
...
...
dhparser.py
View file @
5aa3acd2
...
...
@@ -36,7 +36,6 @@ EBNF_TEMPLATE = r"""-grammar
#
#######################################################################
@ testing = True # testing supresses error messages for unconnected symbols
@ whitespace = vertical # implicit whitespace, includes any number of line feeds
@ literalws = right # literals have implicit whitespace on the right hand side
@ comment = /#.*/ # comments range from a '#'-character to the end of the line
...
...
@@ -49,7 +48,7 @@ EBNF_TEMPLATE = r"""-grammar
#
#######################################################################
document = //~ { WORD } §EOF # root parser:
optional whitespace followed by a sequence of words
document = //~ { WORD } §EOF # root parser:
a sequence of words preceded by whitespace
# until the end of file
#######################################################################
...
...
@@ -58,24 +57,24 @@ document = //~ { WORD } §EOF # root parser: optional whitespace followed by
#
#######################################################################
WORD = /\w+/~
# a sequence of letters,
possibly followed by implicit
whitespace
WORD = /\w+/~ # a sequence of letters,
optional trailing
whitespace
EOF = !/./ # no more characters ahead, end of file reached
"""
TEST_WORD_TEMPLATE
=
r
'''[match:WORD]
1
: word
2
: one_word_with_underscores
M
1: word
M
2: one_word_with_underscores
[fail:WORD]
1
: two words
F
1: two words
'''
TEST_DOCUMENT_TEMPLATE
=
r
'''[match:document]
1
: """This is a sequence of words
M
1: """This is a sequence of words
extending over several lines"""
[fail:document]
1
: """This test should fail, because neither
F
1: """This test should fail, because neither
comma nor full have been defined anywhere."""
'''
...
...
@@ -117,14 +116,16 @@ import DHParser.dsl
from DHParser import testing
from DHParser import toolkit
if not DHParser.dsl.recompile_grammar('{name}.ebnf', force=False): # recompiles Grammar only if it has changed
# recompiles Grammar only if it has changed
if not DHParser.dsl.recompile_grammar('{name}.ebnf', force=False):
print('\nErrors while recompiling "{name}.ebnf":\n--------------------------------------\n\n')
with open('{name}_ebnf_ERRORS.txt') as f:
print(f.read())
sys.exit(1)
sys.path.append('./')
# must be appended after module creation, because otherwise an ImportError is raised under Windows
# must be appended after module creation, because
# otherwise an ImportError is raised under Windows
from {name}Compiler import get_grammar, get_transformer
with toolkit.logging(True):
...
...
@@ -135,7 +136,7 @@ if error_report:
print(error_report)
sys.exit(1)
else:
print('
\nSUCCESS! All tests passed :-)
')
print('
ready.
')
'''
...
...
@@ -152,7 +153,7 @@ def create_project(path: str):
print
(
'"%s" already exists! Not overwritten.'
%
name
)
if
os
.
path
.
exists
(
path
)
and
not
os
.
path
.
isdir
(
path
):
print
(
'Cannot create new project, because a file named "%s" alread exists!'
%
path
)
print
(
'Cannot create new project, because a file named "%s" alread
y
exists!'
%
path
)
sys
.
exit
(
1
)
name
=
os
.
path
.
basename
(
path
)
print
(
'Creating new DHParser-project "%s".'
%
name
)
...
...
@@ -172,6 +173,7 @@ def create_project(path: str):
create_file
(
name
+
'.ebnf'
,
'# '
+
name
+
EBNF_TEMPLATE
)
create_file
(
'README.md'
,
README_TEMPLATE
.
format
(
name
=
name
))
create_file
(
'tst_%s_grammar.py'
%
name
,
GRAMMAR_TEST_TEMPLATE
.
format
(
name
=
name
))
os
.
chmod
(
'tst_%s_grammar.py'
%
name
,
0o755
)
os
.
chdir
(
curr_dir
)
print
(
'ready.'
)
...
...
@@ -257,5 +259,6 @@ def main():
if
not
cpu_profile
(
selftest
,
1
):
sys
.
exit
(
1
)
if
__name__
==
"__main__"
:
main
()
examples/EBNF/EBNFCompiler.py
View file @
5aa3acd2
...
...
@@ -66,8 +66,9 @@ class EBNFGrammar(Grammar):
factor = [flowmarker] [retrieveop] symbol !"=" # negative lookahead to be sure it's not a definition
| [flowmarker] literal
| [flowmarker] regexp
| [flowmarker] group
| [flowmarker] oneormore
| [flowmarker] group
| [flowmarker] unordered
| repetition
| option
...
...
@@ -76,6 +77,7 @@ class EBNFGrammar(Grammar):
retrieveop = "::" | ":" # '::' pop, ':' retrieve
group = "(" §expression ")"
unordered = "<" §expression ">" # elements of expression in arbitrary order
oneormore = "{" expression "}+"
repetition = "{" §expression "}"
option = "[" §expression "]"
...
...
@@ -91,7 +93,7 @@ class EBNFGrammar(Grammar):
EOF = !/./
"""
expression
=
Forward
()
source_hash__
=
"
3c472b3a5d1039680c751fd2dd3f3e24
"
source_hash__
=
"
084a572ffab147ee44ac8f2268793f63
"
parser_initialization__
=
"upon instantiation"
COMMENT__
=
r
'#.*(?:\n|$)'
WHITESPACE__
=
r
'\s*'
...
...
@@ -106,10 +108,11 @@ class EBNFGrammar(Grammar):
option
=
Series
(
Token
(
"["
),
expression
,
Token
(
"]"
),
mandatory
=
1
)
repetition
=
Series
(
Token
(
"{"
),
expression
,
Token
(
"}"
),
mandatory
=
1
)
oneormore
=
Series
(
Token
(
"{"
),
expression
,
Token
(
"}+"
))
unordered
=
Series
(
Token
(
"<"
),
expression
,
Token
(
">"
),
mandatory
=
1
)
group
=
Series
(
Token
(
"("
),
expression
,
Token
(
")"
),
mandatory
=
1
)
retrieveop
=
Alternative
(
Token
(
"::"
),
Token
(
":"
))
flowmarker
=
Alternative
(
Token
(
"!"
),
Token
(
"&"
),
Token
(
"-!"
),
Token
(
"-&"
))
factor
=
Alternative
(
Series
(
Option
(
flowmarker
),
Option
(
retrieveop
),
symbol
,
NegativeLookahead
(
Token
(
"="
))),
Series
(
Option
(
flowmarker
),
literal
),
Series
(
Option
(
flowmarker
),
regexp
),
Series
(
Option
(
flowmarker
),
group
),
Series
(
Option
(
flowmarker
),
oneormo
re
),
repetition
,
option
)
factor
=
Alternative
(
Series
(
Option
(
flowmarker
),
Option
(
retrieveop
),
symbol
,
NegativeLookahead
(
Token
(
"="
))),
Series
(
Option
(
flowmarker
),
literal
),
Series
(
Option
(
flowmarker
),
regexp
),
Series
(
Option
(
flowmarker
),
oneormore
),
Series
(
Option
(
flowmarker
),
group
),
Series
(
Option
(
flowmarker
),
unorde
re
d
),
repetition
,
option
)
term
=
OneOrMore
(
Series
(
Option
(
Token
(
"§"
)),
factor
))
expression
.
set
(
Series
(
term
,
ZeroOrMore
(
Series
(
Token
(
"|"
),
term
))))
directive
=
Series
(
Token
(
"@"
),
symbol
,
Token
(
"="
),
Alternative
(
regexp
,
literal
,
list_
),
mandatory
=
1
)
...
...
examples/EBNF/EBNF_variant.ebnf
deleted
100644 → 0
View file @
c90818b7
# EBNF-Grammar in EBNF
@ comment = /#.*(?:\n|$)/ # comments start with '#' and eat all chars up to and including '\n'
@ whitespace = /\s*/ # whitespace includes linefeed
@ literalws = right # trailing whitespace of literals will be ignored tacitly
syntax = [~//] { definition | directive } §EOF
definition = symbol §"=" §expression
directive = "@" §symbol §"=" §( regexp | literal | list_ )
expression = term { "|" term }
term = { factor }+
factor = [flowmarker] [retrieveop] symbol !"=" # negative lookahead to be sure it's not a definition
| [flowmarker] literal
| [flowmarker] regexp
| [flowmarker] group
| [flowmarker] oneormore
| repetition
| option
flowmarker = "!" | "&" | "§" # '!' negative lookahead, '&' positive lookahead, '§' required
| "-!" | "-&" # '-' negative lookbehind, '-&' positive lookbehind
retrieveop = "::" | ":" # '::' pop, ':' retrieve
group = "(" expression §")"
oneormore = "{" expression "}+"
repetition = "{" expression §"}"
option = "[" expression §"]"
symbol = /(?!\d)\w+/~ # e.g. expression, factor, parameter_list
literal = /"(?:[^"]|\\")*?"/~ # e.g. "(", '+', 'while'
| /'(?:[^']|\\')*?'/~ # whitespace following literals will be ignored tacitly.
regexp = /~?\/(?:\\\/|[^\/])*?\/~?/~ # e.g. /\w+/, ~/#.*(?:\n|$)/~
# '~' is a whitespace-marker, if present leading or trailing
# whitespace of a regular expression will be ignored tacitly.
list_ = /\w+/~ { "," /\w+/~ } # comma separated list of symbols, e.g. BEGIN_LIST, END_LIST,
# BEGIN_QUOTE, END_QUOTE ; see CommonMark/markdown.py for an exmaple
EOF = !/./
examples/EBNF/EBNF_variantCompiler.py
deleted
100644 → 0
View file @
c90818b7
#!/usr/bin/python
#######################################################################
#
# SYMBOLS SECTION - Can be edited. Changes will be preserved.
#
#######################################################################
from
functools
import
partial
import
os
import
sys
try
:
import
regex
as
re
except
ImportError
:
import
re
from
DHParser
import
logging
,
is_filename
,
load_if_file
,
\
Grammar
,
Compiler
,
nil_preprocessor
,
\
Lookbehind
,
Lookahead
,
Alternative
,
Pop
,
Required
,
Token
,
Synonym
,
\
Option
,
NegativeLookbehind
,
OneOrMore
,
RegExp
,
Retrieve
,
Series
,
RE
,
Capture
,
\
ZeroOrMore
,
Forward
,
NegativeLookahead
,
mixin_comment
,
compile_source
,
\
last_value
,
counterpart
,
accumulate
,
PreprocessorFunc
,
\
Node
,
TransformationFunc
,
TransformationDict
,
TRUE_CONDITION
,
\
traverse
,
remove_children_if
,
merge_children
,
is_anonymous
,
\
reduce_single_child
,
replace_by_single_child
,
replace_or_reduce
,
remove_whitespace
,
\
remove_expendables
,
remove_empty
,
remove_tokens
,
flatten
,
is_whitespace
,
\
is_empty
,
is_expendable
,
collapse
,
replace_content
,
WHITESPACE_PTYPE
,
TOKEN_PTYPE
,
\
remove_parser
,
remove_content
,
remove_brackets
,
replace_parser
,
\
keep_children
,
is_one_of
,
has_content
,
apply_if
,
remove_first
,
remove_last
#######################################################################
#
# PREPROCESSOR SECTION - Can be edited. Changes will be preserved.
#
#######################################################################
def
EBNF_variantPreprocessor
(
text
):
return
text
def
get_preprocessor
()
->
PreprocessorFunc
:
return
EBNF_variantPreprocessor
#######################################################################
#
# PARSER SECTION - Don't edit! CHANGES WILL BE OVERWRITTEN!
#
#######################################################################
class
EBNF_variantGrammar
(
Grammar
):
r
"""Parser for an EBNF_variant source file, with this grammar:
# EBNF-Grammar in EBNF
@ comment = /#.*(?:\n|$)/ # comments start with '#' and eat all chars up to and including '\n'
@ whitespace = /\s*/ # whitespace includes linefeed
@ literalws = right # trailing whitespace of literals will be ignored tacitly
syntax = [~//] { definition | directive } §EOF
definition = symbol §"=" §expression
directive = "@" §symbol §"=" §( regexp | literal | list_ )
expression = term { "|" term }
term = { factor }+
factor = [flowmarker] [retrieveop] symbol !"=" # negative lookahead to be sure it's not a definition
| [flowmarker] literal
| [flowmarker] regexp
| [flowmarker] group
| [flowmarker] oneormore
| repetition
| option
flowmarker = "!" | "&" | "§" # '!' negative lookahead, '&' positive lookahead, '§' required
| "-!" | "-&" # '-' negative lookbehind, '-&' positive lookbehind
retrieveop = "::" | ":" # '::' pop, ':' retrieve
group = "(" expression §")"
oneormore = "{" expression "}+"
repetition = "{" expression §"}"
option = "[" expression §"]"
symbol = /(?!\d)\w+/~ # e.g. expression, factor, parameter_list
literal = /"(?:[^"]|\\")*?"/~ # e.g. "(", '+', 'while'
| /'(?:[^']|\\')*?'/~ # whitespace following literals will be ignored tacitly.
regexp = /~?\/(?:\\\/|[^\/])*?\/~?/~ # e.g. /\w+/, ~/#.*(?:\n|$)/~
# '~' is a whitespace-marker, if present leading or trailing
# whitespace of a regular expression will be ignored tacitly.
list_ = /\w+/~ { "," /\w+/~ } # comma separated list of symbols, e.g. BEGIN_LIST, END_LIST,
# BEGIN_QUOTE, END_QUOTE ; see CommonMark/markdown.py for an exmaple
EOF = !/./
"""
expression
=
Forward
()
source_hash__
=
"4735db10f0b79d44209d1de0184b2ca0"
parser_initialization__
=
"upon instantiation"
COMMENT__
=
r
'#.*(?:\n|$)'
WHITESPACE__
=
r
'\s*'
WSP__
=
mixin_comment
(
whitespace
=
WHITESPACE__
,
comment
=
COMMENT__
)
wspL__
=
''
wspR__
=
WSP__
EOF
=
NegativeLookahead
(
RegExp
(
'.'
))
list_
=
Series
(
RE
(
'
\\
w+'
),
ZeroOrMore
(
Series
(
Token
(
","
),
RE
(
'
\\
w+'
))))
regexp
=
RE
(
'~?/(?:
\\\\
/|[^/])*?/~?'
)
literal
=
Alternative
(
RE
(
'"(?:[^"]|
\\\\
")*?"'
),
RE
(
"'(?:[^']|
\\\\
')*?'"
))
symbol
=
RE
(
'(?!
\\
d)
\\
w+'
)
option
=
Series
(
Token
(
"["
),
expression
,
Token
(
"]"
),
mandatory
=
2
)
repetition
=
Series
(
Token
(
"{"
),
expression
,
Token
(
"}"
),
mandatory
=
2
)
oneormore
=
Series
(
Token
(
"{"
),
expression
,
Token
(
"}+"
))
group
=
Series
(
Token
(
"("
),
expression
,
Token
(
")"
),
mandatory
=
2
)
retrieveop
=
Alternative
(
Token
(
"::"
),
Token
(
":"
))
flowmarker
=
Alternative
(
Token
(
"!"
),
Token
(
"&"
),
Token
(
"§"
),
Token
(
"-!"
),
Token
(
"-&"
))
factor
=
Alternative
(
Series
(
Option
(
flowmarker
),
Option
(
retrieveop
),
symbol
,
NegativeLookahead
(
Token
(
"="
))),
Series
(
Option
(
flowmarker
),
literal
),
Series
(
Option
(
flowmarker
),
regexp
),
Series
(
Option
(
flowmarker
),
group
),
Series
(
Option
(
flowmarker
),
oneormore
),
repetition
,
option
)
term
=
OneOrMore
(
factor
)
expression
.
set
(
Series
(
term
,
ZeroOrMore
(
Series
(
Token
(
"|"
),
term
))))
directive
=
Series
(
Token
(
"@"
),
symbol
,
Token
(
"="
),
Alternative
(
regexp
,
literal
,
list_
),
mandatory
=
1
)
definition
=
Series
(
symbol
,
Token
(
"="
),
expression
,
mandatory
=
1
)
syntax
=
Series
(
Option
(
RE
(
''
,
wR
=
''
,
wL
=
WSP__
)),
ZeroOrMore
(
Alternative
(
definition
,
directive
)),
EOF
,
mandatory
=
2
)
root__
=
syntax
def
get_grammar
()
->
EBNF_variantGrammar
:
global
thread_local_EBNF_variant_grammar_singleton
try
:
grammar
=
thread_local_EBNF_variant_grammar_singleton
except
NameError
:
thread_local_EBNF_variant_grammar_singleton
=
EBNF_variantGrammar
()
grammar
=
thread_local_EBNF_variant_grammar_singleton
return
grammar
#######################################################################
#
# AST SECTION - Can be edited. Changes will be preserved.
#
#######################################################################
EBNF_variant_AST_transformation_table
=
{
# AST Transformations for the EBNF_variant-grammar
"+"
:
remove_empty
,
"syntax"
:
[],
"definition"
:
[],
"directive"
:
[],
"expression"
:
[],
"term"
:
[],
"factor"
:
[
replace_or_reduce
],
"flowmarker"
:
[
replace_or_reduce
],
"retrieveop"
:
[
replace_or_reduce
],
"group"
:
[],
"oneormore"
:
[],
"repetition"
:
[],
"option"
:
[],
"symbol"
:
[],
"literal"
:
[
replace_or_reduce
],
"regexp"
:
[],
"list_"
:
[],
"EOF"
:
[],
":Token, :RE"
:
reduce_single_child
,
"*"
:
replace_by_single_child
}
def
EBNF_variantTransform
()
->
TransformationDict
:
return
partial
(
traverse
,
processing_table
=
EBNF_variant_AST_transformation_table
.
copy
())
def
get_transformer
()
->
TransformationFunc
:
global
thread_local_EBNF_variant_transformer_singleton
try
:
transformer
=
thread_local_EBNF_variant_transformer_singleton
except
NameError
:
thread_local_EBNF_variant_transformer_singleton
=
EBNF_variantTransform
()
transformer
=
thread_local_EBNF_variant_transformer_singleton
return
transformer
#######################################################################
#
# COMPILER SECTION - Can be edited. Changes will be preserved.
#
#######################################################################
class
EBNF_variantCompiler
(
Compiler
):
"""Compiler for the abstract-syntax-tree of a EBNF_variant source file.
"""
def
__init__
(
self
,
grammar_name
=
"EBNF_variant"
,
grammar_source
=
""
):
super
(
EBNF_variantCompiler
,
self
).
__init__
(
grammar_name
,
grammar_source
)
assert
re
.
match
(
'\w+\Z'
,
grammar_name
)
def
on_syntax
(
self
,
node
):
return
node
def
on_definition
(
self
,
node
):
pass
def
on_directive
(
self
,
node
):
pass
def
on_expression
(
self
,
node
):
pass
def
on_term
(
self
,
node
):
pass
def
on_factor
(
self
,
node
):
pass
def
on_flowmarker
(
self
,
node
):
pass
def
on_retrieveop
(
self
,
node
):
pass
def
on_group
(
self
,
node
):
pass
def
on_oneormore
(
self
,
node
):
pass
def
on_repetition
(
self
,
node
):
pass
def
on_option
(
self
,
node
):
pass
def
on_symbol
(
self
,
node
):
pass
def
on_literal
(
self
,
node
):
pass
def
on_regexp
(
self
,
node
):
pass
def
on_list_
(
self
,
node
):
pass
def
on_EOF
(
self
,
node
):
pass
def
get_compiler
(
grammar_name
=
"EBNF_variant"
,
grammar_source
=
""
)
->
EBNF_variantCompiler
:
global
thread_local_EBNF_variant_compiler_singleton
try
:
compiler
=
thread_local_EBNF_variant_compiler_singleton
compiler
.
set_grammar_name
(
grammar_name
,
grammar_source
)
except
NameError
:
thread_local_EBNF_variant_compiler_singleton
=
\
EBNF_variantCompiler
(
grammar_name
,
grammar_source
)
compiler
=
thread_local_EBNF_variant_compiler_singleton
return
compiler
#######################################################################
#
# END OF DHPARSER-SECTIONS
#
#######################################################################
def
compile_src
(
source
):
"""Compiles ``source`` and returns (result, errors, ast).
"""
with
logging
(
"LOGS"
):
compiler
=
get_compiler
()
cname
=
compiler
.
__class__
.
__name__
log_file_name
=
os
.
path
.
basename
(
os
.
path
.
splitext
(
source
)[
0
])
\
if
is_filename
(
source
)
<
0
else
cname
[:
cname
.
find
(
'.'
)]
+
'_out'
result
=
compile_source
(
source
,
get_preprocessor
(),
get_grammar
(),
get_transformer
(),
compiler
)
return
result
if
__name__
==
"__main__"
:
if
len
(
sys
.
argv
)
>
1
:
result
,
errors
,
ast
=
compile_src
(
sys
.
argv
[
1
])
if
errors
:
for
error
in
errors
:
print
(
error
)
sys
.
exit
(
1
)
else
:
print
(
result
.
as_x