Skip to content
GitLab
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Sign in
Toggle navigation
Menu
Open sidebar
badw-it
DHParser
Commits
e5258e45
Commit
e5258e45
authored
Oct 09, 2018
by
Eckhart Arnold
Browse files
- cleanup experimental folder
parent
d2dd2613
Changes
23
Expand all
Hide whitespace changes
Inline
Side-by-side
documentation/StepByStepGuide.rst
View file @
e5258e45
...
...
@@ -27,7 +27,7 @@ Setting up a new DHParser project
=================================
Since DHParser, while quite mature in terms of implemented features, is still
in a pre-first-release state, it is for the time being more recommendable to
in a pre-first-release state, it is
,
for the time being
,
more recommendable to
clone the most current version of DHParser from the git-repository rather than
installing the packages from the Python Package Index (PyPI).
...
...
examples/XML/XML.ebnf
View file @
e5258e45
...
...
@@ -6,7 +6,7 @@
#
#######################################################################
@ whitespace = /\s*/ # i
mplici
t whitespace, signified by ~
@ whitespace = /\s*/ # i
nsignifican
t whitespace, signified by ~
@ literalws = none # literals have no implicit whitespace
@ comment = // # no implicit comments
@ ignorecase = False # literals and regular expressions are case-sensitive
...
...
examples/XMLSnippet/XMLSnippet.ebnf
View file @
e5258e45
# XMLSnippet-grammar
#######################################################################
#
# EBNF-Directives
#
#######################################################################
@ whitespace =
vertical
# i
mplici
t whitespace,
includes any number of line feeds
@ literalws =
right
# literals have implicit whitespace
on the right hand side
@ comment = /
#.*
/
# comments range from a '#'-character to the end of the line
@ whitespace =
/\s*/
# i
nsignifican
t whitespace,
signified by ~
@ literalws =
none
# literals have
no
implicit whitespace
@ comment = //
# no implicit comments
@ ignorecase = False # literals and regular expressions are case-sensitive
#######################################################################
#
# Structure and Components
# Document Frame and Prolog
#
#######################################################################
document = prolog element [Misc] EOF
prolog = [ ~ XMLDecl ] [Misc] [doctypedecl [Misc]]
XMLDecl = '<?xml' VersionInfo [EncodingDecl] [SDDecl] ~ '?>'
VersionInfo = ~ 'version' ~ '=' ~ ("'" VersionNum "'" | '"' VersionNum '"')
VersionNum = /[0-9]+\.[0-9]+/
EncodingDecl = ~ 'encoding' ~ '=' ~ ("'" EncName "'" | '"' EncName '"')
EncName = /[A-Za-z][A-Za-z0-9._\-]*/
SDDecl = ~ 'standalone' ~ '=' ~ (("'" Yes | No "'") | ('"' Yes | No '"'))
Yes = 'yes'
No = 'no'
#######################################################################
#
# Logical Structures
#
#######################################################################
element = emptyElement | STag §content ETag
STag = '<' TagName { ~ Attribute } ~ '>'
ETag = '</' §::TagName ~ '>'
document = prolog element EOF
prolog = ""
xml = { element | text | comment }
element = single_tag | tag_pair
single_tag = "<" name attributes "/>"
tag_pair = opening_tag xml closing_tag
opening_tag = "<" tag_name attributes ">"
closing_tag = "</" ::tag_name ">"
attributes = { attribute }
attribute = name "=" '"' content '"'
emptyElement = '<' Name { ~ Attribute } ~ '/>'
TagName = Name
Attribute = Name ~ §'=' ~ AttValue
content = [ CharData ]
{ (element | Reference | CDSect | PI | Comment)
[CharData] }
name = IDENTIFIER
tag_name = IDENTIFIER
#######################################################################
#
#
Regular Expression
s
#
Literal
s
#
#######################################################################
WORD = /\w+/~ # a sequence of letters, optional trailing whitespace
EOF = !/./ # no more characters ahead, end of file reached
EntityValue = '"' { /[^%&"]+/ | PEReference | Reference } '"'
| "'" { /[^%&']+/ | PEReference | Reference } "'"
AttValue = '"' { /[^<&"]+/ | Reference } '"'
| "'" { /[^<&']+/ | Reference } "'"
SystemLiteral = '"' /[^"]*/ '"' | "'" /[^']*/ "'"
PubidLiteral = '"' [PubidChars] '"'
| "'" [PubidCharsSingleQuoted] "'"
#######################################################################
#
# References
#
#######################################################################
Reference = EntityRef | CharRef
EntityRef = '&' Name ';'
PEReference = '%' Name ';'
#######################################################################
#
# Names and Tokens
#
#######################################################################
Nmtokens = Nmtoken { / / Nmtoken }
Nmtoken = NameChars
Names = Name { / / Name }
Name = NameStartChar [NameChars]
NameStartChar = /_|:|[A-Z]|[a-z]
|[\u00C0-\u00D6]|[\u00D8-\u00F6]|[\u00F8-\u02FF]
|[\u0370-\u037D]|[\u037F-\u1FFF]|[\u200C-\u200D]
|[\u2070-\u218F]|[\u2C00-\u2FEF]|[\u3001-\uD7FF]
|[\uF900-\uFDCF]|[\uFDF0-\uFFFD]
|[\U00010000-\U000EFFFF]/
NameChars = /(?:_|:|-|\.|[A-Z]|[a-z]|[0-9]
|\u00B7|[\u0300-\u036F]|[\u203F-\u2040]
|[\u00C0-\u00D6]|[\u00D8-\u00F6]|[\u00F8-\u02FF]
|[\u0370-\u037D]|[\u037F-\u1FFF]|[\u200C-\u200D]
|[\u2070-\u218F]|[\u2C00-\u2FEF]|[\u3001-\uD7FF]
|[\uF900-\uFDCF]|[\uFDF0-\uFFFD]
|[\U00010000-\U000EFFFF])+/
#######################################################################
#
# Comments, Processing Instructions and CDATA sections
#
#######################################################################
Misc = { Comment | PI | S }+
Comment = '<!--' { CommentChars | /-(?!-)/ } '-->'
PI = '<?' PITarget [~ PIChars] '?>'
PITarget = !/X|xM|mL|l/ Name
CDSect = '<![CDATA[' CData ']]>'
#######################################################################
#
# Characters, Explicit Whitespace and End of File
#
#######################################################################
PubidCharsSingleQuoted = /(?:\x20|\x0D|\x0A|[a-zA-Z0-9]|[-()+,.\/:=?;!*#@$_%])+/
PubidChars = /(?:\x20|\x0D|\x0A|[a-zA-Z0-9]|[-'()+,.\/:=?;!*#@$_%])+/
CharData = /(?:(?!\]\]>)[^<&])+/
CData = /(?:(?!\]\]>)(?:\x09|\x0A|\x0D|[\u0020-\uD7FF]|[\uE000-\uFFFD]|[\U00010000-\U0010FFFF]))+/
IgnoreChars = /(?:(?!(?:<!\[)|(?:\]\]>))(?:\x09|\x0A|\x0D|[\u0020-\uD7FF]|[\uE000-\uFFFD]|[\U00010000-\U0010FFFF]))+/
PIChars = /(?:(?!\?>)(?:\x09|\x0A|\x0D|[\u0020-\uD7FF]|[\uE000-\uFFFD]|[\U00010000-\U0010FFFF]))+/
CommentChars = /(?:(?!-)(?:\x09|\x0A|\x0D|[\u0020-\uD7FF]|[\uE000-\uFFFD]|[\U00010000-\U0010FFFF]))+/
CharRef = ('&#' /[0-9]+/ ';') | ('&#x' /[0-9a-fA-F]+/ ';')
Chars = /(?:\x09|\x0A|\x0D|[\u0020-\uD7FF]|[\uE000-\uFFFD]|[\U00010000-\U0010FFFF])+/
Char = /\x09|\x0A|\x0D|[\u0020-\uD7FF]|[\uE000-\uFFFD]|[\U00010000-\U0010FFFF]/
S = /\s+/ # whitespace
EOF = !/./ # no more characters ahead, end of file reached
examples/XMLSnippet/example.dsl
deleted
100644 → 0
View file @
d2dd2613
Life is but a walking shadow
examples/XMLSnippet/example.xml
0 → 100644
View file @
e5258e45
<?xml version="1.0" encoding="UTF-8"?>
<note
date=
"2018-06-14"
>
<to>
Tove
</to>
<from>
Jani
</from>
<heading>
Reminder
</heading>
<body>
Don't forget me this weekend!
</body>
<priority
level=
"high"
/>
<remark></remark>
</note>
\ No newline at end of file
experimental/.gitkeep
deleted
100644 → 0
View file @
d2dd2613
experimental/fascitergula_alternative.xml
deleted
100644 → 0
View file @
d2dd2613
This diff is collapsed.
Click to expand it.
experimental/new2/README.md
deleted
100644 → 0
View file @
d2dd2613
# new2
PLACE A SHORT DESCRIPTION HERE
Author: AUTHOR'S NAME
<EMAIL>
, AFFILIATION
## License
new2 is open source software under the
[
Apache 2.0 License
](
https://www.apache.org/licenses/LICENSE-2.0
)
Copyright YEAR AUTHOR'S NAME
<EMAIL>
, AFFILIATION
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
experimental/new2/example.dsl
deleted
100644 → 0
View file @
d2dd2613
Life is but a walking shadow.
experimental/new2/grammar_tests/01_test_word.ini
deleted
100644 → 0
View file @
d2dd2613
[match:WORD]
M1:
word
M2:
one_word_with_underscores
[match:WORD]
M3:
Life’s
[fail:WORD]
F1:
two
words
F2:
""
experimental/new2/grammar_tests/02_test_document.ini
deleted
100644 → 0
View file @
d2dd2613
[match:document]
M1:
"""This
is
a
sequence
of
words
extending
over
several
lines."""
M2:
"""
This
sequence
contains
leading
whitespace."""
[fail:document]
F1:
"""This
test
should
fail,
because
neither
comma
nor
full
have
been
defined
anywhere"""
experimental/new2/grammar_tests/03_test_sentence.ini
deleted
100644 → 0
View file @
d2dd2613
[match:part]
M1:
"""a
poor
player
that
struts
and
frets
his
hour
upon
the
stage"""
[fail:part]
F1:
"""It
is
a
tale
told
by
an
idiot,"""
[match:sentence]
M1:
"""It
is
a
tale
told
by
an
idiot,
full
of
sound
and
fury,
signifying
nothing."""
M2:
"""Plain
old
sentence."""
[fail:sentence]
F1:
"""Ups,
a
full
stop
is
missing"""
F2:
"""No
commas
at
the
end,."""
experimental/new2/macbeth.dsl
deleted
100644 → 0
View file @
d2dd2613
Life’s but a walking shadow, a poor player that struts and frets his hour
upon the stage and then is heard no more. It is a tale told by an idiot,
full of sound and fury, signifying nothing.
experimental/new2/new2.ebnf
deleted
100644 → 0
View file @
d2dd2613
document = ~ { sentence } §EOF
sentence = part {"," part } "."
part = { WORD }+
WORD = /[\w’]+/~
EOF = !/./
experimental/new2/new2Compiler.py
deleted
100755 → 0
View file @
d2dd2613
#!/usr/bin/python
#######################################################################
#
# SYMBOLS SECTION - Can be edited. Changes will be preserved.
#
#######################################################################
from
functools
import
partial
import
os
import
sys
sys
.
path
.
append
(
r
'/home/eckhart/Entwicklung/DHParser'
)
try
:
import
regex
as
re
except
ImportError
:
import
re
from
DHParser
import
logging
,
is_filename
,
load_if_file
,
\
Grammar
,
Compiler
,
nil_preprocessor
,
PreprocessorToken
,
Whitespace
,
\
Lookbehind
,
Lookahead
,
Alternative
,
Pop
,
_Token
,
Synonym
,
AllOf
,
SomeOf
,
Unordered
,
\
Option
,
NegativeLookbehind
,
OneOrMore
,
RegExp
,
Retrieve
,
Series
,
_RE
,
Capture
,
\
ZeroOrMore
,
Forward
,
NegativeLookahead
,
mixin_comment
,
compile_source
,
\
last_value
,
counterpart
,
accumulate
,
PreprocessorFunc
,
\
Node
,
TransformationFunc
,
TransformationDict
,
\
traverse
,
remove_children_if
,
merge_children
,
is_anonymous
,
\
reduce_single_child
,
replace_by_single_child
,
replace_or_reduce
,
remove_whitespace
,
\
remove_expendables
,
remove_empty
,
remove_tokens
,
flatten
,
is_whitespace
,
\
is_empty
,
is_expendable
,
collapse
,
replace_content
,
WHITESPACE_PTYPE
,
TOKEN_PTYPE
,
\
remove_nodes
,
remove_content
,
remove_brackets
,
replace_parser
,
\
keep_children
,
is_one_of
,
has_content
,
apply_if
,
remove_first
,
remove_last
,
\
remove_anonymous_empty
,
keep_nodes
,
traverse_locally
,
strip
,
lstrip
,
rstrip
,
\
grammar_changed
#######################################################################
#
# PREPROCESSOR SECTION - Can be edited. Changes will be preserved.
#
#######################################################################
def
new2Preprocessor
(
text
):
return
text
,
lambda
i
:
i
def
get_preprocessor
()
->
PreprocessorFunc
:
return
new2Preprocessor
#######################################################################
#
# PARSER SECTION - Don't edit! CHANGES WILL BE OVERWRITTEN!
#
#######################################################################
class
new2Grammar
(
Grammar
):
r
"""Parser for a new2 source file, with this grammar:
document = ~ { sentence } §EOF
sentence = part {"," part } "."
part = { WORD }+
WORD = /[\w’]+/~
EOF = !/./
"""
source_hash__
=
"7a9984368b1c959222099d389d18c54f"
parser_initialization__
=
"upon instantiation"
COMMENT__
=
r
''
WHITESPACE__
=
r
'\s*'
WSP_RE__
=
mixin_comment
(
whitespace
=
WHITESPACE__
,
comment
=
COMMENT__
)
wspL__
=
''
wspR__
=
WSP__
whitespace__
=
Whitespace
(
WSP__
)
EOF
=
NegativeLookahead
(
RegExp
(
'.'
))
WORD
=
_RE
(
'[
\\
w’]+'
)
part
=
OneOrMore
(
WORD
)
sentence
=
Series
(
part
,
ZeroOrMore
(
Series
(
_Token
(
","
),
part
)),
_Token
(
"."
))
document
=
Series
(
whitespace__
,
ZeroOrMore
(
sentence
),
EOF
,
mandatory
=
2
)
root__
=
document
def
get_grammar
()
->
new2Grammar
:
global
thread_local_new2_grammar_singleton
try
:
grammar
=
thread_local_new2_grammar_singleton
except
NameError
:
thread_local_new2_grammar_singleton
=
new2Grammar
()
grammar
=
thread_local_new2_grammar_singleton
return
grammar
#######################################################################
#
# AST SECTION - Can be edited. Changes will be preserved.
#
#######################################################################
new2_AST_transformation_table
=
{
# AST Transformations for the new2-grammar
"+"
:
remove_empty
,
"document"
:
[
remove_whitespace
,
reduce_single_child
],
"sentence"
:
[
flatten
],
"part"
:
[],
"WORD"
:
[
remove_whitespace
,
reduce_single_child
],
"EOF"
:
[],
":_Token"
:
[
remove_whitespace
,
reduce_single_child
],
":_RE"
:
reduce_single_child
,
"*"
:
replace_by_single_child
}
def
new2Transform
()
->
TransformationDict
:
return
partial
(
traverse
,
processing_table
=
new2_AST_transformation_table
.
copy
())
def
get_transformer
()
->
TransformationFunc
:
global
thread_local_new2_transformer_singleton
try
:
transformer
=
thread_local_new2_transformer_singleton
except
NameError
:
thread_local_new2_transformer_singleton
=
new2Transform
()
transformer
=
thread_local_new2_transformer_singleton
return
transformer
#######################################################################
#
# COMPILER SECTION - Can be edited. Changes will be preserved.
#
#######################################################################
class
new2Compiler
(
Compiler
):
"""Compiler for the abstract-syntax-tree of a new2 source file.
"""
def
__init__
(
self
,
grammar_name
=
"new2"
,
grammar_source
=
""
):
super
(
new2Compiler
,
self
).
__init__
(
grammar_name
,
grammar_source
)
assert
re
.
match
(
'\w+\Z'
,
grammar_name
)
def
on_document
(
self
,
node
):
return
self
.
fallback_compiler
(
node
)
# def on_WORD(self, node):
# return node
# def on_EOF(self, node):
# return node
def
get_compiler
(
grammar_name
=
"new2"
,
grammar_source
=
""
)
->
new2Compiler
:
global
thread_local_new2_compiler_singleton
try
:
compiler
=
thread_local_new2_compiler_singleton
compiler
.
set_grammar_name
(
grammar_name
,
grammar_source
)
except
NameError
:
thread_local_new2_compiler_singleton
=
\
new2Compiler
(
grammar_name
,
grammar_source
)
compiler
=
thread_local_new2_compiler_singleton
return
compiler
#######################################################################
#
# END OF DHPARSER-SECTIONS
#
#######################################################################
def
compile_src
(
source
,
log_dir
=
''
):
"""Compiles ``source`` and returns (result, errors, ast).
"""
with
logging
(
log_dir
):
compiler
=
get_compiler
()
cname
=
compiler
.
__class__
.
__name__
log_file_name
=
os
.
path
.
basename
(
os
.
path
.
splitext
(
source
)[
0
])
\
if
is_filename
(
source
)
<
0
else
cname
[:
cname
.
find
(
'.'
)]
+
'_out'
result
=
compile_source
(
source
,
get_preprocessor
(),
get_grammar
(),
get_transformer
(),
compiler
)
return
result
if
__name__
==
"__main__"
:
if
len
(
sys
.
argv
)
>
1
:
try
:
grammar_file_name
=
os
.
path
.
basename
(
__file__
).
replace
(
'Compiler.py'
,
'.ebnf'
)
if
grammar_changed
(
new2Grammar
,
grammar_file_name
):
print
(
"Grammar has changed. Please recompile Grammar first."
)
sys
.
exit
(
1
)
except
FileNotFoundError
:
print
(
'Could not check for changed grammar, because grammar file "%s" was not found!'
%
grammar_file_name
)
file_name
,
log_dir
=
sys
.
argv
[
1
],
''
if
file_name
in
[
'-d'
,
'--debug'
]
and
len
(
sys
.
argv
)
>
2
:
file_name
,
log_dir
=
sys
.
argv
[
2
],
'LOGS'
result
,
errors
,
ast
=
compile_src
(
file_name
,
log_dir
)
if
errors
:
cwd
=
os
.
getcwd
()
rel_path
=
file_name
[
len
(
cwd
):]
if
file_name
.
startswith
(
cwd
)
else
file_name
for
error
in
errors
:
print
(
rel_path
+
':'
+
str
(
error
))
sys
.
exit
(
1
)
else
:
print
(
result
.
as_xml
()
if
isinstance
(
result
,
Node
)
else
result
)
else
:
print
(
"Usage: new2Compiler.py [FILENAME]"
)
experimental/new2/tst_new2_grammar.py
deleted
100755 → 0
View file @
d2dd2613
#!/usr/bin/python3
"""tst_new2_grammar.py - runs the unit tests for the new2-grammar
"""
import
os
import
sys
sys
.
path
.
append
(
r
'/home/eckhart/Entwicklung/DHParser'
)
scriptpath
=
os
.
path
.
dirname
(
__file__
)
try
:
from
DHParser
import
dsl
import
DHParser.log
from
DHParser
import
testing
except
ModuleNotFoundError
:
print
(
'Could not import DHParser. Please adjust sys.path in file '
'"%s" manually'
%
__file__
)
sys
.
exit
(
1
)
def
recompile_grammar
(
grammar_src
,
force
):
with
DHParser
.
log
.
logging
(
False
):
# recompiles Grammar only if it has changed
if
not
dsl
.
recompile_grammar
(
grammar_src
,
force
=
force
):
print
(
'
\n
Errors while recompiling "%s":'
%
grammar_src
+
'
\n
--------------------------------------
\n\n
'
)
with
open
(
'new2_ebnf_ERRORS.txt'
)
as
f
:
print
(
f
.
read
())
sys
.
exit
(
1
)
def
run_grammar_tests
(
glob_pattern
):
with
DHParser
.
log
.
logging
(
False
):
error_report
=
testing
.
grammar_suite
(
os
.
path
.
join
(
scriptpath
,
'grammar_tests'
),
get_grammar
,
get_transformer
,
fn_patterns
=
[
glob_pattern
],
report
=
True
,
verbose
=
True
)
return
error_report
if
__name__
==
'__main__'
:
arg
=
sys
.
argv
[
1
]
if
len
(
sys
.
argv
)
>
1
else
'*_test_*.ini'
if
arg
.
endswith
(
'.ebnf'
):
recompile_grammar
(
arg
,
force
=
True
)
else
:
recompile_grammar
(
os
.
path
.
join
(
scriptpath
,
'new2.ebnf'
),
force
=
False
)
sys
.
path
.
append
(
'.'
)
from
new2Compiler
import
get_grammar
,
get_transformer
error_report
=
run_grammar_tests
(
glob_pattern
=
arg
)
if
error_report
:
print
(
'
\n
'
)
print
(
error_report
)
sys
.
exit
(
1
)
print
(
'ready.
\n
'
)
experimental/ws/README.md
deleted
100644 → 0
View file @
d2dd2613
# ws
PLACE A SHORT DESCRIPTION HERE
Author: AUTHOR'S NAME
<EMAIL>
, AFFILIATION
## License
ws is open source software under the
[
Apache 2.0 License
](
https://www.apache.org/licenses/LICENSE-2.0
)
Copyright YEAR AUTHOR'S NAME
<EMAIL>
, AFFILIATION
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
experimental/ws/example.dsl
deleted
100644 → 0
View file @
d2dd2613
Life is but a walking shadow
experimental/ws/grammar_tests/01_test_word.ini
deleted
100644 → 0
View file @
d2dd2613
[match:WORD]
M1:
word
M2:
one_word_with_underscores
[fail:WORD]
F1:
two
words
experimental/ws/grammar_tests/02_test_document.ini
deleted
100644 → 0
View file @
d2dd2613
[match:document]
M1:
"""This
is
a
sequence
of
words
extending
over
several
lines"""
M2:
"""
This
sequence
contains
leading
whitespace"""
[fail:document]
F1:
"""This
test
should
fail,
because
neither
comma
nor
full
have
been
defined
anywhere."""
Prev
1
2
Next
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment