Commit 5aa3acd2 authored by eckhart's avatar eckhart
Browse files

some documentation added (still a stub)

parent c90818b7
...@@ -145,7 +145,8 @@ class EBNFGrammar(Grammar): ...@@ -145,7 +145,8 @@ class EBNFGrammar(Grammar):
def grammar_changed(grammar_class, grammar_source: str) -> bool: def grammar_changed(grammar_class, grammar_source: str) -> bool:
"""Returns ``True`` if ``grammar_class`` does not reflect the latest """
Returns ``True`` if ``grammar_class`` does not reflect the latest
changes of ``grammar_source`` changes of ``grammar_source``
Parameters: Parameters:
......
...@@ -583,7 +583,7 @@ class Node(collections.abc.Sized): ...@@ -583,7 +583,7 @@ class Node(collections.abc.Sized):
def log(self, log_file_name): def log(self, log_file_name):
""" """
Writes ab S-expressions of the tree with root `self` to a file. Writes an S-expression-representation of the tree with root `self` to a file.
""" """
if is_logging(): if is_logging():
path = os.path.join(log_dir(), log_file_name) path = os.path.join(log_dir(), log_file_name)
......
...@@ -17,4 +17,4 @@ permissions and limitations under the License. ...@@ -17,4 +17,4 @@ permissions and limitations under the License.
""" """
__all__ = ('__version__',) __all__ = ('__version__',)
__version__ = '0.7.8' # + '_dev' + str(os.stat(__file__).st_mtime) __version__ = '0.7.9' # + '_dev' + str(os.stat(__file__).st_mtime)
...@@ -5,9 +5,8 @@ Introduction to [DHParser](https://gitlab.lrz.de/badw-it/DHParser) ...@@ -5,9 +5,8 @@ Introduction to [DHParser](https://gitlab.lrz.de/badw-it/DHParser)
Motto: **Computers enjoy XML, humans don't.** Motto: **Computers enjoy XML, humans don't.**
Why use domain specific languages in the humanities
Why use domain specific languages in the humanities? ---------------------------------------------------
----------------------------------------------------
Suppose you are a literary scientist and you would like to edit a poem Suppose you are a literary scientist and you would like to edit a poem
like Heinrich Heine's "Lyrisches Intermezzo". Usually, the technology of like Heinrich Heine's "Lyrisches Intermezzo". Usually, the technology of
...@@ -47,9 +46,9 @@ Now, while you might think that this all works well enough, there are ...@@ -47,9 +46,9 @@ Now, while you might think that this all works well enough, there are
a few drawbacks to this approach: a few drawbacks to this approach:
- The syntax is cumbersome and the encoding not very legible to humans - The syntax is cumbersome and the encoding not very legible to humans
working with it. (And I did not even use working with it. (And I did not even use
[TEI-XML](http://www.tei-c.org/index.xml), yet...) [TEI-XML](http://www.tei-c.org/index.xml), yet...)
Editing and revising XML-encoded text is a pain. Just ask the Editing and revising XML-encoded text is a pain. Just ask the
literary scientists who have to work with it. literary scientists who have to work with it.
- The XML encoding, especially TEI-XML, is often not intuitive. Only - The XML encoding, especially TEI-XML, is often not intuitive. Only
...@@ -57,15 +56,15 @@ a few drawbacks to this approach: ...@@ -57,15 +56,15 @@ a few drawbacks to this approach:
friend, who is not into digital technologies, might help you with friend, who is not into digital technologies, might help you with
proof-reading, you better think about it again. proof-reading, you better think about it again.
- There is an awful lot of typing to do: All those lengthy opening - There is an awful lot of typing to do: All those lengthy opening
and closing tags. This takes time... and closing tags. This takes time...
- While looking for a good XML-Editor, you find that there hardly exist - While looking for a good XML-Editor, you find that there hardly exist
any XML-Editors any more. (And for a reason, actually...) In any XML-Editors any more. (And for a reason, actually...) In
particular, there are no good open source XML-Editors. particular, there are no good open source XML-Editors.
On the other hand, there are good reasons why XML is used in the On the other hand, there are good reasons why XML is used in the
humanities: Important encoding standards like humanities: Important encoding standards like
[TEI-XML](http://www.tei-c.org/index.xml) are defined in [TEI-XML](http://www.tei-c.org/index.xml) are defined in
XML. Its strict syntax and the possibility to check data against XML. Its strict syntax and the possibility to check data against
schema help to detect and avoiding encoding errors. If the schema is schema help to detect and avoiding encoding errors. If the schema is
...@@ -81,29 +80,29 @@ provides an infrastructure that - if you know a little ...@@ -81,29 +80,29 @@ provides an infrastructure that - if you know a little
Python-programming - makes it very easy to convert your annotated text Python-programming - makes it very easy to convert your annotated text
into an XML-encoding of your choice. With DHParser, the same poem above into an XML-encoding of your choice. With DHParser, the same poem above
can be simply encoded like this: can be simply encoded like this:
Heinrich Heine <gnd:118548018>, Heinrich Heine <gnd:118548018>,
Buch der Lieder <urn:nbn:de:kobv:b4-200905192211>, Buch der Lieder <urn:nbn:de:kobv:b4-200905192211>,
Hamburg <gnd:4023118-5>, 1927. Hamburg <gnd:4023118-5>, 1927.
Lyrisches Intermezzo Lyrisches Intermezzo
IV. IV.
Wenn ich in deine Augen seh', Wenn ich in deine Augen seh',
so schwindet all' mein Leid und Weh! so schwindet all' mein Leid und Weh!
Doch wenn ich küsse deinen Mund, Doch wenn ich küsse deinen Mund,
so werd' ich ganz und gar gesund. so werd' ich ganz und gar gesund.
Wenn ich mich lehn' an deine Brust, Wenn ich mich lehn' an deine Brust,
kommt's über mich wie Himmelslust, kommt's über mich wie Himmelslust,
doch wenn du sprichst: Ich liebe dich! doch wenn du sprichst: Ich liebe dich!
so muß ich weinen bitterlich. so muß ich weinen bitterlich.
Yes, that's right. It is as simple as that. Observe, how much more Yes, that's right. It is as simple as that. Observe, how much more
effacious a verse like "Wenn ich mich lehn' an deine Brust, / kommt's efficacious a verse like "Wenn ich mich lehn' an deine Brust, / kommt's
über mich wie Himmelslust," can be if it is not cluttered with XML tags über mich wie Himmelslust," can be if it is not cluttered with XML tags
;-) ;-)
You might now wonder whether the second version really does encode the You might now wonder whether the second version really does encode the
same information as the XML version. How, for example, would the same information as the XML version. How, for example, would the
...@@ -114,13 +113,13 @@ example, a verse always starts and ends on the same line. There ...@@ -114,13 +113,13 @@ example, a verse always starts and ends on the same line. There
is always a gap between stanzas. And the title is always written above is always a gap between stanzas. And the title is always written above
the poem and not in the middle of it. So, if there is a title at all, we the poem and not in the middle of it. So, if there is a title at all, we
can be sure that what is written in the first line is the title and not can be sure that what is written in the first line is the title and not
a stanza. a stanza.
DHParser is able to exploit all those hints in order to gather much the DHParser is able to exploit all those hints in order to gather much the
same information as was encoded in the XML-Version. Don't believe it? same information as was encoded in the XML-Version. Don't believe it?
You can try: Download DHParser from the You can try: Download DHParser from the
[gitlab-repository](https://gitlab.lrz.de/badw-it/DHParser) and enter [gitlab-repository](https://gitlab.lrz.de/badw-it/DHParser) and enter
the directory `examples/Tutorial` on the command line interface (shell). the directory `examples/Tutorial` on the command line interface (shell).
Just run `python LyrikCompiler_example.py` (you need to have installed Just run `python LyrikCompiler_example.py` (you need to have installed
[Python](https://www.python.org/) Version 3.4 or higher on your computer). [Python](https://www.python.org/) Version 3.4 or higher on your computer).
The output will be something like this: The output will be something like this:
...@@ -165,36 +164,36 @@ without further proof that it can easily be converted into the other ...@@ -165,36 +164,36 @@ without further proof that it can easily be converted into the other
version and contains all the information that the other version contains. version and contains all the information that the other version contains.
How does DHParser achieve this? Well, there is the rub. In order to convert How does DHParser achieve this? Well, there is the rub. In order to convert
the poem in the domain specific version into the XML-version, DHParser the poem in the domain specific version into the XML-version, DHParser
requires a structural description of the domain specific encoding. This requires a structural description of the domain specific encoding. This
is a bit similar to a document type definition (DTD) in XML. This is a bit similar to a document type definition (DTD) in XML. This
structural description uses a slightly enhanced version of the structural description uses a slightly enhanced version of the
[Extended-Backus-Naur-Form (EBNF)](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form), [Extended-Backus-Naur-Form (EBNF)](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form),
which is a well-established formalism for the structural description of which is a well-established formalism for the structural description of
formal languages in computer sciences. An excerpt of the EBNF-definition formal languages in computer sciences. An excerpt of the EBNF-definition
of our domain-specific encoding for the poem looks like this. (We leave out of our domain-specific encoding for the poem looks like this. (We leave out
the meta-data here. See the meta-data here. See
[`examples/Tutorial/Lyrik.ebnf`](https://gitlab.lrz.de/badw-it/DHParser/blob/master/examples/Tutorial/Lyrik.ebnf) [`examples/Tutorial/Lyrik.ebnf`](https://gitlab.lrz.de/badw-it/DHParser/blob/master/examples/Tutorial/Lyrik.ebnf)
for the full EBNF): for the full EBNF):
gedicht = { LEERZEILE }+ [serie] §titel §text /\s*/ §ENDE gedicht = { LEERZEILE }+ [serie] §titel §text /\s*/ §ENDE
serie = !(titel vers NZ vers) { NZ zeile }+ { LEERZEILE }+ serie = !(titel vers NZ vers) { NZ zeile }+ { LEERZEILE }+
titel = { NZ zeile}+ { LEERZEILE }+ titel = { NZ zeile}+ { LEERZEILE }+
zeile = { ZEICHENFOLGE }+ zeile = { ZEICHENFOLGE }+
text = { strophe {LEERZEILE} }+ text = { strophe {LEERZEILE} }+
strophe = { NZ vers }+ strophe = { NZ vers }+
vers = { ZEICHENFOLGE }+ vers = { ZEICHENFOLGE }+
ZEICHENFOLGE = /[^ \n<>]+/~ ZEICHENFOLGE = /[^ \n<>]+/~
NZ = /\n/~ NZ = /\n/~
LEERZEILE = /\n[ \t]*(?=\n)/~ LEERZEILE = /\n[ \t]*(?=\n)/~
ENDE = !/./ ENDE = !/./
Without going into too much detail here, let me just explain a few basics of Without going into too much detail here, let me just explain a few basics of
this formal description: The slashes `/` enclose ordinary regular expressions. this formal description: The slashes `/` enclose ordinary regular expressions.
Thus, `NZ` for ("Neue Zeile", German for: "new line") is defined as `/\n/~` which Thus, `NZ` for ("Neue Zeile", German for: "new line") is defined as `/\n/~` which
is the newline-token `\n` in a regular expression, plus further horizontal is the newline-token `\n` in a regular expression, plus further horizontal
whitespace (signified by the tilde `~`), if there is any. whitespace (signified by the tilde `~`), if there is any.
The braces `{` `}` enclose items that can be repeated zero or more times; with a The braces `{` `}` enclose items that can be repeated zero or more times; with a
...@@ -203,10 +202,10 @@ definition of `text` in the 6th line: `{ strophe {LEERZEILE} }+`. This reads as ...@@ -203,10 +202,10 @@ definition of `text` in the 6th line: `{ strophe {LEERZEILE} }+`. This reads as
follows: The text of the poem consists of a sequence of stanzas, each of which follows: The text of the poem consists of a sequence of stanzas, each of which
is followed by a sequence of empty lines (German: "Leerzeilen"). If you now look is followed by a sequence of empty lines (German: "Leerzeilen"). If you now look
at the structural definition of a stanza, you find that it consists of a sequence at the structural definition of a stanza, you find that it consists of a sequence
of verses, each of which starts, i.e. is preceeded by a new line. of verses, each of which starts, i.e. is preceded by a new line.
Can you figure out the rest? Hint: The angular brackets `[` and `]` mean that and Can you figure out the rest? Hint: The angular brackets `[` and `]` mean that and
item is optional and the `§` sign means that it is obligatory. (Strictly speaking, item is optional and the `§` sign means that it is obligatory. (Strictly speaking,
the §-signs are not necessary, because an item that is not optional is always the §-signs are not necessary, because an item that is not optional is always
obligatory, but the §-signs help the converter to produce more useful error obligatory, but the §-signs help the converter to produce more useful error
messages.) messages.)
...@@ -215,7 +214,7 @@ This should be enough for an introduction to the purpose of DSLs in the ...@@ -215,7 +214,7 @@ This should be enough for an introduction to the purpose of DSLs in the
humanities. It has shown the probably most important use case of humanities. It has shown the probably most important use case of
DHParser, i.e. as a frontend-technology form XML-encodings. Of course, DHParser, i.e. as a frontend-technology form XML-encodings. Of course,
it can just as well be used as a frontend for any other kind of it can just as well be used as a frontend for any other kind of
structured data, like SQL or graph-strcutured data. The latter is by the structured data, like SQL or graph-structured data. The latter is by the
way is a very reasonable alternative to XML for edition projects with a way is a very reasonable alternative to XML for edition projects with a
complex transmission history. See Andreas Kuczera's Blog-entry on complex transmission history. See Andreas Kuczera's Blog-entry on
["Graphdatenbanken für Historiker"](http://mittelalter.hypotheses.org/5995). ["Graphdatenbanken für Historiker"](http://mittelalter.hypotheses.org/5995).
...@@ -237,19 +236,19 @@ Now, if you enter the repo, you'll find three subdirectories: ...@@ -237,19 +236,19 @@ Now, if you enter the repo, you'll find three subdirectories:
DHParser DHParser
examples examples
test test
The directory `DHParser` contains the Python modules of the The directory `DHParser` contains the Python modules of the
DHParser-package, `test` - as you can guess - contains the unit-tests DHParser-package, `test` - as you can guess - contains the unit-tests
for DHParser. Now, enter the `examples/Tutorial`-directory. Presently, for DHParser. Now, enter the `examples/Tutorial`-directory. Presently,
most other examples are pretty rudimentary. So, don't worry about them. most other examples are pretty rudimentary. So, don't worry about them.
In this directory, you'll find a simple EBNF Grammar for poetry in the In this directory, you'll find a simple EBNF Grammar for poetry in the
file `Lyrik.ebnf`. Have a look at it. You'll find that is the same file `Lyrik.ebnf`. Have a look at it. You'll find that is the same
grammar (plus a few additions) that has been mentioned just before. grammar (plus a few additions) that has been mentioned just before.
You'll also find a little script `recompile_grammar.py` that is used to You'll also find a little script `recompile_grammar.py` that is used to
compile an EBNF-Grammar into an executable Python-module that can be compile an EBNF-Grammar into an executable Python-module that can be
used to parse any piece of text that this grammar is meant for; in this used to parse any piece of text that this grammar is meant for; in this
case poetry. case poetry.
Any DHParser-Project needs such a script. The content of the script is Any DHParser-Project needs such a script. The content of the script is
pretty self-explanatory: pretty self-explanatory:
...@@ -259,7 +258,7 @@ pretty self-explanatory: ...@@ -259,7 +258,7 @@ pretty self-explanatory:
with open('Lyrik_ebnf_ERRORS.txt') as f: with open('Lyrik_ebnf_ERRORS.txt') as f:
print(f.read()) print(f.read())
sys.exit(1) sys.exit(1)
The script simply (re-)compiles any EBNF grammar that it finds in the The script simply (re-)compiles any EBNF grammar that it finds in the
current directory. "Recompiling" means that DHParser notices if a current directory. "Recompiling" means that DHParser notices if a
grammar has already been compiled and overwrites only that part of the grammar has already been compiled and overwrites only that part of the
...@@ -268,7 +267,7 @@ will come to that later what these are - can safely be edited by you. ...@@ -268,7 +267,7 @@ will come to that later what these are - can safely be edited by you.
Now just run `recompile_grammar.py` from the command line: Now just run `recompile_grammar.py` from the command line:
$ python3 recompile_grammar.py $ python3 recompile_grammar.py
You'll find that `recompile_grammar.py` has generated a new script with You'll find that `recompile_grammar.py` has generated a new script with
the name `LyrikCompiler.py`. This script contains the Parser for the the name `LyrikCompiler.py`. This script contains the Parser for the
`Lyrik.ebnf`-grammar and some skeleton-code for a DSL->XML-Compiler (or `Lyrik.ebnf`-grammar and some skeleton-code for a DSL->XML-Compiler (or
...@@ -276,7 +275,7 @@ rather, a DSL-whatever compiler), which you can later fill in. Now let's ...@@ -276,7 +275,7 @@ rather, a DSL-whatever compiler), which you can later fill in. Now let's
see how this script works: see how this script works:
$ python3 LyrikCompiler.py Lyrisches_Intermezzo_IV.txt >result.xml $ python3 LyrikCompiler.py Lyrisches_Intermezzo_IV.txt >result.xml
The file `Lyrisches_Intermezzo_IV.txt` contains the fourth part of The file `Lyrisches_Intermezzo_IV.txt` contains the fourth part of
Heinrich Heine's Lyrisches Intermezzo encoded in our own human-readable Heinrich Heine's Lyrisches Intermezzo encoded in our own human-readable
poetry-DSL that has been shown above. Since we have redirected the poetry-DSL that has been shown above. Since we have redirected the
...@@ -317,7 +316,7 @@ recognizable!) first verse of the poem: ...@@ -317,7 +316,7 @@ recognizable!) first verse of the poem:
</vers> </vers>
... ...
How come it is so obfuscated, and where do all those pseudo-tags like How come it is so obfuscated, and where do all those pseudo-tags like
`<:RegExp>` and `<:Whitespace>` come from? Well, this is probably the `<:RegExp>` and `<:Whitespace>` come from? Well, this is probably the
right time to explain a bit about parsing and compilation in general. right time to explain a bit about parsing and compilation in general.
...@@ -354,7 +353,7 @@ as the grammar has to be specified for each application domain. ...@@ -354,7 +353,7 @@ as the grammar has to be specified for each application domain.
Before I'll explain how to specify an AST-transformation for DHParser, Before I'll explain how to specify an AST-transformation for DHParser,
you may want to know what difference it makes. There is a script you may want to know what difference it makes. There is a script
`LyrikCompiler_example.py` in the directory where the `LyrikCompiler_example.py` in the directory where the
AST-transformations are already included. Running the script AST-transformations are already included. Running the script
$ python LyrikCompiler_example.py Lyrisches_Intermezzo_IV.txt $ python LyrikCompiler_example.py Lyrisches_Intermezzo_IV.txt
...@@ -362,7 +361,7 @@ yields the fairly clean Pseudo-XML-representation of the DSL-encoded ...@@ -362,7 +361,7 @@ yields the fairly clean Pseudo-XML-representation of the DSL-encoded
poem that we have seen above. Just as a teaser, you might want to look poem that we have seen above. Just as a teaser, you might want to look
up, how the AST-transformation is specified with DHParser. For this up, how the AST-transformation is specified with DHParser. For this
purpose, you can have a look in file `LyrikCompiler_example.py`. If you purpose, you can have a look in file `LyrikCompiler_example.py`. If you
scrool down to the AST section, you'll see something like this: scroll down to the AST section, you'll see something like this:
Lyrik_AST_transformation_table = { Lyrik_AST_transformation_table = {
# AST Transformations for the Lyrik-grammar # AST Transformations for the Lyrik-grammar
...@@ -389,8 +388,8 @@ As you can see, AST-transformations a specified declaratively (with the ...@@ -389,8 +388,8 @@ As you can see, AST-transformations a specified declaratively (with the
option to add your own Python-programmed transformation rules). This option to add your own Python-programmed transformation rules). This
keeps the specification of the AST-transformation simple and concise. At keeps the specification of the AST-transformation simple and concise. At
the same, we avoid adding hints for the AST-transformation in the the same, we avoid adding hints for the AST-transformation in the
grammar specification, which would render the grammar less readable. grammar specification, which would render the grammar less readable.
Next, I am going to explain step by step, how a domain specific language Next, I am going to explain step by step, how a domain specific language
for poems like Heine's Lyrisches Intermezzo can be designed, specified, for poems like Heine's Lyrisches Intermezzo can be designed, specified,
compiled and tested. compiled and tested.
......
...@@ -7,7 +7,6 @@ specific languages (DSL) in Digital Humanities projects. ...@@ -7,7 +7,6 @@ specific languages (DSL) in Digital Humanities projects.
Author: Eckhart Arnold, Bavarian Academy of Sciences Author: Eckhart Arnold, Bavarian Academy of Sciences
Email: arnold@badw.de Email: arnold@badw.de
License License
------- -------
...@@ -32,18 +31,16 @@ Python 3.5 source code in order for DHParser to be backwards compatible ...@@ -32,18 +31,16 @@ Python 3.5 source code in order for DHParser to be backwards compatible
with Python 3.4. The module ``DHParser/foreign_typing.py`` is licensed under the with Python 3.4. The module ``DHParser/foreign_typing.py`` is licensed under the
[Python Software Foundation License Version 2](https://docs.python.org/3.5/license.html) [Python Software Foundation License Version 2](https://docs.python.org/3.5/license.html)
Sources Sources
------- -------
Find the sources on [gitlab.lrz.de/badw-it/DHParser](https://gitlab.lrz.de/badw-it/DHParser) . Find the sources on [gitlab.lrz.de/badw-it/DHParser](https://gitlab.lrz.de/badw-it/DHParser) .
Get them with: Get them with:
git clone https://gitlab.lrz.de/badw-it/DHParser git clone https://gitlab.lrz.de/badw-it/DHParser
Please contact me, if you are intested in contributing to the Please contact me, if you are intested in contributing to the
development or just using DHParser. development or just using DHParser.
Disclaimer Disclaimer
---------- ----------
...@@ -55,14 +52,13 @@ function names changed in future versions. The API is NOT YET STABLE! ...@@ -55,14 +52,13 @@ function names changed in future versions. The API is NOT YET STABLE!
Use it for testing an evaluation, but not in an production environment Use it for testing an evaluation, but not in an production environment
or contact me first, if you intend to do so. or contact me first, if you intend to do so.
Purpose Purpose
------- -------
DHParser leverages the power of Domain specific languages for the DHParser leverages the power of Domain specific languages for the
Digital Humanities. Digital Humanities.
Domain specific languages are widespread in Domain specific languages are widespread in
computer sciences, but seem to be underused in the Digital Humanities. computer sciences, but seem to be underused in the Digital Humanities.
While DSLs are sometimes introduced to Digital-Humanities-projects as While DSLs are sometimes introduced to Digital-Humanities-projects as
[practical adhoc-solution][Müller_2016], these solutions are often [practical adhoc-solution][Müller_2016], these solutions are often
...@@ -76,17 +72,17 @@ parser generators, but employs the more modern form called ...@@ -76,17 +72,17 @@ parser generators, but employs the more modern form called
recursive descent parser. recursive descent parser.
Why another parser generator? There are plenty of good parser Why another parser generator? There are plenty of good parser
generators out there, e.g. [Añez's grako parser generator][Añez_2017], generators out there, e.g. [Añez's grako parser generator][Añez_2017],
[Eclipse XText][XText_Website]. However, DHParser is [Eclipse XText][XText_Website]. However, DHParser is
intended as a tool that is specifically geared towards digital intended as a tool that is specifically geared towards digital
humanities applications, while most existing parser generators come humanities applications, while most existing parser generators come
from compiler construction toolkits for programming languages. from compiler construction toolkits for programming languages.
While I expect DSLs in computer science and DSLs in the Digital While I expect DSLs in computer science and DSLs in the Digital
Humanities to be quite similar as far as the technological realization Humanities to be quite similar as far as the technological realization
is concerned, the use cases, requirements and challenges are somewhat is concerned, the use cases, requirements and challenges are somewhat
different. For example, in the humanities annotating text is a central different. For example, in the humanities annotating text is a central
use case, which is mostly absent in computer science treatments. use case, which is mostly absent in computer science treatments.
These differences might sooner or later require to develop the These differences might sooner or later require to develop the
DSL-construction toolkits in a different direction. Also, DSL-construction toolkits in a different direction. Also,
DHParser shall (in the future) serve as a teaching tool, which DHParser shall (in the future) serve as a teaching tool, which
influences some of its design decisions such as, for example, clearly influences some of its design decisions such as, for example, clearly
...@@ -113,7 +109,7 @@ Further (intended) use cases are: ...@@ -113,7 +109,7 @@ Further (intended) use cases are:
Mark and Markdown also go beyond what is feasible with pure Mark and Markdown also go beyond what is feasible with pure
EBNF-based-parsers.) EBNF-based-parsers.)
* EBNF itself. DHParser is already self-hosting ;-) * EBNF itself. DHParser is already self-hosting ;-)
* Digital and cross-media editions * Digital and cross-media editions
* Digital dictionaries * Digital dictionaries
For a simple self-test run `dhparser.py` from the command line. This For a simple self-test run `dhparser.py` from the command line. This
...@@ -122,13 +118,11 @@ Python-based parser class representing that grammar. The concrete and ...@@ -122,13 +118,11 @@ Python-based parser class representing that grammar. The concrete and
abstract syntax tree as well as a full and abbreviated log of the abstract syntax tree as well as a full and abbreviated log of the
parsing process will be stored in a sub-directory named "LOG". parsing process will be stored in a sub-directory named "LOG".
Introduction Introduction
------------ ------------
see [Introduction.md](https://gitlab.lrz.de/badw-it/DHParser/blob/master/Introduction.md) see [Introduction.md](https://gitlab.lrz.de/badw-it/DHParser/blob/master/Introduction.md)
References References
---------- ----------
...@@ -146,22 +140,19 @@ München 2016. Short-URL: [tiny.badw.de/2JVT][Arnold_2016] ...@@ -146,22 +140,19 @@ München 2016. Short-URL: [tiny.badw.de/2JVT][Arnold_2016]
[Arnold_2016]: https://f.hypotheses.org/wp-content/blogs.dir/1856/files/2016/12/EA_Pr%C3%A4sentation_Auszeichnungssprachen.pdf [Arnold_2016]: https://f.hypotheses.org/wp-content/blogs.dir/1856/files/2016/12/EA_Pr%C3%A4sentation_Auszeichnungssprachen.pdf
Brian Ford: Parsing Expression Grammars: A Recognition-Based Syntactic Brian Ford: Parsing Expression Grammars: A Recognition-Based Syntactic
Foundation, Cambridge Foundation, Cambridge
Massachusetts, 2004. Short-URL:[http://t1p.de/jihs][Ford_2004] Massachusetts, 2004. Short-URL:[http://t1p.de/jihs][Ford_2004]
[Ford_2004]: https://pdos.csail.mit.edu/~baford/packrat/popl04/peg-popl04.pdf [Ford_2004]: https://pdos.csail.mit.edu/~baford/packrat/popl04/peg-popl04.pdf
[Ford_20XX]: http://bford.info/packrat/
[Ford_20XX]: http://bford.info/packrat/
Richard A. Frost, Rahmatullah Hafiz and Paul Callaghan: Parser Richard A. Frost, Rahmatullah Hafiz and Paul Callaghan: Parser
Combinators for Ambiguous Left-Recursive Grammars, in: P. Hudak and Combinators for Ambiguous Left-Recursive Grammars, in: P. Hudak and
D.S. Warren (Eds.): PADL 2008, LNCS 4902, pp. 167–181, Springer-Verlag D.S. Warren (Eds.): PADL 2008, LNCS 4902, pp. 167–181, Springer-Verlag
Berlin Heidelberg 2008. Berlin Heidelberg 2008.
Dominikus Herzberg: Objekt-orientierte Parser-Kombinatoren in Python, Dominikus Herzberg: Objekt-orientierte Parser-Kombinatoren in Python,
Blog-Post, September, 18th 2008 on denkspuren. gedanken, ideen, Blog-Post, September, 18th 2008 on denkspuren. gedanken, ideen,
anregungen und links rund um informatik-themen, short-URL: anregungen und links rund um informatik-themen, short-URL:
...@@ -169,7 +160,6 @@ anregungen und links rund um informatik-themen, short-URL: ...@@ -169,7 +160,6 @@ anregungen und links rund um informatik-themen, short-URL:
[Herzberg_2008a]: http://denkspuren.blogspot.de/2008/09/objekt-orientierte-parser-kombinatoren.html [Herzberg_2008a]: http://denkspuren.blogspot.de/2008/09/objekt-orientierte-parser-kombinatoren.html
Dominikus Herzberg: Eine einfache Grammatik für LaTeX, Blog-Post, Dominikus Herzberg: Eine einfache Grammatik für LaTeX, Blog-Post,
September, 18th 2008 on denkspuren. gedanken, ideen, anregungen und September, 18th 2008 on denkspuren. gedanken, ideen, anregungen und
links rund um informatik-themen, short-URL: links rund um informatik-themen, short-URL:
...@@ -177,17 +167,14 @@ links rund um informatik-themen, short-URL: ...@@ -177,17 +167,14 @@ links rund um informatik-themen, short-URL:
[Herzberg_2008b]: http://denkspuren.blogspot.de/2008/09/eine-einfache-grammatik-fr-latex.html [Herzberg_2008b]: http://denkspuren.blogspot.de/2008/09/eine-einfache-grammatik-fr-latex.html
Dominikus Herzberg: Uniform Syntax, Blog-Post, February, 27th 2007 on Dominikus Herzberg: Uniform Syntax, Blog-Post, February, 27th 2007 on
denkspuren. gedanken, ideen, anregungen und links rund um denkspuren. gedanken, ideen, anregungen und links rund um
informatik-themen, short-URL: [http://t1p.de/s0zk][Herzberg_2007] informatik-themen, short-URL: [http://t1p.de/s0zk][Herzberg_2007]
[Herzberg_2007]: http://denkspuren.blogspot.de/2007/02/uniform-syntax.html [Herzberg_2007]: http://denkspuren.blogspot.de/2007/02/uniform-syntax.html
[ISO_IEC_14977]: http://www.cl.cam.ac.uk/~mgk25/iso-14977.pdf [ISO_IEC_14977]: http://www.cl.cam.ac.uk/~mgk25/iso-14977.pdf
John MacFarlane, David Greenspan, Vicent Marti, Neil Williams, John MacFarlane, David Greenspan, Vicent Marti, Neil Williams,
Benjamin Dumke-von der Ehe, Jeff Atwood: CommonMark. A strongly Benjamin Dumke-von der Ehe, Jeff Atwood: CommonMark. A strongly
defined, highly compatible specification of defined, highly compatible specification of
...@@ -195,7 +182,6 @@ Markdown, 2017. [commonmark.org][MacFarlane_et_al_2017] ...@@ -195,7 +182,6 @@ Markdown, 2017. [commonmark.org][MacFarlane_et_al_2017]
[MacFarlane_et_al_2017]: http://commonmark.org/ [MacFarlane_et_al_2017]: http://commonmark.org/
Stefan Müller: DSLs in den digitalen Geisteswissenschaften, Stefan Müller: DSLs in den digitalen Geisteswissenschaften,
Präsentation auf dem Präsentation auf dem
[dhmuc-Workshop: Digitale Editionen und Auszeichnungssprachen](https://dhmuc.hypotheses.org/workshop-digitale-editionen-und-auszeichnungssprachen), [dhmuc-Workshop: Digitale Editionen und Auszeichnungssprachen](https://dhmuc.hypotheses.org/workshop-digitale-editionen-und-auszeichnungssprachen),
...@@ -203,15 +189,15 @@ München 2016. Short-URL: [tiny.badw.de/2JVy][Müller_2016] ...@@ -203,15 +189,15 @@ München 2016. Short-URL: [tiny.badw.de/2JVy][Müller_2016]
[Müller_2016]: https://f.hypotheses.org/wp-content/blogs.dir/1856/files/2016/12/Mueller_Anzeichnung_10_Vortrag_M%C3%BCnchen.pdf [Müller_2016]: https://f.hypotheses.org/wp-content/blogs.dir/1856/files/2016/12/Mueller_Anzeichnung_10_Vortrag_M%C3%BCnchen.pdf
Markus Voelter, Sbastian Benz, Christian Dietrich, Birgit Engelmann, Markus Voelter, Sbastian Benz, Christian Dietrich, Birgit Engelmann,
Mats Helander, Lennart Kats, Eelco Visser, Guido Wachsmuth: Mats Helander, Lennart Kats, Eelco Visser, Guido Wachsmuth:
DSL Engineering. Designing, Implementing and Using Domain-Specific Languages, 2013. DSL Engineering. Designing, Implementing and Using Domain-Specific Languages, 2013.
[http://dslbook.org/][Voelter_2013] [http://dslbook.org/][Voelter_2013]
[voelter_2013]: http://dslbook.org/ [voelter_2013]: http://dslbook.org/
[tex_stackexchange_no_bnf]: http://tex.stackexchange.com/questions/4201/is-there-a-bnf-grammar-of-the-tex-language [tex_stackexchange_no_bnf]: http://tex.stackexchange.com/questions/4201/is-there-a-bnf-grammar-of-the-tex-language
[tex_stackexchange_latex_parsers]: http://tex.stackexchange.com/questions/4223/what-parsers-for-latex-mathematics-exist-outside-of-the-tex-engines [tex_stackexchange_latex_parsers]: http://tex.stackexchange.com/questions/4223/what-parsers-for-latex-mathematics-exist-outside-of-the-tex-engines