Esrapy - Parsing in Python

esrapy is yparse backwards

Latest version: 0.7 posted March 20, 2023 (changes)

Esrapy is a parsing library written entirely in python. (If you're thinking "what the heck is 'parsing'?", click here.)

Esrapy can parse pretty much any context-free grammar, including left-recursive ones; can compile patterns (grammars) from textual or procedural descriptions; unifies parsing and tokenizing so tokenization can be contextual; has support for precedence (ambiguity resolution) and attributes (limited context-sensitivity); can return "first match" or a forrest of all possible parsings (for an ambiguous grammar); is very easy to use and results in very readable applications; and is only about 600 lines of code (not counting comments)--or only 450 if you don't need the textual compiler. Esrapy works as a simple translator from a raw source text to a friendly(!) parse tree. Much of the emphasis in esrapy is in making the returned parse tree very easy to traverse.

Esrapy is not terribly fast, and may be a bit of a memory hog while it's running (this may be an understatement). It's probably most useful for language prototyping or small interactive applications where generality and ease of use are more important than speed. (I could be wrong about the speed--I haven't run comparisons.) The parsing method it uses is somewhere between chart parsing and a packrat recursive descent.

Here is an entire program using esrapy which implements a simple infix calculator with operator precedence (see here for a trivial natural language example):

import esrapy

# Here's a Pattern that parses expressions like "a = b*5+(2^c)":
pattern = esrapy.compile(r"""
SPACE = <[ \t]*>
expr  = name:<[a-zA-Z_]+> op:'=' b:expr
		|1; a:expr@1 op:<[+-]> b:expr@2
		|2; a:expr@2 op:<[*/]> b:expr@3
		|3; a:expr@3 op:'^'    b:expr@4
		|4; name:<[a-zA-Z_]+> op:\var
		|    val:<[0-9]+>     op:\int
		| '(' _:expr ')'
""")

# And here's code that evaluates expressions like that...
vars = {}
def eval_pt(pt):
	if pt.op == 'int': return int(pt.val)
	if pt.op ==   '+': return eval_pt(pt.a)  + eval_pt(pt.b)
	if pt.op ==   '-': return eval_pt(pt.a)  - eval_pt(pt.b)
	if pt.op ==   '*': return eval_pt(pt.a)  * eval_pt(pt.b)
	if pt.op ==   '/': return eval_pt(pt.a)  / eval_pt(pt.b)
	if pt.op ==   '^': return eval_pt(pt.a) ** eval_pt(pt.b)
	if pt.op == 'var': return vars[pt.name]
	if pt.op ==   '=':
		val = eval_pt(pt.b)
		vars[pt.name] = val
		return val
	raise "Unhandled op: %s"%pt.op

while True:
	try:
		s = raw_input("Expr: ")
	except:
		break
	try:
		pt = pattern.match(s, debug=1)
	except:
		print "Sorry, couldn't parse that."
		continue
	print "Value:", eval_pt(pt)

More info and docs available in the tarball: esrapy.0.7.tgz

It is released under the GPL license version 2.

It is currently fairly green, so expect possible hiccups. Please let me know of any bugs you find, or enhancements you make.