In my previous post I showed the expression parser that I have been using in some of my hobby projects. I always thought of this as fairly small and neat in its operation. It made sense to me and I was happy with it. However… Since writing that post, I have been thinking about expression parsing again and I have made some discoveries.
Last night I tried to eliminate the stack from my expression parse and use simply recursion and the main stack itself. I came up with the following algorithm, which works great, and I was amazed at how tiny it is:
int prec(char c) {
switch (c) {
case ('=') : return 1;
case ('&') : return 2;
case ('+') : return 3;
case ('*') : return 4;
default: return 0;
}
}
void level(int l) {
while (char t = peek()) {
if (t == ')') return;
if (prec(t) > l) {
level(prec(t));
}
else if (prec(t) == l) {
t = next();
ident();
level(l);
putc(t);
}
else {
return;
}
}
}
void expr() {
ident();
level(1);
}
Since nobody can come up with anything original these days, I wanted to know more about the algorithm I have just found, since it surely has been discovered by those who came before me. It seems this is a crappy weird version of a precedence climbing parser. This link here does a very good job of explaining how it works, and how to extend it to handle things like differing association, which I had not considered.
After seeing that article I wanted to try out the jist of the algorithm as they present it. Here is the full source for my little test program I made to play around with it:
#include <stdio.h>
#include <assert.h>
// the source expession
static const char * g = "z=a+b*c+d=e";
//static const char * g = "a+b&c";
//static const char * g = "a+(b&c)";
// lexer duties
char next() { if (*g) return *g++; else return '\0'; }
char peek() { return *g; }
void putc(char c) { putchar(c); }
void expect(char c) { char t = next(); assert(t == c); }
void expr(int v);
// munch an itentifier or parenthesized expression
void ident() {
char n = next();
if (n == '(') { expr(1); expect(')'); return; }
assert(n >= 'a' && n <= 'z');
putc(n);
}
// the precedence table
int prec(char c) {
switch (c) {
case ('=') : return 1;
case ('&') : return 2;
case ('+') : return 3;
case ('*') : return 4;
default: return 0;
}
}
// precedence climbing expression parser
void expr(int min_prec) {
ident();
while (prec(peek()) >= min_prec) {
char c = next();
expr(prec(c));
putc(c);
}
}
int main() {
expr(1);
return 0;
}
This version is much much smaller and neater then either of the versions I came up with. I will definitely be ditching my strange methods for this one. I am always happy to learn something new, and especially when its better then what I was previously doing. I also boldly said expressions were the hardest part of a recursive decent compiler in my previous post… well I certainly don’t think that’s true anymore!