In my previous post I showed the expression parser that I have been using in some of my hobby projects. I always thought of this as fairly small and neat in its operation. It made sense to me and I was happy with it. However… Since writing that post, I have been thinking about expression parsing again and I have made some discoveries.
Last night I tried to eliminate the stack from my expression parse and use simply recursion and the main stack itself. I came up with the following algorithm, which works great, and I was amazed at how tiny it is:
int prec(char c) { switch (c) { case ('=') : return 1; case ('&') : return 2; case ('+') : return 3; case ('*') : return 4; default: return 0; } } void level(int l) { while (char t = peek()) { if (t == ')') return; if (prec(t) > l) { level(prec(t)); } else if (prec(t) == l) { t = next(); ident(); level(l); putc(t); } else { return; } } } void expr() { ident(); level(1); }
Since nobody can come up with anything original these days, I wanted to know more about the algorithm I have just found, since it surely has been discovered by those who came before me. It seems this is a crappy weird version of a precedence climbing parser. This link here does a very good job of explaining how it works, and how to extend it to handle things like differing association, which I had not considered.
After seeing that article I wanted to try out the jist of the algorithm as they present it. Here is the full source for my little test program I made to play around with it:
#include <stdio.h> #include <assert.h> // the source expession static const char * g = "z=a+b*c+d=e"; //static const char * g = "a+b&c"; //static const char * g = "a+(b&c)"; // lexer duties char next() { if (*g) return *g++; else return '\0'; } char peek() { return *g; } void putc(char c) { putchar(c); } void expect(char c) { char t = next(); assert(t == c); } void expr(int v); // munch an itentifier or parenthesized expression void ident() { char n = next(); if (n == '(') { expr(1); expect(')'); return; } assert(n >= 'a' && n <= 'z'); putc(n); } // the precedence table int prec(char c) { switch (c) { case ('=') : return 1; case ('&') : return 2; case ('+') : return 3; case ('*') : return 4; default: return 0; } } // precedence climbing expression parser void expr(int min_prec) { ident(); while (prec(peek()) >= min_prec) { char c = next(); expr(prec(c)); putc(c); } } int main() { expr(1); return 0; }
This version is much much smaller and neater then either of the versions I came up with. I will definitely be ditching my strange methods for this one. I am always happy to learn something new, and especially when its better then what I was previously doing. I also boldly said expressions were the hardest part of a recursive decent compiler in my previous post… well I certainly don’t think that’s true anymore!