A simpler tinier expression parser

In my previous post I showed the expression parser that I have been using in some of my hobby projects.  I always thought of this as fairly small and neat in its operation.  It made sense to me and I was happy with it.  However… Since writing that post, I have been thinking about expression parsing again and I have made some discoveries.

Last night I tried to eliminate the stack from my expression parse and use simply recursion and the main stack itself.  I came up with the following algorithm, which works great, and I was amazed at how tiny it is:

int prec(char c) {
    switch (c) {
    case ('=') : return 1;
    case ('&') : return 2;
    case ('+') : return 3;
    case ('*') : return 4;
    default:     return 0;
    }
}

void level(int l) {
    while (char t = peek()) {
        if (t == ')') return;
        if (prec(t) > l) {
            level(prec(t));
        }
        else if (prec(t) == l) {
            t = next();
            ident();
            level(l);
            putc(t);
        }
        else {
            return;
        }
    }
}

void expr() {
    ident();
    level(1);
}

Since nobody can come up with anything original these days, I wanted to know more about the algorithm I have just found, since it surely has been discovered by those who came before me. It seems this is a crappy weird version of a precedence climbing parser. This link here does a very good job of explaining how it works, and how to extend it to handle things like differing association, which I had not considered.

After seeing that article I wanted to try out the jist of the algorithm as they present it. Here is the full source for my little test program I made to play around with it:

#include <stdio.h>
#include <assert.h>

// the source expession
static const char * g = "z=a+b*c+d=e";
//static const char * g = "a+b&c";
//static const char * g = "a+(b&c)";

// lexer duties
char next()         { if (*g) return *g++; else return '\0'; }
char peek()         { return *g; }
void putc(char c)   { putchar(c); }
void expect(char c) { char t = next(); assert(t == c); }
void expr(int v);

// munch an itentifier or parenthesized expression
void ident() {
    char n = next();
    if (n == '(') { expr(1); expect(')'); return; }
    assert(n >= 'a' && n <= 'z');
    putc(n);
}

// the precedence table
int prec(char c) {
    switch (c) {
    case ('=') : return 1;
    case ('&') : return 2;
    case ('+') : return 3;
    case ('*') : return 4;
    default:     return 0;
    }
}

// precedence climbing expression parser
void expr(int min_prec) {
    ident();
    while (prec(peek()) >= min_prec) {
        char c = next();
        expr(prec(c));
        putc(c);
    }
}

int main() {
    expr(1);
    return 0;
}

This version is much much smaller and neater then either of the versions I came up with. I will definitely be ditching my strange methods for this one. I am always happy to learn something new, and especially when its better then what I was previously doing. I also boldly said expressions were the hardest part of a recursive decent compiler in my previous post… well I certainly don’t think that’s true anymore!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s