For the TL;DR group amongst you, here's the link: http://www.byond.com/members/Jp/files/dreamcatcher.zip . That's a zipfile containing the source and a makefile for building the source. The makefile builds it using GNU flex and bison, which are freely available: flex, bison. Your local equivalent of lex or yacc may be appropriate - I don't know.
This is not likely to be easy to build on a Windows system, I'm afraid. For that matter, it's probably going to only be a bit easier for someone on a UNIX-like system to build. Rest assured that as of yet, Dreamcatcher does very little of use - you're not missing out.
Anyway, what was my Get Something Done project? Well, I described my original goals here. How well did I do?
Not so well, I'm afraid. I'd like to say that real life intervened, but while I'm busy, I wasn't so busy that I couldn't code. I was just lazy for most of this month. Sorry.
But I have done some work, and I've gone a bit beyond where I was last time I fiddled with this stuff.
Dreamcatcher as it is currently built is capable of extracting the object tree, including variables, from a single DM source file without preprocessor statements, procedures, or verbs. It only works on 'usual-ish' DM files - no braces for block structure, no spaces for indents. parent_type isn't understood, nor are some technically legal but kind of odd constructs, like leading derived types with a / character or using ? as an identifier. There are probably more restrictions I can't see because I'm right up against the code. Some of them are a matter of writing a few lines in the flex code, some of them are a little trickier.
So what can it do?
Well, this:
a
b
var/c
d/e/var/f
var/a/b
g = "Test variable" // This variable is a test
var
d
e
h = 5 + /* This is a constant expression */ 6 // Yes, this is kind of odd code.
i/var/j = {"Oh no!
This giant string will devour us all!
And it's FULL of funny characters!
" \"} "\} ""}
k/var/l = "This one is also kind of odd \
and also spans two lines \""
parses out to:
root a b var c d e var f var a b g var d e h i var j k var l
Notice that names that appear more than once aren't collapsed into a single instance (all those 'var' nodes), but that's mostly a matter of data representation. The current code is very proof-of-concept.
Is this at all useful? Probably not. But hopefully it'll lead to something more interesting.