BASIC programming language interpreter based on Boost.Spirit X3
The project was born as a result of a time-limited engineering challenge of implementing a BASIC interpreter that is good enough for running 2 text-based games:
- WUMPUS 2 by Gregory Yob from "More BASIC Computer Games" by David H. Ahl, published 1979 (listing)
- THE ANCIENT CHATEAU from "Creating Adventure Games On Your Computer" by Tim Hartnell, published 1983 (listing)
- Fully functional, but limited to only essential (used in samples) commands.
- Parsing based on
Boost.Spirit X3
. The whole language grammar takes less than 250 lines. - Comprehensive unit test suite:
unit_tests.bas
(based on the test suite of Applesoft BASIC in Javascript by Joshua Bell) + separatetests.cpp
- Input automation: User input is being logged and can be reused through
.input
files - Walkthrough
.input
files for bothwumpus2.bas
andchateau.bas
- Interactive mode
- Additional programs included
Download Windows executable archive, unarchive, and execute run.bat
to run most programs at once using the input automation.
Maze generation, Prime numbers, etc. See a separate page for more info
The goal was to write an interpreter with a minimum amount of code yet capable of running the sample programs as is and without cheating. It means that many classical concepts of compiler construction theory were omitted for the sake of simplicity. Here are some key decisions and related consequences:
- Preparse step (
bool Preparse()
), which copies the whole program into memory. Doing so, it indexes lines for fastGOTO
and separately storesDATA
section. Also, it combines multiline statement sequences together. - No tokenization / lexical analysis step. The parser works with characters directly. It increases parser complexity and likely slows it down. Additionally, some "nospace inputs" aren't supported, e.g., in
IFK9>T9THENT9=K9
the substringT9THENT9
will be recognized as an identifier instead of 2 identifiers and thethen
keyword. (It could be supported using lookahead syntax inidentifier_def
rule). The "right" approach could leverageBoost.Spirit.Lex
or old trusty Flex to generate the lexical analyzer. - Absence of Abstract Syntax Tree (AST) and bytecode generations as well as no separate execution step. I.e. parsing and execution happen simultaneously without intermediate representation. That's the biggest hack. The obvious issue is that each iteration of the loop involves parsing which isn't optimal. Also, there is some increased complexity of the grammar to support proper backtracking. The hardest thing though was the
ESLE
part of the condition statement. To solve thatclass SkipStatementRuntime
was created andbool ParseSequence()
complexity came from that. The "right" approach could involve the generation of AST, bytecode with following separate execution, or even generation of LLVM IR and making the real compiler.
See a separate page for more info