Rainbow OS Update: Vim Editor, BASIC Interpreter, Persistent Storage & C Compiler


Five days, four milestones. Since the last update, Rainbow OS has gained a full-screen text editor, a BASIC interpreter, persistent disk storage, and a subset-C compiler that generates native i386 machine code — all running on a custom 32-bit kernel targeting the Intel 486.

M10: Vim-like Text Editor

Rainbow OS now ships with a built-in text editor modeled after Vim. It supports four modes:

  • Normal mode — navigation with h/j/k/l (or arrow keys), 0/$ for line start/end, gg/G for first/last line
  • Insert mode — entered via i, a, o, or O; Tab inserts 4 spaces; Escape returns to Normal
  • Visual mode (v) — character-wise selection
  • Visual-line mode (V) — line-wise selection

Editing Commands

The editor implements core Vim operations:

  • dd — delete current line
  • yy — yank (copy) current line
  • p — paste after cursor (supports both charwise and linewise paste)
  • x — delete character at cursor
  • u — undo
  • Ctrl+R — redo

Undo/redo uses transaction boundaries — each insert session or normal-mode command is a single undoable unit. The undo history is maintained as a linked structure in the buffer.

File Persistence

Files are saved directly to the FAT12 filesystem via :w. The editor calls fat12_write_file() to allocate clusters, write data, and update directory entries. Combined with M12’s disk backing, edited files survive reboots.

M11: BASIC Interpreter

A complete BASIC interpreter brings interactive programming to Rainbow OS. Programs use classic numbered lines:

10 PRINT "Hello from Rainbow OS!"
20 FOR I = 1 TO 10
30 PRINT I * I
40 NEXT I
50 END
RUN

Tokenization

The tokenizer converts keywords to single-byte tokens (0x80–0xFF range), reducing memory usage and simplifying the parser. All standard BASIC keywords are recognized: IF/THEN/ELSE, FOR/TO/STEP/NEXT, GOTO, GOSUB/RETURN, PRINT, INPUT, LET, DIM, REM, and more.

Expression Evaluator

The expression engine handles:

  • Arithmetic: +, -, *, /, MOD
  • Comparison: =, <, >, <=, >=, <>
  • Logical: AND, OR, NOT
  • Built-in functions: ABS(), INT(), RND(), LEN(), VAL(), CHR$(), STR$(), LEFT$(), RIGHT$(), MID$()
  • Direct memory access: PEEK() and POKE

Execution Runtime

The runtime maintains a GOSUB stack (32 levels deep) and a FOR/NEXT stack (8 levels). Programs can be saved to and loaded from disk with SAVE "filename" and LOAD "filename". Immediate-mode commands (RUN, LIST, NEW) execute outside the program context.

M12: Persistent File Storage

Before M12, the FAT12 filesystem lived entirely in a 64 KB ramdisk — everything was lost on reboot. Now, an ATA PIO driver provides actual disk persistence.

ATA PIO Driver

The driver communicates with the primary ATA controller via I/O ports 0x1F0–0x1F7:

  • 28-bit LBA addressing (supports drives up to 128 GB)
  • ata_read_sectors(lba, count, buf) and ata_write_sectors(lba, count, buf)
  • Proper busy-wait polling with timeout (1M iterations, ~400ns delay via ALT_STATUS reads)
  • Cache flush after writes (ATA FLUSH CACHE command)
  • Drive detection via the IDENTIFY command at boot

Disk Layout

The HDD image uses a fixed layout:

Sectors Content
0 MBR (Stage 1 bootloader)
1–4 Stage 2 bootloader
5–260 Kernel binary
261 Reserved
262–2309 FAT12 filesystem (1 MB)

New Shell Commands

  • sync — flush the in-memory filesystem to disk via diskfs_sync()
  • rm <filename> — delete a file and free its FAT clusters

M13: Subset-C Compiler

The most ambitious milestone yet: a C compiler that runs inside the OS, compiling C source files to native i386 machine code that executes directly on the CPU.

Compilation Pipeline

  1. Preprocessor (preproc.c) — basic macro expansion
  2. Lexer (lexer.c) — tokenizes source into keywords, identifiers (max 65 chars), literals, operators
  3. Parser (parser.c) — recursive-descent parser that drives code generation
  4. Code Generator (codegen.c) — emits raw i386 machine code, not assembly text

Supported C Subset

  • Types: int, char, void, pointers
  • Control flow: if/else, while, for, do/while, break, continue
  • Operators: full arithmetic, bitwise (&, |, ^, ~, <<, >>), comparison, logical (&&, ||, !), compound assignment (+=, -=, etc.), increment/decrement
  • Pointers: dereference (*), address-of (&), type casts like (char *)0xB8000
  • Functions: declarations, calls, return values, local variables via stack frames

Code Generation Details

The code generator emits position-dependent i386 bytecode loaded at 0x200000 (2 MB). Constraints:

  • 8 KB code segment, 4 KB data segment, 2 KB string pool
  • 256 labels and 256 fixups for forward references
  • Standard x86 calling convention with proper stack frame prologues/epilogues

Usage from the shell:

cc hello.c        # compile to hello.bin
cc hello.c -r     # compile and run immediately

Current Limitations

  • No structs, enums, or unions
  • No dynamic memory allocation
  • No arrays (pointer arithmetic works as a substitute)
  • No floating point
  • Single-pass compilation with limited error recovery

Other Improvements

German QWERTZ Keyboard Layout

The keyboard driver was switched from US QWERTY to German QWERTZ, including proper Z/Y swap, umlauts (ü, ö, ä), and ß mapping. Two 128-byte scancode tables handle normal and Shift-modified keys.

ACPI Shutdown

A new shutdown command performs a clean ACPI power-off by programming the PIIX4 power management controller:

  1. Set the PM base address via PCI configuration space (bus 0, device 1, function 3)
  2. Enable ACPI I/O space via PMREGMISC
  3. Write SLP_EN + SLP_TYP=S5 to PM1a_CNT to trigger the sleep state 5 (power off)

This works on QEMU’s i440FX/PIIX4 chipset emulation.

Editor Polish

Several smaller fixes improved the editor experience: a status bar rendering bug was fixed where the filename would bleed through behind command input, and the visual mode selection highlighting was refined.

What’s Next

Rainbow OS has grown from a bare-metal bootloader to a self-hosting development environment — you can write C code in the built-in editor, compile it with the built-in compiler, and run the resulting binary, all without leaving the OS. The next steps will focus on expanding the C compiler’s capabilities and adding more system calls for compiled programs to use.

The full source code is available on GitHub. Build it yourself with i686-elf-gcc and QEMU:

cmake -B build -DCMAKE_TOOLCHAIN_FILE=cmake/i686-elf-toolchain.cmake
cmake --build build
./scripts/run.sh