MuggsOfCode // About v0.1

> What is this tool?

A spec-driven code generation sandbox for a high-school CS course. You write a specification in markdown; a small local language model synthesizes a single-file HTML page from it; you read the generated code, comment what you understand, and iterate.

> What does this tool teach?

There are three primary ways AI assists in the writing of code: vibe coding ("just make me a thing"), AI-assisted coding ("complete what I'm typing"), and spec-driven development ("here's a precise specification — build to it"). This tool trains the third.

Spec-driven development is the oldest of the three by decades and the most transferable to real engineering. Working software has been designed from specifications since the 1960s; what changed in 2022 is that the spec→code translation got cheap enough that the spec can be a living document instead of an archived one.

The skill you're building isn't “use AI to write code.” It's express your intent precisely enough that anyone — a person, a model, or future-you — can produce the right thing from it. That skill transfers to any AI tool, any year, any model size.

> How does the build loop work — the three viewers?

1. Markdown viewer (the spec)

Four panels: Project Overview, Structure (HTML), Style (CSS), and Behavior (JS). Each panel is its own auto-growing textarea. The four sections force you to decompose your intent into the structural categories a webpage actually has.

A START WITH dropdown at the top loads a starter template for one of five project types (landing page, navigation portal, long-form article, slide deck, form/signup) or a Something else open-ended start. Templates contain {{slots}} you fill in and {{a | b | c}} choice lists for design decisions.

2. Code viewer (the artifact)

Where the synthesized HTML/CSS/JS lands after you click SYNTHESIZE →. Rendered with the same VS Code Dark+ palette students will see if they ever open the real editor: blue tags, light-cyan attributes, orange strings, green italic comments. The colors are syntax-highlighting from Prism.js sitting under a transparent textarea so the code is fully editable while staying syntactically navigable.

3. Page viewer (the rendered result)

Lives below the fold. You have to scroll past the code viewer to reach it. That's intentional — if the page loaded automatically, you'd glance at it and skip reading the code. Reading the code is half the curriculum, so the rendered page gets put one scroll away.

> How should students get started?

Students who do their best work on this tool do something before they ever open it: they draw the page on graph paper. Top to bottom. Header here, hero there, three cards in a row, a form, a footer. The drawing doesn't have to be neat. The point is that the hardest decisions — what goes on the page, in what order, with what hierarchy — get made physically, with a pencil, before any spec gets typed.

When you then sit at the tool, the Structure (HTML) panel is mostly transcription: read what you drew, top to bottom, into a numbered list. The Project Overview is what the drawing is FOR. Style annotates the visual choices you made on paper (colors, tone, whether things feel airy or dense). Behavior notes the things you implicitly drew as clickable or moving.

Page mapping turns spec-writing from intimidating abstraction into reading-aloud-from-paper. It also gives you something to hold up and point at when you defend your spec in class.

> What's the typical workflow loop?

Write a spec into the four markdown panels (or load a template and fill the slots).
Click SYNTHESIZE →. The spec goes to the LLM, which produces HTML/CSS/JS.
Read the generated code in viewer 2.
Click RENDER → on viewer 2. The page appears in viewer 3 below.
Compare what you got to what you intended. Spot gaps.
If the change is small — tweak the code directly. If the change is significant — revise the spec and synthesize again. The tool teaches the judgement of which to do when.
Once you have something you're proud of, add your own comments to the code explaining what each section does. // CHECK COMMENTS // verifies your understanding without writing comments for you.
The annotated code is your submission.

> What can the AI do — and what is it forbidden from doing?

Per-section `// ASK AI //`

Each spec section has its own ASK AI button. Returns 3–6 Socratic questions about that section only. Aware of which template you picked (asks landing-page-specific questions if you picked a landing page). Aware of unreplaced {{slots}} in template content (walks you through each placeholder).

Whole-spec `// REVIEW //`

Critiques the full four-section spec. Flags vague language, missing decisions, contradictions between sections, scope mismatches with your chosen project type.

`// CHECK COMMENTS //` on viewer 2

After you add your own comments to the generated code, this button asks the AI to check (a) whether each comment is written with valid syntax for its language context, and (b) whether each comment accurately describes the surrounding code. It will NOT write or rewrite any of your comments and will NOT fix any code. It's a second pair of eyes on your reading.

What the AI is forbidden from doing

It will not write spec content for you. (Templates and slots are the only scaffolding it provides.)
It will not pick design choices for you (no “use #2d4a2b for primary”).
It will not write or rewrite your comments.
It will not fix your code.
It will not explain what well-named code already says.

These are not technical limits — they are curriculum guardrails enforced in the system prompts. The point of the tool is for you to practice the work; the AI is an interviewer, critic, and verifier, never a co-author.

> What hardware is powering this tool?

This tool talks to a local LLM running on a single GPU in a closet, not to a cloud API. Two reasons that matter: cost (a class hitting an API would burn through budget fast) and pedagogy (a small local model has less slack, so vague specifications produce visibly weaker output — the gap teaches precision).

The current rig is an RTX 4090 (24 GB VRAM) running llama-server from the llama.cpp project. The model is Qwen3-8B as the target with Qwen3-1.7B as a speculative draft. The 4090 serves student traffic; a future RTX 5070 Ti will handle teacher-side work (assessment, focused writing feedback) on its own.

> What is speculative decoding?

Modern LLMs generate text one token at a time. Each token requires running the model's full forward pass. That makes large models slow.

Speculative decoding cheats that step. A small “draft” model (Qwen3-1.7B here) proposes the next several tokens cheaply. Then the target model (Qwen3-8B) checks all of them in a single forward pass: it accepts as many as match its own prediction and rejects the rest. If the draft was right about most of them, you collected several tokens for the cost of one. Typical speedup on code generation with this pairing is roughly 1.8x-3x.

The trick only works when the two models share a tokenizer and produce similar token distributions — which is why we pair Qwen3-1.7B with Qwen3-8B rather than mixing families.

> What's in v0.1, and what's coming in v0.2?

This is the first usable version of the tool. The architecture works: students can write specs, get code, read code, write comments, and have those comments checked. The infrastructure runs (Cloudflare Pages for the front-end, a Cloudflare Worker proxy talking to the local llama-server, KV-backed per-student credentials, HMAC-signed session cookies).

What v0.1 does NOT yet do, but v0.2 will look at:

Publish-to-GitHub — clicking a button puts a student's page at a real public URL under their handle, on a class repo.
Per-student rate limiting — making sure no single student saturates the GPU.
Spec history / diff view — seeing what you changed between revisions.
Automated load-checks — the teacher sees which student pages load without errors before reading any of them.
Inline syntax-error markers in the code viewer — VS Code-style red squigglies under unclosed tags, mismatched braces, typos, etc. Right now syntax errors only show up when you click RENDER and the page misbehaves. v0.2 will catch them at the editor.
Possible model upgrades as Qwen and other open-weights families release newer instructables.

> Who made this?

Built by Sean Muggivan as a teaching tool for high-school computer science. If you want access to try it, email sean@muggivanlcsw.me.

The Code Builder