Hackathon Playbook Back to retreat

The Two-Hour Hackathon

Ship a real working thing
in 120 minutes.

A working playbook for using Claude Code, a knowledge base, and a small army of agents to go from a blank repo to a deployed, demoable product inside one hackathon block. Built from doing it — not from describing it.

120Minutes total
7Phases
3–5Agents in parallel
1Live deploy

The shape of two hours

Don't wing it — ride this curve.

Hackathons that ship follow the same arc every time. Setup is fast and unglamorous. The middle is where you spend the most time and the most discipline. The last 15 minutes are deploy + smoke-test, not building.

0:00
Setup
10 min
0:10
Brief the AI
10 min
0:20
Spawn agents
5 min
0:25
Prototype
40 min
1:05
QA loop
25 min
1:30
Polish
15 min
1:45
Deploy
15 min

Phase by phase

Seven moves, two hours.

Each phase has a single job. Each phase has a clear "done" signal. If you find yourself blowing past a phase's time, you've slipped scope — cut, don't extend.

1

Set up your environment

0:00 · 10 minutes

Goal

A fresh repo, Claude Code running in it, your knowledge-base MCPs connected, deploy target wired up. Empty but live.

What to do

  • gh repo create & clone
  • claude/init to seed CLAUDE.md
  • Add MCP for your KB (Notion / ClickUp / Outline)
  • Add MCP for deploy target (Vercel / Fly)
  • Push an empty index.html and deploy — prove the pipe

Why this matters

The single biggest hackathon killer is "we couldn't deploy at the end". Deploy something boring on minute 10 so the path is paved.

Done signal

A live URL returning your placeholder. Git is clean. Claude has a project memory.

2

Brief the AI from your knowledge base

0:10 · 10 minutes

Goal

Claude understands your domain, your constraints, and what "good" looks like — without you typing 2,000 words of context.

What to do

  • Point Claude at the brief in your KB (URL or page)
  • Ask it to summarise what it understood
  • Correct any misunderstandings — this is the highest-leverage minute of the day
  • Have it write the refined brief to CLAUDE.md

Why this matters

The brief becomes the agent's compass. Every later agent inherits this context. Fix it once, never re-explain.

Done signal

Claude can answer "what are we building, for whom, by when, and what does done look like?" in three sentences.

3

Spawn parallel agents

0:20 · 5 minutes

Goal

Stop being a bottleneck. Run a scoping agent, a design agent, and a research agent in parallel while you drive the build.

What to do

  • Scoping agent: "Cut this idea down to what's buildable in 90 minutes."
  • Design agent: "Generate the minimum visual identity — palette, fonts, one logo mark."
  • Research agent: "Find 3 existing tools that solve adjacent problems; report what they do well."

Why this matters

Agents are cheap. Your attention is not. Anything that doesn't need you can run while you build.

Done signal

Three background agents kicked off, returning useful artifacts by the time you need them.

4

Build the prototype

0:25 · 40 minutes

Goal

Something demoable. Not pretty. Not handling every edge. The shortest path to a working flow you can click through end-to-end.

What to do

  • Drive Claude to build the happy path first — one screen, one API call, one output
  • Validate the deploy still works after every meaningful change (keep the pipe paved)
  • Resist new ideas; queue them in your KB as "v2"

Why this matters

The prototype's job is to prove the idea, not to be the product. Edge cases come later. Tests come later.

Done signal

You can demo the happy path on the live URL without flinching.

5

Automated QA loop

1:05 · 25 minutes

Goal

Bugs found and fixed before the demo. A QA agent that tests, logs findings to your KB, and keeps running while you're polishing.

What to do

  • Spin up a QA agent: feed it the brief, the deployed URL, and an empty Notion/ClickUp DB row
  • It should test edge cases, write findings to the DB, mark pass/fail
  • You triage: fix the real ones, defer the rest

Why this matters

Manual QA is the silent time-sink that destroys hackathon timelines. Delegate it and triage.

Done signal

A QA log with rows. Critical ones green. Known-deferred ones flagged.

6

Production polish

1:30 · 15 minutes

Goal

The 15 minutes that turn "works on my machine" into "I'd be proud to show a stranger this".

What to do

  • Empty states, error messages, loading spinners
  • Copy pass — verbs over nouns, no placeholder text
  • One favicon, one OG image, one decent title
  • Mobile check — resize the window

Why this matters

These small things are what people remember. The work feels like a product when you do them, like a demo when you don't.

Done signal

You can scroll the page on your phone and not wince once.

7

Deploy + smoke-test

1:45 · 15 minutes

Goal

The final deploy is boring because you've deployed 8 times today. Smoke-test, screenshot, log the run.

What to do

  • git push → auto-deploy or vercel --prod
  • Curl every route, expect 200s
  • Click through the happy path one more time
  • Write the final QA log row: "Demo-ready, deployed, URL: ..."

Why this matters

The boring deploy at minute 115 is what made the deploy at minute 10 worth doing. You've earned the calm.

Done signal

Live URL. All routes 200. The story for the demo is in your head.

Multi-agent workflow

Stop being the bottleneck.

Hackathons feel impossible when you're the only one moving. They feel inevitable when four agents are working in parallel and you're orchestrating. Here are the four roles to spin up.

Scoping agent

Decides what's in for the 2-hour window. The first agent you spawn. Saves you from yourself.

What it does

  • Reads the brief from your KB
  • Lists every feature anyone could want
  • Cuts ruthlessly to the demo-critical core
  • Writes the cut list back to the KB as "v2 candidates"

Starter prompt

# Scoping agent
You are scoping a 2-hour hackathon build of: "<idea>".
The team is 1 person + Claude Code.
The brief lives at <notion-url> — read it first.

List every feature anyone might want.
Then cut to the smallest demoable thing — one screen,
one input, one output. Defer the rest.

Output two markdown sections: "In for v1" and
"Deferred to v2". Write both back to the KB.

Build agent (you + Claude Code)

The main loop. You drive, Claude writes. Tight feedback cycles — never let the diff get larger than you can hold in your head.

Pattern

  • Ask for one screen / one endpoint at a time
  • Run it, click it, see it
  • Commit when it works (not when it's perfect)
  • Re-deploy every ~15 minutes — catch infra drift early

Starter prompt

# Build kickoff
Read CLAUDE.md and the "In for v1" section
the scoping agent wrote.

Build the happy path only. One screen, one
backend call, one output. Static HTML + a single
serverless function is the right ceiling for v1.

Match the visual identity in tokens.css if it exists,
otherwise be conservative: clean type, generous whitespace,
one accent color, no decoration.

When the happy path runs locally, commit and deploy.
Tell me what to click to demo it.

QA agent

Runs in the background from minute 65 onwards. Tests against the deployed URL, logs findings to your KB, doesn't bother you with cosmetic stuff.

What it tests

  • Every route returns 200
  • The happy path completes end-to-end
  • Empty input, oversized input, hostile input
  • Mobile viewport doesn't break the layout

Starter prompt

# QA agent
Test the deployed app at <url>. The brief is in CLAUDE.md.

For each test:
  1. Describe what you tested in one line
  2. Run it (curl, headless browser, or visual check)
  3. Pass or fail
  4. If fail, severity: critical / important / cosmetic

Log every test as a row in the QA database at <notion-url>.

When done, return a 5-line summary:
  - Total tests
  - Critical failures (list them)
  - Important failures (list them)
  - Cosmetic (count only)
  - Overall verdict

Deploy agent

The unsung hero. Owns the deploy pipeline, sanity-checks env vars, runs the final smoke test, writes the launch note.

What it does

  • Verifies env vars before deploy
  • Runs the deploy command (Vercel / Fly / Render)
  • Curls every route post-deploy
  • Writes the launch note to the KB

Starter prompt

# Deploy + smoke-test
You are deploying the current branch to production.

  1. Run vercel env ls and confirm every var in
     .env.example has a production value.
  2. vercel --prod. Wait for READY.
  3. Curl every route in routes.md; expect 200.
  4. Hit the happy path with a real payload; expect 200.
  5. Write a launch note to the KB at <notion-url>:
       URL, deploy time, routes tested, payload tested,
       known issues (link to QA-DB rows).

If anything fails, halt and report — do not roll forward.

Worked example

This site is the case study.

The page you're reading right now — and the retreat hub it links from — was built using the exact playbook above. Here's what it actually took.

Case study · Helix Robotics Leadership Retreat

From Notion brief to live, Notion-driven intake form — in a working day.

The brief lived in a Notion page (URLs, repo layout, env vars, brand). One human + Claude Code built nine HTML pages, four serverless API endpoints, three Notion databases (Questions, Responses, QA), and a bidirectional sync command that lets a non-coder edit the form prompts in Notion and have the live form update on reload — no redeploy needed.

9
HTML pages
4
API endpoints
3
Notion DBs
23
Synced questions
100%
QA pass rate

How it unfolded

Brief
Pointed Claude at the Notion pick-up guide. It produced a 3-sentence summary; corrected one assumption.
v0 deploy
Cloned the existing repo, ran vercel link, deployed an empty change. Pipe was paved.
Build
Built 6 retreat-hub pages, then the dynamic Notion-driven intake form, in tight cycles. Deployed after each one.
QA
Every meaningful change logged to a Notion QA database. 5 rows, all Pass. Followups captured for non-blockers.
Sync
Built a secret-protected /api/sync-questions endpoint so the team can edit prompts in Notion and run a one-line curl to align everything.
Polish
Added back-link from intake to retreat hub when a user reported the nav disappeared. Five-minute fix, redeployed.

Read the full retreat hub at /retreat2026 or jump straight to the Hackathon exercise card.

Ideas to steal

Eight things you can ship in two hours.

Every one of these is genuinely buildable in 120 minutes with Claude Code, the right MCPs, and a working deploy target. Filter by what you fancy.

Internal

Field Tech Pocket Assistant

Mobile-first chat over your install & spec PDFs. A technician on a roof asks "what's the airflow rating for the X-200?" and gets the answer with the page reference.

StackClaude + RAG + Vercel
Demo win10 real questions, right answers
Data

Customer Health Dashboard

Pulls Stripe + last contact + open tickets per customer → green / yellow / red. CEO sees "who needs attention this week" in one screen.

StackStripe + CRM API + Next
Demo winReal customer list, ranked
Customer-facing

Site Survey → Quote PDF

Upload site photos and a few notes → Claude drafts a quote with recommended cooler specs and outputs a branded PDF the sales engineer can send.

StackClaude vision + HTML→PDF
Demo winOne real quote from one real survey
Automation

Daily Pipeline Digest (Slack)

Every morning at 8am, posts to #leadership: new deals, slipping deals, customers waiting on us, this week's revenue. Cron + a short prompt.

StackSlack + CRM + Vercel cron
Demo winDigest lands Monday morning
Internal

Ask Helix Robotics

RAG chatbot over all internal docs — policies, specs, onboarding, finance basics. Anyone in the company can ask without bothering a human.

StackClaude + embeddings + Vercel KV
Demo winAn HR question answered correctly
Automation

PR Review Buddy

Every GitHub PR gets an auto-comment with a one-paragraph summary, three suggestions, and a risk flag if the diff touches anything spicy.

StackGitHub webhook + Claude
Demo winFirst real PR gets a useful comment
Automation

Meeting → Action Items

Fireflies or Zoom transcript → Claude extracts action items, assigns owners, files them as tasks in ClickUp / Notion with due dates.

StackFireflies + Claude + ClickUp
Demo winA real meeting becomes real tasks
Customer-facing

Voice Memo → Spec Doc

Walk-and-talk a feature idea into your phone → Claude turns the recording into a clean PRD with sections, drops it in Notion ready for review.

StackWhisper + Claude + Notion API
Demo win3-min voice memo → usable spec

Copy & paste

Five prompts that do most of the work.

Open a fresh Claude Code session, paste, replace the angle-bracketed parts, run. These are the starting points — you'll iterate from there.

1Kickoff

Minute 0–10. After git init and claude /init.

# Hackathon kickoff
Read the brief at <notion-or-clickup-url>.

Summarise in 3 sentences: what we're building, for whom,
and what success looks like for the demo.

Then update CLAUDE.md with:
  - one-paragraph project brief
  - the deploy target and live URL convention
  - which MCPs we'll use (KB, deploy, anything else)
  - a "Scope is sacred" reminder for both of us

2Spawn the scoping agent

Minute 10–20. While you set up the dev loop.

# Scoping agent
Spawn a sub-agent. Brief:
  - Read CLAUDE.md.
  - List every feature the brief implies.
  - Cut to the smallest demoable thing — one input,
    one output, no auth, no settings.
  - Write two markdown sections back to the KB:
      "In for v1" and "Deferred to v2"
  - Return a 5-line summary.

Do not start building. Plan only.

3Build the prototype

Minute 25–65. The main event.

# Prototype loop
Build only the "In for v1" features from the scoping doc.

Constraints:
  - Static HTML + one serverless function is the ceiling
  - One screen, one happy path, no edge cases yet
  - Match the visual identity in tokens.css (if it exists)
  - Commit when it works, not when it's perfect

After every meaningful change:
  - run it locally
  - if it works, vercel --prod
  - tell me what to click to demo it

4Run the QA agent

Minute 65–90. Background, in parallel with polish.

# QA agent (run in background)
Test <production-url> against the brief in CLAUDE.md.

For each test:
  1. One-line description
  2. Run (curl / headless browser / visual)
  3. Pass or Fail
  4. If fail: critical / important / cosmetic

Log each row to the QA database at <notion-db-url>.

Return summary:
  total / critical failures / important failures /
  cosmetic count / overall verdict.

5Ship + write the launch note

Minute 105–120. The final move.

# Ship it
Final deploy + smoke test.

  1. vercel env ls — confirm prod vars are set
  2. vercel --prod
  3. Curl every route, expect 200
  4. Walk the happy path with a real payload
  5. Write the launch note to the KB:
       - URL
       - What it does in one sentence
       - Three things it does well
       - Three known limitations
       - Link to the QA DB rows

Halt and report if anything fails. Do not roll forward.

Ready for your two hours?

Open Claude Code in a fresh folder, paste prompt #1, and start the clock.

Back to the retreat hackathon card