Claude AI Workflow · Real Case Study

How to Use Claude AI Effectively

Q: How do you write good prompts for Claude?

The best Claude prompts are specific, not clever. State the outcome, paste real artifacts (not summaries), list hard constraints, and include any prior failures so the model doesn't repeat them. Then iterate with precise corrections, not vague 'try again' prompts.

Most people are using Claude wrong. They treat it like a chatbot. I use it like a production system — and it ships hours of real work every week.

By Scott Willis · April 10, 2026 · ~14 min read · View as slide deck →

How to Use Claude AIClaude AI WorkflowClaude Best PracticesClaude vs ChatGPTClaude PromptingAI Iteration LoopSystem Design

This guide is for people who already opened a Claude tab, got an answer that almost worked, and walked away thinking AI is overhyped. It's not overhyped. You're using it like a search engine when you should be using it like a junior engineer who never sleeps.

Below is the exact workflow I use to ship production code, file pro se legal documents, run a serverless honeypot, and generate organic content with zero ad spend. The same loop, every time, across every domain.

What is Claude AI? Claude is a large language model built by Anthropic for reasoning, coding, long-form writing, and tool use. The most effective way to use Claude is not "better prompting" — it's wrapping the model in a structured workflow with real artifacts, real tests, and tight feedback loops. Prompting is Stage 1. System design is Stage 2. This guide is about Stage 2.

What Most People Get Wrong About Claude

Three failure modes account for almost every "Claude isn't that useful" complaint I see online:

They one-shot it. They paste a vague request, take the first answer, and use it. No iteration, no verification, no second pass. Then they're surprised when the output is generic.
They feed it summaries instead of artifacts. They describe their problem in English when they should paste the actual error, the actual query, the actual log line. Summaries leak the exact signal Claude needs.
They treat the chat as the system. They run 40-message threads until compaction breaks them. The chat is a tool. The system is your workflow around the chat — your scoped tasks, your context files, your verification loop.

If you fix only those three things, your Claude experience improves more than switching models ever will.

My Actual Claude Workflow (Step-by-Step)

People keep asking me for my "prompt template." There isn't one. The thing that matters is the loop, not the wording. Here it is, in full:

Define the outcome → Claude drafts → Test against reality → Specific feedback → Ship or repeat.

The critical insight: most people stop at step 2. They get a draft and try to use it. The real value is in steps 3 and 4 — testing against your actual environment and giving feedback precise enough that the next iteration converges instead of wandering. That's it. That's the engine. Below is what each step actually means in production.

1. Define the outcome (before you type a single word)

This is the single most important step and the one people skip. You must know exactly what "done" looks like in production reality. Not "a good query." Not "a legal-looking brief." Done means: copy-paste into the live environment with zero errors, or filed with zero procedural defects, or a function that logs real probes and blocks nothing legitimate.

Write the definition in plain English first: "After this loop, I will have a single script that replaces the legacy ones, runs in under two seconds on a 10k-row dataset, and requires zero user training." A model is a world-class pattern-matcher. If the target pattern is fuzzy, the output is fuzzy. Clarity here is the entire game.

2. Claude drafts

Feed the crystal-clear outcome plus all the real data — never summaries. Paste the actual query text, the actual docket entries, the actual log lines, the actual analytics exports. State your hard constraints up front. One-shot draft. No hand-holding yet. Let it go full creative. You'll fix it in the next steps.

3. Test against reality (this is where 95% of AI users fail)

You become the merciless QA department. Run the code in a real dev environment. File the draft in test mode. Deploy the function and hit it with real traffic. Check the indexing tools for the new content.

Document every failure with surgical precision. Do not say "it's wrong." Say: "Line 47 throws an invalid handle on the temp buffer because the query uses an alias that only exists in the older code path."

4. Specific feedback (the convergence engine)

This is the art. Your feedback must be so precise that the next draft cannot possibly make the same mistake.

Bad: "Make it better."
Good: "The script fails on records where the invoice number contains a hyphen because you assumed all invoice numbers are numeric. Add a string check and handle that case explicitly."
Even better: "Reproduce the exact error I just saw, then fix it, then show me the before/after diff so I can verify."

You are training the model on your exact domain reality in real time. That training only works if your descriptions are reproducible.

5. Ship or repeat

Two choices only. Ship = it passes your production test with zero caveats. Repeat = go straight back to step 3 with the new failure data. No "maybe one more prompt." The loop is sacred.

You keep repeating until the output is production-ready every single time. That's why the audit work delivered real measurable gains, why the filings forced substantive responses from experienced opposing counsel, and why the honeypot logs hostile probes while serving clean content to humans.

Why This Loop Crushes Every Other AI Workflow

It forces you to stay the domain expert. You define done. You test reality. The model can't hide behind vagueness because you removed it.
It turns hallucinations into fuel. Every failure becomes training data for the next iteration.
It compounds across domains. The more loops you run on legal filings, the faster honeypot analysis converges, because your feedback style is now elite.
It's model-agnostic. Swap Claude for Grok, Gemini, GPT — the loop still works because the human is the constant.

Pro Tips to Make Your Loop Tighter

Keep a "war room" context file. Paste it at the start of every new session: definition of done, previous failures, system constraints. Externalized memory beats hoping the model remembers.
Build a personal pattern library. After every successful ship, save the final output plus the exact feedback that got it there. That library is more valuable than any prompt template you'll find online.
Time-box every loop to 15–20 minutes. If it isn't converging, the outcome wasn't defined clearly enough. Go back to step 1. Don't keep prompting.

That's the entire methodology. It's boring, unsexy, and brutally effective — exactly like real engineering. Everything else in this article is what happens when you run this loop, hard, against real problems.

Professional: A 27% Performance Gain Across Hundreds of Database Views

I work in enterprise distribution software that runs on a relational database. Custom views accumulate over years. Reports get slow. The interesting question is never "is this view slow" — it's "is the join order wrong, is there a non-sargable predicate, is the index even being used."

Working with Claude on the audit, I batched the changes into discrete change sets and treated each one as a unit of work with a baseline, a hypothesis, and a rollback. Two patterns came up over and over:

Date math wrapped around an indexed column in a WHERE clause kills sargability. Rewriting it so the column appears bare on one side makes the index usable again.
Scalar functions in SELECT lists silently force row-by-row execution. Inlining them is ugly but fast.

300+

views audited

~27%

aggregate gain

change sets

99%+

match rate

None of those patterns are clever. The leverage was running the loop fast enough that we could touch dozens of change sets in the time a single audit normally takes. Claude wasn't doing the optimization — I was, with Claude as the rubber duck that could also write the rewrite. If you want the reverse-direction story — pulling data out of complex systems for real reporting — that's exactly what we build at the SiegeStack ETL Showcase.

The other professional win was a scan-to-ship workflow for a high-volume warehouse: a fragile suite of nine spreadsheet macros, replaced by a single script that watches a folder, queries the inventory database, and matches scans at 99%+. Co-authored end-to-end with Claude, deployed in production, no humans in the loop.

Legal: Pro Se Litigation

This is where the "Claude as smart junior" framing matters most.

An opposing attorney filed a false address with the court using an address I hadn't lived at in over a decade — 103 days after their own team personally served me at my actual location. A default judgment based on that filing stripped me of custody for over 490 days without my knowledge. When I found out, I filed pro se.

The defense moved to dismiss under attorney immunity. Their lawyer had been practicing for thirty years. I had Claude.

Claude helped me do three things that mattered:

Surfaced controlling precedent. The line of cases that define when conduct falls outside attorney immunity. I read every one of them in full before citing any of them.
Framed the dispositive fact. The 103-day timeline between personal service at my actual address and the false filing is the entire case. Claude didn't find that — I did — but Claude helped me articulate why that fact alone defeats the immunity defense as a matter of law.
Drafted the structural skeleton. Causes of action, statutory citations, prayer for relief. I rewrote substantially every paragraph, but starting from a reasonable structure saved me weeks.

The motion to dismiss was denied. The case is set for trial. There's an active criminal investigation and complaints have been filed with the relevant professional licensing bodies.

The verification rule: Claude fabricates citations. Not often, but enough that you must independently verify every case, every statute, every quote against the actual source before it goes in a filing. If you can't be bothered to do that, don't use AI for legal work. The cost of a hallucinated citation in front of a judge is your credibility — and credibility is the only currency a pro se litigant has.

Personal Project: A Honeypot at safesapcrtx.org

I wanted to understand how attackers actually behave by watching them — not by reading about it. So I built safesapcrtx.org: a static site that looks like a vulnerable WordPress install, deployed on Netlify's free tier, with edge functions acting as a serverless firewall and Supabase logging every probe.

Total monthly cost: $0. Daily intake: 80+ probes.

The probe paths are exactly what you'd expect: /wp-login.php, /xmlrpc.php, /admin/.env, /shell.php, /backup.sql, /phpmyadmin/. Each one tells you something about the attacker's playbook — credential stuffing, environment variable hunting, webshell discovery, database dump fishing. The interesting work isn't catching probes — it's clustering them into actor groups by behavior.

Claude helped me build the edge function classifier, the Supabase schema, and the dashboard. Two weekends of work, end to end. The honeypot is live and writing data right now. The repo is github.com/iamnotcheckingit-cyber/-safe-sapcr-texas.

Zero Ad Spend: The Strangest Validation

I have never paid for an ad. Not for SiegeStack, not for the honeypot site, not for anything. But people I have never met — strangers, completely unknown to me — independently paid for advertising campaigns promoting safesapcrtx.org. They thought the content mattered enough to spend their own money on it.

That is the strangest validation I've ever received as a builder. It also taught me something concrete: the content has to be good enough to make someone else want to amplify it. That bar is much higher than "good enough to ship," and it's the only bar that matters for organic growth.

The Honest Limitations

Everything above is real. So is everything below. Anyone selling Claude as a magic co-founder is lying.

Compaction is not lossless

When the context window fills, Claude summarizes the older parts of the conversation. That summary drops nuance and edge cases. You will lose things. Plan for it: checkpoint your decisions in external files, restart sessions deliberately, don't rely on Claude to remember what happened ten thousand tokens ago.

Confidently wrong, roughly 20% of the time

Claude will produce answers that sound right and aren't. Citations get fabricated. Working code gets rewritten unnecessarily. The human is the verification layer — always. If you don't have the domain expertise to catch the wrong answers, you cannot use Claude safely for that domain.

Instructions get ignored in long sessions

"Don't change X" gets violated. "Use the function we wrote earlier" gets ignored. Re-state critical constraints in every meaningful turn. Treat the assistant as memoryless even when it isn't.

Not actually a stateful partner

Default chat has no cross-session memory. What feels like continuity is you re-establishing context every time. That's a workflow you build, not a feature the model gives you.

The honest framing: Claude is a smart but unreliable junior, not a co-founder. The leverage comes from how you wrap it, not from the model alone.

From Prompting to System Design

This is the actual leap, and it's the part most articles miss. Stage 1 thinking is "how do I write a better prompt." Stage 2 thinking is "how do I build a system around the model." Power users live in Stage 2.

Don't:

Write giant kitchen-sink prompts
Run long messy chat threads until compaction breaks them
Trust the model to remember what happened last week
Skip verification because the answer "sounds right"

Do:

Break work into discrete phases with checkpoints
Externalize memory: docs, files, system prompts, MCP tools
Restart sessions with clean, curated context
Use real tools (file I/O, APIs, databases) — not just chat
Verify every claim against reality before shipping

The real formula: Claude + external memory + structured workflows = leverage. The honeypot, the database audit, the legal filings — none of those were "I asked Claude a question." Each one was a system: scoped tasks, real artifacts, tight feedback loops, verification at every step.

Claude vs ChatGPT: When to Use Each

People keep asking me "Claude vs ChatGPT — which one should I use?" The honest answer is both, for different things. They're not interchangeable. After running serious work through both daily, here's the split I actually use:

Use Claude when:

You need long, structured reasoning. Long-form writing, multi-file code refactors, legal drafting, anything that has to hold a coherent thread over thousands of words. Claude's long-context behavior is its strongest feature.
You're co-authoring code or text. Claude tends to follow precise editing instructions more reliably than ChatGPT in my experience — it rewrites less aggressively when you tell it not to.
You need a careful, conservative voice. For legal filings and technical writing, Claude is less likely to over-claim or sprinkle in marketing fluff.
You're using tool-calling / agents. Claude's tool-use API and Anthropic's agent SDK are well-designed for production agent workflows.

Use ChatGPT when:

You need fast, breadth-first ideation. Brainstorming, naming, quick research summaries — ChatGPT is faster to first draft and willing to free-associate.
You need image generation, voice, or DALL-E inline. Multi-modal feature parity favors OpenAI for casual creative work.
You need niche plugin integrations. The ChatGPT plugin/GPT ecosystem is broader for end-user automations.

My personal split is roughly 80/20 Claude/ChatGPT for production work and roughly 50/50 for exploration. Both have their place. Anyone telling you "X is better than Y, period" is selling you something. Anthropic's documentation is also worth reading directly — most people skip it, and most "Claude tips" articles you see are downstream of it.

Common Mistakes That Break Claude

If your Claude session feels like it's going nowhere, check this list. Nine times out of ten one of these is the cause:

Prompting in summary form. "Help me fix my login bug." Wrong. Paste the actual error stack, the actual code, the actual reproduction steps. Real artifacts only.
Defining "done" in the middle of a session. If you don't know what success looks like before you start, the model can't help you converge on it.
Letting compaction happen silently. Long sessions hit the context limit, Claude summarizes the older history, and that summary loses your earlier constraints. Restart deliberately. Don't drift.
Trusting citations without verifying. Claude will fabricate case law, library names, function signatures, and config flags about 1 in 5 turns. Verify everything against the source.
Skipping the test-against-reality step. "It looked right" is not a test. Run the code. File the draft. Deploy the function. The model lives in word-space. Reality lives in error-space. Bridge it yourself.
Re-prompting instead of correcting. When Claude gets it wrong, don't say "try again." Say "you got X wrong because Y; the fix is Z." Specific corrections converge in one round; vague ones loop forever.
Asking it to remember things it can't. Default chat has no cross-session memory. Re-establish context every time. Or use a project file / system prompt to externalize it.

Claude Workflow Template (Copy/Paste)

Here's the literal template I paste at the top of any new serious Claude session. Steal it. Modify it. The point isn't the wording — it's that you've forced yourself to answer every question in it before you start prompting.

# Session brief

## Outcome (definition of done)
- [What artifact will exist when this session is over?]
- [What test will it pass?]
- [Where will it be deployed/filed/shipped?]

## Constraints (hard rules)
- [Language / framework / version]
- [Things you must NOT change]
- [Style / tone / format requirements]

## Context (real artifacts only)
- [Paste the actual code / log / document / data — never summaries]

## Prior failures (so we don't repeat them)
- [What did the last attempt get wrong?]
- [What was the precise reason?]

## Verification plan
- [How will I test this output before shipping?]
- [What command / step proves it works?]

That's it. Five sections. If you can't fill all five in before you start prompting, you're not ready to prompt — go figure out the missing piece first. I can't overstate how much faster work moves once you internalize this.

Frequently Asked Questions

What is Claude AI used for?

Claude is used for long-form writing, code generation and refactoring, structured reasoning over long documents, legal and technical drafting, data analysis, and as the LLM behind agent / tool-use workflows. Its strength is holding coherent context over many thousands of tokens, which makes it well-suited for production engineering work and complex documents.

Is Claude better than ChatGPT?

Neither is universally better. Claude tends to win on long-context work, careful editing, and conservative tone. ChatGPT tends to win on fast breadth-first ideation and multi-modal features. Most serious users keep both open and route work to whichever fits the task. See the comparison section above for the split I use.

How do you write good prompts for Claude?

The best Claude prompts aren't clever — they're specific. State the outcome, paste real artifacts (not summaries), list hard constraints, and include any prior failures so the model doesn't repeat them. Then iterate with precise corrections, not vague "try again" prompts. The Claude workflow template above is exactly this, formalized.

Can Claude replace developers?

No. Claude amplifies developers — it doesn't replace them. Without domain expertise to verify output, catch hallucinations, and define what "done" actually means, you get confident-sounding garbage. The human is the quality gate. The 27% performance gain in this case study only happened because a human knew which patterns mattered and could test the rewrites against a real database.

What's the biggest mistake people make with Claude?

Treating it like a search engine instead of a junior collaborator. They paste a vague question, take the first answer, and walk away. The leverage is in the loop — defining done, drafting, testing against reality, giving precise feedback, and repeating until production-ready. Skip the loop and you're just chatting with an autocomplete.

Does Claude have memory between sessions?

Default chat has no persistent cross-session memory. What feels like continuity is you re-establishing context every time. Serious workflows externalize memory in project files, system prompts, or the Anthropic Memory Tool. Treat the chat as stateless and your reliability goes way up.

What I'd Tell Someone Starting Today

You're the expert. Claude amplifies domain knowledge. If you don't have the knowledge, you get confident-sounding garbage. The human is the quality gate.

Specificity compounds. One precise correction saves five rounds of guessing. Invest the time to describe exactly what's wrong.

Ship real things. The gap between "playing with AI" and "using AI" is whether something goes into production. File the motion. Deploy the script. Push the site live.

Side projects matter. My honeypot made me better at log analysis at work. My legal filings made me a better writer. Everything cross-pollinates.

AI levels the playing field. A pro se litigant produced filings that survived a motion to dismiss from a 30-year attorney. That wasn't possible five years ago. It is now. Use it.

How to Use Claude AI Effectively

What Most People Get Wrong About Claude

My Actual Claude Workflow (Step-by-Step)

1. Define the outcome (before you type a single word)

2. Claude drafts

3. Test against reality (this is where 95% of AI users fail)

4. Specific feedback (the convergence engine)

5. Ship or repeat

Why This Loop Crushes Every Other AI Workflow

Pro Tips to Make Your Loop Tighter

Professional: A 27% Performance Gain Across Hundreds of Database Views

Legal: Pro Se Litigation

Personal Project: A Honeypot at safesapcrtx.org

Zero Ad Spend: The Strangest Validation

The Honest Limitations

Compaction is not lossless

Confidently wrong, roughly 20% of the time

Instructions get ignored in long sessions

Not actually a stateful partner

From Prompting to System Design

Claude vs ChatGPT: When to Use Each

Use Claude when:

Use ChatGPT when:

Common Mistakes That Break Claude

Claude Workflow Template (Copy/Paste)

Frequently Asked Questions

What is Claude AI used for?

Is Claude better than ChatGPT?

How do you write good prompts for Claude?

Can Claude replace developers?

What's the biggest mistake people make with Claude?

Does Claude have memory between sessions?

What I'd Tell Someone Starting Today

Related