Vibe Coding with AI β Best Practices for Every Project
Learn how to use evaluation-first development, GitHub Copilot, Claude, and local LLMs to build faster, smarter AI-powered apps.

π Introduction
This guide captures a modern, evaluation-first approach to building AI-powered projects using tools like GitHub Copilot, Claude, ChatGPT, and local LLMs like LLaMA and DeepSeek.
We’ll use AsyncPR β a real-world mobile app project I built and shipped β as the public example. Check out a YouTube short video I did here on the App https://go.fabswill.com/asyncpr-shortintro and feel free to test it out! This guide reflects how I work today: VS Code as my IDE, Copilot for code assist, and running private models on my MacBook when needed for security or speed.
Grab the app in the App Store for iOS here https://apps.apple.com/app/id6744700840
If you’re serious about modern software building, youβll love this framework.
π οΈ Purpose and Setting Expectations
This is not a theory doc written after the fact β it’s a living guide captured while actively working.
Key details:
- IDE: Visual Studio Code (VS Code)
- Primary AI Tools: GitHub Copilot, ChatGPT, Claude Desktop
- Local Models: LLaMA 3.3:70B, DeepSeek 70B (via Ollama)
AsyncPR is featured because my private projects at work can’t be shared publicly β but the principles, practices, and rigor are identical.
π§ Mindset Shift: From Testing to Evaluations (Evals)
Old Way | New Way |
---|---|
Test after building | Eval before building |
Check if code “works” | Check if code “works the right way” |
Hope AI outputs are fine | Define “good” first, then build |
β Plan β Eval β Build β Test
π§ͺ Start with Evaluations (Evals), Not Just Tests
Before writing any production code, define Evals: - What does a good output look like? - What would success/failure look like? - How can a real user journey be simulated?
π Examples from AsyncPR:
- Receipt image β extracted business name? β
- Business name β valid business email? β
- Customer narrative β clean, structured feedback JSON? β
Think of Evals like mini contracts between you and your AI tools.
π Save these in /evals/
folder for every project.
π Project Planning Before Coding
This is the step that saves you the most pain:
Work with the LLM to write a PLAN.md:
- Must-Have Features
- Nice-to-Haves
- Out-of-Scope (for now)
The AI cannot read your mind β writing this plan forces clarity upfront.
π οΈ Tool Setup and AI Strategy
Today, my stack looks like:
- VS Code + GitHub Copilot for core coding
- Windsurf or Cursor for AI copartnering coding
- Claude Desktop/Code for architectural planning and debugging
- Custom MCP Servers for API documentation lookup and internal data sources
- Local Ollama Models (LLaMA, DeepSeek) for private projects
My main IDE of choice is VS Code with GitHub Copilot and given a choice I am coding in .NET C#, when I face less of a choice its still VS Code but its in Python or TypeScript
Ive also started to dabble in Cursor and Windsurf, which is a clone of VS Code, I do like Windsurf native and dare I say easier approach to built in MCP support in the tooling configuration over Claude and VS Code.
I think I am just fanboying with using Claude Desktop because MCP is from Anthropic and Claude is too which also makes the Docs easy to follow when I was getting up to speed. I use this to test my MCP Servers as well.
π― Mindset: Treat AI tools like a team of interns β powerful but needing precise guidance.
π Iterative, Section-by-Section Development
Build one section at a time:
β
Implement small pieces
β
Validate against Evals
β
Only then commit to Git
Never let bad AI outputs pile up. Reset early and often.
ποΈ Version Control is Sacred
- Work from clean Git branches.
git reset --hard
if AI drifts too far off course.- Use GitHub Actions to validate key rules.
Trust me: you’ll thank yourself later.
π High-Level Testing Always
Once Evals pass, simulate real-world behavior:
- I often use Insomnia or a basic Blazor SPA to hit real endpoints.
- Validate the entire user journey, not just isolated function outputs.
π Deep Documentation and Accessibility
Docs aren’t just for humans anymore β they’re for AIs too.
- Save API specs, database schemas, business rules under
/docs/
- Build MCP Servers that live-ingest updated docs
- Even scrape and save static Markdown from sites if needed for local models
π οΈ Refactor Relentlessly
When the tests pass, refactor: - Break apart monolith files - Create small, focused modules - Ask your LLMs to suggest refactors too
Small files = happier humans and happier AIs.
π§ Choose the Right Model for the Job
Some models are better at: - Planning: Claude and DeepSeek - Autocompleting: GitHub Copilot - Domain Reasoning: Your own MCP Servers
Experiment, experiment, experiment.
π Keep Iterating
Every few weeks: - Try new models on old logic. - Update your Evals. - Share learnings with yourself (and your future teammates).
Vibe coding is about building smarter and faster β both.
π Example Project Structure
/my-project-name
β
βββ .gitignore
βββ README.md
βββ PLAN.md # Detailed project plan with must-have, nice-to-have, out-of-scope
βββ LLM_INSTRUCTIONS.md # Special instructions to AI agents
β
βββ /src # Application code
β βββ /backend
β βββ /frontend
β βββ /shared
β
βββ /tests # High-level user journey tests (Post-Eval validation)
β
βββ /evals # Manual or automated Evals
β βββ image-processing-eval.md
β βββ business-name-detection-eval.md
β βββ feedback-generation-eval.md
β βββ README.md # Explains your Evals philosophy
β
βββ /docs # API docs, specs, architecture references
β
βββ /scripts # Utility scripts (e.g., image resizing, data migration)
β
βββ /mcp-servers # (Optional) Local MCP server configurations
β
βββ LICENSE
π¬ Final Thoughts
AI-assisted development demands a new way of thinking.
Evaluation-first frameworks, thoughtful planning, and tight Git discipline make the difference between chaos and clarity.
AsyncPR is just the beginning β this approach scales to any project, any team, any goal.
π― Whatβs one Eval youβll try first? Comment or message me β letβs level up together!
Chat with me
Engage with me | Click |
---|---|
BlueSky | @fabianwilliams |
Fabian G. Williams |