How I refactored a 1,348-Line Lambda Using Context Engineering
I had a problem. A Lambda function that started as a quick prototype had grown to 1,348 lines. It handled AI character chat, vector memory, moderation, credits, and creator payouts. It had zero tests. It was untouchable.
Most teams would assign this to a senior developer and hope for the best. I used Context Engineering to systematically dismantle it.
One hour later: 60 lines of routing code. 83 passing tests. 100% backward compatibility.
The Starting Point
The chat_bot Lambda was technical debt incarnate:
- •1,348 lines in a single file
- •27 functions (5 async, 22 sync)
- •5 API endpoints
- •0% test coverage
- •Global state mutations
- •Tight AWS coupling
Every change risked breaking something. New developers avoided it. Production bugs were inevitable.
But this wasn’t just messy code. This Lambda powered the core feature of our platform: AI character conversations with long-term memory. Users pay credits. Creators earn revenue. Vector embeddings store context. Moderation blocks violations. One wrong move and we’d break the entire business.
The Traditional Approach Doesn’t Scale
Here’s what most teams would do:
- •Assign it to a senior developer
- •They spend 2-3 weeks refactoring
- •Hope they understand the conventions
- •Pray nothing breaks
- •Tests? Maybe if there’s time
- •No clear plan, just “make it better”
The problem: it’s ad-hoc. Every developer refactors differently. No guarantee the result matches your standards. And when you’re done, you have cleaner code—but did it follow your patterns? Who knows.
The Context Engineering Approach
I took a different path. Before touching any code, I created a document: chat-bot-tech-debt-clean-up.md.
This wasn’t documentation. It was executable architecture.
Step 1: Define the Outcome
Goal: Reduce handler.py from 1,348 lines to under 200 lines. Create 76+ unit tests. Achieve over 80% coverage. Zero regressions.
Not “refactor the code.” Specific, measurable outcomes.
Step 2: Map the Current State
I documented everything:
- •27 functions by type (async vs sync)
- •5 API endpoints with exact behavior
- •External dependencies (OpenAI, Venice AI, svectorDB, DynamoDB, S3)
- •Business logic flows (credit deduction, creator payouts, memory summarization)
- •Global state (banned word cache, refresh intervals)
This wasn’t busy work. Understanding what exists is how you know what to move where.
Step 3: Design the Target Architecture
I split the monolith into 6 focused modules:
- •ai/ – Client wrappers and prompt building
- •moderation/ – Text filtering and violation tracking
- •memory/ – Vector storage and retrieval (svectorDB)
- •storage/ – Message persistence and character data
- •business/ – Credits, payouts, chat history
- •routes/ – API endpoint handlers
Each module: clear responsibilities, no circular dependencies, clean boundaries.
Step 4: Create Executable Milestones
This is where Context Engineering diverges from traditional refactoring.
I broke the work into 7 milestones, ordered by dependency:
ai → moderation → memory → storage → business → routes → handler
Each milestone specified:
- •Exactly which functions to extract
- •Exactly which lines to move
- •Exactly which tests to create
- •Exactly which files to update
- •Exactly how to validate success
- •Exactly what commit message to use
Here’s Milestone 1:
Milestone 1: AI Module Create ai/openai_client.py Create ai/venice_client.py Create ai/prompt_builder.py Move get_openai_client() from handler.py Move build_character_prompt() from handler.py (lines 370-427) Move generate_ai_reply() from handler.py (lines 429-486) Update handler.py imports Create test_chat_bot_ai_clients.py (3 tests) Create test_chat_bot_ai_prompt_builder.py (5 tests) Run: pytest lambda/tests/unit/test_chat_bot_ai_*.py -v Validate: All 8 tests passing Commit: refactor(chat_bot): extract AI module with client wrappers and prompt builders
Every step: actionable, testable, verifiable.
Step 5: Execute With Claude Code
Here’s where it gets interesting.
I told Claude Code: “Execute Milestone 1 from chat-bot-tech-debt-clean-up.md”
Claude Code:
- •Read the milestone
- •Read ADR-003 (my testing standards)
- •Read ADR-002 (my commit conventions)
- •Create ai/ directory structure
- •Extract specified functions from handler.py
- •Create new module files
- •Update handler.py imports
- •Generate 12 tests following my standards
- •Run tests
- •Commit with conventional message
Result: Milestone complete in minutes. All code matches my organizational standards. All tests pass. All commits follow conventions.
The Results
I executed all 7 milestones in about an hour.
Milestone 1: AI Module Handler: 1,348 → 1,190 lines (-158, -11.7%) Tests: 12 passing
Milestone 2: Moderation Module Handler: 1,190 → 979 lines (-211, -17.7%) Tests: 20 passing
Milestone 3: Memory Module Handler: 979 → 835 lines (-144, -14.7%) Tests: 21 passing (all async)
Milestone 4: Storage Module Handler: 835 → 731 lines (-104, -12.4%) Tests: 14 passing
Milestone 5: Business Module Handler: 731 → 633 lines (-98, -13.4%) Tests: 16 passing
Milestone 6: Routes Module Handler: 633 → 110 lines (-523, -82.6%) Tests: 0 (routes reuse tested modules)
Milestone 7: Handler Cleanup Handler: 110 → 60 lines (-50, -45.5%) Tests: 0 (pure routing)
Final: 1,348 → 60 lines (-1,288, -95.5%) Total tests: 83 passing
The Final Handler
Here’s what 60 lines of routing logic looks like:
def lambda_handler(event, context):
path = event.get("rawPath") or event.get("path")
method = event.get("requestContext", {}).get("http", {}).get("method")
if path == "/api/chatbot/send-message" and method == "POST":
return asyncio.run(handle_send_message(event))
elif path == "/api/chatbot/generate-reply" and method == "POST":
return asyncio.run(handle_generate_reply(event))
elif path == "/api/chatbot/get-messages" and method == "GET":
return handle_get_messages(event)
elif path == "/api/chatbot/rate-message" and method == "POST":
return handle_rate_message(event)
elif path == "/api/chatbot/agreed-to-chat-terms" and method == "POST":
return handle_agreed_to_chat_terms(event)
return _response(404, {"error": "Not Found"})That’s it. Pure routing. No business logic. No AWS calls. No global state. Just clean delegation.
The Key Insight
The AI didn’t need to understand my entire system. It needed to understand ONE milestone at a time, with clear instructions and queryable standards.
That’s Context Engineering.
Traditional refactoring: “Claude, clean up this file” and hope for the best.
Context Engineering: “Claude, execute Milestone 1 following ADR-003 and ADR-002” and get aligned output.
What Made This Possible
Three things enabled this velocity:
- ADRs as Guardrails
ADR-003 defines my testing standards. ADR-002 defines my commit format. These aren’t documentation—they’re queryable rules that Claude Code enforces automatically.
When Claude generates tests, they follow my patterns. When Claude commits, the messages match my conventions. Not because I told it each time, but because it queries the standards.
- Executable Milestones
The chat-bot-tech-debt-clean-up.md document wasn’t a plan. It was a script.
Each milestone: specific files, specific functions, specific tests, specific validations. Claude Code didn’t improvise. It executed.
- Systematic Validation
After each milestone: All tests must pass Handler must import correctly No circular dependencies Git commit with conventional message
Validation caught issues immediately. No big-bang failures. No “hope it works” deployment.
The Difference From Gene Kim’s Vibe Coding
Gene Kim’s “Vibe Coding” book describes AI productivity gains. He’s right about 10-100x speed. But he also documents the nightmares:
- •AI deleting 80% of his tests
- •3,000-line functions that became unmaintainable
- •Code violating team conventions
- •Git branches named cryptically
That’s speed without alignment.
Context Engineering solves this. When your standards are queryable, AI doesn’t guess at conventions—it queries your ADRs and generates code that matches YOUR patterns.
I didn’t tell Claude “write tests.” I told Claude “write tests following ADR-003.” The difference is everything.
The Impact
Before refactoring:
- •Bug fix time: 2-4 hours
- •New feature time: 1-2 days
- •Onboarding time: 2-3 weeks
- •Production incidents: 2-3 per month
After refactoring:
- •Bug fix time: 30-60 minutes
- •New feature time: 4-8 hours
- •Onboarding time: 2-3 days
- •Production incidents: >1 per month
Estimated savings: 60-80 hours per month in development and bug fixes.
The Broader Lesson
Every enterprise has 1,348-line functions. Every team has technical debt. Every organization struggles with consistency when using AI tools.
They need a systematic way to fix it.
Context Engineering provides that:
- Document your standards as ADRs
- Structure work as executable milestones
- Use AI tools that query your standards
- Validate outcomes, not implementation
- Maintain velocity AND alignment
Technical debt remediation becomes systematic instead of heroic. AI assistance becomes aligned instead of chaotic. Teams go fast AND stay consistent.
My Evolution
I built my platform using GPT that made me 10x faster through copy/paste workflows.
I switched to Claude Code CLI that made me 10x faster again through direct file manipulation.
I added Context Engineering that made it systematic instead of ad-hoc.
The result: 100x velocity with alignment. Not just fast—fast in the right direction.
What This Proves
If AI can systematically remediate a 1,348-line Lambda while maintaining organizational standards, it can handle any technical debt.
The challenge isn’t AI capability. It’s making your systems AI-understandable.
That’s Context Engineering. That’s the future.
Not AI that codes. But AI that codes the way YOUR organization codes.