Back to Blog
TutorialsJune 11, 20259 min read

My AI Coding Workflow in 2025: How I Ship 2x Faster Using Free LLM APIs

A real developer workflow using free LLM APIs across every stage of the development process — planning, coding, review, testing, and debugging.

The Honest Truth About AI-Assisted Development

AI has not replaced software engineers. But it has fundamentally changed what a single developer can ship in a day. I am not faster because AI writes all my code — it does not. I am faster because AI eliminates the friction from every stage of the development process: the blank-page problem, the "what was the syntax for this" moment, the tedious review checklist, the debugging loop.

This guide is my actual workflow. Everything here uses free LLM API keys from FreeLLMKeys — zero monthly AI subscription cost.

My Tool Setup

  • Cursor IDE with a FreeLLMKeys key (GPT-4o + Claude via custom API base URL)
  • Continue.dev in VS Code for Python work
  • A custom Python script for batch tasks (code review, test generation)

Total AI cost: $0/month. Key refresh time: ~30 seconds every 48 hours.

Stage 1: Planning (5–15 minutes instead of 30–60)

Before writing any code, I describe what I am building to Claude Opus 4 and ask for architectural feedback:

from openai import OpenAI

client = OpenAI(
    base_url="https://aiapiv2.pekpik.com/v1",
    api_key="sk-your-key"
)

planning_prompt = """
I need to build a REST API endpoint that:
- Accepts a URL and extracts article text
- Runs sentiment analysis using an LLM
- Caches results in Redis for 1 hour
- Returns JSON with sentiment score, summary, and key phrases

Stack: FastAPI, Redis, Python 3.12

What are the edge cases I should plan for? What could go wrong at scale?
What is the simplest architecture that does not over-engineer this?
"""

response = client.chat.completions.create(
    model="claude-opus-4-7",
    messages=[{"role": "user", "content": planning_prompt}]
)
print(response.choices[0].message.content)

Claude typically surfaces 3–5 edge cases I had not thought of. This 10-minute conversation saves hours of debugging later.

Stage 2: First Implementation (AI writes the boilerplate, I write the logic)

I use Cursor's Cmd+K to write boilerplate, error handling, and patterns I have written dozens of times before. I write the actual business logic myself — this is where domain knowledge matters and where AI is least reliable.

The split that works for me:

  • AI writes: HTTP handlers, database queries, serialization, test fixtures, error handling, logging setup
  • I write: Business rules, state machines, algorithm selection, security decisions, data model design

Stage 3: Code Review (the highest ROI use of AI)

Before I open a pull request, I run every changed file through this review script:

import subprocess
from openai import OpenAI

client = OpenAI(
    base_url="https://aiapiv2.pekpik.com/v1",
    api_key="sk-your-key"
)

def ai_review(filepath: str) -> str:
    with open(filepath) as f:
        code = f.read()

    response = client.chat.completions.create(
        model="claude-opus-4-7",
        messages=[{
            "role": "user",
            "content": f"""Review this code as a senior engineer preparing it for production.
Focus on:
1. Security vulnerabilities
2. Edge cases that will cause bugs
3. Performance issues at scale
4. Missing error handling
5. Code clarity issues

File: {filepath}
```python
{code}
```"""
        }]
    )
    return response.choices[0].message.content

# Get all changed files in current branch
changed = subprocess.check_output(
    ['git', 'diff', '--name-only', 'main...HEAD'],
    text=True
).strip().split('\n')

for f in changed:
    if f.endswith('.py'):
        print(f"\n{'='*60}")
        print(f"REVIEWING: {f}")
        print('='*60)
        print(ai_review(f))

This script catches real bugs. Last month it caught a SQL injection vulnerability I had missed in a review and a missing authentication check in an API endpoint.

Stage 4: Test Generation

Writing tests is the most tedious part of development. AI does not make tests better — but it makes them faster to write:

def generate_tests(filepath: str, test_framework: str = "pytest") -> str:
    with open(filepath) as f:
        code = f.read()

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "user",
            "content": f"""Generate comprehensive {test_framework} tests for this code.
Include:
- Happy path tests
- Edge cases (empty input, None, maximum values)
- Error condition tests
- At least one parametrized test

{code}"""
        }]
    )
    return response.choices[0].message.content

Stage 5: Debugging

When I hit a bug I cannot immediately understand, I paste the traceback + relevant code into Claude with this prompt:

debug_prompt = f"""
Here is a Python traceback:
{traceback}

Here is the code that produced it:
{code}

1. What is the root cause?
2. What is the minimal fix?
3. Is there a deeper design issue that made this bug possible?
"""

Claude's debugging explanations are excellent because it explains the root cause, not just the symptom. This is the fastest way I have found to understand an unfamiliar error.

The Result

Before integrating AI into my workflow, a typical feature took 2–3 days: planning, implementation, review, tests. Now the same feature takes 1–1.5 days. The time savings come almost entirely from stages 1, 3, and 5 — planning, review, and debugging — not from AI-generated code replacing my own.

All of this runs on free FreeLLMKeys API keys. The AI infrastructure cost is $0. The productivity gain is real.

F
FreeLLMKeys Team
Building tools for the AI developer community