METHOD 2

⚡ GPT-5 Mini Direct (V2)

Single-call generation using GPT-5 Mini's training knowledge. 10× cheaper than V1, 90% of the quality.

Overview

V2 makes a single API call to gpt-5-mini-2025-08-07 with the full consumer prompt. No web search — the model uses its training knowledge for pricing, competitors, and company facts. Surprisingly thorough for most popular services.

AttributeValue
Modelgpt-5-mini-2025-08-07
Web searchesNone
Avg article length1,700–2,100 words
Avg quality score9/10
Cost per article€0.009–0.013
Avg generation time60–100 seconds
ParallelisableYes — 5 workers simultaneously
Cost × 1,000 articles≈ €9–13

Architecture

┌─────────────────────────────────────────────────────┐
│  SERVICE DATA (name, category, notes, keywords…)    │
└────────────────────────┬────────────────────────────┘┌──────────────▼──────────────────┐
         │         SINGLE API CALL          │
         │                                  │
         │  Consumer Prompt Template        │  gpt-5-mini
         │  + Service context block         │  max_completion_tokens=8000
         │  + 14-section structure          │  streaming=true
         │  Model uses training knowledge   │
         │  for pricing + competitors       │
         └──────────────┬───────────────────┘FINAL ARTICLE
                  (HTML, ~1,900 words, 14 H2, 4 tables)

Prompt Context Injected

V2 injects a "Service Context" block instead of live research:

══ SERVICE CONTEXT ══
Cancellation methods: phone, email, account-online
Notes: {service notes from JSON, first 200 chars}
Keywords: cancel netflix, how to cancel netflix australia, ...

The model draws on its training data for all other facts (company HQ, CEO, pricing history, competitors).

When to use V2

When NOT to use V2

Cost Breakdown

ComponentTokens (avg)Cost (USD)Cost (EUR)
Prompt tokens~1,900$0.000475€0.00044
Completion tokens~5,700$0.01140€0.01049
TOTAL per article~7,600~$0.012~€0.011
× 20 articles~152,000$0.24€0.22
× 1,000 articles~7.6M$11.87€10.92

Model pricing: Input $0.25/1M tokens, Output $2.00/1M tokens.

Python Script Reference

# Run V2 batch (all 20 sample services, 5 parallel workers)
python scripts/02_generate_v2.py

# Output: Billoff/data/results_v2.json

# Run 3-method comparison on 1 service
python scripts/test_compare_3methods.py --service "Netflix"

Parallelisation

# V2 runs with 5 concurrent workers (no rate limit concern for direct completions)
from concurrent.futures import ThreadPoolExecutor
MAX_WORKERS = 5

with ThreadPoolExecutor(max_workers=MAX_WORKERS) as ex:
    futures = {ex.submit(process_service, svc): svc for svc in services}
    for future in as_completed(futures):
        results.append(future.result())

Quality Results (Ocado test)

MetricResultvs V1
Word count1,927−192 (−9%)
Tables4Same
H2 sections14Same
H3 sub-sections40+1 (better!)
Company fact boxSame
FAQSame
Quality score9/10−1
Cost€0.01110× cheaper
Time95s−31% faster

Pros & Cons

✅ Pros❌ Cons
10× cheaper than V1No real-time pricing data
Parallelisable (5+ workers)Competitors may not have current prices
Consistently good structure (14 H2)Company info may be outdated (training cutoff)
9/10 quality score in testsLess reliable for niche/local services
Best cost/quality ratio for scaleNo verified source citations
Works offline (no web dependency)May hallucinate pricing for obscure services