Single-call generation using GPT-5 Mini's training knowledge. 10× cheaper than V1, 90% of the quality.
V2 makes a single API call to gpt-5-mini-2025-08-07 with the full consumer prompt. No web search — the model uses its training knowledge for pricing, competitors, and company facts. Surprisingly thorough for most popular services.
| Attribute | Value |
|---|---|
| Model | gpt-5-mini-2025-08-07 |
| Web searches | None |
| Avg article length | 1,700–2,100 words |
| Avg quality score | 9/10 |
| Cost per article | €0.009–0.013 |
| Avg generation time | 60–100 seconds |
| Parallelisable | Yes — 5 workers simultaneously |
| Cost × 1,000 articles | ≈ €9–13 |
┌─────────────────────────────────────────────────────┐ │ SERVICE DATA (name, category, notes, keywords…) │ └────────────────────────┬────────────────────────────┘ │ ┌──────────────▼──────────────────┐ │ SINGLE API CALL │ │ │ │ Consumer Prompt Template │ gpt-5-mini │ + Service context block │ max_completion_tokens=8000 │ + 14-section structure │ streaming=true │ Model uses training knowledge │ │ for pricing + competitors │ └──────────────┬───────────────────┘ │ FINAL ARTICLE (HTML, ~1,900 words, 14 H2, 4 tables)
V2 injects a "Service Context" block instead of live research:
══ SERVICE CONTEXT ══ Cancellation methods: phone, email, account-online Notes: {service notes from JSON, first 200 chars} Keywords: cancel netflix, how to cancel netflix australia, ...
The model draws on its training data for all other facts (company HQ, CEO, pricing history, competitors).
| Component | Tokens (avg) | Cost (USD) | Cost (EUR) |
|---|---|---|---|
| Prompt tokens | ~1,900 | $0.000475 | €0.00044 |
| Completion tokens | ~5,700 | $0.01140 | €0.01049 |
| TOTAL per article | ~7,600 | ~$0.012 | ~€0.011 |
| × 20 articles | ~152,000 | $0.24 | €0.22 |
| × 1,000 articles | ~7.6M | $11.87 | €10.92 |
Model pricing: Input $0.25/1M tokens, Output $2.00/1M tokens.
# Run V2 batch (all 20 sample services, 5 parallel workers) python scripts/02_generate_v2.py # Output: Billoff/data/results_v2.json # Run 3-method comparison on 1 service python scripts/test_compare_3methods.py --service "Netflix"
# V2 runs with 5 concurrent workers (no rate limit concern for direct completions) from concurrent.futures import ThreadPoolExecutor MAX_WORKERS = 5 with ThreadPoolExecutor(max_workers=MAX_WORKERS) as ex: futures = {ex.submit(process_service, svc): svc for svc in services} for future in as_completed(futures): results.append(future.result())
| Metric | Result | vs V1 |
|---|---|---|
| Word count | 1,927 | −192 (−9%) |
| Tables | 4 | Same |
| H2 sections | 14 | Same |
| H3 sub-sections | 40 | +1 (better!) |
| Company fact box | ✅ | Same |
| FAQ | ✅ | Same |
| Quality score | 9/10 | −1 |
| Cost | €0.011 | 10× cheaper |
| Time | 95s | −31% faster |
| ✅ Pros | ❌ Cons |
|---|---|
| 10× cheaper than V1 | No real-time pricing data |
| Parallelisable (5+ workers) | Competitors may not have current prices |
| Consistently good structure (14 H2) | Company info may be outdated (training cutoff) |
| 9/10 quality score in tests | Less reliable for niche/local services |
| Best cost/quality ratio for scale | No verified source citations |
| Works offline (no web dependency) | May hallucinate pricing for obscure services |