Google's fast multimodal model rewrites existing seo_content into a structured Billoff article — any language, any country. Mandatory 3-table schema, country-law injection, and config-tuned for factual stability (temp 0.4).
V3 rewrites the existing seo_content from your service JSON using Google's Gemini 2.5 Flash. No web search — with 4 article plan choices (A–D) and 3 built-in improvements for structural integrity and legal accuracy across multiple countries.
| Attribute | Value |
|---|---|
| Model | gemini-2.5-flash (Google AI / Gemini API) |
| API endpoint | generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent |
| Web search | None |
| Input source | Existing seo_content from service JSON (up to 4,000 chars) |
| Article structure | 13 sections (Plans A/B/C/D) — includes mandatory FAQ §12 + 3 mandatory tables |
| Streaming | Yes — SSE via ?alt=sse parameter |
| Max output tokens | 24 576 (with thinkingBudget: 3072) |
| Temperature | 0.4 (reduced for factual stability) |
| Top-P | 0.85 (tighter sampling — reduces hallucinated facts) |
| Avg article length | 2 800–3 400 words (highest of all rewrite methods) |
| Avg quality score | 9/10 |
| Cost per article | ≈ €0.012 (~8× cheaper than V1) |
| Avg generation time | 30–40 seconds |
| Cost × 1 000 articles | ≈ €12 |
| Cost × 50 000 articles | ≈ €600 |
┌─────────────────────────────────────────────────────────┐ │ SERVICE DATA (name, category, website, keywords, etc.) │ └────────────────────────┬────────────────────────────────┘ │ ┌──────────────▼──────────────────┐ │ REWRITE INPUT BLOCK │ │ • Existing seo_content (~4K ch) │ │ • Service metadata │ │ • Cancellation methods │ │ • Keywords │ └──────────────┬───────────────────┘ │ ┌──────────────▼───────────────────┐ │ REWRITE PHASE │ │ Rewrite Prompt (Plan A/B/C/D) │ gemini-2.5-flash │ + system_instruction (brand │ maxOutputTokens=24576 │ + registry + 3 tables │ thinkingBudget=3072 │ + country-law + persona) │ temperature=0.4, topP=0.85 └──────────────┬───────────────────┘ │ REWRITTEN ARTICLE (<h1> creative title + 13 H2, ~3 000 words, 3 mandatory tables, FAQ)
Gemini uses a different API format from OpenAI. The prompt is wrapped in a contents array with parts objects. The system instruction goes in a separate system_instruction field and is handled inside the Cloudflare proxy.
Billoff/functions/api/gemini.js — Cloudflare Pages Function proxy (injects GEMINI_API_KEY server-side)Billoff/web/assets/openai.js → streamGemini() + generateV3() — Browser-side generatorBilloff/scripts/03_generate_v3.py — Python batch generatorThe Gemini API key is never exposed to the browser. All calls go through /api/gemini:
// functions/api/gemini.js — key logic export async function onRequestPost(context) { const apiKey = context.env.GEMINI_API_KEY; // Cloudflare secret const { model, ...geminiBody } = await request.json(); const url = `https://generativelanguage.googleapis.com/v1beta/models/` + `${model}:streamGenerateContent?key=${apiKey}&alt=sse`; const upstream = await fetch(url, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify(geminiBody), }); return new Response(upstream.body, { headers: { 'Content-Type': 'text/event-stream' } }); }
// openai.js — streamGemini() await fetch('/api/gemini', { method: 'POST', body: JSON.stringify({ model: 'gemini-2.5-flash', contents: [{ role: 'user', parts: [{ text: prompt }] }], generationConfig: { maxOutputTokens: 8192, temperature: 0.7 }, }) });
Gemini with ?alt=sse returns standard SSE. Each data: line contains a JSON object with incremental text and — in the last chunk — usage metadata:
// Each SSE chunk data: { "candidates": [{ "content": { "parts": [{ "text": "Hello" }] } }], "usageMetadata": { // only in last chunk "promptTokenCount": 2847, "candidatesTokenCount": 512, "totalTokenCount": 3359 } }
The client-side parser in streamGemini() normalises these to {type:'token', content} and {type:'usage', usage:{prompt_tokens, completion_tokens}} — the same format used by V1/V2.
V3 uses the shared 14-section Consumer Prompt Template (same as V2) with withResearch: false:
// Base brand instruction "You are a senior SEO content strategist at Billoff (billoff.com). " "'Billoff' MUST appear at least 4 times. NEVER write 'Postclic'. " "Output pure HTML starting with <h1> (creative title) then <h2> sections." // + SERVICE REGISTRY block (improvement 1) "══ VERIFIED SERVICE REGISTRY ══\nnotice_period: 30\nearly_exit_fee: null\n..." "Rule: null → write 'check current terms' — NEVER estimate." // + MANDATORY 3-TABLE SCHEMA (improvement 2) "TABLE_1 — Cancellation methods: columns Method | Steps | Cost | Notes" "TABLE_2 — Cost scenarios: columns Scenario | Cost | Legal basis" "TABLE_3 — Country-specific rights: Situation | Your right | Law | Action" // + COUNTRY-LAW BLOCK (improvement 3) "══ COUNTRY REGULATORY CONTEXT — France (FR) ══" "Law: Loi Hamon / Code de la Consommation" "Cooling-off: 14 days (EU distance contracts)" "Notice: 30 days monthly / 60 days annual" "Key rights: Reconduction tacite must be notified..." "Cite in TABLE_3: 'Under Loi Hamon, you have the right to...'" // + WRITING PERSONA block (e.g. cancellation_specialist)
Heading capitalisation — enforced in system instruction + QUALITY CHECK self-scan: sentence case on every H1/H2/H3/H4. First word capitalised, proper nouns capitalised, everything else lowercase. Post-generation sanitiser applies as final safety net.
Why a system instruction for Gemini? Gemini uses a dedicated system_instruction field at the top level of the request, handled by the Cloudflare proxy. The instruction ensures consistent brand naming, mandatory tables, legal accuracy per country, and writing persona.
V3 is fully locale-aware. Add locale fields to each service row — get_locale(svc) resolves them and injects country, language, currency and consumer-law references directly into the rewrite prompt.
| Field | Example (AU) | Example (ES) | Effect in prompt |
|---|---|---|---|
country | Australia | Spain | Market name in mission & rights section |
language | English | Spanish | LANGUAGE: Spanish only — all headings, tables, text |
currency / currency_symbol | AUD / A$ | EUR / € | Pricing table headers and inline references |
cancel_word | Cancel | Cancelación | How-to-cancel section headings |
consumer_law | ACL/ACCC | Ley General para la Defensa de los Consumidores | Consumer rights section (§7) |
Output JSON includes cost_usd, cost_eur (thinking tokens billed at output rate), and elapsed per article. Thinking budget is configurable: set THINKING_BUDGET = 0 to disable (faster, slightly cheaper).
Fallback chain: svc.country → editor config → script defaults. Any missing field is silently defaulted.
Pricing source: ai.google.dev/gemini-api/docs/pricing (verified Feb 2026)
| Component | Tokens | Rate | Cost (USD) | Cost (EUR) |
|---|---|---|---|---|
| Input (prompt) | ~2 800 | $0.30 / 1M | $0.00084 | €0.00077 |
| Output (article) | ~1 600 | $2.50 / 1M | $0.00400 | €0.00368 |
| TOTAL / article | ~4 400 | — | ~$0.0048 | ~€0.0044 |
| × 20 articles | $0.097 | €0.089 | ||
| × 1 000 articles | $4.84 | €4.45 | ||
| × 50 000 articles | $242 | €222 |
No web search cost — all content is generated from model training knowledge.
Note on thinking tokens: Gemini 2.5 Flash may use internal reasoning tokens (not billed separately for Flash). Usage metadata reflects only billed tokens.
# From Billoff/ directory python scripts/03_generate_v3.py # Compare all 5 methods on 1 service python scripts/test_compare_3methods.py
# generate_phase(service_dict) # Uses google-generativeai SDK or direct REST # Model: gemini-2.5-flash # Returns (html_content, usage_dict) # usage_dict: {prompt_tokens: N, completion_tokens: N}
GEMINI_API_KEY=AIzaSy... # Or set in config/scraper_config.py # Cloudflare: set as GEMINI_API_KEY secret in Pages project settings
| ✅ Pros | ❌ Cons |
|---|---|
| Ultra-cheap: €0.003/article (30× cheaper than V1) | No real-time data (pricing, competitors) |
| Very fast: 15–25s/article | No company fact box (requires web research) |
| Native multimodal model — excellent HTML output | May hallucinate prices for obscure services |
| No web dependency — works fully offline | Training cutoff limits knowledge recency |
| Massive parallelisation possible (200+ workers) | Brand enforcement requires explicit system instruction |
| 1M token context window (very long services OK) | Score 7–8/10 vs 9–10/10 for V1 |
| Best for long-tail / niche services at scale | Different API format (requires Gemini-specific proxy) |
| Scenario | Recommended method |
|---|---|
| Top 100 highest-volume services | V1 (web research) |
| Mid-tier 500 services | V2 or V4 |
| Long-tail 2 000+ services | V3 ← ideal |
| Rapid prototyping / testing | V3 ← ideal |
A writing persona is appended to the system instruction for every article. Set PERSONA_ID at the top of the downloaded Python script to switch voice. Available: cancellation_specialist (default), consumer_rights_expert, contract_lawyer, financial_advisor. See V1 Docs for the full persona reference.
Once all 4 methods complete, the Lab runs a 2-phase parallel analysis and displays a full comparison report. The analysis is automatically saved to history alongside the article results.
| Phase | Model | Role |
|---|---|---|
| Phase 1 (parallel) | claude-haiku-4-5 × 4 | One eval per method → structured JSON (scores, E-E-A-T, improvements) |
| Phase 2 (streaming) | claude-sonnet-4-6 + Extended Thinking | Comparative synthesis → full HTML report (scorecard · E-E-A-T · winners · recommendation · improvement plan) |
Total cost per analysis: ≈€0.04–0.06. See V1 Docs for the full specification.