4-pass live web search via GPT-4.1 Responses API + 1 structured-data extraction pass (Pass 5), then 2,000+ word article generation via GPT-5 Mini with pre-verified data injected.
V1 is the most comprehensive and expensive method. It performs 4 real-time web searches before generating the article, ensuring the content reflects current pricing, real competitors, and actual customer reviews.
| Attribute | Value |
|---|---|
| Research model | gpt-4.1 (via Responses API with web_search_preview) |
| Writing model | gpt-5-mini-2025-08-07 |
| Web searches | 4 passes (cancellation/reviews, pricing, competitors, company info) |
| Pass 5 β data extraction | Non-streaming JSON call: extracts 6 structured fields from raw research before generation |
| Avg article length | 1,900β2,400 words |
| Avg quality score | 9β10/10 |
| Cost per article | β¬0.09β0.12 (mainly from 4Γ web search at $0.025 each) |
| Avg generation time | 90β150 seconds |
| Cost Γ 1,000 articles | β β¬100β120 |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β SERVICE DATA (from sample_services.json) β β + locale fields: country, language, currency, etc. β ββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ β get_locale(svc) β resolves country/currency/consumer_law β ββββββββββββββββΌβββββββββββββββββββ β RESEARCH PHASE β β β β Pass 1: Cancellation + Reviews β gpt-4.1 β Pass 2: {currency} Pricing β + web_search_preview β Pass 3: 3β5 Competitors β + user_location (geo) β Pass 4: Company Structured Info β Responses API ββββββββββββββββ¬ββββββββββββββββββββ β β raw research text (all 4 passes) β ββββββββββββββββΌββββββββββββββββββββ β PASS 5 β STRUCTURED EXTRACTION β gpt-4.1 (non-streaming) β Extract from raw text: β β notice_period_days (int) β β cancellation_channels (list) β β refund_eligibility_days (int) β β statutory_cooling_off_days (int) β β country_specific_law (str) β β effective_date (str) β ββββββββββββββββ¬ββββββββββββββββββββ β research._structured (pre-verified JSON) β ββββββββββββββββΌββββββββββββββββββββ β GENERATION PHASE β β β β Consumer Prompt Template β gpt-5-mini β + Company info block β max_completion_tokens=8000 β + Research JSON (5,500 chars) β streaming=true β + Pre-extracted schema (Pass 5) β β + 14-section structure β ββββββββββββββββ¬ββββββββββββββββββββ β FINAL ARTICLE (<h1> creative title + 14 H2, ~2,000+ words, 3+ tables)
Billoff/scripts/01_generate_v1.py β Python batch generatorBilloff/web/assets/openai.js β generateV1() β Browser-side generator with streamingBilloff/scripts/config.py β Prompts, models, cost tablesQueries are now jurisdiction-specific: they embed the country ISO code, current year, and market-specific review sites (ProductReview.com.au for AU, Avis VΓ©rifiΓ©s for FR, Trusted Shops for DE, etc.).
# Template β variables resolved at runtime via get_locale() # search_context_size is tier-based: "high" services β one tier below, "low" services β "low" {cancel_word} "{name}" {country_name} ({country_code}) {year}. Find: ALL cancellation methods (web, app, email, phone, post), exact notice period in DAYS (integer), early termination fee amount in {currency}, cooling-off rights under {consumer_law}, refund window in days, regulatory body name. Also find: overall rating and key complaint themes from TrustPilot, local review sites (e.g. ProductReview.com.au / Avis VΓ©rifiΓ©s / Trusted Shops) or app stores. Return JSON: {"cancellation":{"methods":[...],"steps":"...","notice_period":"...", "notice_period_days":null,"early_termination_fee":null,"cooling_off_days":null, "refund_window_days":null},"refund":{"policy":"...","window":"..."}, "reviews":{"rating":"X/5","review_count":"~X","positive":[...],"negative":[...]}} # geo-targeted: user_location {type:"approximate",city,region,country:country_code,timezone} # search_context_size: "high" tier β "medium" | "medium/low" tier β "low"
After all 4 web searches, a dedicated non-streaming API call extracts structured fields from the raw research text. This pre-verified data is then injected into the generation prompt so the writing model never has to guess numeric facts.
# System prompt for extraction (gpt-4.1, non-streaming, ~0.001 USD): Extract a JSON object from the research text with keys: notice_period_days (int | null) cancellation_channels (list of strings | null) refund_eligibility_days (int | null) statutory_cooling_off_days (int | null) country_specific_law (str | null) effective_date (str | null) Return null for unknown fields. Do NOT generate β only extract. # Result injected into generation prompt as: ββ PRE-EXTRACTED STRUCTURED DATA (verified from research) ββ notice_period_days: 30 | refund_eligibility_days: 14 | ... Embed these values as-is β do not contradict them.
Current {currency} pricing plans for "{name}" in {country_name} as of {year}. List ALL plans (basic, standard, premium, business) with exact monthly AND annual prices in {currency}. Include any free trial or free tier. Flag usage-based services with typical monthly total. Return JSON: {"pricing":{"items":[{"plan":"...","price_monthly":"{symbol}XX/mo", "price_annual":"{symbol}XX/yr","savings_annual":"save {symbol}XX","features":"..."}], "free_trial":"yes/no, X days","typical_monthly":"{symbol}XX"}} # search_context_size: "medium" β needs to visit actual pricing pages
Find 3 to 5 real competitors to "{name}" in {country_name} ({category} category). For each: name, official {tld} website, REAL {currency} price/month. PRICING RULE: subscription β cheapest paid plan | utility β typical monthly bill | unknown β estimate "from ~{symbol}XX/mo" β NEVER "Varies by plan". Return JSON: {"competitors":[{"name":"...","website":"https://...","price_monthly":"{symbol}XX", "price_annual":"...","free_trial":"yes/no","cancel_difficulty":"Low|Medium|High", "key_difference":"...","pros":[...],"cons":[...],"best_for":"..."}]}
Structured company information for "{name}" operating in {country_name}: full legal name, parent company, HQ, CEO, founders, founded year, employees, support email, active users/month, App Store rating, Google Play rating, annual revenue, stock ticker. Return JSON: {"company_info":{"full_legal_name":"...","ceo":"...","headquarters":"...", ...}}
V1 is fully locale-aware. Every web search is geo-targeted and every generated article uses the correct language, currency and consumer law for the target country. No code change is needed β just add locale fields to each service row in your JSON.
| Field | Example (AU) | Example (FR) | Notes |
|---|---|---|---|
country | Australia | France | Full country name injected into the article |
country_code | AU | FR | ISO 3166-1 alpha-2 β used for geo-targeting |
language | English | French | Article language instruction |
currency | AUD | EUR | Currency code for pricing queries |
currency_symbol | A$ | β¬ | Symbol used inline in search queries |
country_tld | .com.au | .fr | Guides competitor website lookup |
cancel_word | Cancel | RΓ©siliation | Native-language cancellation term for search queries |
consumer_law | Australian Consumer Law (ACL/ACCC) | French Consumer Code (DGCCRF) | Referenced in rights section of article |
city / region / timezone | Sydney / New South Wales / Australia/Sydney | Paris / Γle-de-France / Europe/Paris | Used for user_location geo-targeting in web searches |
Fields are optional: if missing, the script falls back to your editor config (set once in the Prompt Editor). Priority chain: svc.country β editor config β script defaults.
All search query templates (customisable in the Editor) support these placeholders β resolved per service at runtime:
{name} β service name
{category} β service category
{country_name} β e.g. "France"
{currency} β e.g. "EUR"
{symbol} β e.g. "β¬"
{cancel_word} β e.g. "RΓ©siliation"
{tld} β e.g. ".fr"
{language} β e.g. "French"
{consumer_law} β e.g. "French Consumer Code (DGCCRF)"
Each of the 4 web search passes includes a user_location object passed to the web_search_preview tool, steering OpenAI's search toward local sources:
// Automatically built from locale fields: user_location: { type: "approximate", city: "Paris", region: "Γle-de-France", country: "FR", // ISO country code timezone: "Europe/Paris", }
# sample_services.json β mix countries freely in one file
{
"services": [
{ "name": "Netflix", "category": "tv-streaming",
"country": "France", "country_code": "FR",
"language": "French", "currency": "EUR", "currency_symbol": "β¬",
"country_tld": ".fr", "cancel_word": "RΓ©siliation",
"consumer_law": "French Consumer Code (DGCCRF)",
"city": "Paris", "region": "Γle-de-France", "timezone": "Europe/Paris" },
{ "name": "Netflix", "category": "tv-streaming",
"country": "Australia", "country_code": "AU",
"language": "English", "currency": "AUD", "currency_symbol": "A$",
"country_tld": ".com.au", "cancel_word": "Cancel",
"consumer_law": "Australian Consumer Law (ACL/ACCC)",
"city": "Sydney", "region": "New South Wales", "timezone": "Australia/Sydney" }
]
}
Each service gets an article in the correct language, currency and legal context β fully automated, no prompt editing required per country.
The full prompt injected to GPT-5 Mini after research. Key characteristics:
<table class="company-facts">)ββ VOICE & STYLE ββ Direct address ("you"), contractions, varied rhythm, genuine opinions. Banned: "Furthermore", "Moreover", "In conclusion", "Navigating", "Delve into"β¦ ββ TITLE ββ 8 possible angles: Problem-first | Speed | Trap alert | Savings | Rights-first | β¦ FORBIDDEN pattern: "How to Cancel [name]: [subtitle]" ββ SERVICE DATA ββ Name, Category, Website, Main Keyword, Notes, Cancellation Address, Currency ββ PRE-EXTRACTED STRUCTURED DATA (Pass 5) ββ notice_period_days, cancellation_channels, refund_eligibility_days, β¦ ββ STRUCTURED COMPANY DATA ββ (From Pass 4 research: CEO, HQ, Founded, Employees, Ratings, Revenueβ¦) ββ LIVE RESEARCH DATA ββ (From Passes 1β3: cancellation steps, refund policy, reviews, pricing, competitors) ββ STRUCTURE β 14 SECTIONS ββ (Detailed instructions for each H2 with required H3 sub-sections) ββ HTML RULES ββ (Tables, links, brand, language, target length, tone, persona voice) ββ HTML RULES β enforced ββ Sentence case on ALL headings (first word + proper nouns only β no Title Case). Title bank: 8 angles in sentence case β model picks the best fit. ββ QUALITY CHECK (self-verify before output) ββ Creative H1? 14 H2? β₯2 H3 per H2? Company fact table? Tables in 2/3/12? FAQ? 1,600+ words? No banned phrases? Writing persona applied? Sentence case on every heading (H1/H2/H3)?
A writing persona is injected into the system message for every article. It shapes tone, vocabulary, logical connectors, and structural approach. The active persona is set in the Python script via PERSONA_ID.
| PERSONA_ID | Tone | Best for |
|---|---|---|
cancellation_specialist (default) | Friendly expert, practical, step-by-step | How-to guides, trap-avoidance content |
consumer_rights_expert | Reassuring, empowering, rights-focused | Legal/consumer protection angle |
contract_lawyer | Precise, methodical, authoritative | High-value / complex contract services |
financial_advisor | Data-driven, comparative, savings-focused | Cost analysis, subscription audits |
Personas are baked into the downloaded Python script. Change PERSONA_ID = "..." at the top of the script to switch voice for a batch. The persona block is appended to the system message β it does not override brand, HTML or locale rules.
| Component | Model | Cost (USD) | Cost (EUR) |
|---|---|---|---|
| Web search Γ 4 | gpt-4.1 Responses API | $0.100 | β¬0.092 |
| Pass 5 β data extraction (~500 in + 200 out tokens) | gpt-4.1 | $0.003 | β¬0.003 |
| Writing (~3,000 prompt + 5,500 completion tokens) | gpt-5-mini | $0.011 | β¬0.010 |
| TOTAL per article | β | ~$0.114 | ~β¬0.105 |
| Γ 20 articles | β | $2.28 | β¬2.10 |
| Γ 1,000 articles | β | $114 | β¬105 |
Note: ~88% of the cost is the 4 web searches. Pass 5 and generation are comparatively very cheap.
# From Billoff/ directory python scripts/01_generate_v1.py # Test 2 random services with full debug logs python scripts/test_v1_dry_run.py # Research-only (no generation) python scripts/test_v1_dry_run.py --research-only # Test specific service python scripts/test_v1_dry_run.py --service "Spotify"
# research_phase(service_name, category, country, currency, symbol, verbose) # β Returns (research_dict, web_cost_usd) # research_dict keys: cancellation, refund, reviews, pricing, competitors, company_info, _verified_sources # generate_phase(service_dict, research_dict) # β Returns (html_content, tokens_dict) # tokens_dict: {prompt: N, completion: N}
OPENAI_API_KEY=sk-proj-...
# Or set in config/scraper_config.py as OPENAI_API_KEY
| Metric | Result | Target |
|---|---|---|
| Word count | 2,119 | 1,600+ |
| Tables | 4 | 3+ |
| H2 sections | 14 | 14 |
| H3 sub-sections | 39 | 28+ |
| FAQ section | β Present | Required |
| Company fact box | β Present | Required |
| Billoff mentions | 2 | 2+ |
| Quality score | 10/10 | 9+ |
| β Pros | β Cons |
|---|---|
| Real-time pricing data (current AUD) | 10Γ more expensive than V2 |
| Actual competitors with verified websites | 2β3Γ slower than V2/V3 |
| Real customer review themes | Requires working internet access |
| Structured company data (CEO, HQ, etc.) | Web search can fail for niche services |
| Highest quality score (10/10 in tests) | Sequential by nature (rate limit aware) |
| Best for important pages (top 50 services) | Not cost-effective at massive scale |
Once V1, V2, V3 and V4 have all generated their articles, the Lab automatically triggers a 2-phase comparative analysis:
| Phase | Model | What it does | Speed |
|---|---|---|---|
| Phase 1 β Parallel evals | claude-haiku-4-5-20251001 Γ 4 | One call per article (V1βV4, up to 7 000 chars each). Returns a structured JSON assessment: scores /10 per dimension (structure, E-E-A-T, SEO, tone, brand, economics), strengths, weaknesses, improvement ideas. | ~10β15s (parallel) |
| Phase 2 β Synthesis | claude-sonnet-4-6 (Extended Thinking) | Receives the 4 compact JSON assessments (not the full articles β 3β4Γ smaller input). Generates the full HTML report: scorecard, E-E-A-T deep-dive, tone analysis, category winners, production recommendation, improvement plan. | ~20β30s streaming |
| Component | Model | Approx. cost |
|---|---|---|
| Phase 1 β 4Γ eval (parallel, one per method) | claude-haiku-4-5 Γ 4 | β$0.003 total |
| Phase 2 β synthesis + thinking | claude-sonnet-4-6 | β$0.04β0.06 |
| Total per analysis run | β | β$0.044β0.064 β β¬0.04β0.06 |
The analysis HTML is automatically saved alongside the generation results in Cloudflare KV (BILLOFF_TESTS namespace) β shared across all browsers, devices, and private sessions. When you re-open a saved test, the analysis is restored instantly (no API call needed). For older tests without a saved analysis, a "Run Claude Analysis now" button appears.