4-pass live web search via GPT-4.1 Responses API, then 2,000+ word article generation via GPT-5 Mini.
V1 is the most comprehensive and expensive method. It performs 4 real-time web searches before generating the article, ensuring the content reflects current pricing, real competitors, and actual customer reviews.
| Attribute | Value |
|---|---|
| Research model | gpt-4.1 (via Responses API with web_search_preview) |
| Writing model | gpt-5-mini-2025-08-07 |
| Web searches | 4 passes (general, pricing, competitors, company info) |
| Avg article length | 1,900β2,200 words |
| Avg quality score | 9β10/10 |
| Cost per article | β¬0.09β0.12 (mainly from 4Γ web search at $0.025 each) |
| Avg generation time | 90β150 seconds |
| Cost Γ 1,000 articles | β β¬100β120 |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β SERVICE DATA (from sample_services.json / EN-AU JSON) β ββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ β ββββββββββββββββΌβββββββββββββββββββ β RESEARCH PHASE β β β β Pass 1: Cancellation + Reviews β gpt-4.1 β Pass 2: Current AUD Pricing β + web_search_preview β Pass 3: 3β5 Competitors β Responses API β Pass 4: Company Structured Info β ββββββββββββββββ¬ββββββββββββββββββββ β β research JSON (cancellation, reviews, β pricing, competitors, company_info) β ββββββββββββββββΌββββββββββββββββββββ β GENERATION PHASE β β β β Consumer Prompt Template β gpt-5-mini β + Company info block β max_completion_tokens=8000 β + Research JSON (5,500 chars) β streaming=true β + 14-section structure β ββββββββββββββββ¬ββββββββββββββββββββ β FINAL ARTICLE (HTML, ~2,000 words, 14 H2, 3+ tables)
Billoff/scripts/01_generate_v1.py β Python batch generatorBilloff/web/assets/openai.js β generateV1() β Browser-side generator with streamingBilloff/scripts/config.py β Prompts, models, cost tablesCancellation policy, refund policy, user reviews and consumer rights for "{name}" in Australia. Find: how to cancel (web, app, email), refund terms, TrustPilot or ProductReview.com.au rating and themes. Return JSON: {"cancellation":{"items":[...]},"refund":{"policy":"..."},"reviews":{"rating":"X/5","positive":[...],"negative":[...]}}
Exact AUD pricing for "{name}" in Australia (all plans, monthly and annual). Return JSON: {"pricing":{"items":[{"plan":"...","price_monthly":"$XX","price_annual":"$XX","features":"..."}]}}
Find 3 to 5 real competitors to "{name}" in {category} available in Australia. For each: name, official website, AUD price/month, key difference, pros, cons. Return JSON: {"competitors":[{"name":"...","website":"https://...","price_monthly":"$XX","key_difference":"...","pros":[...],"cons":[...]}]} # Retry once if fewer than 3 competitors found
Structured company information: full legal name, parent company, HQ, CEO, founders, founded year, number of employees, support email, active users/month, App Store rating, Google Play rating, annual revenue, stock ticker. Return JSON: {"company_info":{"full_legal_name":"...","ceo":"...","headquarters":"...","founded_year":"...", ...}}
The full prompt injected to GPT-5 Mini after research. Key characteristics:
<table class="company-facts">)ββ SERVICE DATA ββ Name, Category, Website, Main Keyword, Notes, Cancellation Address, Currency ββ STRUCTURED COMPANY DATA ββ (From Pass 4 research: CEO, HQ, Founded, Employees, Ratings, Revenueβ¦) ββ LIVE RESEARCH DATA ββ (From Passes 1β3: cancellation steps, refund policy, reviews, pricing, competitors) ββ STRUCTURE β 14 SECTIONS ββ (Detailed instructions for each H2 with required H3 sub-sections) ββ HTML RULES ββ (Tables, links, brand, language, target length, tone) ββ QUALITY CHECK (self-verify before output) ββ 14 H2? β₯2 H3 per H2? Company fact table? Tables in 2/3/12? FAQ? 1600+ words?
| Component | Model | Cost (USD) | Cost (EUR) |
|---|---|---|---|
| Web search Γ 4 | gpt-4.1 Responses API | $0.100 | β¬0.092 |
| Writing (~3,000 prompt + 5,500 completion tokens) | gpt-5-mini | $0.011 | β¬0.010 |
| TOTAL per article | β | ~$0.111 | ~β¬0.102 |
| Γ 20 articles | β | $2.22 | β¬2.04 |
| Γ 1,000 articles | β | $111 | β¬102 |
Note: 90% of the cost is the 4 web searches. The generation itself is very cheap.
# From Billoff/ directory python scripts/01_generate_v1.py # Test 2 random services with full debug logs python scripts/test_v1_dry_run.py # Research-only (no generation) python scripts/test_v1_dry_run.py --research-only # Test specific service python scripts/test_v1_dry_run.py --service "Spotify"
# research_phase(service_name, category, country, currency, symbol, verbose) # β Returns (research_dict, web_cost_usd) # research_dict keys: cancellation, refund, reviews, pricing, competitors, company_info, _verified_sources # generate_phase(service_dict, research_dict) # β Returns (html_content, tokens_dict) # tokens_dict: {prompt: N, completion: N}
OPENAI_API_KEY=sk-proj-...
# Or set in config/scraper_config.py as OPENAI_API_KEY
| Metric | Result | Target |
|---|---|---|
| Word count | 2,119 | 1,600+ |
| Tables | 4 | 3+ |
| H2 sections | 14 | 14 |
| H3 sub-sections | 39 | 28+ |
| FAQ section | β Present | Required |
| Company fact box | β Present | Required |
| Billoff mentions | 2 | 2+ |
| Quality score | 10/10 | 9+ |
| β Pros | β Cons |
|---|---|
| Real-time pricing data (current AUD) | 10Γ more expensive than V2 |
| Actual competitors with verified websites | 2β3Γ slower than V2/V3 |
| Real customer review themes | Requires working internet access |
| Structured company data (CEO, HQ, etc.) | Web search can fail for niche services |
| Highest quality score (10/10 in tests) | Sequential by nature (rate limit aware) |
| Best for important pages (top 50 services) | Not cost-effective at massive scale |