Run this after V1/V2/V3/V4 to complete each page with keyword-optimised metadata: seo_title, h1, seo_description, slug, and faq (5 Q&A pairs). Powered by GPT-4o-mini β ultra-fast, extremely low cost, runs at 200 parallel workers.
V5 is a post-processing step β it takes the enriched JSON produced by V1/V2/V3/V4 and generates the SEO shell that wraps each article: the page title, H1 tag, meta description, slug, and 5 contextual FAQ questions grounded in seo_content.
| Attribute | Value |
|---|---|
| Model | gpt-4o-mini (OpenAI) |
| API endpoint | api.openai.com/v1/chat/completions |
| Streaming | No β structured JSON output |
| Temperature | 0.3 (factual, consistent) |
| Max output tokens | 900 |
| Input required | name, main_keyword, keywords[], seo_content |
| Fields generated | seo_title Β· h1 Β· seo_description Β· slug Β· faq (5 Q&A) |
| Parallel workers | 200 (no web search β safe to max out) |
| Retry logic | Up to 2 retries if char-length constraints fail |
| Avg cost per service | β $0.00015 USD (β β¬0.00014) |
| Cost Γ 1 000 services | β $0.15 USD |
| Cost Γ 50 000 services | β $7.50 USD (β β¬6.90) |
| Speed (200 workers) | ~1 000 services/min |
| Python dependency | stdlib only β urllib, json, reOptional: beautifulsoup4 for richer FAQ context extraction |
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β V1/V2/V3/V4 OUTPUT β service JSON with seo_content β βββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ β βββββββββββββββββΌβββββββββββββββββββββββ β CONTEXT EXTRACTION (per service) β β β’ top-15 keywords sorted by volume β β β’ 4 500-char preview of seo_content β β β’ h2/h3 headings extracted β β β’ semantic flags (fee, refund, β¦) β βββββββββββββββββ¬βββββββββββββββββββββββ β ββββββββββββββββββΌβββββββββββββββββββββββββββββ β GPT-4o-mini (200 parallel workers) β β Single-call, non-streaming, JSON output β β Retry Γ 2 if char-length validation fails β ββββββββββββββββββ¬βββββββββββββββββββββββββββββ β βββββββββΌββββββββββββββββββββββββββββ β OUTPUT (per service) β β seo_title 30β60 chars β β h1 unique angle β β seo_description 120β160 chars β β slug cancel-[name] β β faq[] 5 Q&A pairs β βββββββββ¬ββββββββββββββββββββββββββββ β SAVED BACK INTO SAME JSON FILE
If beautifulsoup4 is installed, the script also extracts structured table data and heading maps from seo_content HTML and feeds them to the model as structured "EXTRACTED FACTS". This significantly improves FAQ specificity β highly recommended for production.
pip install beautifulsoup4
| Field | Format / Constraint | Purpose |
|---|---|---|
seo_title |
30β60 chars Β· ends with | Billoff |
HTML <title> tag β keyword-optimised from the top-volume keyword |
h1 |
Unique angle β different from title | Page H1 β action-oriented, complementary to title |
seo_description |
120β160 chars Β· includes rating + CTA | HTML <meta name="description"> |
slug |
cancel-[name-ascii-lowercase] |
Clean URL slug β always derived from service name, never hallucinated |
faq |
Array of 5 {question, answer} Β· plain text, no HTML |
Structured FAQ data β injected into page as JSON-LD schema or visible FAQ section |
{
"name": "Netflix",
"seo_content": "...", // β from V1/V2/V3/V4
// β added by V5
"seo_title": "Cancel Netflix Subscription: Complete Guide | Billoff",
"h1": "Cancel Netflix: Step-by-Step Without Losing Your Watch History",
"seo_description": "Learn how to cancel Netflix in 2 minutes β no hold music, no tricks. Rated 4.8/5 by Billoff users. Keep your data or delete everything. Cancel now.",
"slug": "cancel-netflix",
"faq": [
{
"question": "Can I get a refund after cancelling Netflix?",
"answer": "Netflix does not offer refunds for partial billing periods. Your access continues until the end of the current billing cycle, after which you won't be charged again."
},
// ... 4 more Q&A pairs
]
}
Each service gets a single non-streaming call to gpt-4o-mini. The prompt includes:
seo_content (head + cancellation section)ββ KEYWORDS (sorted by volume) ββββββββββββββββββββββββββββββ Main keyword: cancel netflix subscription Related keywords: how to cancel netflix, ... ββ SERVICE CONTEXT (4 500 chars) ββββββββββββββββββββββββββββ Headings: ["How to Cancel Netflix", "Refund Policy", ...] Flags: ["auto_renewal", "refunds", "pause_option"] [seo_content preview...] ββ REQUIREMENTS βββββββββββββββββββββββββββββββββββββββββββββ 1οΈβ£ SEO TITLE β EXACTLY 30β60 chars Β· ends "| Billoff" 2οΈβ£ H1 β different angle from title 3οΈβ£ SEO DESC β EXACTLY 120β160 chars Β· rating 4.8/5 Β· CTA 4οΈβ£ SLUG β cancel-[service-name] 5οΈβ£ FAQ (5) β grounded in context Β· plain text Β· varied angles ββ VALIDATION CHECKLIST βββββββββββββββββββββββββββββββββββββ β seo_title: 30β60 chars β seo_description: 120β160 chars β FAQ: no invented contact details, plain text only β Language: English throughout
After each API response, the script validates len(seo_title) and len(seo_description). If either fails the character-length constraint, the prompt is augmented with specific correction instructions and re-sent β up to 2 retries before accepting the last result.
Click Python Script β¦ live config above. The script is fully self-contained β no dependencies except the standard library (and optional beautifulsoup4).
# Option A β edit the script directly OPENAI_API_KEY = "sk-..." # Option B β environment variable (recommended for production) export OPENAI_API_KEY="sk-..."
python3 billoff_v5_metadata.py data/services.json --test
Test mode runs on 5 random services and prints a detailed output table with field lengths, FAQ quality, and validation status β no file is saved.
# Full run (200 workers by default) python3 billoff_v5_metadata.py data/services.json # Custom worker count (e.g. for rate-limit-sensitive environments) python3 billoff_v5_metadata.py data/services.json --workers 50
The script saves back to the same JSON file in-place, enriching each service object with the 5 new fields.
The script accepts both the pipeline format and a plain array:
// Format A β pipeline output ({"services": [...]}) { "country": "FR", // ISO code of your target country "services": [ { "name": "Netflix", "seo_content": "...", ... } ] } // Format B β plain array [ { "name": "Netflix", "seo_content": "...", ... } ]
Edit the locale constants at the top of the script for each country run:
DEFAULT_LANGUAGE = "French" DEFAULT_CANCEL_WORD = "RΓ©siliation" DEFAULT_CANCEL_VERB = "rΓ©silier" DEFAULT_SLUG_PREFIX = "annuler"
You can also override per-service by adding language, cancel_word, or slug_prefix fields directly to each service object in the JSON.
| Field | Required | Notes |
|---|---|---|
name | β Yes | Service name (e.g. "Netflix") |
main_keyword | Recommended | Primary target keyword β used as default title base |
keywords | Recommended | Array of {keyword, volume} β sorted by volume for title selection |
seo_content | Recommended | HTML article from V1/V2/V3/V4 β powers FAQ context extraction |
language | Optional | Per-service language override (defaults to DEFAULT_LANGUAGE) |
cancel_word | Optional | Per-service cancel verb override |
slug_prefix | Optional | Per-service slug prefix override |
GPT-4o-mini pricing: $0.15 / 1M input tokens Β· $0.60 / 1M output tokens.
| Scale | Est. tokens / service | Cost (USD) | Cost (EUR) |
|---|---|---|---|
| 1 service | ~1 000 in + 400 out | β $0.00039 | β β¬0.00036 |
| 1 000 services | β | β $0.39 | β β¬0.36 |
| 10 000 services | β | β $3.90 | β β¬3.59 |
| 50 000 services | β | β $19.50 | β β¬17.90 |
| Full pipeline (V1+V5) | β | β $0.05β0.10 / article | β |
Estimates based on average prompt size. Retries add β5% overhead. Actual costs may vary.
# Step 1 β Generate articles (choose one method) python3 billoff_v1_generate.py data/services.json # GPT-4.1 + web search python3 billoff_v2_generate.py data/services.json # GPT-5 Mini rewrite python3 billoff_v3_generate.py data/services.json # Gemini 2.5 Flash rewrite python3 billoff_v4_generate.py data/services.json # Claude Haiku 4.5 rewrite # Step 2 β Generate SEO metadata (always the same script) python3 billoff_v5_metadata.py data/services.json # Each service in the JSON now has all fields needed to publish: # seo_content (article HTML), seo_title, h1, seo_description, slug, faq
billoff_v5_metadata.py β the standalone script you download from this pagedata/services.json (or any name) β enriched in-place by the scriptNo Cloudflare proxy needed β V5 runs directly from your machine or CI/CD pipeline against the OpenAI API.