Optimization Plan — YouTube Shorts Generator

Generated: 2026-04-11
Last updated: 2026-04-11 (P0 + P1 implemented)


Priority 0 — Critical (broken functionality)

# Issue File Status Description
0.1 G4F ignores temperature/max_tokens app/providers/llm/g4f_provider.py DONE _call_with_timeout() does not pass temperature and max_tokens to the API call. All temperature tuning across the pipeline (0.3 for reflection, 0.8 for dynamic trends, etc.) is silently ignored.
0.2 Hardcoded "American English" in script prompt app/pipeline/step_02_script.py DONE System prompt hardcodes "natural American English", "US audiences aged 18-35", "US spelling" regardless of cfg.channel.language and cfg.channel.region. Non-English channels generate wrong content.
0.3 Comments module ignores configured LLM provider app/analytics/youtube_comments.py DONE _build_llm() hardcodes G4FProvider() instead of using build_llm_provider(cfg.llm). If user configures OpenAI — comments still use free provider.
0.4 LLM factory has no runtime fallback app/providers/llm/factory.py DONE Fallback only triggers at instantiation. If primary provider creates successfully but fails at call time — no fallback occurs.

P0 Implementation Details

0.1 — G4F temperature/max_tokens fix
- _call_with_timeout() signature extended: temperature: float = 0.7, max_tokens: int = 2000
- Parameters forwarded via kwargs dict to client.chat.completions.create(**kwargs)
- All 3 call sites (PollinationsAI, DeepInfra, generic) updated
- Specifics: Parameters are passed conditionally (if temperature is not None) to avoid breaking providers that don't support them

0.2 — Dynamic language/region in script prompt
- Added _LANG_MAP dict (en-US, uk-UA, es-ES, de-DE, fr-FR, pt-BR, ja-JP, ko-KR, zh-CN, hi-IN, ar-SA) + _lang_display_name() helper
- System prompt now uses cfg.channel.language → display name and cfg.channel.region dynamically
- Segment count is now dynamic: seg_min = max(2, word_target // 50), seg_max = min(7, word_target // 20)
- Added comprehensive few-shot JSON example (4-segment AI medical script) — also covers P1.2
- Hook instruction: "shocking fact, bold claim, or curiosity-gap question"; CTA: "specific action — avoid generic 'like and subscribe'"

0.3 — Comments LLM provider fix
- _build_llm() rewritten: loads config via reload_config(), uses build_llm_provider(cfg.llm)
- Falls back to None (with error log) if config loading fails, preserving graceful degradation

0.4 — Runtime LLM fallback
- New _FallbackLLMProvider(LLMProvider) wrapper class in factory.py
- Wraps primary + fallback providers; complete() and complete_json() catch primary exceptions and transparently retry on fallback
- build_llm_provider() returns wrapped provider when both primary and fallback are available
- Specifics: Logs llm_runtime_fallback warning with primary name, error, and fallback name for observability


Priority 1 — Important (significant quality/performance impact)

# Issue File Status Description
1.1 Steps 3+4 run sequentially app/pipeline/orchestrator.py DONE Visuals and Audio are independent (both depend only on script). Parallel execution saves 30-90s per run.
1.2 No few-shot examples in prompts app/pipeline/step_02_script.py DONE Script generation prompt has no concrete example. LLM frequently returns malformed JSON or wrong structure.
1.3 Memory overwrites instead of accumulating app/pipeline/step_09_reflect.py DONE Each reflection completely replaces previous memory. Insights from older videos are permanently lost.
1.4 Comments not analyzed in reflection app/pipeline/step_09_reflect.py DONE _build_video_data() ignores CommentReply records entirely. Comment themes are the richest audience signal.
1.5 Memory not used in trends selection app/pipeline/step_01_trends.py DONE Trend selection doesn't know which topics performed well/poorly.
1.6 Memory not used in SEO generation app/pipeline/step_06_seo.py DONE SEO prompt doesn't receive channel memory. Tags/titles can't optimize based on past performance.
1.7 Pollinations always generates same image (seed=42) app/providers/visuals/pollinations_provider.py DONE Fixed seed means identical images for same keyword across runs.
1.8 MoviePy resource leaks app/pipeline/step_05_edit.py DONE VideoFileClip, AudioFileClip, TextClip are never closed. File handles and FFmpeg subprocesses leak.
1.9 _extract_json() duplicated in 3 files app/utils/json_utils.py (new) DONE Identical function copy-pasted in 3 step files.
1.10 LLM provider rebuilt on every step app/pipeline/orchestrator.py + step files DONE build_llm_provider(cfg.llm) called in each step independently.

P1 Implementation Details

1.1 — Parallel step execution (Visuals + Audio)
- orchestrator.py: added StepEntry = Union[PipelineStep, list[PipelineStep], tuple[PipelineStep, ...]] type
- New _run_parallel() method uses ThreadPoolExecutor(max_workers=len(active))
- Each parallel step gets a shallow dict(ctx) copy; results merged after all complete
- tasks.py: steps list now uses (VisualsStep(), AudioStep()) tuple for parallel group
- Specifics: Only steps 3+4 are parallelized (not 5+6) because SEO actually benefits from having the final video path, and Edit must complete before Upload. This is the safe, high-value parallel pair.

1.2 — Few-shot example in script prompt
- Added a complete 4-segment example JSON in _build_user() showing hook → segments → CTA structure
- Example uses a realistic AI-medical topic with proper word counts per segment
- Specifics: Combined with P0.2 implementation in the same step_02_script.py rewrite

1.3 — Cumulative memory (reflection carries forward)
- step_09_reflect.py now calls get_channel_memory(topic) before reflection
- Previous memory passed to _reflect() as prev_memory parameter
- LLM prompt includes PREVIOUS CHANNEL INSIGHTS block with instruction: "UPDATE these insights — keep valid, revise contradicted, add new patterns"
- Specifics: Memory is still upserted (one record per topic), but the LLM sees prior insights and evolves them rather than starting fresh each time

1.4 — Comment data in reflection
- _build_video_data() now queries CommentReply records (up to 5 per video, latest first)
- entry["viewer_comments"] = list of comment texts (truncated to 150 chars)
- Added audience_requests: list[str] field to ChannelMemorySchema
- Reflection prompt updated: "If viewer_comments are included, extract recurring themes, questions, or requests"
- memory_store.py updated: get_memory_context() now outputs audience requests

1.5 — Memory-informed trend selection
- Both _run_static and _run_dynamic in step_01_trends.py now query channel memory
- Injects best_topic_keywords as "prefer similar angles" and avoid_topic_keywords as "avoid these"
- Also fixed hardcoded "60-second short" → uses cfg.channel.target_duration_sec
- All LLM select methods accept optional llm parameter with fallback

1.6 — Memory-informed SEO
- step_06_seo.py imports get_memory_context and injects into system prompt
- Memory block: "Channel performance insights (use to optimize tags/title): ..."
- Limited to 150 tokens to avoid overwhelming the SEO prompt
- Fixed hardcoded "American" → uses {region} from config

1.7 — Random seed for Pollinations
- Changed seed=42seed = random.randint(1, 999999)
- Cache key updated to include seed: f"pollinations_{keyword}_{w}x{h}_{seed}"
- Specifics: Old cache lookup by keyword-only key was removed because it would always return stale results. Each generation now gets a unique image. Trade-off: no deduplication for same keyword within a run, but this is acceptable since segments should have different keywords anyway.

1.8 — MoviePy resource cleanup
- run() method wrapped in try/finally block
- All clips tracked in _open_clips: list throughout the method
- Finally block closes all clips in reverse order (composite first, then sources)
- Specifics: MoviePy clips don't support Python context managers (with), so explicit tracking + finally is the correct pattern. Reverse order ensures composite clips are closed before their source clips.

1.9 — Shared extract_json() utility
- Created app/utils/json_utils.py with single extract_json(text) -> str function
- Handles: markdown code blocks (\``json ... ```), raw JSON objects, passthrough - Removed duplicate copies fromstep_02_script.py,step_06_seo.py,step_09_reflect.py- All three files now importfrom app.utils.json_utils import extract_json as _extract_json`

1.10 — Single LLM provider instance per pipeline run
- orchestrator.py creates LLM once: ctx["llm"] = build_llm_provider(self.cfg.llm)
- All step files use pattern: llm = ctx.get("llm") or build_llm_provider(cfg.llm)
- Specifics: The or build_llm_provider() fallback preserves backward compatibility — steps can still be used standalone (e.g., test_pipeline.py --step seo) without the orchestrator. The _FallbackLLMProvider wrapper (P0.4) means the shared instance already has fallback baked in.


Priority 2 — Moderate (reliability/correctness)

# Issue File Line Description
2.1 No per-step retry logic app/pipeline/step_base.py Steps have no retry mechanism. Celery retries the entire pipeline (max_retries=2), wasting completed work.
2.2 No Celery task timeout app/scheduler/tasks.py 86 If any step hangs, worker is blocked indefinitely. Need time_limit=3600.
2.3 Visual fetching is sequential app/pipeline/step_03_visuals.py 29-46 For 5 segments, each API call takes 3-15s. ThreadPoolExecutor can parallelize these.
2.4 N+1 query problem in reflection app/pipeline/step_09_reflect.py 84-141 For each of N jobs, makes 3 separate queries (JobStep, Upload, Analytics). Should use JOINs.
2.5 String-based topic filter in memory API app/api/routes_memory.py 49-51 MemoryEvent.content.contains('"topic": "AI"') matches substrings. topic="AI" can delete records with topic="PAIR".
2.6 Memory history endpoint not filtered by topic app/api/routes_memory.py 23-28 Returns last 10 events globally, not for the requested topic.
2.7 OpenAI model name hardcoded app/providers/llm/openai_provider.py ~8 gpt-4o-mini is not configurable. Should be in LLMConfig.
2.8 shares field stores comments count app/analytics/youtube_fetcher.py 199 Misleading field name. Analytics.shares actually holds YouTube commentCount.
2.9 Reflection runs on same data repeatedly app/pipeline/step_09_reflect.py 56-68 No check against source_job_ids. Same videos analyzed every run after threshold.
2.10 Over-fetching in reflection query app/pipeline/step_09_reflect.py 60 limit(n * 2) but only recent[:n] used.
2.11 Step 09 overwrites LLM analytics fields app/pipeline/step_09_reflect.py 224-228 Manually replaces avg_views, performance_trend after LLM generation, making LLM's version pointless.
2.12 No MemoryEvent index app/db/memory_models.py 36-42 Queries on event_type scan full table. Needs index.

Priority 3 — Low (nice-to-have improvements)

# Issue File Line Description
3.1 Token estimation inaccurate app/memory/memory_store.py 75-77 1 token ≈ 4 chars is ±15% off. Can truncate mid-sentence. Should cut at sentence boundary.
3.2 No JSON mode for OpenAI provider app/providers/llm/base.py 11-15 complete_json() just appends text instruction. OpenAI supports response_format={"type": "json_object"}.
3.3 Comment reply prompt ignores channel.style app/analytics/youtube_comments.py 253-264 channel_style is passed to task but never used in the prompt.
3.4 Shorten/expand prompts lack strategy app/pipeline/step_02_script.py 131-155 "Shorten it" / "Expand it" without guidance on what to cut/add. Should specify preserve hook+CTA.
3.5 Keyword deduplication in memory app/memory/memory_store.py 44-46 Same keyword can appear in both best_topic_keywords and avoid_topic_keywords.
3.6 No memory freshness check app/memory/memory_store.py 30-40 Injects 30+ day old memory without warning.
3.7 Unbounded MemoryEvent growth app/memory/memory_store.py 103-107 No pruning. Table grows indefinitely.
3.8 Multiple db.commit() per pipeline step app/pipeline/orchestrator.py 48-95 4-5 commits per step. Can batch into 1-2.
3.9 Prompt injection via trend_keyword app/pipeline/step_02_script.py 94 User/LLM-supplied keyword inserted into prompt without sanitization.
3.10 FFmpeg subprocess no timeout app/utils/ffmpeg_utils.py ~71 subprocess.run() can hang on corrupted media.
3.11 Upload chunked streaming no timeout app/pipeline/step_07_upload.py 155 request.next_chunk() can stall indefinitely.
3.12 Analytics fetch not triggered on upload app/scheduler/tasks.py Videos wait 6h before first analytics fetch. Should trigger immediately after upload.

Priority 4 — Cosmetic / future enhancements

# Issue File Line Description
4.1 No A/B testing framework Can't compare two strategies (e.g., dramatic vs calm hooks).
4.2 No YouTube Analytics API integration app/analytics/youtube_fetcher.py Missing watch time, retention, avg_view_duration (requires different OAuth scope).
4.3 Trend detection too simplistic app/pipeline/step_09_reflect.py 166-181 Binary recent/older comparison. Sensitive to outliers.
4.4 Comment reply doesn't detect language app/analytics/youtube_comments.py Ukrainian comment gets English reply.
4.5 No per-keyword performance tracking in memory app/memory/memory_store.py Memory doesn't track which specific keywords drove views.
4.6 MemoryEvent table never read app/db/memory_models.py 36-42 Written to but never queried in pipeline logic. Dead audit trail.
4.7 SEO chapters field always empty app/pipeline/step_06_seo.py 73 Shorts don't have chapters. Remove from schema.
4.8 bottom_bar subtitle style not implemented app/pipeline/step_05_edit.py Config allows it but only karaoke positioning exists.
4.9 Causality markers for memory insights Track which insights were used per video to measure their impact.
4.10 LLM response caching via Redis Same/similar prompts re-sent without cache. Redis already available.

Implementation Summary

Implemented: 14 items (P0: 4/4, P1: 10/10)
Remaining: 22 items (P2: 12, P3: 12, P4: 10)

Files Modified

File Changes
app/providers/llm/g4f_provider.py P0.1 — temperature/max_tokens forwarding
app/providers/llm/factory.py P0.4 — _FallbackLLMProvider wrapper class
app/analytics/youtube_comments.py P0.3 — config-aware _build_llm()
app/pipeline/step_01_trends.py P1.5 — memory-informed trend selection, dynamic duration
app/pipeline/step_02_script.py P0.2 + P1.2 — i18n prompts, few-shot example, dynamic segments
app/pipeline/step_05_edit.py P1.8 — clip tracking + finally cleanup
app/pipeline/step_06_seo.py P1.6 — memory-informed SEO, region-aware prompt
app/pipeline/step_09_reflect.py P1.3 + P1.4 — cumulative memory, comment analysis, audience_requests
app/pipeline/orchestrator.py P1.1 + P1.10 — parallel steps, shared LLM instance
app/scheduler/tasks.py P1.1 — parallel step group (VisualsStep(), AudioStep())
app/memory/memory_store.py P1.4 — audience_requests in memory context
app/providers/visuals/pollinations_provider.py P1.7 — random seed per generation

Files Created

File Purpose
app/utils/json_utils.py P1.9 — shared extract_json() utility
docs/OPTIMIZATION_PLAN.md This document

Key Design Decisions

  1. Parallel steps limited to Visuals+Audio only — SEO could theoretically run parallel with Edit, but it benefits from having the final output path, and the time savings would be marginal compared to the Visuals+Audio pair (which involves network I/O).

  2. ctx.get("llm") or build_llm_provider() pattern — Preserves backward compatibility for standalone step usage (test_pipeline.py --step X) while eliminating redundant instantiation in normal pipeline runs.

  3. Pollinations cache key includes seed — Breaking change from the old pollinations_{keyword}_{size} key. Old cached images won't be hit, but this is the correct behavior since the whole point is generating unique images.

  4. Cumulative memory via LLM, not merge — Rather than programmatically merging old+new memory dicts (brittle, loses nuance), the previous memory is passed to the LLM with explicit instructions to update/revise/keep. This lets the LLM make qualitative judgments about what's still valid.

  5. MoviePy cleanup via tracked list + finally — MoviePy clips don't implement __enter__/__exit__, so context managers aren't an option. Reverse-order closing (composite → source) prevents errors from closing source clips that are still referenced.