About / Categories / Guides / Subscribe / Twitter

Venture Capitalist at Theory Ventures

Blog Post Generator with Vector Search

An AI-powered blog post generator that uses LanceDB for semantic similarity search to find the most relevant existing posts as context for generating new, high-quality blog content.

Features

Installation

# Using uv (recommended)
uv sync

# Or using pip
pip install -r requirements.txt

Usage

Basic Usage

python blog_post_generator.py \
  --source_file content.txt \
  --prompt "Your blog post idea" \
  --output_file my-post.md

With Vector Search (Default)

python blog_post_generator.py \
  --source_file transcript.txt \
  --prompt "AI trends in venture capital" \
  --categories ai startups funding

Disable Vector Search (Category-only)

python blog_post_generator.py \
  --source_file content.txt \
  --prompt "Blog post idea" \
  --no_vector_search

Using Ollama (Local AI)

python blog_post_generator.py \
  --source_file content.txt \
  --prompt "Blog post idea" \
  --use_ollama \
  --ollama_model gemma2

How It Works

  1. Vector Database: On first run, creates embeddings for all blog posts using sentence-transformers
  2. Semantic Search: Finds 10 most semantically similar posts to your prompt
  3. Category Matching: Also finds posts matching specified categories
  4. Style Analysis: Analyzes writing patterns from both sets of posts
  5. Content Generation: Creates blog post using combined context
  6. Iterative Improvement: Grades and refines content until high quality

Vector Search Output

When running with vector search enabled, you’ll see:

🔄 Building vector database from blog posts...
📊 Creating embeddings for 1683 documents...
✅ Vector database created with 1683 documents
🔍 Analyzing existing posts for style...
Found 5 category-relevant posts for style analysis
🧠 Finding semantically similar posts...
Found 10 semantically similar posts
  1. ai-investment-2024.md
  2. vc-market-ai-2024.md
  3. fundraising-compendium-guide.md
  4. the-venture-fund-of-the-future.md
  5. startup-ecosystem-trends.md
  ...and 5 more
📊 Using 15 total posts for style analysis

Arguments

Dependencies

Output Quality

The system aims for A- grade (90+ score) blog posts with:

Example output grades:

📝 Grading attempt 1...
Grade: A (93/100)
✅ Target grade achieved: A
Final Grade: A- (91/100)