Implement Playwright-based X scraper with AI-powered newsletter generation

Major changes:
- Replace Nitter RSS with Playwright browser automation for direct X scraping
- Scrape all 37 configured tech accounts in parallel
- Add OpenRouter AI integration for topic-based summaries (xiaomi/mimo-v2-flash:free model)
- Update prompts for factual, emotion-free analysis with post links
- Add console output for newsletter preview in dry-run mode
- Update Dockerfile to Playwright v1.57.0 with necessary browser dependencies
- Implement WRAP workflow method for AI-assisted development guidance

Technical improvements:
- Fixed TypeScript compilation (unused parameter in XScraper)
- Newsletter pipeline successfully processes 37 accounts -> AI summaries -> HTML email
- Full end-to-end test validated: scraping, processing, AI generation, email template

Pipeline flow:
1. Scrape X profiles with Playwright (parallel, configurable timeout)
2. Filter tweets by time window and content type
3. Categorize into AI/ML, Software Engineering, Tech & Startups
4. Generate AI summaries for each topic
5. Create cross-topic daily insights
6. Render HTML newsletter with highlights and trending topics
7. Send via email (or print to console in dry-run mode)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-12 09:54:50 +00:00
parent b3643fd5b0
commit fabfc2b520
9 changed files with 3365 additions and 41 deletions

View File

@@ -1,6 +1,6 @@
import type { TechAccount } from '../types/index.js';
export const TECH_ACCOUNTS: TechAccount[] = [
const ALL_TECH_ACCOUNTS: TechAccount[] = [
// ===========================================
// AI / Machine Learning
// ===========================================
@@ -51,6 +51,8 @@ export const TECH_ACCOUNTS: TechAccount[] = [
{ username: 'jason_f', displayName: 'Jason Fried', category: 'general_tech', priority: 'medium' },
];
export const TECH_ACCOUNTS: TechAccount[] = ALL_TECH_ACCOUNTS;
export function getAccountsByCategory(category: TechAccount['category']): TechAccount[] {
return TECH_ACCOUNTS.filter((account) => account.category === category);
}