Python Developer for AI/LLM Audit Engine – API Integration & Queue Processing System

Python Developer for AI/LLM Audit Engine – API Integration & Queue Processing System

Python Developer for AI/LLM Audit Engine – API Integration & Queue Processing System

Upwork

Upwork

Remoto

7 hours ago

No application

About

Project Overview We're seeking an experienced Python developer to build a prompt tracking audit engine that interfaces with multiple LLM APIs (ChatGPT, Claude, Gemini, etc.). This engine will serve as the core processing system for our Laravel-based generative engine optimization platform, handling both real-time content generation and large-scale asynchronous audit processing. What You'll Build A standalone Python service that operates in three primary modes: Generation Mode (Real-time): REST API endpoints for generating content (website context, topics, prompts) Synchronous LLM API calls with timeout handling Fast response times for interactive user workflows Audit Execution Mode (Asynchronous): Queue-based processing of hundreds/thousands of prompts across multiple LLMs Distributed task processing for parallel execution Progress tracking and status reporting Comprehensive error handling and retry logic Insight Generation Mode (Post-Processing): Multi-stage analysis pipeline triggered after audit completion Stage 1: Compare all LLM responses per prompt (100 prompts → 100 analyses) Stage 2: Topic-level aggregation and pattern detection (~10-15 summaries) Stage 3: Funnel stage analysis (Awareness/Evaluation/Purchase) Stage 4: Executive summary and priority recommendations Each stage uses LLM calls for intelligent analysis Technical Requirements Core System: Build REST API using FastAPI or similar framework Implement queue processing using Celery/Redis or comparable solution Integrate with multiple LLM APIs (OpenAI, Anthropic, Google, etc.) Design for horizontal scaling and high availability Share PostgreSQL database with existing Laravel application Data Flow: Accept audit specifications from Laravel via queue messages Process prompts through multiple LLM APIs concurrently Store responses and metadata in PostgreSQL Generate insights through multi-stage prompt analysis Provide real-time progress updates via Redis Insight Generation Pipeline: Orchestrate multi-stage LLM analysis workflows Manage dependencies between analysis stages Batch similar insights for efficient LLM processing Handle hierarchical data aggregation: 100 prompts x 3 LLMs = 300 responses ↓ Stage 1: 100 analyses (compare all LLM responses per prompt) ↓ Stage 2: ~10-15 topic summaries (grouped insights) ↓ Stage 3: 3 funnel stage analyses ↓ Stage 4: 1 executive summary Store insights at each level for drill-down reporting Implement intelligent prompt templates for each analysis stage Manual Verification Support: API endpoints for Chrome extension integration Prompt queue management for manual operators Session handling and prompt assignment logic Result collection from manual verification Key Deliverables Fully functional Python engine with both API and worker components Database schema for audit results and progress tracking Comprehensive error handling and retry mechanisms Rate limiting and cost tracking per LLM Multi-stage insight generation pipeline Prompt templates for each analysis stage Hierarchical insight storage schema Deployment configuration (Docker preferred) API documentation and integration guides Monitoring and logging setup Technical Skills Required Essential: Python: 5+ years experience with async programming API Development: FastAPI, Flask, or Django REST Queue Systems: Celery, RQ, or similar distributed task queues Database: PostgreSQL, schema design, query optimization LLM APIs: Experience with OpenAI, Anthropic, or similar APIs Message Brokers: Redis, RabbitMQ, or AWS SQS Pipeline Orchestration: Experience with multi-stage data processing workflows Error Handling: Retry logic, circuit breakers, timeout management Highly Desired: Async Python: asyncio, aiohttp, concurrent.futures DevOps: Docker, container orchestration, CI/CD Monitoring: Logging, APM tools, observability Rate Limiting: Experience with API rate limit management Prompt Engineering: Designing prompts for analysis and summarization Data Aggregation: Hierarchical data processing and storage Workflow Engines: Experience with Airflow, Prefect, or Temporal Laravel/PHP: Understanding for integration purposes Nice to Have: Experience with LangChain or similar LLM frameworks Web scraping experience (BeautifulSoup, Scrapy) Previous work on audit or analytics systems Chrome Extensions: Experience with extension APIs NLP/Text Analysis: Understanding of text summarization techniques Experience with multi-tenant SaaS architectures Start Date: Immediate System Complexity Note The system involves significant LLM orchestration: Initial generation phase (context, topics, prompts) Parallel audit execution across multiple LLMs (100+ prompts × 3-5 LLMs) Complex post-processing with 4 stages of LLM analysis Each audit may require 400+ total LLM calls (execution + analysis) Please factor this complexity into your estimates, particularly the insight generation pipeline which requires careful prompt design and efficient batching strategies. To Apply Please include: Relevant examples of Python API/queue processing systems you've built Any experience with LLM API integrations Your approach to handling distributed task processing at scale Experience with multi-stage data pipelines or workflow orchestration Availability and estimated timeline for MVP delivery Your preferred tech stack from the requirements above Budget Note Please provide quotes for: MVP with core functionality (all three modes operational) Hourly rate for ongoing development and maintenance Separate estimate for the insight generation pipeline if needed We're looking for someone who can own this critical component of our platform and potentially stay on for future enhancements including headless browser automation. Important: We need someone who understands real-time API requirements, large-scale batch processing, AND complex data aggregation pipelines. The ideal candidate will have built similar orchestration systems and can demonstrate experience with production Python applications handling high-throughput workloads with multi-stage processing.