xps

Build a comprehensive literature review system with parallel searchers - process 30 papers in 10 minutes

Economics Research Application

Academic literature review is one of the most time-intensive tasks in economics research. A comprehensive review of 30 papers typically requires 3+ hours of manual work: searching databases, downloading PDFs, extracting key findings, and synthesizing results.

Task decomposition transforms this process through intelligent orchestration and parallelization.

Parallelization reduces literature review time by 18× - from 3+ hours to 10 minutes for 30 papers.

Use Case: 30-Paper Literature Review

The system processes a broad research topic (e.g., "AI impact on labor markets") by decomposing it into specialized sub-queries, conducting parallel searches across multiple academic databases, and synthesizing findings into a comprehensive review.

System Architecture

The literature review pipeline uses a hierarchical multi-agent architecture with five specialized components:

Orchestrator Agent: Coordinates the entire workflow. Decomposes the research topic into focused sub-queries, manages task assignment to searchers, monitors progress, and triggers the synthesis phase when all papers are processed.

Searcher Agents (×3): Execute parallel searches across different domains. Each searcher specializes in a specific aspect of the topic: theoretical frameworks, empirical studies, and policy implications. This domain-based parallelization ensures comprehensive coverage.

Downloader Agent: Retrieves PDF files for all discovered papers. Operates sequentially with retry logic and rate limiting to respect publisher constraints while maintaining reliability.

Extractor Agents (×10): Process papers in parallel batches. Each extractor reads one paper, identifies key contributions, extracts methodology details, and summarizes findings. This is where maximum parallelization occurs.

Synthesizer Agent: Combines all extracted findings into a coherent literature review. Identifies common themes, highlights contradictions, and organizes results by topic area.

Implementation Workflow

Query Decomposition

The orchestrator analyzes the research topic and generates three focused sub-queries targeting different aspects:

queries = orchestrator.decompose_topic(
    "AI impact on labor markets",
    domains=["theory", "empirical", "policy"]
)

This produces specialized queries like "theoretical models of AI-driven automation," "empirical studies on job displacement," and "policy responses to technological unemployment."

Parallel Search Execution

Three searcher agents execute queries simultaneously across academic databases. Each searcher returns a list of relevant papers with metadata:

results = await asyncio.gather(
    search_theory.run(queries[0]),
    search_empirical.run(queries[1]),
    search_policy.run(queries[2])
)

This parallel execution completes in 2 minutes instead of 6 minutes sequential.

PDF Download

The downloader agent retrieves all papers sequentially with rate limiting and error handling. This prevents overwhelming publisher servers while ensuring complete paper collection.

Papers are saved to a local cache for processing by extractor agents.

Parallel Paper Extraction

Ten extractor agents process papers in parallel batches. Each agent reads the full text, identifies research contributions, extracts methodological approaches, and summarizes key findings.

extracts = await process_parallel(
    papers, extractors=10,
    extract_fn=lambda p: agent.extract_findings(p)
)

Processing 30 papers takes 5 minutes with 10 parallel extractors versus 50 minutes sequentially.

Synthesis and Review Generation

The synthesizer agent combines all extracted findings into a structured literature review. It organizes results by theme, identifies consensus and contradictions, and highlights research gaps.

The final output is a comprehensive review document with citations, methodology comparison tables, and a synthesis of key findings across all 30 papers.

Performance Results

18× Faster

10 minutes for 30-paper review vs 3+ hours manually

100% Coverage

No papers missed through systematic parallel search

Consistent Quality

Standardized extraction ensures uniform analysis depth

The key insight is that decomposition enables massive parallelization at the extraction phase while maintaining coordination through the orchestrator. This architecture scales linearly: doubling the number of extractors halves processing time.

Economics Research: Literature Review Pipeline