Customization Patterns: Making It Yours

Domain-specific tools, custom MCP servers, citation styles, and knowledge management integrations

The power of this research system lies in customization. Every domain has unique databases, citation styles, and workflows. Here's how to adapt the system to your specific needs.

Custom MCP Tools for Your Domain

Build custom MCP servers to integrate domain-specific databases and workflows. The pattern is always the same: define tools, implement handlers, register with Claude Desktop.

Example: Biology Research - PubMed MCP Server

// mcp-servers/pubmed-server/server.ts

import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { chromium } from 'playwright';

const server = new Server({
  name: 'pubmed-tools',
  version: '1.0.0'
}, {
  capabilities: { tools: {} }
});

server.setRequestHandler('tools/list', async () => {
  return {
    tools: [
      {
        name: 'pubmed_search',
        description: 'Search PubMed with MeSH terms and filters',
        inputSchema: {
          type: 'object',
          properties: {
            mesh_terms: {
              type: 'array',
              items: { type: 'string' },
              description: 'Medical Subject Headings (MeSH) terms'
            },
            text_query: {
              type: 'string',
              description: 'Free-text search query'
            },
            filters: {
              type: 'object',
              properties: {
                article_type: { type: 'array', items: { type: 'string' } },
                publication_date: { type: 'string' },
                species: { type: 'string' }
              }
            }
          },
          required: ['mesh_terms']
        }
      },
      {
        name: 'fetch_pubmed_metadata',
        description: 'Fetch complete metadata for PMIDs',
        inputSchema: {
          type: 'object',
          properties: {
            pmids: { type: 'array', items: { type: 'string' } }
          },
          required: ['pmids']
        }
      },
      {
        name: 'download_pubmed_pdfs',
        description: 'Download full-text PDFs via institutional access',
        inputSchema: {
          type: 'object',
          properties: {
            pmids: { type: 'array', items: { type: 'string' } },
            auth_session: { type: 'string' }
          },
          required: ['pmids', 'auth_session']
        }
      }
    ]
  };
});

// Implementation details: search execution, metadata parsing, PDF downloads

Key design principles:

  • MeSH term support: Medical researchers use standardized vocabulary
  • Filter-specific fields: Article type, publication date, species filters
  • Institutional access: Authenticate once, reuse session for batch downloads
  • Metadata completeness: PMID, authors, affiliations, funding sources

Adapt this pattern to any domain database: arXiv (physics/CS), JSTOR (humanities), LexisNexis (law), or internal corporate knowledge bases. The MCP server abstracts authentication, rate limiting, and parsing complexity away from your research workflow.

Domain-Specific Automation Examples

Different academic disciplines require different workflows. Here are three complete examples.

arXiv + GitHub Integration

Computer science research increasingly includes code artifacts. This workflow discovers papers, finds associated repositories, and analyzes code quality.

// scripts/cs-research-workflow.ts

async function csResearchWorkflow(topic: string) {
  // Step 1: Search arXiv
  const arxivPapers = await mcp.invoke('arxiv_search', {
    query: topic,
    category: 'cs.LG', // Machine Learning
    max_results: 100
  });

  // Step 2: For each paper, find associated GitHub repos
  const papersWithCode = await Promise.all(
    arxivPapers.map(async (paper) => {
      const github = await mcp.invoke('github_search', {
        query: `${paper.title} ${paper.authors[0]}`,
        language: 'python'
      });
      return { ...paper, github_repos: github.repos };
    })
  );

  // Step 3: Clone and analyze code quality
  for (const paper of papersWithCode) {
    if (paper.github_repos.length > 0) {
      const repo = paper.github_repos[0];
      await mcp.invoke('clone_repo', { url: repo.url });

      // Analyze code quality
      const analysis = await mcp.invoke('analyze_code', {
        repo_path: repo.local_path,
        metrics: ['test_coverage', 'documentation', 'activity']
      });

      paper.code_quality = analysis;
    }
  }

  // Step 4: Rank by paper quality + code availability
  const ranked = rankByQuality(papersWithCode, {
    weights: {
      citations: 0.3,
      code_availability: 0.3,
      code_quality: 0.2,
      recency: 0.2
    }
  });

  return ranked;
}

Why this workflow:

  • Code reproducibility: Papers without code are harder to validate
  • Implementation quality: Well-documented, tested code indicates rigor
  • Community adoption: Active repos signal practical impact

JSTOR + Archive.org Integration

Humanities research combines scholarly articles with primary sources. This workflow cross-references modern scholarship with historical documents.

// scripts/humanities-research-workflow.ts

async function humanitiesResearchWorkflow(topic: string, era: string) {
  // Step 1: Search JSTOR for scholarly articles
  const jstorArticles = await mcp.invoke('jstor_search', {
    query: topic,
    date_range: era,
    disciplines: ['History', 'Literature', 'Philosophy']
  });

  // Step 2: Search Archive.org for primary sources
  const primarySources = await mcp.invoke('archive_org_search', {
    query: topic,
    mediatype: 'texts',
    year_range: era
  });

  // Step 3: Download and OCR primary sources
  for (const source of primarySources) {
    await mcp.invoke('download_archive_item', {
      identifier: source.identifier
    });

    // OCR if needed
    if (source.requires_ocr) {
      await mcp.invoke('ocr_document', {
        input: source.local_path,
        language: 'eng'
      });
    }
  }

  // Step 4: Cross-reference: which scholars cite which primary sources?
  const crossRef = await analyzeSourceCitations(jstorArticles, primarySources);

  // Step 5: Generate scholarly report
  const report = await mcp.invoke('ask-gemini', {
    prompt: `Analyze the relationship between these ${jstorArticles.length} scholarly articles
             and ${primarySources.length} primary sources on "${topic}" in the ${era} era.

             Cross-reference data: ${JSON.stringify(crossRef)}

             Identify:
             1. Most-cited primary sources
             2. Scholarly consensus and debates
             3. Under-utilized primary sources
             4. Historiographical trends`,
    model: 'gemini-2.5-pro'
  });

  return report;
}

Why this workflow:

  • Primary source access: Archive.org digitizes rare historical documents
  • OCR automation: Machine-readable text enables computational analysis
  • Citation network: Map which primary sources influence modern scholarship
  • Gap identification: Find overlooked primary sources

PubMed + Institutional Access

Biomedical research requires precise vocabulary (MeSH terms) and institutional PDF access.

// scripts/biomedical-research-workflow.ts

async function biomedicalResearchWorkflow(meshTerms: string[]) {
  // Step 1: Search PubMed with MeSH terms
  const pubmedResults = await mcp.invoke('pubmed_search', {
    mesh_terms: meshTerms,
    filters: {
      article_type: ['Clinical Trial', 'Meta-Analysis', 'Systematic Review'],
      publication_date: '2020-2025',
      species: 'Humans'
    }
  });

  // Step 2: Fetch full metadata
  const pmids = pubmedResults.map(r => r.pmid);
  const metadata = await mcp.invoke('fetch_pubmed_metadata', { pmids });

  // Step 3: Download PDFs via institutional access
  const authSession = await mcp.invoke('authenticate_institution', {
    institution: 'harvard',
    credentials: process.env.INSTITUTION_CREDS
  });

  await mcp.invoke('download_pubmed_pdfs', {
    pmids,
    auth_session: authSession
  });

  // Step 4: Extract outcome measures from clinical trials
  const outcomes = await Promise.all(
    metadata.map(async (paper) => {
      const analysis = await mcp.invoke('ask-gemini', {
        prompt: `Extract primary and secondary outcome measures from this clinical trial abstract:

                 ${paper.abstract}

                 Return structured data: intervention, control, primary_outcome, secondary_outcomes, results.`,
        model: 'gemini-2.5-pro'
      });
      return { ...paper, outcomes: analysis };
    })
  );

  return outcomes;
}

Why this workflow:

  • MeSH precision: Standardized vocabulary reduces false positives
  • Article type filtering: Focus on high-evidence studies
  • Institutional access: Many medical journals require subscriptions
  • Structured extraction: AI extracts outcome measures for meta-analysis

These examples are starting points. Every research domain has unique databases, citation conventions, and workflows. Use the MCP server pattern to build custom integrations matching your specific needs. Start with one automation, validate the approach, then expand systematically.

Citation Style Handling

Different disciplines require different citation formats. Build a universal formatter supporting all major styles.

Universal Citation Formatter

// utils/citation-formatter.ts - APA Style

const apaStyle = {
  book: (c) =>
    `${formatAuthors(c.authors, 'apa')}. (${c.year}). *${c.title}*. ${c.publisher}.`,

  article: (c) =>
    `${formatAuthors(c.authors, 'apa')}. (${c.year}). ${c.title}. *${c.journal}*, *${c.volume}*(${c.issue}), ${c.pages}.`,

  web: (c) =>
    `${formatAuthors(c.authors, 'apa')}. (${c.year}). ${c.title}. ${c.website}. ${c.url}`
};

// formatAuthors('apa') → "Smith, J., & Jones, A."

APA author format: Last name, Initials. Use & before final author.

// utils/citation-formatter.ts - MLA Style

const mlaStyle = {
  book: (c) =>
    `${formatAuthors(c.authors, 'mla')}. *${c.title}*. ${c.publisher}, ${c.year}.`,

  article: (c) =>
    `${formatAuthors(c.authors, 'mla')}. "${c.title}." *${c.journal}*, vol. ${c.volume}, no. ${c.issue}, ${c.year}, pp. ${c.pages}.`,

  web: (c) =>
    `${formatAuthors(c.authors, 'mla')}. "${c.title}." *${c.website}*, ${c.year}, ${c.url}.`
};

// formatAuthors('mla') → "Smith, John, and Anne Jones."

MLA author format: Full names. Use and before final author.

// utils/citation-formatter.ts - Chicago Style

const chicagoStyle = {
  book: (c) =>
    `${formatAuthors(c.authors, 'chicago')}. *${c.title}*. ${c.place}: ${c.publisher}, ${c.year}.`,

  article: (c) =>
    `${formatAuthors(c.authors, 'chicago')}. "${c.title}." *${c.journal}* ${c.volume}, no. ${c.issue} (${c.year}): ${c.pages}.`,

  web: (c) =>
    `${formatAuthors(c.authors, 'chicago')}. "${c.title}." ${c.website}. Accessed ${c.access_date}. ${c.url}.`
};

// formatAuthors('chicago') → "Smith, John, and Anne Jones."

Chicago author format: Full names. Includes access date for web sources.

// utils/citation-formatter.ts - Nature Style

const natureStyle = {
  article: (c) =>
    `${formatAuthors(c.authors, 'nature')} ${c.title}. *${c.journal}* **${c.volume}**, ${c.pages} (${c.year}).`
};

// formatAuthors('nature') → "Smith, J. & Jones, A."

Nature author format: Initials after last name. Ampersand separator. Bold volume number.

// utils/citation-formatter.ts - IEEE Style

const ieeeStyle = {
  article: (c) =>
    `${formatAuthors(c.authors, 'ieee')}, "${c.title}," *${c.journal}*, vol. ${c.volume}, no. ${c.issue}, pp. ${c.pages}, ${c.year}.`
};

// formatAuthors('ieee') → "J. Smith and A. Jones"

IEEE author format: Initials first. Use and separator. Comma-heavy format.

Complete Implementation

// utils/citation-formatter.ts

const citationStyles = {
  apa: { /* ... */ },
  mla: { /* ... */ },
  chicago: { /* ... */ },
  nature: { /* ... */ },
  ieee: { /* ... */ }
};

async function formatCitation(paper: Paper, style: string): Promise<string> {
  const type = detectCitationType(paper);

  if (citationStyles[style] && citationStyles[style][type]) {
    return citationStyles[style][type](paper);
  }

  // Fallback: Use CSL processor for obscure styles
  return await mcp.invoke('format_csl', {
    citation: paper,
    style: style
  });
}

For obscure citation styles (e.g., bluebook for law, discipline-specific journals), use the CSL (Citation Style Language) processor as a fallback. CSL supports over 10,000 citation styles via standardized JSON templates.

Personal Knowledge Management Integration

Connect research automation to your existing note-taking and knowledge management systems.

Obsidian Vault Sync

Generate markdown files with bidirectional links and tags for Obsidian's graph view.

// integrations/obsidian-sync.ts

async function syncToObsidian(papers: Paper[], vaultPath: string) {
  for (const paper of papers) {
    const note = `
---
title: "${paper.title}"
authors: ${paper.authors.join(', ')}
year: ${paper.year}
tags: [${paper.keywords.map(k => `research/${k}`).join(', ')}]
doi: ${paper.doi}
---

# ${paper.title}

**Authors**: ${paper.authors.join(', ')}
**Year**: ${paper.year}
**DOI**: [${paper.doi}](https://doi.org/${paper.doi})

## Abstract

${paper.abstract}

## Key Findings

${paper.key_findings.map(f => `- ${f}`).join('\n')}

## Methodology

${paper.methodology_summary}

## My Notes

<!-- Add your notes here -->

## Related Papers

${paper.cited_by.map(c => `- [[${c.title}]]`).join('\n')}

## PDF

[Local PDF](file://${paper.pdf_path})
`;

    await fs.writeFile(
      `${vaultPath}/Research/${sanitizeFilename(paper.title)}.md`,
      note
    );
  }

  console.log(`Synced ${papers.length} papers to Obsidian vault`);
}

Key features:

  • YAML frontmatter: Metadata for Obsidian's search and filtering
  • Wikilinks: [[Paper Title]] creates bidirectional links in graph view
  • Hierarchical tags: research/machine-learning, research/nlp
  • File links: Direct links to local PDFs for annotation

Notion Database Sync

Create structured Notion database entries with rich text blocks.

// integrations/notion-sync.ts

async function syncToNotion(papers: Paper[], notionToken: string, databaseId: string) {
  const notion = new Client({ auth: notionToken });

  for (const paper of papers) {
    await notion.pages.create({
      parent: { database_id: databaseId },
      properties: {
        Title: { title: [{ text: { content: paper.title } }] },
        Authors: { rich_text: [{ text: { content: paper.authors.join(', ') } }] },
        Year: { number: paper.year },
        DOI: { url: `https://doi.org/${paper.doi}` },
        Keywords: { multi_select: paper.keywords.map(k => ({ name: k })) },
        Status: { select: { name: 'To Read' } }
      },
      children: [
        {
          object: 'block',
          type: 'heading_2',
          heading_2: { rich_text: [{ text: { content: 'Abstract' } }] }
        },
        {
          object: 'block',
          type: 'paragraph',
          paragraph: { rich_text: [{ text: { content: paper.abstract } }] }
        },
        // Additional blocks for methodology, findings, etc.
      ]
    });
  }
}

Key features:

  • Database properties: Searchable, filterable metadata fields
  • Multi-select tags: Keywords for filtering and grouping
  • Status tracking: To Read, Reading, Completed, Archived
  • Rich text blocks: Formatted content with headings, lists, quotes

Zotero Library Import

Generate Zotero-compatible BibTeX or RIS files for reference management.

// integrations/zotero-sync.ts

async function syncToZotero(papers: Paper[], zoteroLibrary: string) {
  const bibtexEntries = papers.map(paper => `
@article{${generateCiteKey(paper)},
  title = {${paper.title}},
  author = {${paper.authors.join(' and ')}},
  journal = {${paper.journal}},
  year = {${paper.year}},
  volume = {${paper.volume}},
  number = {${paper.issue}},
  pages = {${paper.pages}},
  doi = {${paper.doi}},
  file = {${paper.pdf_path}}
}
  `).join('\n\n');

  await fs.writeFile(
    `${zoteroLibrary}/auto-imported.bib`,
    bibtexEntries
  );

  console.log(`Generated BibTeX file with ${papers.length} entries`);
}

function generateCiteKey(paper: Paper): string {
  const firstAuthor = paper.authors[0].split(' ').pop().toLowerCase();
  return `${firstAuthor}${paper.year}${paper.title.split(' ')[0].toLowerCase()}`;
}

Key features:

  • BibTeX format: Universal citation exchange format
  • Auto-generated cite keys: smith2023transformer
  • PDF attachments: Links to local PDFs for Zotero's PDF viewer
  • Library import: Zotero auto-detects and imports new .bib files

Choose your knowledge management system based on workflow preferences:

  • Obsidian: Local-first, plain text, graph view for concept exploration
  • Notion: Cloud-based, structured databases, team collaboration
  • Zotero: Academic-focused, citation management, Word/LaTeX integration

All three can coexist. Export to all formats and use each for different purposes.

Advanced Orchestration Patterns

Combine automation primitives into sophisticated research workflows.

Pattern 1: Incremental Research Updates

Keep research projects current with daily automated updates.

// scripts/incremental-update.ts

async function dailyIncrementalUpdate(projectName: string) {
  // Load project state
  const project = await loadProject(projectName);
  const lastUpdate = project.last_update;

  // Search for new papers since last update
  const newPapers = await mcp.invoke('search_since', {
    queries: project.search_queries,
    date_from: lastUpdate,
    databases: project.databases
  });

  console.log(`Found ${newPapers.length} new papers since ${lastUpdate}`);

  // AI-powered relevance check against existing collection
  const relevant = await mcp.invoke('ask-gemini', {
    prompt: `Which of these new papers are relevant to the existing collection?

             Existing themes: ${JSON.stringify(project.themes)}
             Existing paper titles: ${project.papers.map(p => p.title)}

             New papers: ${JSON.stringify(newPapers)}

             Return only highly relevant papers (score >= 8/10).`,
    model: 'gemini-2.5-pro'
  });

  // Auto-download and integrate
  for (const paper of relevant.papers) {
    await downloadAndIntegrate(paper, projectName);
  }

  // Update project state
  project.last_update = new Date();
  project.papers.push(...relevant.papers);
  await saveProject(project);

  // Generate update notification
  return {
    papers_added: relevant.papers.length,
    summary: relevant.summary,
    action_required: relevant.papers.filter(p => p.score >= 9.5).length > 0
  };
}

// Run daily via cron
// 0 9 * * * node scripts/incremental-update.ts "transformer-nlp-review"

Automation workflow:

  1. Daily cron job: Runs every morning at 9 AM
  2. Incremental search: Only papers published since last update
  3. AI relevance filtering: Claude/Gemini scores new papers against existing themes
  4. Auto-integration: High-scoring papers downloaded and added automatically
  5. Notification: Email if critical papers (9.5/10+) are found

Benefits:

  • Always current: Never miss recent publications
  • Low noise: AI filters out irrelevant papers
  • Minimal manual work: Only review high-priority papers

Pattern 2: Multi-Project Citation Network Analysis

Analyze citation relationships across multiple research projects.

// scripts/citation-network.ts

async function buildCitationNetwork(projects: string[]) {
  // Load all papers from multiple projects
  const allPapers = [];
  for (const proj of projects) {
    const papers = await loadProjectPapers(proj);
    allPapers.push(...papers.map(p => ({ ...p, project: proj })));
  }

  // Build citation graph
  const graph = {
    nodes: allPapers.map(p => ({
      id: p.doi,
      title: p.title,
      project: p.project,
      year: p.year
    })),
    edges: []
  };

  // Extract citation relationships
  for (const paper of allPapers) {
    for (const cited of paper.references) {
      const citedPaper = allPapers.find(p => p.doi === cited.doi);
      if (citedPaper) {
        graph.edges.push({
          source: paper.doi,
          target: cited.doi,
          cross_project: paper.project !== citedPaper.project
        });
      }
    }
  }

  // Analyze network
  const analysis = await mcp.invoke('ask-gemini', {
    prompt: `Analyze this citation network across ${projects.length} research projects.

             Identify:
             1. Hub papers (highly cited across projects)
             2. Bridge papers (connecting different projects)
             3. Isolated clusters (papers that don't cite each other)
             4. Cross-project citation patterns

             Graph data: ${JSON.stringify(graph)}`,
    model: 'gemini-2.5-pro'
  });

  // Visualize (export to Gephi format)
  await exportToGephi(graph, 'citation-network.gexf');

  return analysis;
}

Use cases:

  • Identify foundational papers: Which papers are cited across all projects?
  • Find research gaps: Isolated clusters suggest under-connected areas
  • Discover interdisciplinary bridges: Papers connecting disparate fields
  • Visualize knowledge structure: Gephi/Cytoscape for network visualization

Pattern 3: Collaborative Research with Shared Citation Database

Enable team research with shared citation management.

// scripts/collaborative-research.ts

async function setupSharedResearchSpace(teamMembers: string[], projectName: string) {
  // Create shared Supabase/PostgreSQL database
  const db = await initializeSharedDatabase(projectName);

  // Configure MCP server with shared database
  const mcpConfig = {
    mcpServers: {
      'shared-citations': {
        command: 'node',
        args: ['./mcp-servers/shared-citation-server.js'],
        env: {
          DATABASE_URL: db.connectionString,
          PROJECT_ID: projectName,
          TEAM_MEMBERS: teamMembers.join(',')
        }
      }
    }
  };

  // Each team member can now:
  // 1. Add papers to shared collection
  // 2. See who added what (provenance tracking)
  // 3. Annotate papers (personal and shared notes)
  // 4. Generate bibliographies from shared pool

  // Generate invitation links
  const invitations = teamMembers.map(member => ({
    email: member,
    invitation_link: generateInvitationLink(db, member, projectName),
    mcp_config: mcpConfig
  }));

  return invitations;
}

Features:

  • Shared citation pool: All team members access same database
  • Provenance tracking: Who added which papers?
  • Personal + shared notes: Private annotations + team discussions
  • Conflict resolution: Last-write-wins or merge strategies
  • Bibliography generation: Export team's collective references

Implementation:

  • Database: Supabase (PostgreSQL + real-time subscriptions)
  • MCP server: Custom server connecting to shared database
  • Authentication: Each team member has unique API key
  • Sync: Real-time updates via Supabase subscriptions

Summary

Customization transforms a generic research system into a domain-specific research assistant.

Key principles:

  1. Build custom MCP servers for domain databases (PubMed, arXiv, JSTOR)
  2. Adapt workflows to discipline-specific practices (code artifacts, primary sources, clinical trials)
  3. Support multiple citation styles with universal formatters
  4. Integrate knowledge management tools (Obsidian, Notion, Zotero)
  5. Orchestrate advanced patterns (incremental updates, citation networks, team collaboration)

Next steps:

  • Identify your domain's unique databases and conventions
  • Build one custom MCP server as a proof-of-concept
  • Validate the workflow with a small research project
  • Expand systematically as needs arise

The system is infinitely extensible. Start simple, iterate based on real research needs.