xps

Domain-specific tools, custom MCP servers, citation styles, and knowledge management integrations

The power of this research system lies in customization. Every domain has unique databases, citation styles, and workflows. Here's how to adapt the system to your specific needs.

Custom MCP Tools for Your Domain

Build custom MCP servers to integrate domain-specific databases and workflows. The pattern is always the same: define tools, implement handlers, register with Claude Desktop.

Example: Biology Research - PubMed MCP Server

// mcp-servers/pubmed-server/server.ts

import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { chromium } from 'playwright';

const server = new Server({
  name: 'pubmed-tools',
  version: '1.0.0'
}, {
  capabilities: { tools: {} }
});

server.setRequestHandler('tools/list', async () => {
  return {
    tools: [
      {
        name: 'pubmed_search',
        description: 'Search PubMed with MeSH terms and filters',
        inputSchema: {
          type: 'object',
          properties: {
            mesh_terms: {
              type: 'array',
              items: { type: 'string' },
              description: 'Medical Subject Headings (MeSH) terms'
            },
            text_query: {
              type: 'string',
              description: 'Free-text search query'
            },
            filters: {
              type: 'object',
              properties: {
                article_type: { type: 'array', items: { type: 'string' } },
                publication_date: { type: 'string' },
                species: { type: 'string' }
              }
            }
          },
          required: ['mesh_terms']
        }
      },
      {
        name: 'fetch_pubmed_metadata',
        description: 'Fetch complete metadata for PMIDs',
        inputSchema: {
          type: 'object',
          properties: {
            pmids: { type: 'array', items: { type: 'string' } }
          },
          required: ['pmids']
        }
      },
      {
        name: 'download_pubmed_pdfs',
        description: 'Download full-text PDFs via institutional access',
        inputSchema: {
          type: 'object',
          properties: {
            pmids: { type: 'array', items: { type: 'string' } },
            auth_session: { type: 'string' }
          },
          required: ['pmids', 'auth_session']
        }
      }
    ]
  };
});

// Implementation details: search execution, metadata parsing, PDF downloads

Key design principles:

MeSH term support: Medical researchers use standardized vocabulary
Filter-specific fields: Article type, publication date, species filters
Institutional access: Authenticate once, reuse session for batch downloads
Metadata completeness: PMID, authors, affiliations, funding sources

Adapt this pattern to any domain database: arXiv (physics/CS), JSTOR (humanities), LexisNexis (law), or internal corporate knowledge bases. The MCP server abstracts authentication, rate limiting, and parsing complexity away from your research workflow.

Domain-Specific Automation Examples

Different academic disciplines require different workflows. Here are three complete examples.

arXiv + GitHub Integration

Computer science research increasingly includes code artifacts. This workflow discovers papers, finds associated repositories, and analyzes code quality.

// scripts/cs-research-workflow.ts

async function csResearchWorkflow(topic: string) {
  // Step 1: Search arXiv
  const arxivPapers = await mcp.invoke('arxiv_search', {
    query: topic,
    category: 'cs.LG', // Machine Learning
    max_results: 100
  });

  // Step 2: For each paper, find associated GitHub repos
  const papersWithCode = await Promise.all(
    arxivPapers.map(async (paper) => {
      const github = await mcp.invoke('github_search', {
        query: `${paper.title} ${paper.authors[0]}`,
        language: 'python'
      });
      return { ...paper, github_repos: github.repos };
    })
  );

  // Step 3: Clone and analyze code quality
  for (const paper of papersWithCode) {
    if (paper.github_repos.length > 0) {
      const repo = paper.github_repos[0];
      await mcp.invoke('clone_repo', { url: repo.url });

      // Analyze code quality
      const analysis = await mcp.invoke('analyze_code', {
        repo_path: repo.local_path,
        metrics: ['test_coverage', 'documentation', 'activity']
      });

      paper.code_quality = analysis;
    }
  }

  // Step 4: Rank by paper quality + code availability
  const ranked = rankByQuality(papersWithCode, {
    weights: {
      citations: 0.3,
      code_availability: 0.3,
      code_quality: 0.2,
      recency: 0.2
    }
  });

  return ranked;
}

Why this workflow:

Code reproducibility: Papers without code are harder to validate
Implementation quality: Well-documented, tested code indicates rigor
Community adoption: Active repos signal practical impact

JSTOR + Archive.org Integration

Humanities research combines scholarly articles with primary sources. This workflow cross-references modern scholarship with historical documents.

// scripts/humanities-research-workflow.ts

async function humanitiesResearchWorkflow(topic: string, era: string) {
  // Step 1: Search JSTOR for scholarly articles
  const jstorArticles = await mcp.invoke('jstor_search', {
    query: topic,
    date_range: era,
    disciplines: ['History', 'Literature', 'Philosophy']
  });

  // Step 2: Search Archive.org for primary sources
  const primarySources = await mcp.invoke('archive_org_search', {
    query: topic,
    mediatype: 'texts',
    year_range: era
  });

  // Step 3: Download and OCR primary sources
  for (const source of primarySources) {
    await mcp.invoke('download_archive_item', {
      identifier: source.identifier
    });

    // OCR if needed
    if (source.requires_ocr) {
      await mcp.invoke('ocr_document', {
        input: source.local_path,
        language: 'eng'
      });
    }
  }

  // Step 4: Cross-reference: which scholars cite which primary sources?
  const crossRef = await analyzeSourceCitations(jstorArticles, primarySources);

  // Step 5: Generate scholarly report
  const report = await mcp.invoke('ask-gemini', {
    prompt: `Analyze the relationship between these ${jstorArticles.length} scholarly articles
             and ${primarySources.length} primary sources on "${topic}" in the ${era} era.

             Cross-reference data: ${JSON.stringify(crossRef)}

             Identify:
             1. Most-cited primary sources
             2. Scholarly consensus and debates
             3. Under-utilized primary sources
             4. Historiographical trends`,
    model: 'gemini-2.5-pro'
  });

  return report;
}

Why this workflow:

Primary source access: Archive.org digitizes rare historical documents
OCR automation: Machine-readable text enables computational analysis
Citation network: Map which primary sources influence modern scholarship
Gap identification: Find overlooked primary sources

PubMed + Institutional Access

Biomedical research requires precise vocabulary (MeSH terms) and institutional PDF access.

// scripts/biomedical-research-workflow.ts

async function biomedicalResearchWorkflow(meshTerms: string[]) {
  // Step 1: Search PubMed with MeSH terms
  const pubmedResults = await mcp.invoke('pubmed_search', {
    mesh_terms: meshTerms,
    filters: {
      article_type: ['Clinical Trial', 'Meta-Analysis', 'Systematic Review'],
      publication_date: '2020-2025',
      species: 'Humans'
    }
  });

  // Step 2: Fetch full metadata
  const pmids = pubmedResults.map(r => r.pmid);
  const metadata = await mcp.invoke('fetch_pubmed_metadata', { pmids });

  // Step 3: Download PDFs via institutional access
  const authSession = await mcp.invoke('authenticate_institution', {
    institution: 'harvard',
    credentials: process.env.INSTITUTION_CREDS
  });

  await mcp.invoke('download_pubmed_pdfs', {
    pmids,
    auth_session: authSession
  });

  // Step 4: Extract outcome measures from clinical trials
  const outcomes = await Promise.all(
    metadata.map(async (paper) => {
      const analysis = await mcp.invoke('ask-gemini', {
        prompt: `Extract primary and secondary outcome measures from this clinical trial abstract:

                 ${paper.abstract}

                 Return structured data: intervention, control, primary_outcome, secondary_outcomes, results.`,
        model: 'gemini-2.5-pro'
      });
      return { ...paper, outcomes: analysis };
    })
  );

  return outcomes;
}

Why this workflow:

MeSH precision: Standardized vocabulary reduces false positives
Article type filtering: Focus on high-evidence studies
Institutional access: Many medical journals require subscriptions
Structured extraction: AI extracts outcome measures for meta-analysis

These examples are starting points. Every research domain has unique databases, citation conventions, and workflows. Use the MCP server pattern to build custom integrations matching your specific needs. Start with one automation, validate the approach, then expand systematically.

Citation Style Handling

Different disciplines require different citation formats. Build a universal formatter supporting all major styles.

Universal Citation Formatter

// utils/citation-formatter.ts - APA Style

const apaStyle = {
  book: (c) =>
    `${formatAuthors(c.authors, 'apa')}. (${c.year}). *${c.title}*. ${c.publisher}.`,

  article: (c) =>
    `${formatAuthors(c.authors, 'apa')}. (${c.year}). ${c.title}. *${c.journal}*, *${c.volume}*(${c.issue}), ${c.pages}.`,

  web: (c) =>
    `${formatAuthors(c.authors, 'apa')}. (${c.year}). ${c.title}. ${c.website}. ${c.url}`
};

// formatAuthors('apa') → "Smith, J., & Jones, A."

APA author format: Last name, Initials. Use & before final author.

// utils/citation-formatter.ts - MLA Style

const mlaStyle = {
  book: (c) =>
    `${formatAuthors(c.authors, 'mla')}. *${c.title}*. ${c.publisher}, ${c.year}.`,

  article: (c) =>
    `${formatAuthors(c.authors, 'mla')}. "${c.title}." *${c.journal}*, vol. ${c.volume}, no. ${c.issue}, ${c.year}, pp. ${c.pages}.`,

  web: (c) =>
    `${formatAuthors(c.authors, 'mla')}. "${c.title}." *${c.website}*, ${c.year}, ${c.url}.`
};

// formatAuthors('mla') → "Smith, John, and Anne Jones."

MLA author format: Full names. Use and before final author.

// utils/citation-formatter.ts - Chicago Style

const chicagoStyle = {
  book: (c) =>
    `${formatAuthors(c.authors, 'chicago')}. *${c.title}*. ${c.place}: ${c.publisher}, ${c.year}.`,

  article: (c) =>
    `${formatAuthors(c.authors, 'chicago')}. "${c.title}." *${c.journal}* ${c.volume}, no. ${c.issue} (${c.year}): ${c.pages}.`,

  web: (c) =>
    `${formatAuthors(c.authors, 'chicago')}. "${c.title}." ${c.website}. Accessed ${c.access_date}. ${c.url}.`
};

// formatAuthors('chicago') → "Smith, John, and Anne Jones."

Chicago author format: Full names. Includes access date for web sources.

// utils/citation-formatter.ts - Nature Style

const natureStyle = {
  article: (c) =>
    `${formatAuthors(c.authors, 'nature')} ${c.title}. *${c.journal}* **${c.volume}**, ${c.pages} (${c.year}).`
};

// formatAuthors('nature') → "Smith, J. & Jones, A."

Nature author format: Initials after last name. Ampersand separator. Bold volume number.

// utils/citation-formatter.ts - IEEE Style

const ieeeStyle = {
  article: (c) =>
    `${formatAuthors(c.authors, 'ieee')}, "${c.title}," *${c.journal}*, vol. ${c.volume}, no. ${c.issue}, pp. ${c.pages}, ${c.year}.`
};

// formatAuthors('ieee') → "J. Smith and A. Jones"

IEEE author format: Initials first. Use and separator. Comma-heavy format.

Complete Implementation

// utils/citation-formatter.ts

const citationStyles = {
  apa: { /* ... */ },
  mla: { /* ... */ },
  chicago: { /* ... */ },
  nature: { /* ... */ },
  ieee: { /* ... */ }
};

async function formatCitation(paper: Paper, style: string): Promise<string> {
  const type = detectCitationType(paper);

  if (citationStyles[style] && citationStyles[style][type]) {
    return citationStyles[style][type](paper);
  }

  // Fallback: Use CSL processor for obscure styles
  return await mcp.invoke('format_csl', {
    citation: paper,
    style: style
  });
}

For obscure citation styles (e.g., bluebook for law, discipline-specific journals), use the CSL (Citation Style Language) processor as a fallback. CSL supports over 10,000 citation styles via standardized JSON templates.

Personal Knowledge Management Integration

Connect research automation to your existing note-taking and knowledge management systems.

Obsidian Vault Sync

Generate markdown files with bidirectional links and tags for Obsidian's graph view.

// integrations/obsidian-sync.ts

async function syncToObsidian(papers: Paper[], vaultPath: string) {
  for (const paper of papers) {
    const note = `
---
title: "${paper.title}"
authors: ${paper.authors.join(', ')}
year: ${paper.year}
tags: [${paper.keywords.map(k => `research/${k}`).join(', ')}]
doi: ${paper.doi}
---

# ${paper.title}

**Authors**: ${paper.authors.join(', ')}
**Year**: ${paper.year}
**DOI**: [${paper.doi}](https://doi.org/${paper.doi})

## Abstract

${paper.abstract}

## Key Findings

${paper.key_findings.map(f => `- ${f}`).join('\n')}

## Methodology

${paper.methodology_summary}

## My Notes

<!-- Add your notes here -->

## Related Papers

${paper.cited_by.map(c => `- [[${c.title}]]`).join('\n')}

## PDF

[Local PDF](file://${paper.pdf_path})
`;

    await fs.writeFile(
      `${vaultPath}/Research/${sanitizeFilename(paper.title)}.md`,
      note
    );
  }

  console.log(`Synced ${papers.length} papers to Obsidian vault`);
}

Key features:

YAML frontmatter: Metadata for Obsidian's search and filtering
Wikilinks: [[Paper Title]] creates bidirectional links in graph view
Hierarchical tags: research/machine-learning, research/nlp
File links: Direct links to local PDFs for annotation

Notion Database Sync

Create structured Notion database entries with rich text blocks.

// integrations/notion-sync.ts

async function syncToNotion(papers: Paper[], notionToken: string, databaseId: string) {
  const notion = new Client({ auth: notionToken });

  for (const paper of papers) {
    await notion.pages.create({
      parent: { database_id: databaseId },
      properties: {
        Title: { title: [{ text: { content: paper.title } }] },
        Authors: { rich_text: [{ text: { content: paper.authors.join(', ') } }] },
        Year: { number: paper.year },
        DOI: { url: `https://doi.org/${paper.doi}` },
        Keywords: { multi_select: paper.keywords.map(k => ({ name: k })) },
        Status: { select: { name: 'To Read' } }
      },
      children: [
        {
          object: 'block',
          type: 'heading_2',
          heading_2: { rich_text: [{ text: { content: 'Abstract' } }] }
        },
        {
          object: 'block',
          type: 'paragraph',
          paragraph: { rich_text: [{ text: { content: paper.abstract } }] }
        },
        // Additional blocks for methodology, findings, etc.
      ]
    });
  }
}

Key features:

Database properties: Searchable, filterable metadata fields
Multi-select tags: Keywords for filtering and grouping
Status tracking: To Read, Reading, Completed, Archived
Rich text blocks: Formatted content with headings, lists, quotes

Zotero Library Import

Generate Zotero-compatible BibTeX or RIS files for reference management.

// integrations/zotero-sync.ts

async function syncToZotero(papers: Paper[], zoteroLibrary: string) {
  const bibtexEntries = papers.map(paper => `
@article{${generateCiteKey(paper)},
  title = {${paper.title}},
  author = {${paper.authors.join(' and ')}},
  journal = {${paper.journal}},
  year = {${paper.year}},
  volume = {${paper.volume}},
  number = {${paper.issue}},
  pages = {${paper.pages}},
  doi = {${paper.doi}},
  file = {${paper.pdf_path}}
}
  `).join('\n\n');

  await fs.writeFile(
    `${zoteroLibrary}/auto-imported.bib`,
    bibtexEntries
  );

  console.log(`Generated BibTeX file with ${papers.length} entries`);
}

function generateCiteKey(paper: Paper): string {
  const firstAuthor = paper.authors[0].split(' ').pop().toLowerCase();
  return `${firstAuthor}${paper.year}${paper.title.split(' ')[0].toLowerCase()}`;
}

Key features:

BibTeX format: Universal citation exchange format
Auto-generated cite keys: smith2023transformer
PDF attachments: Links to local PDFs for Zotero's PDF viewer
Library import: Zotero auto-detects and imports new .bib files

Choose your knowledge management system based on workflow preferences:

Obsidian: Local-first, plain text, graph view for concept exploration
Notion: Cloud-based, structured databases, team collaboration
Zotero: Academic-focused, citation management, Word/LaTeX integration

All three can coexist. Export to all formats and use each for different purposes.

Advanced Orchestration Patterns

Combine automation primitives into sophisticated research workflows.

Pattern 1: Incremental Research Updates

Keep research projects current with daily automated updates.

// scripts/incremental-update.ts

async function dailyIncrementalUpdate(projectName: string) {
  // Load project state
  const project = await loadProject(projectName);
  const lastUpdate = project.last_update;

  // Search for new papers since last update
  const newPapers = await mcp.invoke('search_since', {
    queries: project.search_queries,
    date_from: lastUpdate,
    databases: project.databases
  });

  console.log(`Found ${newPapers.length} new papers since ${lastUpdate}`);

  // AI-powered relevance check against existing collection
  const relevant = await mcp.invoke('ask-gemini', {
    prompt: `Which of these new papers are relevant to the existing collection?

             Existing themes: ${JSON.stringify(project.themes)}
             Existing paper titles: ${project.papers.map(p => p.title)}

             New papers: ${JSON.stringify(newPapers)}

             Return only highly relevant papers (score >= 8/10).`,
    model: 'gemini-2.5-pro'
  });

  // Auto-download and integrate
  for (const paper of relevant.papers) {
    await downloadAndIntegrate(paper, projectName);
  }

  // Update project state
  project.last_update = new Date();
  project.papers.push(...relevant.papers);
  await saveProject(project);

  // Generate update notification
  return {
    papers_added: relevant.papers.length,
    summary: relevant.summary,
    action_required: relevant.papers.filter(p => p.score >= 9.5).length > 0
  };
}

// Run daily via cron
// 0 9 * * * node scripts/incremental-update.ts "transformer-nlp-review"

Automation workflow:

Daily cron job: Runs every morning at 9 AM
Incremental search: Only papers published since last update
AI relevance filtering: Claude/Gemini scores new papers against existing themes
Auto-integration: High-scoring papers downloaded and added automatically
Notification: Email if critical papers (9.5/10+) are found

Benefits:

Always current: Never miss recent publications
Low noise: AI filters out irrelevant papers
Minimal manual work: Only review high-priority papers

Pattern 2: Multi-Project Citation Network Analysis

Analyze citation relationships across multiple research projects.

// scripts/citation-network.ts

async function buildCitationNetwork(projects: string[]) {
  // Load all papers from multiple projects
  const allPapers = [];
  for (const proj of projects) {
    const papers = await loadProjectPapers(proj);
    allPapers.push(...papers.map(p => ({ ...p, project: proj })));
  }

  // Build citation graph
  const graph = {
    nodes: allPapers.map(p => ({
      id: p.doi,
      title: p.title,
      project: p.project,
      year: p.year
    })),
    edges: []
  };

  // Extract citation relationships
  for (const paper of allPapers) {
    for (const cited of paper.references) {
      const citedPaper = allPapers.find(p => p.doi === cited.doi);
      if (citedPaper) {
        graph.edges.push({
          source: paper.doi,
          target: cited.doi,
          cross_project: paper.project !== citedPaper.project
        });
      }
    }
  }

  // Analyze network
  const analysis = await mcp.invoke('ask-gemini', {
    prompt: `Analyze this citation network across ${projects.length} research projects.

             Identify:
             1. Hub papers (highly cited across projects)
             2. Bridge papers (connecting different projects)
             3. Isolated clusters (papers that don't cite each other)
             4. Cross-project citation patterns

             Graph data: ${JSON.stringify(graph)}`,
    model: 'gemini-2.5-pro'
  });

  // Visualize (export to Gephi format)
  await exportToGephi(graph, 'citation-network.gexf');

  return analysis;
}

Use cases:

Identify foundational papers: Which papers are cited across all projects?
Find research gaps: Isolated clusters suggest under-connected areas
Discover interdisciplinary bridges: Papers connecting disparate fields
Visualize knowledge structure: Gephi/Cytoscape for network visualization

Pattern 3: Collaborative Research with Shared Citation Database

Enable team research with shared citation management.

// scripts/collaborative-research.ts

async function setupSharedResearchSpace(teamMembers: string[], projectName: string) {
  // Create shared Supabase/PostgreSQL database
  const db = await initializeSharedDatabase(projectName);

  // Configure MCP server with shared database
  const mcpConfig = {
    mcpServers: {
      'shared-citations': {
        command: 'node',
        args: ['./mcp-servers/shared-citation-server.js'],
        env: {
          DATABASE_URL: db.connectionString,
          PROJECT_ID: projectName,
          TEAM_MEMBERS: teamMembers.join(',')
        }
      }
    }
  };

  // Each team member can now:
  // 1. Add papers to shared collection
  // 2. See who added what (provenance tracking)
  // 3. Annotate papers (personal and shared notes)
  // 4. Generate bibliographies from shared pool

  // Generate invitation links
  const invitations = teamMembers.map(member => ({
    email: member,
    invitation_link: generateInvitationLink(db, member, projectName),
    mcp_config: mcpConfig
  }));

  return invitations;
}

Features:

Shared citation pool: All team members access same database
Provenance tracking: Who added which papers?
Personal + shared notes: Private annotations + team discussions
Conflict resolution: Last-write-wins or merge strategies
Bibliography generation: Export team's collective references

Implementation:

Database: Supabase (PostgreSQL + real-time subscriptions)
MCP server: Custom server connecting to shared database
Authentication: Each team member has unique API key
Sync: Real-time updates via Supabase subscriptions

Summary

Customization transforms a generic research system into a domain-specific research assistant.

Key principles:

Build custom MCP servers for domain databases (PubMed, arXiv, JSTOR)
Adapt workflows to discipline-specific practices (code artifacts, primary sources, clinical trials)
Support multiple citation styles with universal formatters
Integrate knowledge management tools (Obsidian, Notion, Zotero)
Orchestrate advanced patterns (incremental updates, citation networks, team collaboration)

Next steps:

Identify your domain's unique databases and conventions
Build one custom MCP server as a proof-of-concept
Validate the workflow with a small research project
Expand systematically as needs arise

The system is infinitely extensible. Start simple, iterate based on real research needs.

Customization Patterns: Making It Yours

Custom MCP Tools for Your Domain

Example: Biology Research - PubMed MCP Server

Domain-Specific Automation Examples

arXiv + GitHub Integration

JSTOR + Archive.org Integration

PubMed + Institutional Access

Citation Style Handling

Universal Citation Formatter

Complete Implementation

Personal Knowledge Management Integration

Obsidian Vault Sync

Notion Database Sync

Zotero Library Import

Advanced Orchestration Patterns

Pattern 1: Incremental Research Updates

Pattern 2: Multi-Project Citation Network Analysis

Pattern 3: Collaborative Research with Shared Citation Database

Summary

Table of Contents