CAT
/MCP
SkillsMCPMarketplacesDigestToolsAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Sales & MarketingWeb & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web Crawling
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Cross AI Tools

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Tools
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic

Scholar MCP Server

pvliesdonk/scholar-mcp
21 toolsSTDIO, HTTPregistry active
Summary

A FastMCP server that unifies scholarly search across papers, patents, books, and standards through Semantic Scholar, EPO OPS, Open Library, and standards bodies like NIST and IETF. You get full citation graphs with BFS traversal and shortest-path discovery, patent family lookups with NPL resolution, ISBN-based book enrichment, and RFC/ISO identifier resolution. It automatically enriches results with OpenAlex venue data and converts open-access PDFs to Markdown via docling-serve. Reach for this when you need to trace citation paths across domains, generate bibliographies with proper venue metadata, or cross-reference prior art that spans academic papers, patent claims, technical standards, and published books in a single workflow.

CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

Tools

Public tool metadata for what this MCP can expose to an agent.

1 tools
search_papersSearch Google Scholar for academic papers, books, theses, and citations. Returns organic results with title, authors, snippet, citation count, and PDF links.8 params

Search Google Scholar for academic papers, books, theses, and citations. Returns organic results with title, authors, snippet, citation count, and PDF links.

Parameters* required
qstring
Search query (paper title, author, topic, keywords).
hlstring
Language code (default 'en').
numinteger
Results per page (1-20, default 10).
startinteger
Result offset for pagination (multiples of num).
as_sdtstring
Search filter. '0,5' = articles excluding patents (default). '0,33' = case law. '4' = patents only.
as_yhiinteger
Latest publication year (inclusive).
as_ylointeger
Earliest publication year (inclusive). E.g. 2020.
scisbdinteger
Sort. 1 = include only abstracts; 2 = sort by date (most recent).one of 1 · 2

scholar-mcp

CI codecov PyPI Python License Docker Docs llms.txt Template

A FastMCP server for the scholarly citation landscape — papers, patents, books, and standards — giving LLMs a unified way to search, cross-reference, and retrieve prior art across all four source types via Semantic Scholar, EPO Open Patent Services, Open Library, and standards bodies (NIST, IETF, W3C, ETSI), with OpenAlex enrichment and optional docling-serve PDF/full-text conversion.

Documentation | PyPI | Docker

Features

Source domains

  • Papers — full-text search with year/venue/field/citation filters; single-paper lookup by DOI, S2 ID, arXiv ID, ACM ID, or PubMed ID; author profile and name search; forward citations, backward references, BFS graph traversal, shortest-path bridge discovery; recommendations from positive/negative examples; BibTeX/CSL-JSON/RIS citation generation with OpenAlex venue enrichment.
  • Patents — search across 100+ patent offices via EPO OPS with CPC/applicant/inventor/jurisdiction filters; bibliographic, claims, description, family, legal, and citations sections; NPL-to-paper resolution via Semantic Scholar and paper-to-patent citation discovery. EPO credentials are optional — other domains work without them.
  • Books — search by title/author/keywords via Open Library (no API key required); lookup by ISBN-10/13, Open Library work ID, or edition ID; subject-based recommendations sorted by popularity; Google Books excerpts and preview links; WorldCat permalinks for library discovery; cover image caching. Papers with an ISBN in externalIds are automatically enriched with publisher, edition, cover URL, and subject data from Open Library.
  • Standards — identifier resolution, search, and metadata retrieval for NIST, IETF, W3C, and ETSI standards, with optional full-text fetch and Markdown conversion via docling. Tier 2 ISO, IEC, IEEE, Common Criteria (CC), and CEN/CENELEC metadata (including ISO/IEC/IEEE joint standards and the CC ↔ ISO/IEC 15408 cross-link) is synced locally via sync-standards. ISO, IEC, IEEE have a live-fetch fallback for unsynced identifiers; CC and CEN have no live API and require a sync first. Citations matching standards patterns (RFC, ISO, NIST SP, IEEE, EN, CC) are automatically enriched with structured standard_metadata including identifier, title, body, status, and full-text URL when available (see docs/guides/standards.md).

Cross-cutting

  • Enrichment pipeline — phased enrichment from multiple sources: OpenAlex (OA status, affiliations, funders, concepts), CrossRef (publisher, page ranges, container titles), Google Books (preview links, excerpts), and Open Library (book metadata). Runs automatically on paper and book results.
  • PDF conversion — download open-access PDFs and convert to Markdown via docling-serve, with optional VLM enrichment for formulas and figures; automatic fallback to ArXiv, PubMed Central, and Unpaywall when Semantic Scholar has no OA link; direct URL download for PDFs found elsewhere.
  • Intelligent caching — SQLite-backed cache with per-table TTLs (30 days for papers/authors, 7 days for citations/references) and identifier aliasing.
  • Authentication — bearer token, OIDC (OAuth 2.1), or both simultaneously (multi-auth).
  • Multi-transport — stdio (Claude Desktop), HTTP (streamable-http), and SSE transports.
  • Linux packages — .deb and .rpm packages with systemd service and security hardening.

Coverage by domain

Per-domain depth is uneven. Papers currently have the richest tool surface (citation graph, recommendations, cross-referencing to all three other domains); standards are the leanest. That reflects public data availability, not a value hierarchy — writing a paper typically needs all four source types for citations and prior art. Parity work is tracked in GitHub issues and milestones; the roadmap shows intent, not a completeness commitment.

What you can do with it

With this server mounted in an MCP client (Claude, etc.), you can:

  • Survey a field — "Find the 20 most-cited papers on graph neural networks from 2020–2024 and draft a literature review outline." Composes search_papers + get_citations + enrich_paper.
  • Trace a citation path — "What's the shortest citation path from 'Attention is All You Need' to 'RLHF for dialogue agents'?" Uses find_bridge_papers + get_citation_graph.
  • Cross-reference prior art — "For this patent family, list academic papers it cites and any books or standards that show up in the description." Composes get_patent + batch_resolve + standards/book enrichment.
  • Generate a bibliography — "Emit BibTeX for these 30 DOIs with OpenAlex venue data." Uses generate_citations.
  • Look up a standard — "What's the latest status of RFC 9000, and fetch the Markdown full text." Uses resolve_standard_identifier + get_standard.

Installation

From PyPI

pip install pvliesdonk-scholar-mcp

If you add optional extras via the PROJECT-EXTRAS-START / PROJECT-EXTRAS-END sentinels in pyproject.toml, document them below:

Scholar-mcp ships two optional-dependency groups:

  • [mcp] — installs FastMCP; required to run scholar-mcp serve and expose tools over stdio/HTTP.
  • [all] — currently identical to [mcp]; reserved for future optional backends.

For MCP-server usage:

pip install 'pvliesdonk-scholar-mcp[mcp]'
# or, without installing into the environment:
uvx --from pvliesdonk-scholar-mcp scholar-mcp serve

Installing the bare pvliesdonk-scholar-mcp package is enough for library use (from scholar_mcp import ...) but the scholar-mcp serve CLI requires [mcp].

From source

git clone https://github.com/pvliesdonk/scholar-mcp.git
cd scholar-mcp
uv sync --all-extras --all-groups

Docker

docker pull ghcr.io/pvliesdonk/scholar-mcp:latest

A compose.yml ships at the repo root as a starting point — copy .env.example to .env, edit, and docker compose up -d.

To attach a remote Python debugger (development only — the protocol is unauthenticated), see Remote debugging.

Linux packages (.deb / .rpm)

Download .deb or .rpm packages from the GitHub Releases page. Both install a hardened systemd unit; env configuration is sourced from /etc/scholar-mcp/env (copy from the shipped /etc/scholar-mcp/env.example).

Claude Desktop (.mcpb bundle)

Download the .mcpb bundle from the GitHub Releases page and double-click to install, or run:

mcpb install scholar-mcp-<version>.mcpb

Claude Desktop prompts for required env vars via a GUI wizard — no manual JSON editing needed.

For manual Claude Desktop configuration and setup options, see Claude Desktop deployment.

Quick start

scholar-mcp serve                                # stdio transport
scholar-mcp serve --transport http --port 8000   # streamable HTTP

For library usage (embedding the domain logic without the MCP transport), import from the scholar_mcp package directly — backend clients live under src/scholar_mcp/_s2_client.py, _epo_client.py, _openlibrary_client.py, and _standards_client.py.

Server info

The server registers a built-in get_server_info tool (via fastmcp_pvl_core.register_server_info_tool) so operators can confirm the deployed version with a single MCP call. The default response carries server_name, server_version, and core_version. Servers that talk to a remote upstream wire upstream version reporting inside the DOMAIN-UPSTREAM-START / DOMAIN-UPSTREAM-END sentinel in src/scholar_mcp/server.py — see CLAUDE.md for the wiring pattern.

Configuration

Core environment variables shared across all fastmcp-pvl-core-based services:

VariableDefaultDescription
FASTMCP_LOG_LEVELINFOLog level for FastMCP internals and app loggers (DEBUG / INFO / WARNING / ERROR). The -v CLI flag overrides to DEBUG.
FASTMCP_ENABLE_RICH_LOGGINGtrueSet to false for plain / structured JSON log output.
SCHOLAR_MCP_KV_STORE_URLfile:///data/statePersistent-state backend URL for pvl-core subsystems — file:///path (survives restarts), memory:// (dev/ephemeral).

Domain-specific variables go below under Domain configuration.

Authorization (opt-in)

This server inherits opt-in per-subject authorization from fastmcp-pvl-core. The default posture is off — every authenticated caller can use every tool, resource, and prompt. Turn it on by pointing SCHOLAR_MCP_ACL_PATH at a TOML ACL file; the middleware is installed only when the path is set, and individual tools opt in by declaring meta={"required_scope": "<scope>"} at registration. A tool without required_scope is unrestricted regardless of caller.

Wire it in by uncommenting the acl_path field in src/scholar_mcp/config.py and the AuthorizationMiddleware stanza in src/scholar_mcp/server.py — both ship as commented stubs in the scaffold.

ACL TOML schema

[subjects]
"user:alice@example.com" = ["read", "write"]
"user:admin@example.com" = ["*"]              # wildcard — any required scope passes
"service:ci-bot"         = ["read"]
"local"                  = ["*"]              # auth-disabled subject (no bearer / OIDC vars set)
  • Subject strings are opaque. The <kind>:<id> convention is documentation only; the library treats each subject as a literal string.
  • * is the only library-treated special scope — it grants every required scope. Subject-side wildcards (* as an ACL key) are rejected at load time.
  • Scope vocabulary is domain-defined. Per-project or per-folder gating is encoded into the scope string itself (e.g. read:project-foo, write:vault/personal); fastmcp-pvl-core treats every scope except * as opaque.

Subject ↔ bearer-token alignment

The subject string used as a value in the bearer-tokens TOML (SCHOLAR_MCP_BEARER_TOKENS_FILE) is the same string used as a key in the ACL TOML. Same string, opposite roles — keep the two files consistent when adding or removing a principal. See Mapped bearer tokens in the authentication guide for the bearer-tokens TOML schema.

In single-token mode (SCHOLAR_MCP_BEARER_TOKEN) every authenticated caller shares one subject — the library's default (currently "bearer-anon"), override with SCHOLAR_MCP_BEARER_DEFAULT_SUBJECT; reference that string as the ACL key. When no auth is configured (no SCHOLAR_MCP_BEARER_TOKEN, SCHOLAR_MCP_BEARER_TOKENS_FILE, or OIDC env vars set — common in stdio dev rigs but also possible on HTTP), every request resolves to the literal subject "local" — reference that string as the ACL key for un-authenticated local sessions.

Load semantics

The ACL file is loaded once at server startup. Restart the server to pick up changes; live reload is not part of the initial implementation. load_acl fails fast with ConfigurationError on every malformed condition, so a typo in the ACL file aborts startup rather than silently denying requests.

Privacy default

Denied requests are logged at WARNING with the subject string for audit attribution. The wire-side error payload omits the subject by default to limit cross-user information disclosure. For internal-only servers where the subject is safe to surface to clients, construct the middleware with AuthorizationMiddleware(..., expose_subject_in_error=True).

See also

  • fastmcp-pvl-core README — Authorization — full design, the check_authorization per-call helper, and per-token subject mapping.
  • Authorization submodule spec — design rationale and deviations table.

GitHub secrets

CI workflows reference three repository secrets. Configure them via Settings → Secrets and variables → Actions or with gh secret set:

SecretUsed byHow to generate
RELEASE_TOKENrelease.yml, copier-update.ymlFine-grained PAT at https://github.com/settings/personal-access-tokens/new with contents: write and pull_requests: write (the copier-update cron opens PRs). Scoped to this repo.
CODECOV_TOKENci.ymlhttps://codecov.io — sign in with GitHub, add the repo, copy the upload token from the repo settings page.
CLAUDE_CODE_OAUTH_TOKENclaude.yml, claude-code-review.ymlRun claude setup-token locally and paste the result.
gh secret set RELEASE_TOKEN
gh secret set CODECOV_TOKEN
gh secret set CLAUDE_CODE_OAUTH_TOKEN

GITHUB_TOKEN is auto-provided — no action needed.

Local development

The PR gate (matches CI):

uv run pytest -x -q                                  # tests
uv run ruff check --fix . && uv run ruff format .    # lint + format
uv run mypy src/ tests/                              # type-check

Pre-commit runs a subset of the gate on each commit; see .pre-commit-config.yaml for details, or CLAUDE.md for the full Hard PR Acceptance Gates.

Troubleshooting

Moving a scaffolded project

uv sync creates .venv/bin/* scripts with absolute shebangs pointing at the venv Python. If you move the repo after scaffolding (mv /old/path /new/path), uv run pytest fails with ModuleNotFoundError: No module named 'fastmcp' because the stale shebang resolves to a different interpreter than the venv's site-packages.

Fix:

rm -rf .venv
uv sync --all-extras --all-groups

uv run python -m pytest also works as a one-shot workaround (bypasses the stale entry-script shim).

uv.lock refresh after copier update

When copier update introduces new dependencies (e.g. a new extra added to pyproject.toml.jinja), CI runs uv sync --frozen which fails against a stale lockfile. Run uv lock locally and commit the refreshed uv.lock alongside accepting the copier-update PR.

License

MIT — see LICENSE.

Domain configuration

All settings are controlled via environment variables with the SCHOLAR_MCP_ prefix.

Core

VariableDefaultDescription
SCHOLAR_MCP_S2_API_KEY—Semantic Scholar API key (request one); optional but recommended for higher rate limits
SCHOLAR_MCP_READ_ONLYtrueIf true, write-tagged tools (fetch_paper_pdf, convert_pdf_to_markdown, fetch_and_convert, fetch_pdf_by_url, fetch_patent_pdf) are hidden
SCHOLAR_MCP_CACHE_DIR/data/scholar-mcpDirectory for the SQLite cache database and downloaded PDFs
SCHOLAR_MCP_CONTACT_EMAIL—Included in the OpenAlex User-Agent for polite pool access (faster rate limits); also enables Unpaywall PDF lookups

PDF Conversion (optional)

VariableDefaultDescription
SCHOLAR_MCP_DOCLING_URL—Base URL of a running docling-serve instance (e.g. http://localhost:5001)
SCHOLAR_MCP_VLM_API_URL—OpenAI-compatible VLM endpoint for formula/figure-enriched PDF conversion
SCHOLAR_MCP_VLM_API_KEY—API key for the VLM endpoint
SCHOLAR_MCP_VLM_MODELgpt-4oModel name for VLM-enriched conversion

Patent Search (optional)

VariableDefaultDescription
SCHOLAR_MCP_EPO_CONSUMER_KEY—EPO OPS consumer key (register at developers.epo.org); both key and secret must be set for patent tools to appear
SCHOLAR_MCP_EPO_CONSUMER_SECRET—EPO OPS consumer secret

Google Books (optional)

VariableDefaultDescription
SCHOLAR_MCP_GOOGLE_BOOKS_API_KEY—Google Books API key for higher rate limits (1000 req/day without key)

Tier 2 standards sync (optional)

VariableDefaultDescription
SCHOLAR_GITHUB_TOKEN—GitHub personal access token for Relaton sync; lifts unauthenticated GitHub rate limit from 60/hr to 5,000/hr (no scopes required for public-repo reads). Useful for repeated --force testing; daily cron is fine unauthenticated.

Authentication (optional)

VariableDefaultDescription
SCHOLAR_MCP_BEARER_TOKEN—Static bearer token for HTTP transport authentication
SCHOLAR_MCP_BASE_URL—Public base URL, required for OIDC (e.g. https://mcp.example.com)
SCHOLAR_MCP_OIDC_CONFIG_URL—OIDC discovery endpoint URL
SCHOLAR_MCP_OIDC_CLIENT_ID—OIDC client ID
SCHOLAR_MCP_OIDC_CLIENT_SECRET—OIDC client secret
SCHOLAR_MCP_OIDC_JWT_SIGNING_KEY—JWT signing key; required on Linux/Docker to survive restarts (openssl rand -hex 32)

Key design decisions

  • Library-first, MCP-optional. The core domain logic (S2/EPO/Open Library/standards clients, enrichment pipeline, cache) is importable without FastMCP; the MCP server is a thin async wrapper. Enables reuse in scripts, notebooks, and other servers.
  • Sync domain code, async MCP layer. Backend clients are synchronous; MCP tools call them via asyncio.to_thread(). Simpler client code, explicit offloading at the transport boundary.
  • SQLite cache with per-table TTLs and identifier aliases. Papers / authors last 30 days, citations / references 7 days. DOI ↔ S2 ID ↔ arXiv ID aliasing survives across cache clears so repeated enrichment hits the same row.
  • Read-only by default. Write-tagged tools (PDF download/convert, patent PDF) are hidden unless SCHOLAR_MCP_READ_ONLY=false. Safer default for first-run.
  • Rate-limited try-once with background queueing. S2/OpenAlex calls try once; on 429 they queue a retry-enabled background task and return {"queued": true, "task_id": ...}. PDF tools always queue unless the cache already has the conversion.
  • Tier 2 standards sync out-of-band. ISO/IEC/IEEE/CC/CEN catalogues come from community Relaton dumps via scholar-mcp sync-standards, not live at runtime — avoids paywalled-HTML scraping and keeps tool calls fast.

Quick Start details

stdio transport (Claude Desktop / MCP clients)

uvx --from pvliesdonk-scholar-mcp scholar-mcp serve

API key optional but recommended: The server works without a Semantic Scholar API key, but unauthenticated requests are limited to ~1 req/s and will hit 429 throttles quickly during multi-step operations like citation graph traversal. Request a free key to get ~10 req/s.

Claude Desktop configuration (claude_desktop_config.json):

{
  "mcpServers": {
    "scholar": {
      "command": "uvx",
      "args": ["--from", "pvliesdonk-scholar-mcp", "scholar-mcp", "serve"],
      "env": {
        "SCHOLAR_MCP_S2_API_KEY": "your-key"
      }
    }
  }
}

HTTP transport

uvx --from pvliesdonk-scholar-mcp scholar-mcp serve --transport http --port 8000

Claude Code plugin

/plugin marketplace add pvliesdonk/claude-plugins
/plugin install scholar-mcp@pvliesdonk

Syncing Tier 2 standards catalogues

Tier 2 bodies (ISO, IEC, IEEE, CC, CEN) are populated from community-curated bulk dumps rather than live-scraped at MCP-server runtime. Run the sync on first install and periodically thereafter:

scholar-mcp sync-standards            # all registered bodies
scholar-mcp sync-standards --body ISO # only ISO
scholar-mcp sync-standards --body IEEE # only IEEE
scholar-mcp sync-standards --body CC   # only Common Criteria
scholar-mcp sync-standards --body CEN # only CEN/CENELEC
scholar-mcp sync-standards --force    # re-sync even if upstream SHA is unchanged

Schedule via cron / launchd / systemd timer — weekly is sufficient; standards change slowly. First sync can take several minutes; subsequent runs that find no upstream changes exit within seconds.

MCP Tools

28 tools, organised by scholarly source type.

Papers

Search & retrieval

ToolDescription
search_papersFull-text search with year, venue, field-of-study, and citation-count filters. Returns up to 100 results with pagination.
get_paperFetch full metadata for a single paper by DOI, S2 ID, arXiv ID, ACM ID, or PubMed ID.
get_authorFetch author profile with publications, or search by name.

Citation graph

ToolDescription
get_citationsForward citations (papers that cite a given paper) with optional filters.
get_referencesBackward references (papers cited by a given paper).
get_citation_graphBFS traversal from seed papers, returning nodes + edges up to configurable depth.
find_bridge_papersShortest citation path between two papers.

Recommendations & citation generation

ToolDescription
recommend_papersPaper recommendations from 1–5 positive examples and optional negative examples.
generate_citationsGenerate BibTeX, CSL-JSON, or RIS citations for up to 100 papers, with automatic entry type inference and optional OpenAlex venue enrichment.
enrich_paperAugment Semantic Scholar metadata with OpenAlex fields (affiliations, funders, OA status, concepts).

Patents

ToolDescription
search_patentsSearch patents across 100+ patent offices via EPO OPS with CPC / applicant / inventor / jurisdiction / date filters.
get_patentFetch bibliographic / claims / description / family / legal / citations sections for a single patent by publication number. Citations include NPL-to-paper resolution via Semantic Scholar.
get_citing_patentsFind patents that cite a given academic paper (best-effort; EPO OPS citation search coverage is incomplete).
fetch_patent_pdfDownload a patent PDF via authenticated EPO OPS and optionally convert to Markdown.

Patent tools are hidden when SCHOLAR_MCP_EPO_CONSUMER_KEY and SCHOLAR_MCP_EPO_CONSUMER_SECRET are not set. fetch_patent_pdf is also write-tagged and hidden when SCHOLAR_MCP_READ_ONLY=true.

Books

ToolDescription
search_booksSearch for books by title, author, ISBN, or keywords via Open Library. Returns up to 50 results.
get_bookFetch book metadata by ISBN-10, ISBN-13, Open Library work ID, or edition ID. Optionally download and cache the cover image locally.
get_book_excerptFetch a book excerpt and description from Google Books by ISBN. Shows preview availability and link.
recommend_booksRecommend books for a subject via Open Library, sorted by popularity.

Papers with an ISBN in their externalIds are automatically enriched with book_metadata (publisher, edition, cover URL, subjects, and more) from Open Library when fetched via get_paper, get_citations, get_references, or get_citation_graph. Book records also include worldcat_url (when ISBN-13 is present), google_books_url, and snippet from Google Books enrichment. Cover images can be downloaded and cached locally via get_book.

Standards

ToolDescription
resolve_standard_identifierNormalise a messy citation string (e.g. "rfc9000", "nist 800-53") to canonical form and body.
search_standardsSearch standards by identifier, title, or free text, optionally filtered to one body (NIST, IETF, W3C, ETSI).
get_standardRetrieve a standard by canonical or fuzzy identifier, optionally fetching and converting the full text via docling.

Tier-1 bodies (NIST, IETF, W3C, ETSI) are supported with full metadata and optional full-text conversion. Tier-2 bodies (ISO, IEC, IEEE, CC, CEN/CENELEC) are populated locally via scholar-mcp sync-standards.

Cross-source Utility

ToolDescription
batch_resolveResolve up to 100 mixed identifiers (paper DOIs, patent numbers, ISBNs) to full metadata in one call, routing each to the right backend with OpenAlex fallback.

PDF Conversion (requires docling-serve)

ToolDescription
fetch_paper_pdfDownload PDF for a paper (S2 open-access, then ArXiv/PMC/Unpaywall fallback).
convert_pdf_to_markdownConvert a local PDF to Markdown via docling-serve.
fetch_and_convertFull pipeline: fetch PDF (with fallback), convert to Markdown, return both.
fetch_pdf_by_urlDownload a PDF from any URL and optionally convert to Markdown.

PDF tools are write-tagged and hidden when SCHOLAR_MCP_READ_ONLY=true (the default). fetch_patent_pdf (above) and the get_standard full-text mode cover the patent and standards equivalents.

Task Polling

ToolDescription
get_task_resultPoll for the result of a background task by ID.
list_tasksList all active background tasks.

Long-running operations (PDF download/conversion) and rate-limited backend requests return {"queued": true, "task_id": "..."} immediately. Use get_task_result to poll for the result.

Docker Compose

services:
  scholar-mcp:
    image: ghcr.io/pvliesdonk/scholar-mcp:latest
    restart: unless-stopped
    environment:
      SCHOLAR_MCP_S2_API_KEY: "${SCHOLAR_MCP_S2_API_KEY}"
      SCHOLAR_MCP_DOCLING_URL: "http://docling-serve:5001"
      SCHOLAR_MCP_VLM_API_URL: "${VLM_API_URL:-}"
      SCHOLAR_MCP_VLM_API_KEY: "${VLM_API_KEY:-}"
      SCHOLAR_MCP_CACHE_DIR: "/data/scholar-mcp"
      SCHOLAR_MCP_READ_ONLY: "false"
    volumes:
      - scholar-mcp-data:/data/scholar-mcp
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.scholar-mcp.rule=Host(`scholar-mcp.yourdomain.com`)"

  docling-serve:
    image: ghcr.io/ds4sd/docling-serve:latest
    restart: unless-stopped

volumes:
  scholar-mcp-data:

Cache Management

# Show cache statistics (row counts, database size)
scholar-mcp cache stats

# Clear all cached data (preserves identifier aliases)
scholar-mcp cache clear

# Remove entries older than 30 days
scholar-mcp cache clear --older-than 30

# Override cache directory
scholar-mcp cache stats --cache-dir /path/to/cache
Featured
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

Configuration

SCHOLAR_MCP_READ_ONLYdefault: true

Set to 'false' to enable write tools (default: true)

SCHOLAR_MCP_BEARER_TOKEN

Static bearer token for authentication (optional)

SCHOLAR_MCP_BASE_URL

Public base URL of this server, required for OIDC (e.g. https://mcp.example.com)

SCHOLAR_MCP_OIDC_CONFIG_URL

OIDC discovery endpoint URL (e.g. https://auth.example.com/.well-known/openid-configuration)

SCHOLAR_MCP_OIDC_CLIENT_ID

OIDC client ID

SCHOLAR_MCP_OIDC_CLIENT_SECRET

OIDC client secret

SCHOLAR_MCP_OIDC_JWT_SIGNING_KEY

JWT signing key — required on Linux/Docker to survive restarts (generate with: openssl rand -hex 32)

Categories
Search & Web Crawling
Registryactive
Packagepvliesdonk-scholar-mcp
TransportSTDIO, HTTP
UpdatedApr 23, 2026
View on GitHub

Related Search & Web Crawling MCP Servers

View all →
Google Search

com.mcparmory/google-search

Scrape Google search results with SERP data, ads, and knowledge panels
25
Brave Search

io.github.pipeworx-io/brave-search

Brave Search MCP — independent web index (no Google/Bing dependency)
Serper Search and Scrape

marcopesani/mcp-server-serper

Serper MCP Server supporting search and webpage scraping
154
Brave Search Mcp Server

brave/brave-search-mcp-server

Brave Search MCP Server: web results, images, videos, rich results, AI summaries, and more.
1.2k
Google Search Console

com.mcparmory/google-search-console

Query search analytics, manage sitemaps, and inspect site URLs and status
25
Google Search Console

acamolese/google-search-console-mcp

Google Search Console MCP server: SEO audits, performance queries, URL inspection, indexing checks.
3