CAT
/MCP
SkillsMCPMarketplacesDigestToolsAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Sales & MarketingWeb & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web Crawling
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Cross AI Tools

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Tools
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic

methods-mcp

flynnlachendro/methods-mcp
authSTDIOregistry active
Summary

Structured methods extraction for academic papers, designed for the Worldwide AI Science Fellowship build challenge. Exposes eight tools that resolve paper URLs (arXiv, DOI) into canonical metadata, full text with section splits, Pydantic-validated methods objects (steps, reagents, equipment, analyses), and associated code repositories discovered via Papers With Code. The assess_repo_reproducibility tool scores GitHub repos without cloning by analyzing README presence, dependency files, notebooks, figure scripts, and maintenance signals through the GitHub REST API. The composite methods_repro_review tool runs the full pipeline in one call and returns a plain-English narrative alongside structured data. Uses Claude Sonnet 4 for extraction via tool-use validation with one repair attempt on schema failures. Reach for this when you need on-demand reproducibility verdicts that return in seconds rather than the 30-minute runtimes of execution-based pipelines.

CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

methods-mcp

PyPI Python License: MIT

Lightweight, on-demand MCP server for structured methods extraction + reproducibility heuristics on academic papers. Built for the Worldwide AI Science Fellowship build challenge.

⚠️ Status: alpha (0.1.x). The tool surface and output shapes may shift between minor versions. Pin to an exact version in production. Bug reports very welcome via GitHub Issues.

Quick demo

$ uvx --from methods-mcp methods-mcp --version
methods-mcp 0.1.6

# In a Claude Code session:
> /mcp add methods-mcp methods-mcp
> Run methods_repro_review on https://arxiv.org/abs/2509.06917

  → tool: methods_repro_review({"input_str":"https://arxiv.org/abs/2509.06917"})

# Returns a MethodsReproReview object. Read `narrative` first — it explains
# everything else in plain English, so no tool-learning is required:

{
  "status": "ok",
  "narrative":
    "Resolved the paper: 'Paper2Agent' by Miao et al. (arxiv 2509.06917, "
    "2025-09-08). Extracted 11 methods steps at moderate self-reported "
    "confidence (0.72) — the procedure is clearly described but hyperparameters "
    "and software versions are absent. Detected the associated code repository "
    "https://github.com/jmiao24/Paper2Agent from an inline link in the paper "
    "text (detection confidence 0.94). The repo scored 0.90/1.00 on the "
    "reproducibility heuristic — verdict: likely reproducible. Present signals: "
    "substantive README, dependencies file, notebooks, figure-plotting script, "
    "recent activity, permissive license. Missing: data/fixtures directory. "
    "Suggested entrypoint: `python make_figures.py`.",
  "metadata":          { ... },   # PaperMetadata
  "methods":           { ... },   # MethodsStructured (null if extraction failed)
  "code_repo":         { ... },   # CodeRepo           (null only if input unresolvable)
  "repro_assessment":  { ... },   # ReproAssessment   (null if no repo detected)
  "errors":            []         # [{step, error_type, message, hint}] on partial
}

methods-mcp is a small, sharply-scoped Model Context Protocol server. It gives any AI agent (Claude Code, Claude Desktop, your Agent SDK script, etc.) eight tools that turn an academic paper URL into:

  • canonical metadata,
  • best-effort full text + section split,
  • a Pydantic-validated structured methods object (steps / reagents / equipment / analyses),
  • the paper's associated code repository (best-effort discovery),
  • a no-execution-required reproducibility verdict for that repo, and
  • a multi-mode summary.

The wedge: heavyweight pipelines like Paper2Agent (Stanford) take 30 minutes to hours to digest a paper into agent-ready tools. methods-mcp is the agent-callable, on-demand complement — every tool returns in seconds, no clone, no execution.


Install

uv add methods-mcp
# or, install globally:
uv tool install methods-mcp
# or, classic pip:
pip install methods-mcp

API keys

For best performance, set both:

VariableRequired?What you get without it
ANTHROPIC_API_KEYRequired for extract_methods, summarize_paper, methods_repro_reviewThose tools raise RuntimeError: ANTHROPIC_API_KEY not set. Non-LLM tools (fetch_paper_text, find_code_repo, assess_repo_reproducibility) still work fine.
GITHUB_TOKENOptional but recommended for assess_repo_reproducibility / methods_repro_reviewYou're capped at the GitHub unauthenticated rate limit (60 req/hr per IP). Each repo assessment is ~3 calls, so you'll hit the ceiling after ~15–20 repos/hr. With a token: 5,000 req/hr (effectively unlimited).
export ANTHROPIC_API_KEY=sk-ant-...
export GITHUB_TOKEN=ghp_...          # optional but recommended

Neither key is logged or persisted — they're sent only to api.anthropic.com and api.github.com respectively. See SECURITY.md.

Use it from Claude Code

/mcp add methods-mcp methods-mcp

Then in any Claude Code chat:

Take https://arxiv.org/abs/2509.06917 and run methods_repro_review. Summarise what the paper does, the methods steps, and how reproducible the repo looks.

Use it from the Claude Agent SDK

from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient

options = ClaudeAgentOptions(
    mcp_servers={
        "methods-mcp": {
            "type": "stdio",
            "command": "methods-mcp",
            "args": [],
        }
    },
    allowed_tools=["mcp__methods-mcp__methods_repro_review"],
)

async with ClaudeSDKClient(options=options) as client:
    await client.query(
        "Run methods_repro_review on https://arxiv.org/abs/2509.06917 "
        "and tell me whether the repo looks reproducible."
    )
    async for msg in client.receive_response():
        print(msg)

Tools

ToolWhat it does
healthServer liveness + config check.
get_paper_metadata(input_str)Resolve URL / arXiv ID / DOI to canonical metadata. arXiv inputs hit the arXiv export API for title/authors/abstract.
fetch_paper_text(input_str, prefer="auto"|"html"|"pdf")Full text + section split. Defaults to ar5iv HTML for arXiv papers (cheap, structured), PDF fallback otherwise.
extract_methods(input_str, model=None)LLM-driven, Pydantic-validated structured methods extraction. Returns {steps, reagents, equipment, analyses, confidence}.
find_code_repo(input_str)Discover the paper's code repo via paper text → abstract → Papers With Code.
assess_repo_reproducibility(repo_url, paper_id=None)Heuristic, no-clone reproducibility assessment via the GitHub REST API. Weighted signals (README, deps, fixtures, notebooks, figure scripts, recent maintenance, license) → {verdict, score, recommended_entrypoint}.
summarize_paper(input_str, mode="tldr"|"abstract"|"exec")LLM summary in three depths.
methods_repro_review(input_str)Composite — metadata + methods + repo + repro in one call.

All tools return Pydantic v2 models (validated, JSON-serialisable). See src/methods_mcp/schemas.py for the full type surface.

Design notes

  • extract_methods uses Anthropic tool-use to coerce the model into emitting an instance of the MethodsStructured Pydantic schema. On validation failure we send one repair message with the validation error and try again before raising.
  • assess_repo_reproducibility does not clone or execute anything. It scores the repo from publicly-readable GitHub metadata + the recursive tree listing. This is the deliberate wedge against batch tools that try to actually rerun the paper.
  • fetch_paper_text prefers ar5iv HTML over PDF parsing for arXiv papers. Falls back to pypdf for non-arXiv inputs.
  • The default model is claude-sonnet-4-6. Override via METHODS_MCP_MODEL env var or per-call model= arg.
  • methods_repro_review returns a self-describing response. Every call sets a top-level status ("ok" / "partial" / "empty") and a narrative string that summarises everything retrieved in plain English — including every numeric score in context. A reader who reads only narrative + status gets the full picture without needing to learn the sub-object shapes. Sub-objects can be null when unavailable (e.g. repro_assessment: null on a paper with no detected repo — status stays "ok" because "no repo" isn't a failure). Failed sub-steps contribute a structured entry to errors with {step, error_type, message, hint}, where hint is an actionable plain-English suggestion for recognised patterns (missing API keys, rate-limits, 404s, timeouts, etc.) and null otherwise.

Scores & verdicts explained

Tool outputs contain three numeric fields that look similar but mean very different things. They are triage signals for an agent deciding whether a paper is worth digging into, not calibrated claims about correctness.

FieldRangeHow it's computedHow to read it
methods.confidence0–1LLM self-report. The extractor model sets it per instructions in the system prompt: ≥0.8 only if the paper gives explicit reagents/volumes/equipment, ~0.3 if the methods section is sparse. Uncalibrated.Soft signal for "is this a wet-lab paper with concrete procedure, or a sparse systems paper?" Useful as a flag; don't treat as a trust percentage.
code_repo.confidence0–1Varies by detection_method. papers-with-code: fixed 0.95 (authoritative paper→repo API). paper-text: computed as 0.6 + 0.2·(strong-phrase-present) + 0.015·score_margin, capped at 0.95. abstract-link: fixed 0.85. none: 0.0.Tells you how the repo was found and how decisively. High score + paper-text means a strong phrase like "code is available at …" sat next to the URL.
repro_assessment.overall_score0–1Weighted sum of 8 binary signals, all computed from the GitHub REST API (no clone, no execution): has_readme (0.10), readme_substantial (0.15), has_dependencies_file (0.20), has_data_or_fixtures (0.10), has_notebook (0.10), has_figure_script (0.20), actively_maintained (0.10), permissive_license (0.05). Each present signal contributes its weight.The only fully-deterministic score of the three. Still a heuristic, not a proof — a high score means the repo looks well-structured for reproduction. For actual validation see Paper2Agent.

Verdict buckets (repro_assessment.verdict) are thresholds on overall_score:

VerdictScoreMeaning
likely-reproducible≥ 0.70Most repro-friendly signals present. Worth trying to run.
partial≥ 0.45Some infrastructure, likely gaps. Expect to fill in missing pieces.
unlikely≥ 0.20Minimal signal. Possible code dump without the scaffolding to rerun it.
insufficient-info< 0.20 or repo unreachableNot enough to tell. Don't draw conclusions either way.

Enum values you'll see in outputs:

  • code_repo.detection_method: paper-text | abstract-link | papers-with-code | metadata | none
  • metadata.source: arxiv | biorxiv | doi | url | unknown

Security & limitations

What this server actually does when you install and run it:

  • Network calls only to: export.arxiv.org, ar5iv.labs.arxiv.org, arxiv.org (PDFs), api.github.com, paperswithcode.com, api.anthropic.com. No telemetry, no analytics, no phone-home.
  • Reads ANTHROPIC_API_KEY (required for LLM tools) and optionally GITHUB_TOKEN from environment variables. These are sent only to Anthropic / GitHub respectively. Never logged, never persisted to disk.
  • Writes nothing to your filesystem. No cache directories, no downloaded PDFs, no temp files.
  • Executes no user-supplied code. No eval, exec, subprocess, pickle.loads, or shell-outs. The reproducibility tool deliberately does not clone or run repositories — it scores from the GitHub REST API only.

Limitations to be aware of:

  • Adversarial papers may produce misleading structured output. The extract_methods tool sends paper text to Claude. A paper containing prompt-injection content could yield wrong (but schema-valid) structured methods. Treat the output as a research aid, not ground truth.
  • The reproducibility verdict is a heuristic, not a proof. A high score means the repo looks well-structured for reproduction; it does not guarantee that running the code reproduces the paper. For full validation see Paper2Agent.
  • Intended for local stdio use. The HTTP/SSE transports are provided for development convenience but should only be exposed on trusted networks (no SSRF protection beyond what httpx provides).

Reporting issues:

Security issues: please email flynnlachendro@hotmail.co.uk (also see SECURITY.md). Functional bugs: open a GitHub issue.

Pair with paper-mcp

For broader paper search / citation graph tooling, run paper-mcp (Bhvaik) alongside in the same Claude Code session. paper-mcp does title-keyed search, full-text fetch, citations, and references; methods-mcp adds the structured-methods + reproducibility layer on top. The two were intentionally designed to compose.

Develop locally

git clone https://github.com/FlynnLachendro/methods-mcp
cd methods-mcp
uv sync --extra dev --extra agent

uv run pytest                      # 49 tests, offline (respx-mocked httpx + unittest.mock for Anthropic)
uv run ruff format .
uv run ruff check . --fix
uv run mypy src

uv run methods-mcp --help

License

MIT — see LICENSE.

Acknowledgements

Built for the Worldwide AI Science Fellowship inaugural cohort. Thanks to Michael Raspuzzi for the open-ended brief.

Built on:

  • FastMCP 3.x — the MCP server scaffold.
  • Claude Agent SDK — the agent loop in the demo.
  • ar5iv.labs.arxiv.org — clean HTML for arXiv papers.
  • Anthropic Claude — the LLM behind structured extraction.
Featured
CodeRabbit
CodeRabbit
AI writes the code. CodeRabbit catches the slop.
Try For Free →
Keep your Mac awake
Keep your Mac awake
Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.
One time payment $9 →
Context.devContext.dev
Context.dev
Integrate web data into your AI product. One API to scrape website & brand data.
Get API Key Now →
Make your agent a DeFi expert
Make your agent a DeFi expert
Agent, run crypto. Access onchain data & trade routes via 1inch.
Install now →
Make money from your Skills
Make money from your Skills
On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.
Start earning →
AppSignal
AppSignal
Monitor with ease. Code with confidence.
Start Free Trial →

Configuration

ANTHROPIC_API_KEY*secret

Required for LLM-driven extraction tools (extract_methods, summarize_paper, methods_repro_review).

GITHUB_TOKENsecret

Optional. Raises the GitHub REST API rate limit used by reproducibility assessment.

METHODS_MCP_MODEL

Override the default Claude model (default: claude-sonnet-4-6).

Registryactive
Packagemethods-mcp
TransportSTDIO
AuthRequired
UpdatedApr 15, 2026
View on GitHub