MedSci Skills v2.2: Zero-Touch Manuscript Pipeline

MedSci Skills started as a set of Claude Code skills for medical research — individual tools for literature search, statistical analysis, manuscript writing, and reporting compliance. Version 2.2 connects them into a single autonomous pipeline. Feed it your dataset, walk away, come back to a submission-ready manuscript.

Here is what changed and why it matters.

The Core Feature: `--e2e` and `--autonomous`

Two new flags turn the existing multi-skill workflow into an unattended pipeline:

# Full pipeline: data → analysis → figures → manuscript → QC → DOCX
/orchestrate --e2e --data ./study_data.csv --type diagnostic-accuracy

# Or start from the manuscript phase with autonomous QC
/write-paper --autonomous --project ./my-project/

Without these flags, everything works exactly as before — interactive, with user approval gates between phases. The flags are opt-in. Your existing workflows do not change.

With --e2e, the orchestrator chains every skill in sequence: analyze-stats for your statistics, make-figures for publication-ready plots, write-paper for the IMRAD manuscript, check-reporting for guideline compliance, self-review for quality audit, and finally a DOCX build with embedded figures and formatted tables.

No prompts. No approval gates. One command, one output.

How the Pipeline Flows

Here is the full sequence when you run /orchestrate --e2e:

Raw Data (.csv / .xlsx)
        |
        v
  analyze-stats
  - Table 1, statistical tests, effect sizes
  - Publication-ready tables (.md + .csv)
        |
        v
  make-figures
  - ROC, forest plot, flow diagram, etc.
  - 300 DPI, colorblind-safe
  - _figure_manifest.md (NEW — see below)
        |
        v
  write-paper (Phases 1-6)
  - Outline → Methods → Results → Full IMRAD
  - Consumes figure manifest for placement
        |
        v
  Phase 7: Strict QC Chain (NEW)
  - AI pattern scan + removal
  - check-reporting with auto-fix
  - search-lit --verify-only (citation check)
  - self-review --json --fix (up to 2 iterations)
        |
        v
  DOCX Build
  - Embedded figures + formatted tables
  - Ready for journal submission portal

Each arrow represents an automated handoff. Each skill reads the previous skill's output files directly — no copy-pasting between sessions, no manual file management.

Phase 7: The QC Chain That Catches What You Miss

Phase 7 is the most significant addition. Previously, quality checking was a separate manual step. Now it runs automatically as the final phase of write-paper, and it enforces a strict sequence:

Step 1 — AI Pattern Scan. The humanize skill scans the manuscript for 18 known AI writing patterns (hedging phrases, filler transitions, over-qualification). Flagged passages get rewritten in place.

Step 2 — Reporting Compliance. check-reporting audits the manuscript against the appropriate guideline (STARD for diagnostic accuracy, PRISMA for systematic reviews, STROBE for observational studies). Items marked MISSING get auto-fixed — the skill inserts the required text at the correct manuscript location.

Step 3 — Citation Verification. search-lit --verify-only checks every reference against PubMed and Semantic Scholar. If a DOI does not resolve, the citation gets flagged. No hallucinated references survive this step.

Step 4 — Self-Review with Auto-Fix. self-review --json --fix runs a 10-category quality audit and attempts to fix identified issues. If the first pass finds problems, it runs a second iteration. Two passes maximum — if issues persist after two rounds, they get flagged for human review rather than running indefinitely.

Structured JSON Outputs

Skills now talk to each other through machine-readable JSON blocks. When check-reporting finishes an audit, it emits:

{
  "guideline": "STARD 2015",
  "compliance_pct": 83,
  "items_present": 25,
  "items_missing": 5,
  "fixable_by_ai": 4,
  "action_items": [
    {"item": "7", "label": "Sampling", "status": "MISSING", "fix": "..."}
  ]
}

The orchestrate skill reads fixable_by_ai to decide whether to attempt auto-repair or halt for human input. Same pattern for self-review: its JSON output includes severity scores and fix suggestions that feed directly into the next iteration.

This matters because it turns quality checking from a report you read into a signal that drives automated decisions.

Figure Manifest: `_figure_manifest.md`

When make-figures generates plots, it now writes a structured manifest file listing every figure with its path, type, tool used, and description:

# Figure Manifest
Generated: 2026-04-14
Study type: diagnostic-accuracy

| Figure | Path | Type | Tool | Description |
|--------|------|------|------|-------------|
| Figure 1 | figures/fig1_stard_flow.svg | flow-diagram | D2 | STARD flow |
| Figure 2 | figures/fig2_roc.pdf | roc-curve | matplotlib | ROC curves |
| Figure 3 | figures/fig3_calibration.pdf | calibration | matplotlib | Calibration plot |

write-paper reads this manifest to place figures correctly in the manuscript body. The DOCX builder uses it to embed images with proper captions. No more manually dragging figures into a Word document and re-typing legends.

33 Reporting Guidelines

check-reporting now covers 33 guidelines, up from 22 in v2.0. The additions fill gaps in systematic review methodology and risk-of-bias assessment:

Category	New Additions
Systematic Reviews	PRISMA-P, SWiM, AMSTAR 2
Risk of Bias	QUADAS-C, ROBINS-E, ROBIS, ROB-ME, RoB NMA
Prediction Models	PROBAST+AI
Measurement	COSMIN

Each guideline includes the full item checklist with specific fix suggestions when items are missing. The AI-specific extensions (STARD-AI, TRIPOD+AI, PROBAST+AI, MI-CLEAR-LLM) make this particularly relevant for radiology and medical AI research.

Post-Skill Validation

In --e2e mode, the orchestrator verifies expected outputs after every skill completes. If analyze-stats does not produce a Table 1 file, the pipeline halts immediately rather than letting write-paper proceed with placeholder data.

This is a direct response to a real failure mode: in earlier versions, a skill could fail silently (e.g., a statistical model that did not converge), and downstream skills would fill in the gap with hallucinated content. Post-skill validation eliminates that class of errors.

Try It Now

Install or update:

# Install all 22 skills
git clone https://github.com/Aperivue/medsci-skills.git
cp -r medsci-skills/skills/* ~/.claude/skills/

# Run the full pipeline on your data
/orchestrate --e2e

# Or run autonomous manuscript writing on an existing project
/write-paper --autonomous

If you already have MedSci Skills installed, pull the latest version and re-copy. All new flags default to off — your existing skill calls work exactly as before.

The full skill list, documentation, and three end-to-end demos are on GitHub: github.com/Aperivue/medsci-skills

What This Does Not Do

A few things worth stating explicitly:

It does not replace clinical judgment. The pipeline generates manuscripts from data you provide. Study design, patient selection, clinical interpretation — those remain your responsibility.
It does not guarantee acceptance. The QC chain catches reporting gaps and AI writing patterns, but peer reviewers evaluate scientific merit, which no automation can substitute.
It does not upload or submit. The pipeline stops at DOCX. Journal submission portals, cover letters, and author agreements are manual steps by design.

Background

MedSci Skills is an open-source bundle of 22 Claude Code skills for medical research, built by a radiology resident running multiple concurrent studies. The project started with a single PDF download script and grew into a full research pipeline. Previous posts cover the origin story and three end-to-end demos.

v2.2 is available now on GitHub. MIT licensed.

MedSci Skills is developed by Aperivue. For questions or feature requests, open an issue on the GitHub repository.