MedSci Skills v2.2: Zero-Touch Manuscript Pipeline with /orchestrate --e2e
MedSci Skills v2.2: Zero-Touch Manuscript Pipeline
MedSci Skills started as a set of Claude Code skills for medical research — individual tools for literature search, statistical analysis, manuscript writing, and reporting compliance. Version 2.2 connects them into a single autonomous pipeline. Feed it your dataset, walk away, come back to a submission-ready manuscript.
Here is what changed and why it matters.
The Core Feature: --e2e and --autonomous
Two new flags turn the existing multi-skill workflow into an unattended pipeline:
# Full pipeline: data → analysis → figures → manuscript → QC → DOCX
/orchestrate --e2e --data ./study_data.csv --type diagnostic-accuracy
# Or start from the manuscript phase with autonomous QC
/write-paper --autonomous --project ./my-project/
Without these flags, everything works exactly as before — interactive, with user approval gates between phases. The flags are opt-in. Your existing workflows do not change.
With --e2e, the orchestrator chains every skill in sequence: analyze-stats for your statistics, make-figures for publication-ready plots, write-paper for the IMRAD manuscript, check-reporting for guideline compliance, self-review for quality audit, and finally a DOCX build with embedded figures and formatted tables.
No prompts. No approval gates. One command, one output.
How the Pipeline Flows
Here is the full sequence when you run /orchestrate --e2e:
Raw Data (.csv / .xlsx)
|
v
analyze-stats
- Table 1, statistical tests, effect sizes
- Publication-ready tables (.md + .csv)
|
v
make-figures
- ROC, forest plot, flow diagram, etc.
- 300 DPI, colorblind-safe
- _figure_manifest.md (NEW — see below)
|
v
write-paper (Phases 1-6)
- Outline → Methods → Results → Full IMRAD
- Consumes figure manifest for placement
|
v
Phase 7: Strict QC Chain (NEW)
- AI pattern scan + removal
- check-reporting with auto-fix
- search-lit --verify-only (citation check)
- self-review --json --fix (up to 2 iterations)
|
v
DOCX Build
- Embedded figures + formatted tables
- Ready for journal submission portal
Each arrow represents an automated handoff. Each skill reads the previous skill's output files directly — no copy-pasting between sessions, no manual file management.
Phase 7: The QC Chain That Catches What You Miss
Phase 7 is the most significant addition. Previously, quality checking was a separate manual step. Now it runs automatically as the final phase of write-paper, and it enforces a strict sequence:
Step 1 — AI Pattern Scan. The humanize skill scans the manuscript for 18 known AI writing patterns (hedging phrases, filler transitions, over-qualification). Flagged passages get rewritten in place.
Step 2 — Reporting Compliance. check-reporting audits the manuscript against the appropriate guideline (STARD for diagnostic accuracy, PRISMA for systematic reviews, STROBE for observational studies). Items marked MISSING get auto-fixed — the skill inserts the required text at the correct manuscript location.
Step 3 — Citation Verification. search-lit --verify-only checks every reference against PubMed and Semantic Scholar. If a DOI does not resolve, the citation gets flagged. No hallucinated references survive this step.
Step 4 — Self-Review with Auto-Fix. self-review --json --fix runs a 10-category quality audit and attempts to fix identified issues. If the first pass finds problems, it runs a second iteration. Two passes maximum — if issues persist after two rounds, they get flagged for human review rather than running indefinitely.
Structured JSON Outputs
Skills now talk to each other through machine-readable JSON blocks. When check-reporting finishes an audit, it emits:
{
"guideline": "STARD 2015",
"compliance_pct": 83,
"items_present": 25,
"items_missing": 5,
"fixable_by_ai": 4,
"action_items": [
{"item": "7", "label": "Sampling", "status": "MISSING", "fix": "..."}
]
}
The orchestrate skill reads fixable_by_ai to decide whether to attempt auto-repair or halt for human input. Same pattern for self-review: its JSON output includes severity scores and fix suggestions that feed directly into the next iteration.
This matters because it turns quality checking from a report you read into a signal that drives automated decisions.
Figure Manifest: _figure_manifest.md
When make-figures generates plots, it now writes a structured manifest file listing every figure with its path, type, tool used, and description:
# Figure Manifest
Generated: 2026-04-14
Study type: diagnostic-accuracy
| Figure | Path | Type | Tool | Description |
|--------|------|------|------|-------------|
| Figure 1 | figures/fig1_stard_flow.svg | flow-diagram | D2 | STARD flow |
| Figure 2 | figures/fig2_roc.pdf | roc-curve | matplotlib | ROC curves |
| Figure 3 | figures/fig3_calibration.pdf | calibration | matplotlib | Calibration plot |
write-paper reads this manifest to place figures correctly in the manuscript body. The DOCX builder uses it to embed images with proper captions. No more manually dragging figures into a Word document and re-typing legends.
33 Reporting Guidelines
check-reporting now covers 33 guidelines, up from 22 in v2.0. The additions fill gaps in systematic review methodology and risk-of-bias assessment:
| Category | New Additions |
|---|---|
| Systematic Reviews | PRISMA-P, SWiM, AMSTAR 2 |
| Risk of Bias | QUADAS-C, ROBINS-E, ROBIS, ROB-ME, RoB NMA |
| Prediction Models | PROBAST+AI |
| Measurement | COSMIN |
Each guideline includes the full item checklist with specific fix suggestions when items are missing. The AI-specific extensions (STARD-AI, TRIPOD+AI, PROBAST+AI, MI-CLEAR-LLM) make this particularly relevant for radiology and medical AI research.
Post-Skill Validation
In --e2e mode, the orchestrator verifies expected outputs after every skill completes. If analyze-stats does not produce a Table 1 file, the pipeline halts immediately rather than letting write-paper proceed with placeholder data.
This is a direct response to a real failure mode: in earlier versions, a skill could fail silently (e.g., a statistical model that did not converge), and downstream skills would fill in the gap with hallucinated content. Post-skill validation eliminates that class of errors.
Try It Now
Install or update:
# Install all 22 skills
git clone https://github.com/Aperivue/medsci-skills.git
cp -r medsci-skills/skills/* ~/.claude/skills/
# Run the full pipeline on your data
/orchestrate --e2e
# Or run autonomous manuscript writing on an existing project
/write-paper --autonomous
If you already have MedSci Skills installed, pull the latest version and re-copy. All new flags default to off — your existing skill calls work exactly as before.
The full skill list, documentation, and three end-to-end demos are on GitHub: github.com/Aperivue/medsci-skills
What This Does Not Do
A few things worth stating explicitly:
- It does not replace clinical judgment. The pipeline generates manuscripts from data you provide. Study design, patient selection, clinical interpretation — those remain your responsibility.
- It does not guarantee acceptance. The QC chain catches reporting gaps and AI writing patterns, but peer reviewers evaluate scientific merit, which no automation can substitute.
- It does not upload or submit. The pipeline stops at DOCX. Journal submission portals, cover letters, and author agreements are manual steps by design.
Background
MedSci Skills is an open-source bundle of 22 Claude Code skills for medical research, built by a radiology resident running multiple concurrent studies. The project started with a single PDF download script and grew into a full research pipeline. Previous posts cover the origin story and three end-to-end demos.
v2.2 is available now on GitHub. MIT licensed.
MedSci Skills is developed by Aperivue. For questions or feature requests, open an issue on the GitHub repository.