Data Cleaning
/clean-dataNEWWhat it does
Standardize, validate, and transform raw research datasets. Handles missing data, outlier detection, and variable recoding.
Highlights
- ✓Missing data summary & imputation
- ✓Outlier detection (IQR, Z-score)
- ✓Codebook generation
Install this skill
git clone https://github.com/aperivue/medsci-skills.git
cp -r medsci-skills/skills/clean-data ~/.claude/skills/Related skills
Identifies analysis unit, cohort logic, data leakage risks, and validation strategy.
Sample Size Calculator/calc-sample-sizeInteractive sample size calculator with decision-tree guided test selection. Covers 11 designs including Cox regression EPV.
De-identification/deidentifyDe-identify clinical research data before LLM-assisted analysis. Standalone Python CLI with 10 country locale packs. No LLM involved.
Variable Operationalization/define-variablesLiterature-grounded variable operationalization for observational research. Turns a data dictionary plus research question into a citation-backed table of exposure / outcome / covariate definitions, cutoffs, and DB-variable mappings. Tier 0 dictionary-first rule prevents ad-hoc phenotype definitions that invite reviewer rejection.