Codebook Generator
/generate-codebookNEWWhat it does
Generate a citable data dictionary / codebook from a tabular dataset (CSV, Excel, Parquet, Stata, SAS). Profiles every variable — role, type, level frequencies, range, missingness — and flags coded variables with unknown meanings as [NEEDS DICTIONARY] rather than guessing.
Highlights
- ✓Per-variable profiling → codebook.md + .json
- ✓Flags unknown codes, never guesses
- ✓Feeds /define-variables (dictionary-first)
Install this skill
git clone https://github.com/aperivue/medsci-skills.git
cp -r medsci-skills/skills/generate-codebook ~/.claude/skills/Related skills
Study Design/design-study
Identifies analysis unit, cohort logic, data leakage risks, and validation strategy.
Sample Size Calculator/calc-sample-sizeInteractive sample size calculator with decision-tree guided test selection. Covers 11 designs including Cox regression EPV.
Data Cleaning/clean-dataStandardize, validate, and transform raw research datasets. Handles missing data, outlier detection, and variable recoding.
De-identification/deidentifyDe-identify clinical research data before LLM-assisted analysis. Standalone Python CLI with 10 country locale packs. No LLM involved.