PLCOm2012 original
Person-level 6-year lung cancer risk in ever-smokers. The model first builds a linear predictor from centered clinical and smoking variables, then converts that value to a probability with the logistic function. Original race/ethnicity and education fields require explicit governance opt-in.
Step 1: code variables
education: 1 less than high school, 2 high school graduate,
3 post-high-school training, 4 some college,
5 college graduate, 6 postgraduate/professional
COPD, personal_cancer_history, family_history_lung_cancer,
current_smoker: yes = 1, no = 0
quit_years = 0 for current smokers
Step 2: transform variables
age_centered = age - 62
education_centered = education - 4
BMI_centered = BMI - 27
smoking_intensity_transform = (cigarettes_per_day/10)^-1 - 0.4021541613
years_smoked_centered = years_smoked - 27
quit_years_centered = quit_years - 10
Step 3: calculate linear predictor
lp = -4.532506
+ 0.0778868*(age - 62)
- 0.0812744*(education - 4)
- 0.0274194*(BMI - 27)
+ 0.3553063*COPD
+ 0.4589971*personal_cancer_history
+ 0.5871850*family_history_lung_cancer
+ 0.2597431*current_smoker
- 1.8226060*((cigarettes_per_day/10)^-1 - 0.4021541613)
+ 0.0317321*(years_smoked - 27)
- 0.0308572*(quit_years - 10)
+ race_ethnicity_term
race_ethnicity_term:
White / American Indian / Alaskan Native = 0
Black = 0.3944778
Hispanic = -0.7434744
Asian = -0.4665850
Native Hawaiian / Pacific Islander = 1.0271520
Step 4: convert to 6-year probability
risk = exp(lp)/(1+exp(lp))