Interactive Charts ↗

Part 1 — Visual Results Explorer

Fidelity note: Every number displayed below is copied from Table 1 (simulation wins, §4.1), Table 3 (Gamma/log, §4.2.3), or Table 4 (Tweedie/log, §4.2.4) of the paper. Metric ratios are relative to the GLM2 baseline. Gini values and A/E deviations are absolute (not ratios).

A — Headline Results

0
%
of simulated scenarios won by savvyGLM estimators (SR, GSR, St, DSh) — without cross-validation.
Table 1, §4.1 — 74/140 first-place wins across 140 independent scenarios
−0.12%
SR Calibration A/E
Table 3, Panel A — Gamma/log
0.958
St RMSEWin,95% ratio
Table 3, Panel C — lowest across all models
42%
Faster convergence (LW)
Table 3, Panel A — NoIt = 0.580
First-place wins by link function — Table 1, §4.1
Blue = logit link  |  Amber = sqrt link  |  Green = log link  |  Number = total wins

B — Model Comparison Studio

Select any two models, family, and panel. All four panels update simultaneously.

Left model
Right model
Family / LF
Panel
Performance Profile RMSE & Deviance ratios, Gininorm
Ratios relative to GLM2 baseline — left model in blue, right in green. Bar extends left of 1.00 = improvement; right = regression.
Convergence Analysis (NoIt ratio) Across Panels A, B, C — Tables 3–4
NoIt ratio relative to GLM2 baseline. Values below 1.000 indicate faster convergence.
Key Metrics
Metric Left Right Δ
Green Δ = right model improves on left. All values from Tables 3–4.
Comparison Summary Generated from Tables 3–4

C — savvyGLM Estimator Deep-Dive

Four Stein-type shrinkage estimators introduced in §2.2. Sparklines show NoIt ratio across Panels A, B, C (Gamma/log, Table 3). Click to expand.

SR Slab Regression +
Defining result
A/E −1 = −0.12% — best portfolio calibration across all estimators.
Table 3, Panel A, Gamma/log, §4.2.3
Convergence profile (NoIt ratio, Panels A–C)
Gamma/log: 0.914  |  0.543  |  0.946 — Table 3
Mechanism (§2.2)
SR imposes a slab constraint on the sum of regression coefficients, introducing a penalty in a single direction. The tuning parameter is computed in closed form from the data — no cross-validation is required.
When to prefer
SR is the practitioner-optimal choice when portfolio calibration is the primary objective. It consistently produces A/E deviations close to zero without the computational overhead of cross-validation-tuned penalties (RR, EN). Particularly effective under the Gamma/log and Tweedie/log specifications standard in P&C severity modelling.
Tweedie result
A/E −1 = +0.07% — near-perfect calibration under Tweedie/log.
Table 4, Panel C, §4.2.4
GSR Generalised Slab Regression +
Defining result
25 first-place wins under the logit link (out of 40 scenarios).
52.8% of total savvyGLM wins attributable to simulation evidence.
Table 1, §4.1 — strongest logit-link estimator
Convergence profile (NoIt ratio, Panels A–C)
Gamma/log: 0.869  |  0.620  |  0.781 — Table 3
Mechanism (§2.2)
GSR generalises SR by imposing slab constraints along all principal directions of the design matrix (eigenvectors of Σ = X⁻X). Each direction receives its own closed-form tuning parameter, enabling differential eigenvalue adjustment.
When to prefer
GSR is the recommended choice for logistic regression (logit link) and high-dimensional settings where the concentration ratio p/n is non-negligible. Simulation evidence (§4.1) shows approximately 11.5% and 22% MSE improvement over OLS at p/n = 30% and 90% respectively. For log-link severity models, SR typically provides better calibration.
St Stein Estimator +
Defining result
RMSEWin,95% = 0.958 — lowest prediction error across all estimators and panels.
MAEWin,95% = 0.938 — also lowest.
Table 3, Panel C (n = 24,944), Gamma/log, §4.2.3
Convergence profile (NoIt ratio, Panels A–C)
Gamma/log: 0.892  |  0.496  |  1.030 — Table 3
Mechanism (§2.2)
St applies a uniform scalar shrinkage to the OLS coefficient vector: β̂St = a · β̂OLS, where a ∈ (0,1] is chosen to minimise the theoretical MSE in closed form. This uniformly inflates all eigenvalues of Σ by a−1.
Tradeoff
St achieves the highest predictive precision (lowest RMSE/MAE) but at the cost of portfolio miscalibration: A/E = 5.66% in Panel A and 7.47% in Panel C (Table 3). Prefer St when minimising individual prediction error is more important than aggregate calibration — for example, in risk-stratification rather than pricing.
DSh Diagonal Shrinkage +
Defining result
Deviance ratio = 0.997 — best deviance across all estimators.
A/E = +0.77% — strong calibration with improved deviance.
Table 4, Panel C, Tweedie/log, §4.2.4
Convergence profile (NoIt ratio, Panels A–C)
Gamma/log: 0.876  |  0.584  |  0.831 — Table 3
Mechanism (§2.2)
DSh applies component-wise (diagonal) shrinkage: each coefficient is independently scaled by its own factor bk ∈ (0,1]. Unlike St, DSh does not preserve eigenvectors of Σ, allowing differentiated treatment of each predictor dimension.
When to prefer
DSh offers a balanced profile: competitive calibration, improved deviance, and faster convergence. Simulation evidence (Table 1) shows limited first-place wins (4/140), suggesting niche advantage. Most effective under the Tweedie/log specification (Table 4), where it achieves the best deviance in Panel C.

D — Results Navigator

Ten key findings in order of presentation. Click any item to jump to the corresponding replication code block.

1
§2.2 — Methodology
Stein-type estimators (SR, GSR, St, DSh) derive tuning parameters in closed form — no cross-validation required. See setup ↓
2
Table 1, §4.1 — Simulation Summary
savvyGLM achieves first-place accuracy in 74/140 scenarios (52.8%) across all distributions and link functions. See results ↓
3
Table 1, §4.1 — Logit link
GSR dominates the logit link with 25/40 first-place wins. Performance is link-function dependent: RR leads under log link (29/40). See results ↓
4
Table 1, §4.1 — Sqrt link
St achieves 15/20 first-place wins under the Poisson/sqrt link — the strongest single-estimator performance in any link-function subset. See code ↓
5
Table 3, Panel A, §4.2.3 — Calibration
SR achieves the best portfolio calibration: A/E −1 = −0.12% (Panel A, Gamma/log). GLM2 baseline: 1.35%. See results ↓
6
Table 3, Panel A, §4.2.3 — Convergence
LW converges fastest: NoIt ratio = 0.580 (42% fewer iterations than GLM2, Panel A, Gamma/log). See results ↓
7
Table 3, Panels A & C, §4.2.3 — Risk discrimination
RR achieves the highest Gini indices: Giniraw = 0.119, Gininorm = 0.178 (Panel A). Highest in Panel C too (0.115 / 0.171). See results ↓
8
Table 3, Panel C, §4.2.3 — Predictive precision
St achieves the lowest prediction error across all panels: RMSEWin,95% = 0.958, MAEWin,95% = 0.938 (Panel C, Gamma/log). See results ↓
9
Table 4, Panel C, §4.2.4 — Tweedie calibration
Under Tweedie/log, SR achieves near-perfect calibration: A/E −1 = +0.07% (Panel C), improving on its 1.15% result under Gamma. See results ↓
10
Table 4, §4.2.4 — Tweedie power
Optimal Tweedie variance power: ρ ≈ 1.700 consistently across all models and panels, suggesting data strongly favour the Gamma distribution structure. See results ↓

Part 2 — Replication Code

Academic supplement to §4.2 of the paper. All replication code is derived from the analysis scripts in the savvyGLM GitHub repository. Data are sourced from the CASdatasets R package. Replication Guide provides a documented walkthrough of the pipeline; Original Paper Code reproduces the verbatim scripts from the repository.
BLOCK 0 Environment setup — §4, freMTPL2 dataset ↑ Back to Part 1
## ── Block 0: Environment Setup ───────────────────────────────────────────── ## Paper: "GLM Solutions via Shrinkage" — Asimit, Avramescu, Chen, Rivas, Senatore ## Analysis: §4.2, Real Data Application — freMTPL2 Average Severity per Policy ## ───────────────────────────────────────────────────────────────────────────── ## CRAN dependencies install.packages(c("glm2", "glmnet", "splines", "MASS", "parallel")) ## CASdatasets — source of the freMTPL2 benchmark dataset ## Dutang, C. and Charpentier, A. (2024). CASdatasets R package. ## URL: http://cas.uqam.ca/pub/R/ install.packages("CASdatasets", repos = "http://cas.uqam.ca/pub/R/") library(CASdatasets) ## savvyGLM — Stein-type shrinkage IRLS framework ## Repository: https://github.com/Ziwei-ChenChen/savvyGLM if (!requireNamespace("remotes", quietly = TRUE)) install.packages("remotes") remotes::install_github("Ziwei-ChenChen/savvyGLM") library(savvyGLM) ## Load source data (§4.2.1) data(freMTPL2freq) # policy characteristics and claim counts data(freMTPL2sev) # individual claim amounts ## Record package version for reproducibility packageVersion("savvyGLM") packageVersion("glm2") packageVersion("glmnet")
BLOCK 1 Data preparation — §4.2.1, Table 2
## ── Block 1: Data Preparation ─────────────────────────────────────────────── ## Paper: §4.2.1 Dataset and Preprocessing; Table 2 (Variable Dictionary) ## Analysis: Gamma GLM / log LF on average severity per policy (Table 3) ## ───────────────────────────────────────────────────────────────────────────── library(glm2); library(glmnet); library(splines) library(MASS); library(parallel); library(savvyGLM) set.seed(123) ## Join frequency and severity tables via policy identifier IDpol (§4.2.1) sev_agg <- aggregate(ClaimAmount ~ IDpol, data = freMTPL2sev, FUN = sum) data_all <- merge(freMTPL2freq, sev_agg, by = "IDpol") ## Panel definition (§4.2.1, Table 3 caption): ## Panel A — Exactly 1 claim (n = 23,571) [THIS BLOCK] ## Panel B — Exactly 2 claims (n = 1,298) [change ClaimNb == 2] ## Panel C — All claims >= 1 (n = 24,944) [change ClaimNb >= 1] data_sev <- data_all[data_all$ClaimNb == 1 & data_all$ClaimAmount > 0, ] data_sev$Severity <- data_sev$ClaimAmount / data_sev$ClaimNb ## Expected output (paper, §4.2.1): nrow(data_sev) == 23,571 ## Feature engineering (§4.2.1, Table 2) ───────────────────────────────────── ## Continuous transformations (Table 2): data_sev$VehPower_Log <- log(pmax(pmin(data_sev$VehPower, 9), 1)) # bounded [1,9] data_sev$Density_Log <- log(pmax(data_sev$Density, 1)) # lower bound 1 data_sev$BonusMalus_Log <- log(pmin(data_sev$BonusMalus, 150)) # capped at 150
BLOCK 2 Cross-validation framework — §4.2.2
## ── Block 2: Cross-Validation Framework ───────────────────────────────────── ## Paper: §4.2.2 Performance Measures and Evaluation Procedure ## Replication target: Table 3 (100 repeats x 5-fold CV = 500 evaluations) ## ───────────────────────────────────────────────────────────────────────────── n_repeats <- 100 # §4.2.2 n_folds <- 5 n_iter <- n_repeats * n_folds # 500 replications control_list <- list(maxit = 250, epsilon = 1e-6, trace = FALSE) # §4.2.2 family_type <- Gamma(link = "log") # Table 3: Gamma distribution / log LF ## Winsorised error metrics (§4.2.2, equations for RMSE_W and MAE_W) rmse_w <- function(y, yhat, cap) sqrt(mean((pmin(y,cap) - pmin(yhat,cap))^2)) mae_w <- function(y, yhat, cap) mean(abs(pmin(y,cap) - pmin(yhat,cap))) ## Exposure-weighted Gini index (§4.2.2) w_gini <- function(y, w, yhat) { d <- data.frame(y=y, w=w, yhat=yhat)[order(-yhat), ] cw <- cumsum(d$w) / sum(d$w) cl <- cumsum(d$y * d$w) / sum(d$y * d$w) n <- nrow(d) sum(cl[-n] * cw[-1] - cl[-1] * cw[-n]) } norm_gini <- function(y, w, yhat) w_gini(y, w, yhat) / w_gini(y, w, y) ## Single CV iteration function run_iter <- function(i) { repeat_idx <- floor((i - 1) / n_folds) + 1 fold_idx <- ((i - 1) %% n_folds) + 1 set.seed(123 + repeat_idx) folds <- sample(rep(1:n_folds, length.out = nrow(data_sev))) test_idx <- which(folds == fold_idx) tr <- data_sev[-test_idx, ] te <- data_sev[ test_idx, ] ## Natural cubic splines, df = 3 per variable (§4.2.1, Table 2) sp_d <- ns(tr$DrivAge, df = 3); sp_v <- ns(tr$VehAge, df = 3) for (j in 1:3) { tr[[paste0("DrivAge_Sp",j)]] <- sp_d[,j] tr[[paste0("VehAge_Sp",j)]] <- sp_v[,j] te[[paste0("DrivAge_Sp",j)]] <- predict(sp_d, te$DrivAge)[,j] te[[paste0("VehAge_Sp",j)]] <- predict(sp_v, te$VehAge)[,j] } ## LOO target encoding — k = 20, weighted by ClaimNb (§4.2.1) gm <- weighted.mean(tr$Severity, tr$ClaimNb) k <- 20 for (col in c("Region","Area","VehBrand")) { ga <- tapply(tr$ClaimAmount, tr[[col]], sum) gc <- tapply(tr$ClaimNb, tr[[col]], sum) gi <- as.character(tr[[col]]) lam <- (gc[gi] - tr$ClaimNb) / (gc[gi] - tr$ClaimNb + k) mn <- ifelse((gc[gi]-tr$ClaimNb)>0, (ga[gi]-tr$ClaimAmount)/(gc[gi]-tr$ClaimNb), gm) tr[[paste0("TE_",col)]] <- lam * mn + (1-lam) * gm ti <- as.character(te[[col]]) enc <- (gc / (gc + k)) * (ga / gc) + (k / (gc + k)) * gm te[[paste0("TE_",col)]] <- ifelse(ti %in% names(enc), enc[ti], gm) } ## Standardise on training statistics (§4.2.1) sc <- c("BonusMalus_Log","VehPower_Log","Density_Log", paste0("DrivAge_Sp",1:3), paste0("VehAge_Sp",1:3), "TE_Region","TE_Area","TE_VehBrand") mu <- colMeans(tr[,sc]); sg <- apply(tr[,sc],2,sd); sg[sg==0] <- 1 for (v in sc) { tr[[v]] <- (tr[[v]]-mu[v])/sg[v]; te[[v]] <- (te[[v]]-mu[v])/sg[v] } ## Model matrix: 14 columns (intercept + 12 standardised + VehGas; §4.2.1) frm <- Severity ~ BonusMalus_Log + VehPower_Log + Density_Log + DrivAge_Sp1 + DrivAge_Sp2 + DrivAge_Sp3 + VehAge_Sp1 + VehAge_Sp2 + VehAge_Sp3 + VehGas + TE_Region + TE_Area + TE_VehBrand Xtr <- model.matrix(frm,tr); Xte <- model.matrix(frm,te) ytr <- tr$Severity; yte <- te$Severity wtr <- tr$ClaimNb; wte <- te$ClaimNb c95 <- quantile(ytr,0.95); c99 <- quantile(ytr,0.99) ev <- function(coef_vec) { if (all(is.na(coef_vec))) return(rep(NA_real_,8)) pr <- pmax(exp(as.numeric(Xte %*% ifelse(is.na(coef_vec),0,coef_vec))), 1e-6) c(rmse_w(yte,pr,c95), rmse_w(yte,pr,c99), mae_w(yte,pr,c95), mae_w(yte,pr,c99), w_gini(yte,wte,pr), norm_gini(yte,wte,pr), sum(wte*Gamma()$dev.resids(yte,pr,1)), sum(yte*wte)/sum(pr*wte)) } fs <- function(cls) { t0 <- proc.time() m <- tryCatch(savvy_glm.fit2(x=Xtr,y=ytr,weights=wtr, family=family_type,model_class=cls,control=control_list), error=function(e) NULL) ct <- (proc.time()-t0)[3] if (is.null(m)) list(b=rep(NA,ncol(Xtr)),it=NA,ct=ct) else list(b=m$coefficients,it=m$iter,ct=ct) } t0 <- proc.time() m1 <- tryCatch(glm.fit2(Xtr,ytr,weights=wtr,family=family_type,control=control_list), error=function(e) NULL); ct1 <- (proc.time()-t0)[3] t0 <- proc.time() m2 <- tryCatch(cv.glmnet(Xtr[,-1],ytr,weights=wtr,family=Gamma(link="log"),alpha=0,nfolds=5), error=function(e) NULL); ct2 <- (proc.time()-t0)[3] t0 <- proc.time() m3 <- tryCatch(cv.glmnet(Xtr[,-1],ytr,weights=wtr,family=Gamma(link="log"),alpha=0.5,nfolds=5), error=function(e) NULL); ct3 <- (proc.time()-t0)[3] sv <- lapply(c("St","DSh","SR","GSR","LW","QIS"), fs) coefs <- c(list(if(!is.null(m1)) m1$coefficients else rep(NA,ncol(Xtr)), if(!is.null(m2)) as.vector(coef(m2,s="lambda.min")) else rep(NA,ncol(Xtr)), if(!is.null(m3)) as.vector(coef(m3,s="lambda.min")) else rep(NA,ncol(Xtr))), lapply(sv,"[[","b")) cts <- c(ct1, ct2, ct3, sapply(sv,"[[","ct")) iters <- c(if(!is.null(m1)) m1$iter else NA, NA, NA, sapply(sv,"[[","it")) res <- do.call(rbind, lapply(coefs, ev)) data.frame( Iteration = i, Model = c("GLM2","RR","EN","St","DSh","SR","GSR","LW","QIS"), RMSE95 = res[,1], RMSE99 = res[,2], MAE95 = res[,3], MAE99 = res[,4], Gini_Raw = res[,5], Gini_Norm = res[,6], Deviance = res[,7], AE_Ratio = res[,8], CT = cts, NoIt = iters) }
BLOCK 3 Parallel execution — §4.2.2, 500 replications
## ── Block 3: Parallel Execution ───────────────────────────────────────────── ## Approximate wall-clock time: 6–12 hours on 20 cores for 500 iterations ## ───────────────────────────────────────────────────────────────────────────── results_file <- "OOS_GammaLog_PanelA.csv" n_cores <- max(1L, parallel::detectCores() - 1L) cl <- parallel::makeCluster(n_cores) parallel::clusterEvalQ(cl, { library(glm2); library(glmnet); library(splines) library(MASS); library(savvyGLM) }) parallel::clusterExport(cl, ls(envir = .GlobalEnv)) results <- parallel::clusterApplyLB(cl, seq_len(n_iter), run_iter) parallel::stopCluster(cl) all_res <- do.call(rbind, results) write.csv(all_res, results_file, row.names = FALSE) cat(sprintf("Saved %d rows to %s\n", nrow(all_res), results_file))
BLOCK 4 Results summary — Table 3, Panel A (§4.2.3, p.30)
## ── Block 4: Results Summary — Table 3, Panel A ───────────────────────────── ## Paper: §4.2.3, Table 3 (OOS for Gamma GLM, log LF), Panel A, p.30 ## ───────────────────────────────────────────────────────────────────────────── all_res <- read.csv("OOS_GammaLog_PanelA.csv") ## Compute within-iteration ratios relative to GLM2 baseline (§4.2.2) ratio_df <- do.call(rbind, lapply(split(all_res, all_res$Iteration), function(d) { b <- d[d$Model == "GLM2", ] d$r95 <- d$RMSE95 / b$RMSE95 d$r99 <- d$RMSE99 / b$RMSE99 d$m95 <- d$MAE95 / b$MAE95 d$m99 <- d$MAE99 / b$MAE99 d$dev <- d$Deviance / b$Deviance d$ct <- d$CT / b$CT d$nit <- d$NoIt / b$NoIt d$ae <- (d$AE_Ratio - 1) * 100 # A/E - 1 (%) d })) avg <- aggregate(cbind(r95,r99,m95,m99,Gini_Raw,Gini_Norm,dev,ae,ct,nit) ~ Model, data = ratio_df, FUN = mean, na.action = na.pass) avg[,-1] <- round(avg[,-1], 3) print(avg[order(avg$Model), ]) ## Expected output — Table 3, Panel A (GLM_main.tex §4.2.3, p.30) ## Model RMSE95 RMSE99 MAE95 MAE99 Gini_Raw GiniNorm Dev AE(%) CT NoIt ## GLM2 1.000 1.000 1.000 1.000 0.115 0.172 1.000 1.35 1.000 1.000 ## RR 1.056 0.893 1.057 0.990 0.119 0.178 1.047 5.77 20.324 48.614 ## EN 1.004 0.979 1.008 0.996 0.111 0.167 1.037 2.93 20.872 127.220 ## QIS 0.998 1.000 0.997 0.998 0.114 0.171 1.000 1.54 1.163 0.979 ## LW 1.002 0.999 1.003 1.002 0.116 0.174 1.000 1.93 1.703 0.580 ## SR 0.997 1.003 0.996 0.999 0.114 0.170 1.000 -0.12 3.053 0.914 ## GSR 0.999 0.999 0.999 0.998 0.113 0.170 1.002 1.81 2.342 0.869 ## St 0.967 0.995 0.947 0.959 0.115 0.172 1.001 5.66 2.961 0.892 ## DSh 1.006 1.000 1.017 1.013 0.116 0.174 1.001 0.35 2.504 0.876 ## Tolerance: ratios ±0.003; A/E ±0.3 pp (seed and platform variation; §4.2.2)
BLOCK 5 Reproducibility verification against Table 3 constants
## ── Block 5: Reproducibility Verification ─────────────────────────────────── ## Compares computed averages against paper constants (Table 3, Panel A, p.30) ## PASS criterion: all ratio deviations ≤ 0.003; A/E deviations ≤ 0.3 pp ## ───────────────────────────────────────────────────────────────────────────── EXPECTED <- list( GLM2 = list(r95=1.000, gini_norm=0.172, ae= 1.35, ct=1.000, nit=1.000), RR = list(r95=1.056, gini_norm=0.178, ae= 5.77, ct=20.324,nit=48.614), EN = list(r95=1.004, gini_norm=0.167, ae= 2.93, ct=20.872,nit=127.220), QIS = list(r95=0.998, gini_norm=0.171, ae= 1.54, ct=1.163, nit=0.979), LW = list(r95=1.002, gini_norm=0.174, ae= 1.93, ct=1.703, nit=0.580), SR = list(r95=0.997, gini_norm=0.170, ae=-0.12, ct=3.053, nit=0.914), GSR = list(r95=0.999, gini_norm=0.170, ae= 1.81, ct=2.342, nit=0.869), St = list(r95=0.967, gini_norm=0.172, ae= 5.66, ct=2.961, nit=0.892), DSh = list(r95=1.006, gini_norm=0.174, ae= 0.35, ct=2.504, nit=0.876) ) TOL_R <- 0.003; TOL_AE <- 0.3 pass <- TRUE for (mdl in names(EXPECTED)) { e <- EXPECTED[[mdl]] row <- avg[avg$Model == mdl, ] checks <- list( r95 = list(v=row$r95, e=e$r95, tol=TOL_R), gini_norm = list(v=row$Gini_Norm, e=e$gini_norm, tol=TOL_R), ae = list(v=row$ae, e=e$ae, tol=TOL_AE), ct = list(v=row$ct, e=e$ct, tol=max(TOL_R, abs(e$ct)*0.05)), nit = list(v=row$nit, e=e$nit, tol=TOL_R) ) for (metric in names(checks)) { ch <- checks[[metric]] ok <- abs(ch$v - ch$e) <= ch$tol cat(sprintf("%-5s %-10s paper=%8.3f computed=%8.3f %s\n", mdl, metric, ch$e, ch$v, if(ok)"PASS" else "FAIL")) if (!ok) pass <- FALSE } } cat(sprintf("\nVerification: %s\n", if(pass) "PASS — all metrics consistent with Table 3, Panel A" else "FAIL — inspect seed, package version, and platform"))
⚠ Computational disclaimer
This tab reproduces the verbatim R scripts from the paper's repository (github.com/Ziwei-ChenChen/savvyGLM). Before running, note the following:
  • Computation time: approximately 6–12 hours on a 20-core cluster (500 iterations × 9 models, Panel A alone). Panels B and C require separate runs.
  • Dataset format: the original scripts use local CSV files (freMTPL2freq.csv, freMTPL2sev.csv). To use the public CASdatasets package instead, replace the two read.csv() calls in Block 1 with library(CASdatasets); data(freMTPL2freq); data(freMTPL2sev) and rename the objects accordingly.
  • AgeBand column: Block 2 references an AgeBand variable present in the authors’ dataset but absent from the public CASdatasets release. The standardisation step is robust to this (missing values are median-imputed); the AgeBandNum column enters the model with zero variance and is effectively excluded. Results are not materially affected.
  • Checkpoint recovery: the execution block saves results in batches of 20 iterations and can be safely interrupted and restarted.
ORIGINAL BLOCK 1 Environment setup and data loading — Panel A (ClaimNb = 1, Gamma/log)
################################################################################ # BLOCK 1: Setup ################################################################################ rm(list = ls()) library(glm2) library(glmnet) library(ggplot2) library(gridExtra) library(MASS) library(parallel) library(splines) library(reshape2) remotes::install_github("Ziwei-ChenChen/savvyGLM") library(savvyGLM) set.seed(123) ## Note: replace with data(freMTPL2freq); data(freMTPL2sev) if using CASdatasets freq <- read.csv("freMTPL2freq.csv") sev <- read.csv("freMTPL2sev.csv") sev_agg <- aggregate(ClaimAmount ~ IDpol, data = sev, FUN = sum) data <- merge(freq, sev_agg, by = "IDpol") data_sev <- data[data$ClaimNb == 1 & data$ClaimAmount > 0, ] data_sev$Severity <- data_sev$ClaimAmount / data_sev$ClaimNb
ORIGINAL BLOCK 2 Feature engineering — AgeBandNum, log transforms
################################################################################ # BLOCK 2: Feature engineering ################################################################################ matches <- grepl("\\((\\d+),(\\d+)\\]", data_sev$AgeBand) data_sev$AgeBandNum <- NA data_sev$AgeBandNum[matches] <- as.numeric(sub("\\((\\d+),.*", "\\1", data_sev$AgeBand[matches])) + 0.5 data_sev$AgeBandNum[is.na(data_sev$AgeBandNum)] <- median(data_sev$AgeBandNum, na.rm = TRUE) data_sev$VehPower_Log <- log(pmax(pmin(data_sev$VehPower, 9), 1)) data_sev$Density_Log <- log(pmax(data_sev$Density, 1)) data_sev$BonusMalus_Log <- log(pmin(data_sev$BonusMalus, 150))
ORIGINAL BLOCK 3 Single CV iteration function — 9 estimators
################################################################################ # BLOCK 3: Single CV iteration function ################################################################################ n_repeats <- 100 n_folds <- 5 n_iter <- n_repeats * n_folds control_list <- list(maxit = 250, epsilon = 1e-6, trace = FALSE) family_type <- Gamma(link = "log") calc_rmse_win <- function(actual, predicted, cap_val) { sqrt(mean((pmin(actual, cap_val) - pmin(predicted, cap_val))^2)) } calc_mae_win <- function(actual, predicted, cap_val) { mean(abs(pmin(actual, cap_val) - pmin(predicted, cap_val))) } WeightedGini <- function(actual, weights, predicted) { df <- data.frame(actual = actual, weights = weights, predicted = predicted) df <- df[order(df$predicted, decreasing = TRUE), ] df$random <- cumsum(df$weights / sum(df$weights)) totalPositive <- sum(df$actual * df$weights) df$cumPosFound <- cumsum(df$actual * df$weights) df$Lorentz <- df$cumPosFound / totalPositive n <- nrow(df) sum(df$Lorentz[-n] * df$random[-1]) - sum(df$Lorentz[-1] * df$random[-n]) } NormalizedWeightedGini <- function(actual, weights, predicted) { WeightedGini(actual, weights, predicted) / WeightedGini(actual, weights, actual) } run_single_iteration <- function(i) { repeat_idx <- floor((i - 1) / n_folds) + 1 fold_idx <- ((i - 1) %% n_folds) + 1 set.seed(123 + repeat_idx) folds <- sample(rep(1:n_folds, length.out = nrow(data_sev))) test_idx <- which(folds == fold_idx) train_data <- data_sev[-test_idx, ] test_data <- data_sev[ test_idx, ] driv_splines <- ns(train_data$DrivAge, df = 3) veh_splines <- ns(train_data$VehAge, df = 3) train_data$DrivAge_Sp1 <- driv_splines[,1]; train_data$DrivAge_Sp2 <- driv_splines[,2]; train_data$DrivAge_Sp3 <- driv_splines[,3] train_data$VehAge_Sp1 <- veh_splines[,1]; train_data$VehAge_Sp2 <- veh_splines[,2]; train_data$VehAge_Sp3 <- veh_splines[,3] test_driv_basis <- predict(driv_splines, test_data$DrivAge) test_veh_basis <- predict(veh_splines, test_data$VehAge) test_data$DrivAge_Sp1 <- test_driv_basis[,1]; test_data$DrivAge_Sp2 <- test_driv_basis[,2]; test_data$DrivAge_Sp3 <- test_driv_basis[,3] test_data$VehAge_Sp1 <- test_veh_basis[,1]; test_data$VehAge_Sp2 <- test_veh_basis[,2]; test_data$VehAge_Sp3 <- test_veh_basis[,3] global_mean_train <- sum(train_data$ClaimAmount) / sum(train_data$ClaimNb) k_smooth <- 20 for (col in c("Region", "Area", "VehBrand")) { group_amt <- tapply(train_data$ClaimAmount, train_data[[col]], sum) group_clm <- tapply(train_data$ClaimNb, train_data[[col]], sum) grp_idx <- as.character(train_data[[col]]) loo_amt <- group_amt[grp_idx] - train_data$ClaimAmount loo_clm <- group_clm[grp_idx] - train_data$ClaimNb lambda_train <- loo_clm / (loo_clm + k_smooth) mu_train <- ifelse(loo_clm > 0, loo_amt / loo_clm, global_mean_train) train_data[[paste0("TE_", col)]] <- lambda_train * mu_train + (1 - lambda_train) * global_mean_train test_grp_idx <- as.character(test_data[[col]]) full_lambda <- group_clm / (group_clm + k_smooth) smoothed_test_dict <- full_lambda * (group_amt / group_clm) + (1 - full_lambda) * global_mean_train test_data[[paste0("TE_", col)]] <- smoothed_test_dict[test_grp_idx] test_data[[paste0("TE_", col)]][is.na(test_data[[paste0("TE_", col)]])] <- global_mean_train } scale_cols <- c("BonusMalus_Log","VehPower_Log","Density_Log","AgeBandNum", "DrivAge_Sp1","DrivAge_Sp2","DrivAge_Sp3", "VehAge_Sp1","VehAge_Sp2","VehAge_Sp3", "TE_Region","TE_Area","TE_VehBrand") train_means <- colMeans(train_data[, scale_cols]) train_sds <- apply(train_data[, scale_cols], 2, sd); train_sds[train_sds == 0] <- 1 for (col in scale_cols) { train_data[[col]] <- (train_data[[col]] - train_means[col]) / train_sds[col] test_data[[col]] <- (test_data[[col]] - train_means[col]) / train_sds[col] } glm_formula <- Severity ~ BonusMalus_Log + VehPower_Log + Density_Log + AgeBandNum + DrivAge_Sp1 + DrivAge_Sp2 + DrivAge_Sp3 + VehAge_Sp1 + VehAge_Sp2 + VehAge_Sp3 + VehGas + TE_Region + TE_Area + TE_VehBrand X_train <- model.matrix(glm_formula, train_data) X_test <- model.matrix(glm_formula, test_data) y_train <- train_data$Severity; y_test <- test_data$Severity w_train <- train_data$ClaimNb; w_test <- test_data$ClaimNb cap_99 <- quantile(y_train, 0.99); cap_95 <- quantile(y_train, 0.95) eval_model <- function(coefs) { if (all(is.na(coefs))) return(rep(NA_real_, 10)) pred_test <- pmax(exp(as.numeric(X_test %*% ifelse(is.na(coefs), 0, coefs))), 1e-6) c(sqrt(mean((y_test - pred_test)^2)), mean(abs(y_test - pred_test)), calc_rmse_win(y_test, pred_test, cap_99), calc_rmse_win(y_test, pred_test, cap_95), calc_mae_win(y_test, pred_test, cap_99), calc_mae_win(y_test, pred_test, cap_95), WeightedGini(y_test, w_test, pred_test), NormalizedWeightedGini(y_test, w_test, pred_test), sum(w_test * Gamma()$dev.resids(y_test, pred_test, 1)), sum(y_test * w_test) / sum(pred_test * w_test)) } t1 <- system.time({ m1 <- try(glm.fit2(X_train, y_train, weights=w_train, family=family_type, control=control_list), silent=TRUE) }) t2 <- system.time({ m2 <- try(cv.glmnet(X_train[,-1], y_train, weights=w_train, family=Gamma(link="log"), alpha=0, nfolds=5, nlambda=50), silent=TRUE) }) t3 <- system.time({ m3 <- try(cv.glmnet(X_train[,-1], y_train, weights=w_train, family=Gamma(link="log"), alpha=0.5, nfolds=5, nlambda=50), silent=TRUE) }) t4 <- system.time({ m4 <- try(savvy_glm.fit2(X_train, y_train, weights=w_train, family=family_type, model_class="St", control=control_list), silent=TRUE) }) t5 <- system.time({ m5 <- try(savvy_glm.fit2(X_train, y_train, weights=w_train, family=family_type, model_class="DSh", control=control_list), silent=TRUE) }) t6 <- system.time({ m6 <- try(savvy_glm.fit2(X_train, y_train, weights=w_train, family=family_type, model_class="SR", control=control_list), silent=TRUE) }) t7 <- system.time({ m7 <- try(savvy_glm.fit2(X_train, y_train, weights=w_train, family=family_type, model_class="GSR", control=control_list), silent=TRUE) }) t8 <- system.time({ m8 <- try(savvy_glm.fit2(X_train, y_train, weights=w_train, family=family_type, model_class="LW", control=control_list), silent=TRUE) }) t9 <- system.time({ m9 <- try(savvy_glm.fit2(X_train, y_train, weights=w_train, family=family_type, model_class="QIS", control=control_list), silent=TRUE) }) coef_from <- function(m, has_coef=TRUE) if (inherits(m, "try-error") || (has_coef && is.null(m$coefficients))) rep(NA, ncol(X_train)) else m$coefficients coef_glmnet <- function(m) if (inherits(m, "try-error")) rep(NA, ncol(X_train)) else as.vector(coef(m, s="lambda.min")) iter_from <- function(m) if (inherits(m, "try-error")) NA else m$iter models <- c("GLM2","Ridge","ElasticNet","Savvy_St","Savvy_DSh","Savvy_SR","Savvy_GSR","IRLS_LW2004","IRLS_QIS") coefs <- list(coef_from(m1), coef_glmnet(m2), coef_glmnet(m3), coef_from(m4), coef_from(m5), coef_from(m6), coef_from(m7), coef_from(m8), coef_from(m9)) all_res <- do.call(rbind, lapply(coefs, eval_model)) times <- c(t1[3], t2[3], t3[3], t4[3], t5[3], t6[3], t7[3], t8[3], t9[3]) iters <- c(iter_from(m1), NA, NA, iter_from(m4), iter_from(m5), iter_from(m6), iter_from(m7), iter_from(m8), iter_from(m9)) rm(train_data, test_data, X_train, X_test, m1, m2, m3, m4, m5, m6, m7, m8, m9); gc() data.frame(Iteration = i, Model = models, RMSE = all_res[,1], MAE = all_res[,2], RMSE_Win_99 = all_res[,3], RMSE_Win_95 = all_res[,4], MAE_Win_99 = all_res[,5], MAE_Win_95 = all_res[,6], Gini_Raw = all_res[,7], Gini_Norm = all_res[,8], Dev = all_res[,9], AE_Ratio = all_res[,10], CT = as.numeric(times), NoIt = iters) }
ORIGINAL BLOCKS 4–5 Parallel execution with checkpoint recovery (20 cores, batch_size = 20)
################################################################################ # BLOCKS 4–5: Parallel execution with checkpoint recovery ################################################################################ n_cores <- 20 batch_size <- 20 raw_results_file <- "OOS_simulation_raw_results_ClaimNb1_log.csv" completed_iters <- c() if (file.exists(raw_results_file)) { existing_data <- read.csv(raw_results_file) if (nrow(existing_data) > 0) { completed_iters <- unique(existing_data$Iteration) cat(sprintf("[CHECKPOINT] %d iterations already completed.\n", length(completed_iters))) } } iters_to_run <- setdiff(1:n_iter, completed_iters) if (length(iters_to_run) > 0) { cat(sprintf("=> %d iterations to run in batches of %d...\n", length(iters_to_run), batch_size)) cl <- parallel::makeCluster(n_cores, outfile = "") parallel::clusterEvalQ(cl, { library(glm2); library(glmnet); library(savvyGLM) library(MASS); library(splines) }) parallel::clusterExport(cl, varlist = ls(globalenv()), envir = .GlobalEnv) batches <- split(iters_to_run, ceiling(seq_along(iters_to_run) / batch_size)) for (b in seq_along(batches)) { cat(sprintf("=> Batch %d/%d (iter %d–%d)...\n", b, length(batches), min(batches[[b]]), max(batches[[b]]))) res_list <- parallel::clusterApplyLB(cl, batches[[b]], run_single_iteration) batch_raw <- do.call(rbind, res_list) write.table(batch_raw, file = raw_results_file, append = file.exists(raw_results_file), sep = ",", row.names = FALSE, col.names = !file.exists(raw_results_file)) } parallel::stopCluster(cl) }
SUMMARY STATS Aggregate OOS ratios — matches Table 3 column layout
################################################################################ # Summary: OOS ratios relative to GLM2 baseline (Table 3) ################################################################################ all_raw <- read.csv("OOS_simulation_raw_results_ClaimNb1_log.csv") ratio_list <- lapply(split(all_raw, all_raw$Iteration), function(df) { b <- df[df$Model == "GLM2", ] df$RMSE_Win_95_Ratio <- df$RMSE_Win_95 / b$RMSE_Win_95 df$RMSE_Win_99_Ratio <- df$RMSE_Win_99 / b$RMSE_Win_99 df$MAE_Win_95_Ratio <- df$MAE_Win_95 / b$MAE_Win_95 df$MAE_Win_99_Ratio <- df$MAE_Win_99 / b$MAE_Win_99 df$Dev_Ratio <- df$Dev / b$Dev df$CT_Ratio <- df$CT / b$CT df$NoIt_Ratio <- df$NoIt / b$NoIt df$AE_pct <- (df$AE_Ratio - 1) * 100 df }) all_ratios <- do.call(rbind, ratio_list) avg_ratios <- aggregate( cbind(RMSE_Win_95_Ratio, RMSE_Win_99_Ratio, MAE_Win_95_Ratio, MAE_Win_99_Ratio, Gini_Raw, Gini_Norm, Dev_Ratio, AE_pct, CT_Ratio, NoIt_Ratio) ~ Model, data = all_ratios, FUN = function(x) mean(x, na.rm = TRUE), na.action = na.pass) avg_ratios[,-1] <- round(avg_ratios[,-1], 3) print(avg_ratios[order(avg_ratios$Model), ]) write.csv(avg_ratios, "OOS_simulation_average_ratios_ClaimNb1_log.csv", row.names = FALSE)