CatBoost — PDP, ICE, SHAP, Tuning

Machine Learning • Tree Ensembles & Explainability

RRandom Forest XGBoostLightGBMCatBoost PDP / ICE / SHAPHyperparameter Tuning

I implement end-to-end tree-ensemble pipelines in R for classification and regression, with strong diagnostics and clear model explanations. I work fluently with randomForest/ranger, xgboost, lightgbm, and catboost, wrapped in tidymodels/caret or mlr3 for clean resampling and reproducibility.

Modelling & tuning: grid/random/Bayesian search; cross-validation, repeated CV, time-series CV; early stopping; class-imbalance handling (weights/SMOTE), monotonic constraints (where supported), categorical encodings (one-hot/target/ordered), missing-value strategies, and leakage-safe preprocessing pipelines. I provide feature importance (gain/cover/permutation), error curves, and calibration.

Explainability: PDP/ICE for main effects and interactions, ALE as an alternative to PDP, and SHAP for global and local attributions (summary, dependence, force plots). I can deliver compact narratives that connect model behavior to domain intuition and robustness checks.

Evaluation: accuracy/ROC-AUC/PR-AUC/logloss for classification; RMSE/MAE/R² for regression; threshold analysis, decision curves, and fairness slices as needed. Deliverables include clean R scripts, tuned models, and publication-quality figures ready for reports or theses.

I’m very familiar with these methods and can provide hands-on R programming help end-to-end—data prep, modelling, tuning, interpretation, and deployment-oriented code.

Get help: engagements start at USD $150; fixed quotes follow a brief review of your data and scope.

✉ Email me — request a quote

Feature importance bar chart — Global feature importance.

Partial dependence and ICE curves — Partial Dependence overlaid with ICE curves.

SHAP summary scatter — SHAP summary for global/local explanations.

Hyperparameter tuning heatmap — Cross-validated score over a tuning grid.