Skip to contents

CGMissingDataR is an R package based on the CGMissingData Python library for evaluating model performance under feature missingness by:

  • injecting missing values into feature columns at specified masking rates,
  • imputing missing values using a Multiple Imputation by Chained Equations (MICE)-style iterative imputer, and
  • training Random Forest and k-Nearest Neighbors regressors to report Mean ABsolute Percentage Error (MAPE) and R across missingness levels.

Before the installation, ensure that you have the following R packages installed:

install.packages(c("FNN", "ranger", "mice"))

Install the development version of CGMissingDataR from GitHub:

devtools::install_github("saraswatsh/CGMissingDataR")

Vignette

A brief vignette illustrating the usage of CGMissingDataR can be found here.

Changelog

The changelog is available here.