Using the CGMissingDataR Shiny App
Source:vignettes/Using-the-CGMissingDataR-Shiny-App.Rmd
Using-the-CGMissingDataR-Shiny-App.RmdOverview
CGMissingDataR includes an optional Shiny app for interactive missing glucose imputation. The app is a point-and-click interface around the main package function:
The app is useful when users want to:
- upload a CSV file without writing R code;
- choose the target glucose, subject ID, timestamp, and feature columns from a user interface;
- load built-in example data sets for demonstration;
- inspect the observed missingness before running imputation;
- run the imputation workflow;
- preview only the rows where glucose was originally missing and then imputed;
- download the completed data as a CSV file.
The Shiny app does not implement a separate imputation algorithm. It
calls run_missing_glucose_imputation() internally and
returns the same type of completed data frame as the command-line
workflow.
Installation
Install CGMissingDataR from CRAN with:
install.packages("CGMissingDataR")The app requires the optional R package shiny. If Shiny
is not already installed, install it with:
install.packages("shiny")Then load the package:
Launching the app
Launch the app with:
run_cgmissingdata_app()During package development, after running
devtools::load_all(), the same launcher can be used:
devtools::load_all()
run_cgmissingdata_app()The app is bundled inside the installed package, typically under:
system.file(
"shiny",
"cgm_imputation_app",
package = "CGMissingDataR"
)Users normally do not need to access this directory directly. The
run_cgmissingdata_app() launcher finds it
automatically.
Input options
The app provides two ways to load data.
Upload a CSV file
Use the Browse button to upload a CSV file containing CGM data. The file should contain, at minimum, columns corresponding to:
| Role | Example column | App selector |
|---|---|---|
| Subject identifier | USUBJID |
Subject ID column |
| Glucose value | LBORRES |
Target glucose column |
| Timestamp | Time |
Timestamp column |
| Additional predictors |
AGE, hba1c
|
Feature columns |
After the file is uploaded, the app displays a preview of the uploaded data and populates the column-selection controls.
Load built-in example data
The app can also load built-in example data sets for demonstration. These are useful for quickly showing how the workflow behaves without requiring users to upload their own data.
The example data sets are intended to include:
| Example data | Description |
|---|---|
CGMExmplDat5Pct |
Example CGM data with about 5% missing glucose values. |
CGMExmplDat10Pct |
Example CGM data with about 10% missing glucose values. |
After selecting an example data set and clicking Load example data, the app uses that data set exactly as if it had been uploaded by the user.
Selecting columns
Once data are loaded, select the columns that map to the imputation function.
Target glucose column
Choose the glucose column with missing values to impute. In the included example data, this is usually:
LBORRESThe original target column is preserved in the returned data. Values
that were originally missing remain NA in this original
column. Completed glucose values are written to a new column named:
imputed_glucose_valueSubject ID column
Choose the column identifying each subject or participant. In the example data, this is usually:
USUBJIDThe subject ID is used for sorting, lag feature creation, rolling-mean feature creation, and subject-level time handling.
Timestamp column
Choose the raw timestamp column. In the example data, this is usually:
TimeThe imputation function creates or reuses a numeric
TimeSeries column from the timestamp values. Common
timestamp formats are supported, including colon-separated,
hyphen-separated, slash-separated, ISO-style, and POSIXct
values.
Missingness summary card
The app includes a missingness summary card beside the uploaded data preview. After a target glucose column is selected, this card shows:
- the percentage of missing values in the target column;
- the number of missing rows;
- the total number of rows;
- a warning style when missingness is greater than the chosen threshold, such as 20%.
This card is intended as a quick data-quality check before running the imputation workflow. Higher missingness does not necessarily mean imputation cannot be run, but users should interpret results carefully when a large portion of the target glucose column is missing.
Backend selection
The app supports the same backends as
run_missing_glucose_imputation().
| Backend | Description | Recommended use |
|---|---|---|
mice |
R-native backend using the R package mice. |
Default, CRAN-safe workflow. |
sklearn |
Optional Python-compatible backend through
reticulate. |
Closest agreement with the Python reference workflow. |
MICE backend
The default backend is:
imputer_backend = "mice"This backend does not require Python and is the safest choice for most users. It is also the backend used in CRAN-safe examples and tests.
Optional sklearn backend
The optional Python-compatible backend is:
imputer_backend = "sklearn"This path sends the data frame to Python through
reticulate. Python then uses:
-
pandasfor data-frame operations; -
scikit-learnforIterativeImputer; -
statsmodelsfor ARIMA; - Python
xgboostfor XGBoost regression.
To use the Python backend, install reticulate and
declare the Python requirements before launching or running the app:
install.packages("reticulate")
reticulate::py_require(c(
"numpy",
"pandas",
"scikit-learn",
"statsmodels",
"xgboost"
))The Python backend is optional. It is not required for installing or loading the package.
Running imputation
After loading data and selecting columns, click Run imputation.
Internally, the app calls code equivalent to:
out <- run_missing_glucose_imputation(
data = uploaded_data,
target_col = selected_target_col,
feature_cols = selected_feature_cols,
id_col = selected_id_col,
time_col = selected_time_col,
imputer_backend = selected_backend,
use_arima_if_missing_leq = selected_threshold,
xgb_nrounds = selected_xgb_rounds,
seed = selected_seed,
export = FALSE
)The returned object is a data frame. The most important columns are:
| Column | Meaning |
|---|---|
| Original target column | Original glucose values; originally missing values remain
NA. |
TimeSeries |
Numeric time feature derived from the timestamp column. |
imputed_glucose_value |
Completed glucose values after imputation. |
imputation_method |
Final method used, such as MICE+ARIMA or
MICE+XGBoost. |
missing_rate |
Original missingness rate of the target glucose column. |
Previewing results
After imputation, the app displays a preview of rows where the original target glucose value was missing. This is more informative than showing only the first few rows of the completed data set because it lets users directly inspect the newly imputed values.
For example, the preview is based on logic like:
The full completed data frame remains available for download.
Downloading results
Use the Download imputed CSV button to save the
completed data set. The CSV contains all returned columns from
run_missing_glucose_imputation(), including:
- the original glucose column;
-
TimeSeries; -
imputed_glucose_value; -
imputation_method; -
missing_rate; - any original input columns retained by the workflow.
Internal lag and rolling-mean columns are used during imputation but are removed from the returned data frame before display or download.
Troubleshooting
The app does not launch
If you see an error saying that Shiny is not installed, run:
install.packages("shiny")Then restart R and try:
run_cgmissingdata_app()No column choices appear
Column choices appear only after data are loaded. Upload a CSV file or load one of the built-in example data sets.
Imputation fails because a timestamp cannot be parsed
Check the timestamp column selected in the app. The values should be parseable dates or datetimes, for example:
"2020:01:16:00:00"
"2020-01-16 00:00:00"
"2020/01/16 00:00:00"
"2020-01-16T00:00:00"If the wrong column was selected as the timestamp column, select the correct column and rerun imputation.
Python backend fails because a Python module is missing
If imputer_backend = "sklearn" fails because Python
packages are missing, run:
reticulate::py_require(c(
"numpy",
"pandas",
"scikit-learn",
"statsmodels",
"xgboost"
))Then restart R and launch the app again.
Developer notes
The recommended package structure for the app is:
inst/
└── shiny/
└── cgm_imputation_app/
└── app.R
The launcher should live in an exported R function, for example:
run_cgmissingdata_app <- function() {
app_dir <- system.file(
"shiny",
"cgm_imputation_app",
package = "CGMissingDataR"
)
shiny::runApp(app_dir, display.mode = "normal")
}Because the app is optional, shiny should usually be
listed in Suggests, not Imports, unless the
package requires Shiny for normal operation.