Parallelized Spot-GLM Model Fitting (macOS / Linux)
Source:R/glm_extension_adam.R
run_model_parallel_mac.Rd
Fits a Spot-GLM model for multiple responses (e.g., genes) in parallel using memory-safe chunking
and pbmclapply
, which relies on mclapply
. Only available on Unix-based systems.
Usage
run_model_parallel_mac(
Y,
X,
lambda,
family = "spot gaussian",
beta_0 = NULL,
fix_coef = NULL,
initialization = T,
G = 0.1,
num_cores = 1,
offset = NULL,
CT = NULL,
weights = NULL,
ct_cov_weights = NULL,
n_epochs = 100,
batch_size = 500,
learning_rate = 1,
max_diff = 1 - 1e-06,
improvement_threshold = 1e-06,
max_conv = 10
)
Arguments
- Y
Response matrix (spots × responses).
- X
Covariate matrix (spots × covariates).
- lambda
Deconvolution matrix (spots × cell types).
- family
The GLM family to use. One of:
"spot gaussian"
,"spot poisson"
,"spot negative binomial"
, or"spot binomial"
.- beta_0
Optional initial coefficient matrix (covariates × cell types).
- fix_coef
Optional logical matrix indicating which coefficients to fix (same dimensions as
beta_0
).- initialization
Boolean if initialization via single cell approximation should be performed. Default TRUE.
- G
Maximum chunk size (in GB) to control memory usage during parallelization.
- num_cores
Number of CPU cores to use in parallel.
- offset
Optional numeric vector (length equal to number of spots).
- CT
Optional vector of dominant cell types per spot.
- weights
Optional observation-level weight matrix (spots × genes).
- ct_cov_weights
Optional cell-type-specific weight matrix (cell types × genes).
- n_epochs
Number of training epochs.
- batch_size
Size of each mini-batch.
- learning_rate
Initial learning rate.
- max_diff
Convergence threshold based on likelihood improvement ratio.
- improvement_threshold
Minimum improvement ratio between epochs.
- max_conv
Number of low-improvement epochs before stopping.
Details
This version uses pbmcapply::pbmclapply
for parallelism. On Windows systems,
please use run_spot_glm_windows
.