Fits a generalized linear model (GLM) with spatial deconvolution for a single response variable (e.g., gene expression), supporting Poisson, Gaussian, Binomial, and Negative Binomial families. This function handles coefficient initialization, model fitting via mini-batch gradient descent, and automatic coefficient filtering for weak covariates or poorly represented cell types.
Usage
run_model(
y,
X,
lambda,
family = "spot gaussian",
beta_0 = NULL,
fix_coef = NULL,
offset = rep(0, length(y)),
initialization = T,
CT = NULL,
weights = rep(1, length(y)),
ct_cov_weights = rep(1, ncol(lambda)),
n_epochs = 100,
batch_size = 500,
learning_rate = 1,
max_diff = 1 - 1e-06,
improvement_threshold = 1e-06,
max_conv = 10
)
Arguments
- y
Numeric response vector (e.g., gene expression for one gene across spots).
- X
Covariate matrix (spots × covariates).
- lambda
Deconvolution matrix (spots × cell types).
- family
GLM family:
"spot gaussian"
,"spot poisson"
,"spot negative binomial"
, or"spot binomial"
.- beta_0
Optional initial coefficient matrix (covariates × cell types).
- fix_coef
Logical matrix (covariates × cell types) indicating coefficients to fix during optimization.
- offset
Optional numeric vector (same length as
y
), used for Poisson or NB normalization.- initialization
Boolean if initialization via single cell approximation should be performed. Default TRUE.
- CT
Optional vector of dominant cell type labels per spot.
- weights
Observation weights (same length as
y
).- ct_cov_weights
Optional vector of cell-type–specific weights (length = number of cell types).
- n_epochs
Number of training epochs for gradient descent.
- batch_size
Size of mini-batches used during gradient descent.
- learning_rate
Initial learning rate for optimization.
- max_diff
Convergence threshold based on likelihood ratio.
- improvement_threshold
Minimum required improvement in likelihood ratio between epochs.
- max_conv
Number of consecutive low-improvement epochs before convergence is assumed.
Value
A list containing:
- beta_estimate
Estimated coefficient matrix (covariates × cell types).
- standard_error_matrix
Standard error matrix for each coefficient.
- time
Elapsed fitting time (in seconds).
- disp
Estimated dispersion (for NB models).
- converged
Logical indicating if convergence was reached.
- likelihood
Final negative log-likelihood.
- vcov
Variance-covariance matrix.
- niter
Number of optimization epochs completed.
- fixed_coef
Final matrix indicating fixed coefficients.