Skip to contents

Fits a generalized linear model (GLM) with spatial deconvolution for a single response variable (e.g., gene expression), supporting Poisson, Gaussian, Binomial, and Negative Binomial families. This function handles coefficient initialization, model fitting via mini-batch gradient descent, and automatic coefficient filtering for weak covariates or poorly represented cell types.

Usage

run_model(
  y,
  X,
  lambda,
  family = "spot gaussian",
  beta_0 = NULL,
  fix_coef = NULL,
  offset = rep(0, length(y)),
  initialization = T,
  CT = NULL,
  weights = rep(1, length(y)),
  ct_cov_weights = rep(1, ncol(lambda)),
  n_epochs = 100,
  batch_size = 500,
  learning_rate = 1,
  max_diff = 1 - 1e-06,
  improvement_threshold = 1e-06,
  max_conv = 10
)

Arguments

y

Numeric response vector (e.g., gene expression for one gene across spots).

X

Covariate matrix (spots × covariates).

lambda

Deconvolution matrix (spots × cell types).

family

GLM family: "spot gaussian", "spot poisson", "spot negative binomial", or "spot binomial".

beta_0

Optional initial coefficient matrix (covariates × cell types).

fix_coef

Logical matrix (covariates × cell types) indicating coefficients to fix during optimization.

offset

Optional numeric vector (same length as y), used for Poisson or NB normalization.

initialization

Boolean if initialization via single cell approximation should be performed. Default TRUE.

CT

Optional vector of dominant cell type labels per spot.

weights

Observation weights (same length as y).

ct_cov_weights

Optional vector of cell-type–specific weights (length = number of cell types).

n_epochs

Number of training epochs for gradient descent.

batch_size

Size of mini-batches used during gradient descent.

learning_rate

Initial learning rate for optimization.

max_diff

Convergence threshold based on likelihood ratio.

improvement_threshold

Minimum required improvement in likelihood ratio between epochs.

max_conv

Number of consecutive low-improvement epochs before convergence is assumed.

Value

A list containing:

beta_estimate

Estimated coefficient matrix (covariates × cell types).

standard_error_matrix

Standard error matrix for each coefficient.

time

Elapsed fitting time (in seconds).

disp

Estimated dispersion (for NB models).

converged

Logical indicating if convergence was reached.

likelihood

Final negative log-likelihood.

vcov

Variance-covariance matrix.

niter

Number of optimization epochs completed.

fixed_coef

Final matrix indicating fixed coefficients.