This function computes regularized least squares estimates for latent factor mixed models using a lasso penalty.

lfmm_lasso(Y, X, K, nozero.prop = 0.01, lambda.num = 100,
  lambda.min.ratio = 0.01, lambda = NULL, it.max = 100,
  relative.err.epsilon = 1e-06)

Arguments

Y

a response variable matrix with n rows and p columns. Each column is a response variable (e.g., SNP genotype, gene expression level, beta-normalized methylation profile, etc). Response variables must be encoded as numeric.

X

an explanatory variable matrix with n rows and d columns. Each column corresponds to a distinct explanatory variable (eg. phenotype). Explanatory variables must be encoded as numeric.

K

an integer for the number of latent factors in the regression model.

nozero.prop

a numeric value for the expected proportion of non-zero effect sizes.

lambda.num

a numeric value for the number of 'lambda' values (obscure).

lambda.min.ratio

(obscure parameter) a numeric value for the smallest lambda value, A fraction of lambda.max, the data derived entry value (i.e. the smallest value for which all coefficients are zero).

lambda

(obscure parameter) Smallest value of lambda. A fraction of 'lambda.max', the (data derived) entry value (i.e. the smallest value for which all coefficients are zero).

it.max

an integer value for the number of iterations of the algorithm.

relative.err.epsilon

a numeric value for a relative convergence error. Determine whether the algorithm converges or not.

Value

an object of class lfmm with the following attributes:

  • U the latent variable score matrix with dimensions n x K,

  • V the latent variable axes matrix with dimensions p x K,

  • B the effect size matrix with dimensions p x d.

Details

The algorithm minimizes the following penalized least-squares criterion

The response variable matrix Y and the explanatory variable are centered.

Examples

library(lfmm) ## a GWAS example with Y = SNPs and X = phenotype data(example.data) Y <- example.data$genotype X <- example.data$phenotype ## Fit an LFMM with 6 factors mod.lfmm <- lfmm_lasso(Y = Y, X = X, K = 6, nozero.prop = 0.01)
#> It = 1/100, err2 = 0.163976365842701
#> It = 2/100, err2 = 0.150909974059675
#> It = 3/100, err2 = 0.150916204324369
#> === lambda = 0.140473176172642, no zero B proportion = 0.000445384701035519
#> It = 1/100, err2 = 0.150916343641328
#> It = 2/100, err2 = 0.150915474257531
#> === lambda = 0.134088453517981, no zero B proportion = 0.00103923096908288
#> It = 1/100, err2 = 0.150915412709902
#> It = 2/100, err2 = 0.150913975701531
#> === lambda = 0.127993926361761, no zero B proportion = 0.00196711576290688
#> It = 1/100, err2 = 0.150913871462831
#> It = 2/100, err2 = 0.150911733598199
#> It = 3/100, err2 = 0.150911574423686
#> === lambda = 0.122176404870708, no zero B proportion = 0.00300634673198976
#> It = 1/100, err2 = 0.150911562286119
#> It = 2/100, err2 = 0.150908693359626
#> It = 3/100, err2 = 0.150908471900048
#> === lambda = 0.116623298709825, no zero B proportion = 0.00456519318561407
#> It = 1/100, err2 = 0.150908454358739
#> It = 2/100, err2 = 0.150904858480186
#> It = 3/100, err2 = 0.150904571937453
#> === lambda = 0.111322589794276, no zero B proportion = 0.00627250120625023
#> It = 1/100, err2 = 0.15090454848927
#> It = 2/100, err2 = 0.150900036019481
#> It = 3/100, err2 = 0.150899664398151
#> === lambda = 0.106262806279724, no zero B proportion = 0.00868500167019263
#> It = 1/100, err2 = 0.150899632925777
#> It = 2/100, err2 = 0.150894188051445
#> It = 3/100, err2 = 0.150893724548089
#> === lambda = 0.101432997734866, no zero B proportion = 0.0117284637939353
## Perform association testing using the fitted model: pv <- lfmm_test(Y = Y, X = X, lfmm = mod.lfmm, calibrate = "gif") ## Manhattan plot with causal loci shown pvalues <- pv$calibrated.pvalue plot(-log10(pvalues), pch = 19, cex = .2, col = "grey", xlab = "SNP")
points(example.data$causal.set, -log10(pvalues)[example.data$causal.set], type = "h", col = "blue")
## An EWAS example with Y = methylation data ## and X = exposure Y <- scale(skin.exposure$beta.value) X <- scale(as.numeric(skin.exposure$exposure)) ## Fit an LFMM with 2 latent factors mod.lfmm <- lfmm_lasso(Y = Y, X = X, K = 2, nozero.prop = 0.01)
#> It = 1/100, err2 = 0.987179487179487
#> It = 2/100, err2 = 0.74927396086806
#> It = 3/100, err2 = 0.749307386150913
#> === lambda = 0.655001715942563, no zero B proportion = 0.00935828877005348
#> It = 1/100, err2 = 0.749307558044645
#> It = 2/100, err2 = 0.748933355127548
#> It = 3/100, err2 = 0.748931596812741
#> === lambda = 0.625230877063826, no zero B proportion = 0.0113636363636364
## Perform association testing using the fitted model: pv <- lfmm_test(Y = Y, X = X, lfmm = mod.lfmm, calibrate = "gif") ## Manhattan plot with true associations shown pvalues <- pv$calibrated.pvalue plot(-log10(pvalues), pch = 19, cex = .3, xlab = "Probe", col = "grey")
causal.set <- seq(11, 1496, by = 80) points(causal.set, -log10(pvalues)[causal.set], col = "blue")