Package 'LOCOM2'

Title: A Logistic Regression Model for Testing Differential Abundance in Compositional Microbiome Data
Description: Testing differential abundance at individual taxa and in a whole microbial community. The tests are based on the log-ratio of relative abundances. The tests accommodate continuous, discrete (binary, categorical), and multivariate traits, and allows adjustment of confounders.
Authors: Yi-Juan Hu [aut, cre]
Maintainer: Yi-Juan Hu <[email protected]>
License: GPL (>= 2)
Version: 1.0
Built: 2026-06-04 06:55:53 UTC
Source: https://github.com/yijuanhu/locom2

Help Index


A logistic regression model for testing differential abundance in compositional microbiome data (LOCOM2)

Description

This function allows you to test (1). whether any OTU (or taxon) is associated with the trait of interest with FDR control, based on log ratios of relative abundances between pairs of taxa, and (2). whether the whole community is associated with the trait (a global test), based on the harmonic mean method for combining individual p-values The tests accommodate continuous, discrete (binary or categorical), and multivariate traits, and allow adjustment for confounders.

Usage

locom2(
  otu.table,
  Y,
  C = NULL,
  fdr.nominal = 0.1,
  filter = TRUE,
  permute = TRUE,
  n.perm.max = 1000,
  n.rej.stop = 100,
  n.cores = 1,
  seed = 123,
  verbose = TRUE,
  Firth.thresh = 0.4
)

Arguments

otu.table

The OTU table (or taxa count table), where rows correspond to samples and columns correspond to OTUs (taxa).

Y

The trait of interest, which can be a vector, matrix, or data frame, must be numeric; for example, a factor should be represented by its corresponding design matrix. When specified as a matrix or data frame, all components are tested jointly for microbial association.

C

The additional (confounding) covariates to be adjusted for. See the requirements for Y.

fdr.nominal

The nominal FDR level, with a default of 0.1.

filter

A logical value indicating whether to filter out rare taxa. The default is TRUE, using a filtering threshold of min(0.1*n.sam, 10).

permute

A logical value indicating whether to perform permutation. The default is TRUE.

n.perm.max

The maximum number of permutations. The default is 1,000, used for the Wald-type test. The full permutation procedure as in LOCOM is performed when n.perm.max is set to NULL. In this case, the total number of permutations is determined as n.otu * n.rej.stop * (1/fdr.nominal), where n.otu is the number of OTUs (that have non-zero counts in at least one sample). The full permutation procedure adopts a sequential stopping criterion (similar to Sandve et al. 2011), which stops when all taxon-level tests have either reached the prespecified number of rejections (default 100) or yielded a q-value (by the Benjamini-Hochberg [BH] procedure) below the nominal FDR level (default 0.1).

n.rej.stop

The minimum number of rejections (i.e., instances where the permutation test statistic exceeds the observed test statistic) required before stopping the permutation procedure. The default is 100.

n.cores

The number of cores to be used for parallel computing. The default is 1.

seed

A user-supplied integer seed for the random number generator in the permutation procedure. The default is NULL, in which case an integer seed is generated internally at random. In either case, the seed is stored in the output object to enable reproducibility of the permutation replicates.

verbose

A logical value indicating whether to produce verbose output during the permutation process. The default is TRUE.

Firth.thresh

The threshold (between 0 and 1) of taxon prevalence for applying the Firth correction. The default is 0.4.

Details

This function extends LOCOM (Hu et al., 2022, PNAS) in the following ways: -. accommodating both relative abundance and read count data for OTUs; -. refining the weighting scheme in LOCOM to eliminate confounding by library size; -. incorporating a series of adjustments to ensure stable and reliable inference, even under extreme conditions such as rare taxa and highly unbalanced case–control designs; -. replacing the computationally intensive permutation procedure with a Wald-type test (using a fixed 1,000 permutation replicates).

Value

A list consisting of

  • p.otu.Wald - Wald p-values for OTU-specific tests

  • q.otu.Wald - Wald q-values (adjusted p-values by BH) for OTU-specific tests

  • detected.otu.Wald - OTUs detected by the Wald test at the nominal FDR level

  • p.otu.perm - permutation p-values for OTU-specific tests

  • q.otu.perm - permutation q-values (adjusted p-values by BH) for OTU-specific tests

  • detected.otu.perm - OTUs detected by the permutation test at the nominal FDR level

  • p.otu.asymptotic - asymptotic p-values for OTU-specific tests

  • q.otu.asymptotic - asymptotic q-values (adjusted p-values by BH) for OTU-specific tests

  • detected.otu.asymptotic - OTUs detected by the asymptotic test at the nominal FDR level

  • beta - effect size at each OTU, defined as beta_j - median (beta_j'), after Yeo–Johnson transformation if the Wald test is used

  • beta.var - estimated variance for each beta

  • ref.otu - reference OTU

  • p.global - p-value for the global test (not available in the asymptotic version). The global test is based on the harmonic mean of individual p-values, using all permutation replicates generated up to the point when the procedure terminates.

  • n.perm.completed - number of permutations completed

  • seed - the seed used to generate the permutation replicates

Examples

data("throat.otu.table.filter")
data("throat.meta.filter")
data("throat.otu.taxonomy")

Y <- ifelse(throat.meta.filter$SmokingStatus == "NonSmoker", 0, 1)
C <- ifelse(throat.meta.filter$Sex == "Male", 0, 1)

##################
# running LOCOM2
##################

## LOCOM2 (Wald), most recommended, better to use n.cores = 4 to speed up

res.Wald <- locom2(otu.table = throat.otu.table.filter, Y = Y, C = C, seed = 123)
res.Wald$detected.otu.Wald
res.Wald$p.otu.Wald[res.Wald$detected.otu.Wald]

Metadata of the throat microbiome samples

Description

This data set includes samples from the microbiome of the nasopharynx and oropharynx on each side of the body. It were generated to study the effect of smoking on the microbiota of the upper respiratory tract in 57 individuals, after filtering out three individuals with antibiotic use.

Usage

data("throat.meta.filter")

Format

A data frame with 57 observations on 16 variables.

Source

Charlson ES, Chen J, Custers-Allen R, Bittinger K, Li H, et al. (2010) Disordered Microbial Communities in the Upper Respiratory Tract of Cigarette Smokers. PLoS ONE 5(12): e15216.

References

R package "GUniFrac"

Examples

data(throat.meta.filter)

OTU count table from 16S sequencing of the throat microbiome samples

Description

This data set contains 57 subjects, after filtering out three subjects with antibiotic use. Microbiome data were collected from right and left nasopharynx and oropharynx region to form an OTU table with 856 OTUs.

Usage

data("throat.otu.table.filter")

Format

A data frame with 57 observations on 856 variables.

Source

Charlson ES, Chen J, Custers-Allen R, Bittinger K, Li H, et al. (2010) Disordered Microbial Communities in the Upper Respiratory Tract of Cigarette Smokers. PLoS ONE 5(12): e15216.

References

R package "GUniFrac"

Examples

data(throat.otu.table.filter)

Taxonomy names for OTUs from 16S sequencing of the throat microbiome samples

Description

This file contains 5683 taxonomy names.

Usage

data("throat.otu.taxonomy")

Format

A vector with 5683 taxonomy names

Source

Charlson ES, Chen J, Custers-Allen R, Bittinger K, Li H, et al. (2010) Disordered Microbial Communities in the Upper Respiratory Tract of Cigarette Smokers. PLoS ONE 5(12): e15216.

References

R package "GUniFrac"

Examples

data(throat.otu.taxonomy)