Main Dependencies
I am following the linux/mac instructions. See the microeco tutorial for windows-specific code.
# If devtools package is not installed, first install it
install.packages("devtools")
knitr::opts_chunk$set(message = FALSE,
warning = FALSE,
echo = TRUE,
include = TRUE,
eval = FALSE,
comment = "")
library(tidyverse)
library(conflicted)
library(bslib)
library(htmltools)
Microbiome data analysis has rapidly evolved into a cornerstone of biological and ecological research, offering insights into how microbial communities influence everything from human health to environmental ecosystems. However, this type of analysis often involves multiple complex steps: data normalization, diversity calculations, community composition comparisons, and advanced visualizations.
For more information on some of the statistical tests I often use/recommend, see the tutorial in this directory called Data_Notes.
The microeco R package provides an elegant and comprehensive solution by integrating many of the most current and popular microbiome analysis approaches into a unified framework. This package simplifies workflows, making it easy to prepare datasets, calculate metrics, and create publication-quality visualizations. Importantly, microeco is designed to work seamlessly with ggplot2 and other widely used R packages, offering flexibility for customization and compatibility with established workflows. If you click the link above, you will find a very comprehensive tutorial presenting the full array of analysis options.
In this workflow, we will prepare our local hard drive and working directory for use with microeco by properly installing the package and all its dependencies.
MicroEcoI am following the linux/mac instructions. See the microeco tutorial for windows-specific code.
# If devtools package is not installed, first install it
install.packages("devtools")
url <- "https://github.com/ChiLiubio/microeco_dependence/releases/download/v0.20.0/microeco_dependence.zip"
options(timeout = 2000)
download.file(url = url, destfile = "microeco_dependence.zip")
tmp <- "microeco_dependence"
unzip(paste0(tmp, ".zip"))
devtools::install_local("microeco_dependence/SpiecEasi-master.zip", dependencies = TRUE)
xcode-select --install
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
brew install gcc
3.a. Make sure ~/.R exists
ls -a | grep '\.R'
mkdir .R
cd .R
cd ~
brew --prefix gcc
ls /opt/homebrew/opt/gcc/lib/gcc
nano Makevars
FC = /opt/homebrew/bin/gfortran-15
F77 = /opt/homebrew/bin/gfortran-15
FLIBS = -L/opt/homebrew/lib/gcc/15 -lgfortran -lquadmath -lm
Restart R (or RStudio) so it picks up your updated Makevars.
Re‐run the install code at the beginning.
devtools::install_local("microeco_dependence/mixedCCA-master.zip", dependencies = TRUE)
devtools::install_local("microeco_dependence/SPRING-master.zip", dependencies = TRUE)
devtools::install_local("microeco_dependence/NetCoMi-main.zip", repos = BiocManager::repositories())
devtools::install_local("microeco_dependence/beem-static-master.zip", dependencies = TRUE)
devtools::install_local("microeco_dependence/chorddiag-master.zip", dependencies = TRUE)
devtools::install_local("microeco_dependence/ggradar-master.zip", dependencies = TRUE)
devtools::install_local("microeco_dependence/ggnested-main.zip", dependencies = TRUE)
devtools::install_local("microeco_dependence/ggcor-1-master.zip", dependencies = TRUE)
Microeco leverages Tax4Fun2 to predict the functional potential of microbial communities based on 16S rRNA gene data on the KEGG database. Tax4Fun translates taxonomic profiles into functional profiles by mapping taxa to KEGG Orthologs (KOs) using pre-trained databases. This integration allows researchers to estimate the metabolic capabilities of microbial communities, such as energy production, nutrient cycling, and ecological interactions.
I will provide a series of steps below to set this up for your first run, but keep in mind that this sometimes works differently based on operating systems. I also recommend you set up your directory structure to follow mine if you are working from a clone of the github repository. Unlike other dependencies, I do not load these to the repo, as they take up too much unecessary space.
I also recommend you check on the instructions in the microeco tutorial, where they provide alternative methods for these steps.
Why Tax4Fun2? Tax4Fun2 is efficient and user-friendly, making it ideal for functional predictions when shotgun metagenomics data isn’t available. However, setting up Tax4Fun2 and its dependencies (e.g., reference databases) can be challenging but crucial for reliable results.
tar -xvzf Tax4Fun2_ReferenceData_v2.tar.gz
This code only works for newer MacOS systems
Run the following code in your terminal from the same directory as the last step:
curl -O ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast-2.16.0+-aarch64-macosx.tar.gz
tar -xvzf ncbi-blast-2.16.0+-aarch64-macosx.tar.gz
chmod 755 ncbi-blast-2.16.0+/bin/*
chmod -R 755 Tax4Fun2_ReferenceData_v2/*
If you are on a mac and still get an error about executing blastn after running the Tax4Fun2 commands, then you may still need to go to your security preferences and select “Open anyway” where it says something about blocking Blastn.
You can also run the chunk below from the R console to make sure R is able to read and write to the directory.
dir_path <- "setup/microbiome/Tax4Fun2_ReferenceData_v2/Ref99NR/"
if (file.access(dir_path, mode = 4) == 0) {
print("Directory is readable.")
} else {
print("Directory is not accessible.")
}
MicroEco’s tutorial excerpt above guided us through installation of several dependencies, including BEEM-static, but to work with true longitudinal data, we want to use the original BEEM package instead. This is where microeco’s mecodev package comes in handy. Mecodev provides a set of extended classes based on the microeco package, including trans_ts, a class designed for handling time-series data.
devtools::install_github("ChiLiubio/mecodev")
# For linux or mac
install.packages("doMC")
# Then install the following packages
install.packages("lokern")
install.packages("monomvn")
install.packages("pspline")
install.packages("seqinr", dependencies = TRUE)
devtools::install_github('csb5/beem')
install.packages("file2meco", repos = BiocManager::repositories())
install.packages("MicrobiomeStat", repos = BiocManager::repositories())
install.packages("WGCNA", repos = BiocManager::repositories())
BiocManager::install("ggtree")
BiocManager::install("metagenomeSeq")
BiocManager::install("ALDEx2")
BiocManager::install("ANCOMBC")
mecodev Package# Then install the following packages
install.packages("lokern")
install.packages("monomvn")
install.packages("pspline")
mecoturn Packageinstall.packages("mecoturn")
# check and install dependent packages
packages <- c("agricolae")
for(x in packages){
if(!require(x, character.only = TRUE)){
install.packages(x)
}
}