This repository contains the scripts, tutorials, and templates for the Rich Lab’s mainstream bioinformatic workflows. You can find some of the current recommended tutorial versions of these workflows linked below.
metadata_setup
The purpose of a metadata file is to organize the potential
independent or predictor variables for your analysis into a single table
with one row per SampleID.
Then, when you produce a set of
potential dependent or outcome values, you organize those into a similar
structure with the SampleIDs
organized rowwise to
streamline the process of matching predictor variables to outcome
variables by SampleID.
It’s good practice to keep as much of your information in one tidy table as possible so that you can keep pulling from that source to further filter, wrangle and analyze without losing track of different versions and datasets over time.
That means you should brainstorm as many possible predictor variables you might use in downstream analysis as possible and organize them into one tidy table where each SampleID is matched to a value for every variable. You will end up ignoring most of these variables as you construct individual tests and visuals later, so consider this simply a rough draft of your information of interest.
This tutorial uses the Pygmy Loris dataset as an example. You may have a very different set of variables to organize for your own project.
compilation_loris.tsv
- produced
by the sample_inventory
scriptsamples_metadata.tsv
- This file gets carried forward
on several other scripts in this repository to help you build consistent
predictor variables across datasets.microbiome_alignments.html
wf-16s
pipeline and convert it into fully
formatted, tidy tables that you can normalize and analyze using
microeco
, phyloseq
or a similar package.samples_metadata.tsv
- produced
by the metadata_setup
script*_abundance_table_species.tsv
- produced
by the wf-16s
pipeline in
readprocessing_multiplex_16s
*_wf-16s-report.html
- produced
by the wf-16s
pipeline in
readprocessing_multiplex_16s
.csv
file per demultiplexed sample from this
.html
file for a given sequencing run and merge those into
a table with raw alignment datataxonomy_table.tsv
- This table is required for
microeco
and several other metagenomic packages. It must
contain exactly one row per taxon in your dataset.otu_table.tsv
- This table is required for
microeco
and several other metagenomic packages. It must
contain exactly one row per taxon and one column per sample in your
dataset. The numeric values represent the raw count of reads mapped to a
given taxon within a given sample. We will want to normalize these
before analyzing/interpreting them.sample_table.tsv
- This table is required for
microeco
and several other metagenomic packages. It must
contain exactly one row per sample in your dataset. The columns contain
metadata values that we may use as independent/predictor variables for
interpreting our results.microbiome_references.html
alignments.tsv
- *raw alignment data gathered from the
wf-16s output and organized into one table in the previous
microbiome_alignments
script*refs_aligned_mafft.fasta
- This contains one GenBank
reference sequence for each taxon in your dataset that we can use to
substitute for ASVs used in many approaches designed for short read 16S
data.refs_tree.treefile
- This is a phylogenetic tree
produced from the previous fasta multiple sequence alignment.microbiome_preprocess.html
refs_aligned_mafft.fasta
- This contains one GenBank
reference sequence for each taxon in your dataset that we can use to
substitute for ASVs used in many approaches designed for short read 16S
data.refs_tree.treefile
- This is a phylogenetic tree
produced from the previous fasta multiple sequence alignment.phyloseq_genus.RData
- Microbiome dataset stored as a
phylo object with counts summed and trees collapsed to the Genus
level.tree_genus.RData
- Only the phylogenetic tree stored
in the phylo object.
phyloseq_melt_genus.RData
- Long tibble
version of the phylo object dataset.
I created a secondary repository called workflows_first_use
.
This is where I will store some scripts/tutorials with specific
instructions for setting up your working directory and environment for
specialized packages (especially for microeco
). You can
find the tutorial version of those scripts under the First Use
option in the menu bar above.
params
option in the R Markdown HeaderHere is what my default yaml
header on most R Markdown
documents in this repository look like:
--- output: html_document: theme: bslib: true css: journal.css toc: true toc_float: true df_print: paged params: sampleset: "loris" seqrun: "hdz18" ---
I use the params
option for streamlining some automation
across different scripts. You can use the sampleset setting under params
in the header of this script to select which sampleset you will be
working with. Everywhere in a chunk of code where I have written
params$sampleset
will be replaced with whichever strong you
change your params value to in the header.
I also created a simple script containing lists of samplesets, subjects, and sequencing runs, to help with iterative programming functions.
subjects <- list( marmoset = list( HAM = "Hamlet", HER = "Hera", JAR = "JarJar BINKS", OPH = "Ophelia", KUB = "Kubo", KOR = "Korra", WOL = "Wolverine", IRI = "Iris", GOO = "Goose", LAM = "Lambchop", FRA = "Franc", IVY = "Ivy", CHA = "Charles", PAD = "Padme", BUB = "Bubblegum", GRO = "Grogu", MAR = "Marshmallow", BUD = "Buddy", JOA = "Joans", HEN = "Henry", GIN = "Ginger" ), loris = list( WARB = "Warble", CULI = "Culi" ), bats = list( UNKN = "Unknown", EPFU = "EPFU", MYLE = "MYLE", LABO = "LABO", MYCI = "MYCI", LANO = "LANO", NYHU = "NYHU" ), envir = list( UNK = "Unknown" ), isolates = list( UNK = "Unknown" ) ) seqruns <- list( loris = as.list(c(paste0("hdz", 1:19))), marmoset = as.list(sprintf("cm%03d", 1:10)), isolates = as.list(paste0("salci", 1)), bats = as.list(paste0("nwr", 1)) )
config
PackageI also use config
for streamlining and organization. If
you want to do the same, you should create your own
config.yml
file in your project working directory. You can
expand the content below to see what my example config.yml
file looks like as well as the R script I use to integrate this with the
params
values for consistent file sourcing and writing
across datasets.
default: setup: "setup/global/setup.R" conflicts: "setup/conflicted.R" functions: "setup/global/functions.R" packages: "setup/global/packages.R" inputs: "setup/default_inputs.R" fonts: "setup/global/fonts/FA6_Free-Solid-900.otf" test_prediction: "setup/microbiome/test_prediction/" visuals: "visuals/" tmp_tsv: "tmp/tmp_table.tsv" tmp_downloads: "tmp/downloads/" tmp_fetch: "tmp/fetch_references.txt" tmp_fasta3: "tmp/tmp3.fasta" tmp_fasta4: "tmp/tmp4.fasta" git_url: "https://rich-molecular-health-lab.github.io/" isolates: samplesets: "salci" minQC: 10 microbiome: micro_scripts: !expr list("setup/microbiome/functions.R", "setup/microbiome/inputs.R") setup_dir: "setup/microbiome" inputs: "setup/microbiome/inputs.R" packages: "setup/microbiome/packages.R" first_use_packages: "setup/microbiome/packages_first_use.R" functions: "setup/microbiome/functions.R" tax4fun: "setup/microbiome/Tax4Fun2_ReferenceData_v2/" Ref99NR: "setup/microbiome/Tax4Fun2_ReferenceData_v2/Ref99NR/" blast: "setup/microbiome/ncbi-blast-2.16.0+/bin" swan: functions: "setup/read_processing/functions.R" loris_tax: "/work/richlab/aliciarich/bioinformatics_stats/data/loris/taxonomy/" uid_file: "/work/richlab/aliciarich/bioinformatics_stats/tmp/fetch_references.txt" logs: "/work/richlab/aliciarich/bioinformatics_stats/logs" ont_reads: "/work/richlab/aliciarich/ont_reads/" dorado_model: "/work/richlab/aliciarich/ont_reads/dna_r10.4.1_e8.2_400bps_sup@v5.0.0/" bioinformatics_stats: "/work/richlab/aliciarich/bioinformatics_stats" samplesheets: "/work/richlab/aliciarich/bioinformatics_stats/dataframes/sample_sheet/loris" scripts: "/work/richlab/aliciarich/bioinformatics_stats/batch_scripts/" accessions: "/work/richlab/aliciarich/bioinformatics_stats/tmp/accessions.txt" tmp_fasta1: "/work/richlab/aliciarich/bioinformatics_stats/tmp/tmp1.fasta" tmp_fasta2: "/work/richlab/aliciarich/bioinformatics_stats/tmp/tmp2.fasta" tmp_fasta3: "/work/richlab/aliciarich/bioinformatics_stats/tmp/tmp3.fasta" tmp_fasta4: "/work/richlab/aliciarich/bioinformatics_stats/tmp/tmp4.fasta" raw_loris_mb: "/work/richlab/aliciarich/ont_reads/loris_microbiome/hdz_raw" basecalled_loris_mb: "/work/richlab/aliciarich/ont_reads/loris_microbiome/basecalled" trimmed_loris_mb: "/work/richlab/aliciarich/ont_reads/loris_microbiome/trimmed" filtered_loris_mb: "/work/richlab/aliciarich/ont_reads/loris_microbiome/filtered" loris_mb_aligned: "/work/richlab/aliciarich/bioinformatics_stats/data/loris/taxonomy/refseqs_aligned.fasta" loris_mb_tree: "/work/richlab/aliciarich/bioinformatics_stats/data/loris/taxonomy/refseqs_tree.newick" bats: sequencing: coverage: "visuals/bats_16s_depth_summary.html" depth_plot: "visuals/bats_16s_depth_hist.html" metadata: summary: "metadata/bats/samples_metadata.tsv" key: "metadata/bats/metadata_key.R" factors: "metadata/bats/factors.R" scripts: !expr list("metadata/bats/colors.R", "metadata/bats/metadata_key.R", "metadata/bats/factors.R") inventories: all_stages: "../read_processing/samples/inventories/bats/compilation_bats.tsv" collection: "../read_processing/samples/inventories/bats/samples_bats.csv" extraction: "../read_processing/samples/inventories/bats/extracts_bats.csv" libraries: "../read_processing/samples/inventories/bats/libraries_bats.csv" seqruns: "../read_processing/samples/inventories/bats/seqruns_bats.csv" barcodes_output: "../read_processing/samples/barcode_alignments/bats/" read_alignments: "data/bats/read_alignments" taxa_reps: aligned: "data/bats/taxonomy/refseqs_aligned.fasta" tree: "data/bats/taxonomy/refseqs_tree.newick" table: "data/bats/taxonomy/tax_table.tsv" loris: day1: "2023-10-26" last: "2024-10-25" visuals: demographics: "visuals/loris/demography" coverage: "visuals/loris_depth_summary.html" depth_plot: "visuals/loris_depth_hist.html" AZAstudbooks: btp: "metadata/loris/Studbook/BTP25.R" btp_current: "metadata/loris/Studbook/BTP25.tsv" lifetable: "metadata/loris/Studbook/lifetable25.tsv" lifetabStatic: "metadata/loris/Studbook/lifetable25_static.tsv" load_data: "metadata/loris/Studbook/LoadStudbookData.R" functions: "metadata/loris/studbookFunctions.R" reactables: "metadata/loris/Studbook/reactables.R" living25: "metadata/loris/Studbook/LivingPop_2025.csv" living21: "metadata/loris/Studbook/LivingPop_2021.csv" historic21: "metadata/loris/Studbook/HistoricPop_2021.csv" institutions: current25: "metadata/loris/Studbook/CurrentInstitutions_2025.csv" current21: "metadata/loris/Studbook/CurrentInstitutions_2021.csv" historic21: "metadata/loris/Studbook/HistoricInstitutions_2021.csv" working: "metadata/loris/Studbook/working_studbook_loris25.tsv" reactable_ready: "metadata/loris/Studbook/studbook_loris_reactableReady.tsv" timeline: "metadata/loris/Studbook/working_timeline_loris25.tsv" metadata: scripts: !expr list("metadata/loris/colors.R", "metadata/loris/metadata_key.R", "metadata/loris/diet_schedule.R") bristol: "metadata/loris/bristols.tsv" studbook: "metadata/loris/subjects_loris.csv" summary: "metadata/loris/samples_metadata.tsv" key: "metadata/loris/metadata_key.R" factors: "metadata/loris/factors.R" foods: "metadata/loris/foods.tsv" proteins: "metadata/loris/proteins.tsv" fats: "metadata/loris/fats.tsv" CHOs: "metadata/loris/CHOs.tsv" Ash: "metadata/loris/Ash.tsv" vitamins: "metadata/loris/vitamins.tsv" reactable: "metadata/loris/loris_metadata_summary.html" sample_table: identifiers: "metadata/loris/identifier_key.tsv" main: "metadata/loris/sample_table.tsv" merged: "metadata/loris/sample_table_merged.tsv" inventories: all_stages: "../read_processing/samples/inventories/loris/compilation_loris.tsv" collection: "../read_processing/samples/inventories/loris/samples_loris.csv" extraction: "../read_processing/samples/inventories/loris/extracts_loris.csv" libraries: "../read_processing/samples/inventories/loris/libraries_loris.csv" seqruns: "../read_processing/samples/inventories/loris/seqruns_loris.csv" outputs_wf16s: "data/loris/outputs_wf16s/" barcodes_output: "dataframes/barcodes/loris/" read_alignments: "data/loris/read_alignments" taxa_reps: aligned: "data/loris/taxonomy/refseqs_aligned.fasta" tree: "data/loris/taxonomy/refseqs_tree.newick" table: "data/loris/taxonomy/tax_table.tsv" abundance_wf16s: "data/loris/wf16s_abundance/" microeco: dataset: main: keg: "microeco/loris/datasets/main/keg" njc: "microeco/loris/datasets/main/njc" fpt: "microeco/loris/datasets/main/fpt" tax: "microeco/loris/datasets/main" culi: keg: "microeco/loris/datasets/culi/keg" njc: "microeco/loris/datasets/culi/njc" fpt: "microeco/loris/datasets/culi/fpt" tax: "microeco/loris/datasets/culi" warb: keg: "microeco/loris/datasets/warble/keg" njc: "microeco/loris/datasets/warble/njc" fpt: "microeco/loris/datasets/warble/fpt" tax: "microeco/loris/datasets/warble" abund: main: keg: "microeco/loris/abundance/main/keg" fpt: "microeco/loris/abundance/main/fpt" njc: "microeco/loris/abundance/main/njc" tax: "microeco/loris/abundance/main" culi: keg: "microeco/loris/abundance/culi/keg" fpt: "microeco/loris/abundance/culi/fpt" njc: "microeco/loris/abundance/culi/njc" tax: "microeco/loris/abundance/culi" warb: keg: "microeco/loris/abundance/warble/keg" fpt: "microeco/loris/abundance/warble/fpt" njc: "microeco/loris/abundance/warble/njc" tax: "microeco/loris/abundance/warble" alpha: main: "microeco/loris/alphadiversity/main" culi: "microeco/loris/alphadiversity/culi" warb: "microeco/loris/alphadiversity/warble" beta: main: kegg: "microeco/loris/betadiversity/main/keg" fpt: "microeco/loris/betadiversity/main/fpt" njc: "microeco/loris/betadiversity/main/njc" tax: "microeco/loris/betadiversity/main" culi: kegg: "microeco/loris/betadiversity/culi/keg" fpt: "microeco/loris/betadiversity/culi/fpt" njc: "microeco/loris/betadiversity/culi/njc" tax: "microeco/loris/betadiversity/culi" warb: kegg: "microeco/loris/betadiversity/warble/keg" fpt: "microeco/loris/betadiversity/warble/fpt" njc: "microeco/loris/betadiversity/warble/njc" tax: "microeco/loris/betadiversity/warble" data: main: feature: "microeco/loris/datasets/main/feature_table.tsv" tree: "microeco/loris/datasets/main/phylo_tree.tre" fasta: "microeco/loris/datasets/main/rep_fasta.fasta" samples: "microeco/loris/datasets/main/sample_table.tsv" taxa: "microeco/loris/datasets/main/tax_table.tsv" culi: feature: "microeco/loris/datasets/culi/feature_table.tsv" tree: "microeco/loris/datasets/culi/phylo_tree.tre" fasta: "microeco/loris/datasets/culi/rep_fasta.fasta" samples: "microeco/loris/datasets/culi/sample_table.tsv" taxa: "microeco/loris/datasets/culi/tax_table.tsv" warb: feature: "microeco/loris/datasets/warb/feature_table.tsv" tree: "microeco/loris/datasets/warb/phylo_tree.tre" fasta: "microeco/loris/datasets/warb/rep_fasta.fasta" samples: "microeco/loris/datasets/warb/sample_table.tsv" taxa: "microeco/loris/datasets/warb/tax_table.tsv" sample_sheets: compilations: bats: "../read_processing/samples/sample_sheets/bats/nwr_combined_sample_sheet.csv" loris: "../read_processing/samples/sample_sheets/loris/hdz_combined_sample_sheet.csv" nwr1: "../read_processing/samples/sample_sheets/bats/nwr1_sample_sheet.csv" hdz1: "../read_processing/samples/sample_sheets/loris/hdz1_sample_sheet.csv" hdz2: "../read_processing/samples/sample_sheets/loris/hdz2_sample_sheet.csv" hdz3: "../read_processing/samples/sample_sheets/loris/hdz3_sample_sheet.csv" hdz4: "../read_processing/samples/sample_sheets/loris/hdz4_sample_sheet.csv" hdz5: "../read_processing/samples/sample_sheets/loris/hdz5_sample_sheet.csv" hdz6: "../read_processing/samples/sample_sheets/loris/hdz6_sample_sheet.csv" hdz7: "../read_processing/samples/sample_sheets/loris/hdz7_sample_sheet.csv" hdz8: "../read_processing/samples/sample_sheets/loris/hdz8_sample_sheet.csv" hdz9: "../read_processing/samples/sample_sheets/loris/hdz9_sample_sheet.csv" hdz10: "../read_processing/samples/sample_sheets/loris/hdz10_sample_sheet.csv" hdz11: "../read_processing/samples/sample_sheets/loris/hdz11_sample_sheet.csv" hdz12: "../read_processing/samples/sample_sheets/loris/hdz12_sample_sheet.csv" hdz13: "../read_processing/samples/sample_sheets/loris/hdz13_sample_sheet.csv" hdz14: "../read_processing/samples/sample_sheets/loris/hdz14_sample_sheet.csv" hdz15: "../read_processing/samples/sample_sheets/loris/hdz15_sample_sheet.csv" hdz16: "../read_processing/samples/sample_sheets/loris/hdz16_sample_sheet.csv" hdz17: "../read_processing/samples/sample_sheets/loris/hdz17_sample_sheet.csv" hdz18: "../read_processing/samples/sample_sheets/loris/hdz18_sample_sheet.csv" barcode_alignments: compilations: loris: "../read_processing/samples/barcode_alignments/loris/hdz_combined_barcode_alignment.tsv" bats: "../read_processing/samples/barcode_alignments/bats/nwr_combined_barcode_alignment.tsv" nwr1: "../read_processing/samples/barcode_alignments/bats/nwr1_barcode_alignment.tsv" hdz1: "../read_processing/samples/barcode_alignments/loris/hdz1_barcode_alignment.tsv" hdz2: "../read_processing/samples/barcode_alignments/loris/hdz2_barcode_alignment.tsv" hdz3: "../read_processing/samples/barcode_alignments/loris/hdz3_barcode_alignment.tsv" hdz4: "../read_processing/samples/barcode_alignments/loris/hdz4_barcode_alignment.tsv" hdz5: "../read_processing/samples/barcode_alignments/loris/hdz5_barcode_alignment.tsv" hdz6: "../read_processing/samples/barcode_alignments/loris/hdz6_barcode_alignment.tsv" hdz7: "../read_processing/samples/barcode_alignments/loris/hdz7_barcode_alignment.tsv" hdz8: "../read_processing/samples/barcode_alignments/loris/hdz8_barcode_alignment.tsv" hdz9: "../read_processing/samples/barcode_alignments/loris/hdz9_barcode_alignment.tsv" hdz10: "../read_processing/samples/barcode_alignments/loris/hdz10_barcode_alignment.tsv" hdz11: "../read_processing/samples/barcode_alignments/loris/hdz11_barcode_alignment.tsv" hdz12: "../read_processing/samples/barcode_alignments/loris/hdz12_barcode_alignment.tsv" hdz13: "../read_processing/samples/barcode_alignments/loris/hdz13_barcode_alignment.tsv" hdz14: "../read_processing/samples/barcode_alignments/loris/hdz14_barcode_alignment.tsv" hdz15: "../read_processing/samples/barcode_alignments/loris/hdz15_barcode_alignment.tsv" hdz16: "../read_processing/samples/barcode_alignments/loris/hdz16_barcode_alignment.tsv" hdz17: "../read_processing/samples/barcode_alignments/loris/hdz17_barcode_alignment.tsv" hdz18: "../read_processing/samples/barcode_alignments/loris/hdz18_barcode_alignment.tsv" abund_wf16s_files: hdz1: "data/loris/wf16s_abundance/hdz1_abundance_table_species.tsv" hdz2: "data/loris/wf16s_abundance/hdz2_abundance_table_species.tsv" hdz3: "data/loris/wf16s_abundance/hdz3_abundance_table_species.tsv" hdz4: "data/loris/wf16s_abundance/hdz4_abundance_table_species.tsv" hdz5: "data/loris/wf16s_abundance/hdz5_abundance_table_species.tsv" hdz6: "data/loris/wf16s_abundance/hdz6_abundance_table_species.tsv" hdz7: "data/loris/wf16s_abundance/hdz7_abundance_table_species.tsv" hdz8: "data/loris/wf16s_abundance/hdz8_abundance_table_species.tsv" hdz9: "data/loris/wf16s_abundance/hdz9_abundance_table_species.tsv" hdz10: "data/loris/wf16s_abundance/hdz10_abundance_table_species.tsv" hdz11: "data/loris/wf16s_abundance/hdz11_abundance_table_species.tsv" hdz12: "data/loris/wf16s_abundance/hdz12_abundance_table_species.tsv" hdz13: "data/loris/wf16s_abundance/hdz13_abundance_table_species.tsv" hdz14: "data/loris/wf16s_abundance/hdz14_abundance_table_species.tsv" hdz15: "data/loris/wf16s_abundance/hdz15_abundance_table_species.tsv" hdz16: "data/loris/wf16s_abundance/hdz16_abundance_table_species.tsv" hdz17: "data/loris/wf16s_abundance/hdz17_abundance_table_species.tsv" hdz18: "data/loris/wf16s_abundance/hdz18_abundance_table_species.tsv" methods_16s: libprep_workflow: "'rapid16s'" dorado_model: "'dna_r10.4.1_e8.2_400bps_sup@v5.0.0'" min_length: 1000 max_length: 2000 min_qual: 7 min_id: 85 min_cov: 80 kit_name: "'SQK-16S114-24'" tax_rank: "S" n_taxa_barplot: 12 abund_threshold: 0 loris: rarefy: 3500 norm: "SRS" min_abund: 0.00001 min_freq: 0.01 include_lowest: TRUE unifrac: TRUE betadiv: "aitchison" alpha_pd: TRUE tax4fun_db: "Ref99NR" loris_rarefy: 3500 keg_minID: 97
global <- config::get(config = "default") swan <- config::get(config = "swan") micro <- config::get(config = "microbiome") loris <- config::get(config = "loris") marmoset <- config::get(config = "marmoset") isolates <- config::get(config = "isolates") bats <- config::get(config = "bats") methods_16s <- config::get(config = "methods_16s") sample_sheets <- config::get(config = "sample_sheets") abund_wf16s_files <- config::get(config = "abund_wf16s_files") barcode_alignments<- config::get(config = "barcode_alignments") seqruns <- seqruns %>% keep_at(params$sampleset) %>% list_flatten(name_spec = "") subject_list <- keep_at(subjects, paste0(params$sampleset)) %>% list_flatten(name_spec = "{inner}") path <- config::get(config = paste0(params$sampleset))
I use different language engines in some scripts named
terminal
and cat
. If you see a chunk of code
with {terminal}
or {cat}
written where you
would usually see {r} at the top of the chunk, then running the chunk
will not tell your R console to process this in the usual R
language.
terminal
chunks will echo the code in
raw text format for easy copying and pasting into the terminal console.
This just makes it easier for me to interact with other servers like
Swan from my R Studio window. There are ways to set R Studio up to run
code through multiple servers, but I find this the simplest way to
switch back and forth while still keeping a record of the code that have
used or changes I have made to it.terminal
language engine at the start of each script.
knitr::knit_engines$set(terminal = function(options) {
code <- paste(options$code, collapse = "\n")
options$warning <- FALSE
knitr::engine_output(options, code, out = code)
})
cat
chunks will write the text in that
chunk directly into a file in this repository. I use this to streamline
my batch scripts that I submit as jobs to the Swan cluster. This way,
you only need to tweak the code in one location and rerun the chunk to
ensure that update applies to all locations with working copies of the
code.
engine.opts=list(file='directory/script.sh')
string
tells this chunk which directory and file to write the text to.append=TRUE
to the string above tells the chunk
that the text should be added to the bottom of the existing file. If
this is not included, the text will overwrite anything already existing
in a file at that location. Here is a complete example of what that
chunk would look like:{cat, engine.opts=list(file='subdirectory/script.sh')}
This would create a new file named `script.sh` with this text inside. If the file already exists, it will be replaced.
{cat, engine.opts=list(file='subdirectory/script.sh', append=TRUE)}
This chunk would add this text to the bottom of the file we just created and added the text to.
Once you gain fluency in R and github, it becomes very easy to adapt just about any project, document, or task to an automated, easily transferrable format with streamlined version control through R Studio and R Markdown. The following two online textbooks are the best place to start if you are new to R Markdown (I still refer back to these constantly for different tips and tricks).
I also recommend downloading the following cheatsheet from The Markdown Guide for handy inline syntax when composing R Markdown files like those used in my tutorials.
This page also contains a list of very helpful cheatsheets for other packages or purposes in R Studio. Refer back to this often, as you may find some of these more useful the more fluent you become in R.