We will show here a typical pipeline for analysis of gene expression data. This tutorial is based on the use of Monocytes and Macrophage data from the following paper:

van de Laar L, Saelens W, De Prijck S, Martens L et al. Yolk Sac Macrophages, Fetal Liver, and Adult Monocytes Can Colonize an Empty Niche and Develop into Functional Tissue-Resident Macrophages. Immunity 2016 Apr 19;44(4):755-68. PMID: 26992565

This tutorial contains the following steps:

  1. Download the data from GEO
  2. Process the expression table for analyses
  3. DE Analysis by Limma
  4. PCA (Principal component analysis)
  5. Clustering by Heatmap (only top 250 genes)
  6. GO (Gene Ontology) Enrichment Analysis
  7. GSEA (Gene Set Enrichment Analysis)
  8. Combining other dataset for PCA and clustering

You can find 3 files given to you under handouts directory:

File Description
handout.pdf Step-by-step explanation of the tasks.
Imun_gen.ex.Rdump Preprocessed object from GEO for loading in the analysis.
pheno.table A table storing the phenotype data for comparison.
mouse_hallmark_genesets.rdata Hallmark mouse gene sets for GSEA

Set the working directory

Before you start to work on this handout, please set your working directory to the place where you save the directory. For example,

setwd("/home/grasshoff/Documents/BIR_course")
# Please use your path instead of the above example

Loading the libraries

rm(list=ls())
if("mogene10sttranscriptcluster.db" %in% rownames(installed.packages()) == FALSE) {
  BiocManager::install(c("mogene10sttranscriptcluster.db"))
}
if("msigdbr" %in% rownames(installed.packages()) == FALSE) {
  BiocManager::install("msigdbr")
}

library(Biobase)
library(GEOquery)
library(limma)
library(mogene