library(splatter) # simulate scRNA-seq data
library(scater) # use the logNormCounts() function from here
library(Seurat) # manipulating the scRNA-seq data
library(tidyverse) # data wrangling
library(gganimate) # fancy plots
I keep putting off trying out the gganimate R package, but today’s the day. To make it more fun, rather than use the iris dataset as they’ve done in the package vignette, we’ll simulate some single cell RNA-seq data using the excellent Splatter R package.
Load R packages
Simulate data
I’ll simulate 400 cells transitioning along a continuous differentiation trajectory with four paths, each cell having an equal probability of belonging to each. I’ve become more familiar with the structure of Seurat objects for handling single cell data, so along the way we’ll convert the SingleCellExperiment
object to a Seurat
object before identifying variable features and running PCA.
# Simulate data
<- newSplatParams(batchCells = 400,
params.groups nGenes = 800)
<- splatSimulatePaths(params.groups,
sim1 path.nSteps = 10,
group.prob = rep(0.25, 4),
de.prob = 0.5,
de.facLoc = 0.2,
path.from = c(0, 1, 2, 3)) %>%
logNormCounts() %>%
as.Seurat() %>%
FindVariableFeatures() %>%
ScaleData() %>%
RunPCA()
Peek at the metadata:
head(sim1@meta.data)
## orig.ident nCount_originalexp nFeature_originalexp Cell Batch Group
## Cell1 SeuratProject 48738 676 Cell1 Batch1 Path2
## Cell2 SeuratProject 78535 714 Cell2 Batch1 Path2
## Cell3 SeuratProject 70102 688 Cell3 Batch1 Path4
## Cell4 SeuratProject 73821 700 Cell4 Batch1 Path2
## Cell5 SeuratProject 61166 684 Cell5 Batch1 Path3
## Cell6 SeuratProject 58364 713 Cell6 Batch1 Path1
## ExpLibSize Step sizeFactor
## Cell1 47689.61 9 0.7940128
## Cell2 78541.88 6 1.2794491
## Cell3 71215.27 3 1.1420633
## Cell4 72928.43 2 1.2026512
## Cell5 61810.54 7 0.9964829
## Cell6 58147.56 1 0.9508343
Splatter approximates a continuous trajectory by simulating a series of steps between groups. We specified four groups and 10 steps per group. Let’s add a new variable to approximate time:
# Get group info as numeric (integer between 1-4) then multiply by ten
$Time <- as.numeric(str_remove_all(sim1$Group, "Path")) * 10
sim1# Add step info (integer between 1-10)
$Time <- sim1$Time + sim1$Step
sim1# start from zero not ten
$Time <- sim1$Time - 10 sim1
Static plot
Create some static PCA plots highlighting the Time
variable we created and the paths and steps.
# Plot group (paths), steps and our new Time variable using two of Seurat's built in
# plot functions; DimPlot() and FeaturePlot()
DimPlot(sim1, group.by = "Group")
FeaturePlot(sim1, "Step") + scale_color_viridis_c()
FeaturePlot(sim1, "Time") + scale_color_viridis_c()
Dynamic plot
Now we’ll use the Time
variable to animate the plot. We use the shadow_mark
argument to keep track of the path of the cells so that they don’t just disappear from frame to frame.
# Pull data for plotting
<- cbind(sim1@meta.data,
plt.data Embeddings(sim1)[, 1:2])
# Create a static plot
<- ggplot(plt.data, aes(x = PC_1, y = PC_2, col = Time)) +
p geom_point() +
scale_colour_viridis_c()
# Add some animation
<- p +
anim transition_states(Time,
transition_length = 2,
state_length = 1) +
ggtitle('Time: {closest_state}') +
shadow_mark(alpha = 0.5, size = 0.7)
animate(anim)
I think the colour in the static plot does the job of conveying the trajectory of the cells through time and adding animation doesn’t contribute much. But it was good to try out gganimate
anyway and discover that the grammar makes it easy to add animation to normal ggplots. Now I’ve finally tried it out I can keep an eye out for future datasets where it might come in handy. Also first post!!! 🎉 🎉