Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
R Bioinformatics Cookbook

You're reading from   R Bioinformatics Cookbook Use R and Bioconductor to perform RNAseq, genomics, data visualization, and bioinformatic analysis

Arrow left icon
Product type Paperback
Published in Oct 2019
Publisher Packt
ISBN-13 9781789950694
Length 316 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (2):
Arrow left icon
Dr Dan Maclean Dr Dan Maclean
Author Profile Icon Dr Dan Maclean
Dr Dan Maclean
Dan MacLean Dan MacLean
Author Profile Icon Dan MacLean
Dan MacLean
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. Performing Quantitative RNAseq FREE CHAPTER 2. Finding Genetic Variants with HTS Data 3. Searching Genes and Proteins for Domains and Motifs 4. Phylogenetic Analysis and Visualization 5. Metagenomics 6. Proteomics from Spectrum to Annotation 7. Producing Publication and Web-Ready Visualizations 8. Working with Databases and Remote Data Sources 9. Useful Statistical and Machine Learning Methods 10. Programming with Tidyverse and Bioconductor 11. Building Objects and Packages for Code Reuse 12. Other Books You May Enjoy

Differential peak analysis

When you've discovered unannotated transcripts you may want to see whether they are differentially expressed between experiments. We've already looked at how we might do that with edgeR and DESeq, but one problem is going from an object such as a RangedSummarizedExperiment, comprised of the data and a GRanges object that describes the peak regions, to the internal DESeq object. In this recipe, we'll look at how we can summarise the data in those objects and get them into the correct format.

Getting ready

For this recipe, you'll need the RangedSummarizedExperiment version of the Arabidopsis thaliana RNAseq in datasets/ch1/arabidopsis_rse.RDS in this book's repository. We'll use the DESeq and SummarizedExperiment Bioconductor packages we used earlier too.

How to do it...

  1. Load data and set up a function that creates region tags:
library(SummarizedExperiment) 
arab_rse <- readRDS(file.path(getwd(), "datasets", "ch1", "arabidopsis_rse.RDS") )

make_tag <- function(grange_obj){
paste0(
grange_obj@seqnames,
":",
grange_obj@ranges@start,
"-",
(grange_obj@ranges@start + grange_obj@ranges@width)
)
}
  1. Extract data and annotate rows:
counts <- assay(arab_rse)

if ( ! is.null(names(rowRanges(arab_rse))) ){
  rownames(counts) <- names(rowRanges(arab_rse))
} else {
  rownames(counts) <- make_tag(rowRanges(arab_rse))
}

How it works...

Step 1 starts by loading in our pre-prepared RangedSummarized experiment; note that the names slot of the GRanges object in there is not populated. We next create a custom function, make_tag(), which works by pasting together seqnames, starts and the computed end (start + width) from a passed GRanges object. Note the @ sign syntax: this is used because GRange is an S4 object and the slots are accessed with @ rather than the more familiar $.

In step 2, the code pulls out the actual data from RangedSummarizedExperiment using the assay() function. The matrix returned has no row names, which is unuseful, so we use the if clause to check the names slot—we use that as row names if it's available; if it, isn't we make a row name tag using the position information in the GRanges object in the make_tag() function we have created. This will give the following outputa count matrix that has the location tag as the row name that can be used in DESeq and edgeR as described in Recipes 1 and 2 in this chapter:

head(counts)
## mock1 mock2 mock3 hrcc1 hrcc2 hrcc3 ## Chr1:3631-5900 35 77 40 46 64 60 ## Chr1:5928-8738 43 45 32 43 39 49 ## Chr1:11649-13715 16 24 26 27 35 20 ## Chr1:23146-31228 72 43 64 66 25 90 ## Chr1:31170-33154 49 78 90 67 45 60 ## Chr1:33379-37872 0 15 2 0 21 8
You have been reading a chapter from
R Bioinformatics Cookbook
Published in: Oct 2019
Publisher: Packt
ISBN-13: 9781789950694
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image