Finding InterPro domains
InterPro is a database of predictive models, or signatures, provided by multiple protein databases. InterPro aggregates information from multiple sources to reduce redundancy in annotations and aid with interoperability. In this recipe, we’ll extend the approach we used for PFAM
domains in the previous recipe and look at getting annotations of InterPro domains on sequences of interest. We’ll start this recipe with something similar for Ensembl core databases.
Getting ready
We’ll need the ensembldb
, EnsDb.Rnorvegicus.v79
, and biomaRt
packages from Bioconductor.
How to do it…
Finding InterPro domains can be done by performing the following steps:
- Load the necessary libraries and check for protein data in the database:
library(ensembldb)library(EnsDb.Rnorvegicus.v79)hasProteinData(EnsDb.Rnorvegicus.v79)
- Build a list of genes to query with:
listTables(EnsDb.Rnorvegicus.v79) e <- EnsDb.Rnorvegicus.v79k <- head...