Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from R Bioinformatics Cookbook Use R and Bioconductor to perform RNAseq, genomics, data visualization, and bioinformatic analysis

Product type Paperback

Published in Oct 2019

Publisher Packt

ISBN-13 9781789950694

Length 316 pages

Edition 1st Edition

Languages

Tools

ggplot

Concepts

Bioinformatics

Authors (2):

Dr Dan Maclean

Dan MacLean

View More author details

Table of Contents (13) Chapters

Preface

1. Performing Quantitative RNAseq

2. Finding Genetic Variants with HTS Data FREE CHAPTER

3. Searching Genes and Proteins for Domains and Motifs

4. Phylogenetic Analysis and Visualization

5. Metagenomics

6. Proteomics from Spectrum to Annotation

7. Producing Publication and Web-Ready Visualizations

8. Working with Databases and Remote Data Sources

9. Useful Statistical and Machine Learning Methods

10. Programming with Tidyverse and Bioconductor

11. Building Objects and Packages for Code Reuse

12. Other Books You May Enjoy

Leave a review - let other readers know what you think

Learning groupings within data and classifying with kNN

The k-Nearest Neighbors (kNN) algorithm is a supervised learning algorithm that, given a data point, will try to classify it based on its similarity to a set of training examples of known classes. In this recipe, we'll look at taking a dataset, dividing it into a test and train set, and predicting the test classes from a model built on the training set. These sorts of approaches are widely applicable in bioinformatics and can be invaluable in clustering when we have some known examples of our target classes.

Getting ready

For this recipe, we'll need a few new packages: caret, class, dplyr, and magrittr. As a dataset, we will use the built-in iris dataset.

...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €18.99/month. Cancel anytime

Authors (2)

Dr Dan Maclean

See other products by Dr Dan Maclean

MacLean

Professor Dan MacLean has a PhD in molecular biology from the University of Cambridge and gained postdoctoral experience in genomics and bioinformatics at Stanford University in California. Dan is now an honorary professor at the School of Computing Sciences at the University of East Anglia. He has worked in bioinformatics and plant pathogenomics, specializing in R and Bioconductor, and has developed analytical workflows in bioinformatics, genomics, genetics, image analysis, and proteomics at the Sainsbury Laboratory since 2006. Dan has developed and published software packages in R, Ruby, and Python, with over 100,000 downloads combined.

See other products by MacLean