You're reading from R Bioinformatics Cookbook Utilize R packages for bioinformatics, genomics, data science, and machine learning

Product type Paperback

Published in Oct 2023

Publisher Packt

ISBN-13 9781837634279

Length 396 pages

Edition 2nd Edition

Languages

Tools

ChatGPT

Concepts

Bioinformatics

Author (1):

Dan MacLean

View More author details

Table of Contents (16) Chapters

Preface

1. Chapter 1: Setting Up Your R Bioinformatics Working Environment

2. Chapter 2: Loading, Tidying, and Cleaning Data in the tidyverse FREE CHAPTER

3. Chapter 3: ggplot2 and Extensions for Publication Quality Plots

4. Chapter 4: Using Quarto to Make Data-Rich Reports, Presentations, and Websites

5. Chapter 5: Easily Performing Statistical Tests Using Linear Models

6. Chapter 6: Performing Quantitative RNA-seq

7. Chapter 7: Finding Genetic Variants with HTS Data

8. Chapter 8: Searching Gene and Protein Sequences for Domains and Motifs

9. Chapter 9: Phylogenetic Analysis and Visualization

10. Chapter 10: Analyzing Gene Annotations

11. Chapter 11: Machine Learning with mlr3

12. Chapter 12: Functional Programming with purrr and base R

13. Chapter 13: Turbo-Charging Development in R with ChatGPT

14. Index

Why subscribe?

15. Other Books You May Enjoy

Testing the fit of the model using cross-validation

Cross-validation provides a reliable estimate of a model’s performance on unseen data. Simulating the model’s performance on multiple subsets of the data reduces the effect of random variations in the training and testing data splits, providing a more realistic assessment of its generalizability.

K-fold cross-validation involves dividing a dataset into K equally-sized subsets, or folds, where K is a predefined number typically chosen between 5 and 10. The original dataset is randomly partitioned into K subsets of approximately equal size (folds), and a model is trained on K-1 folds and evaluated on the fold left out. This means that K-separate model training and evaluation cycles are performed. The performance values from the K iterations are then averaged to obtain a single metric that represents the overall performance.

Leave-one-out (LOO) cross-validation is a variant of cross-validation where the number of...

The rest of the chapter is locked

You're reading from R Bioinformatics Cookbook Utilize R packages for bioinformatics, genomics, data science, and machine learning

Table of Contents (16) Chapters

Testing the fit of the model using cross-validation

Authors (2)

Personalised recommendations for you

You're reading from R Bioinformatics Cookbook Utilize R packages for bioinformatics, genomics, data science, and machine learning

Table of Contents (16) Chapters

Testing the fit of the model using cross-validation

Unlock this book and the full library FREE for 7 days

Authors (2)

Personalised recommendations for you