Who is this book for?
This book aims to practically introduce machine learning and deep learning for genomic applications that can transform genomics data into novel biological insights. It provides both the theoretical fundamentals and hands-on sections to give a taste of how machine learning and deep learning can be leveraged in real-world applications in the life sciences and biotech industries. This book covers a range of topics that are not currently available in other textbooks. The book also includes the challenges, pitfalls, and best practices when applying machine learning and deep learning to real-world scenarios. Each chapter of the book has code written in Python with industry-standard machine learning and deep learning libraries and frameworks such as Keras that the audience can reproduce in their working environment. This book is designed to cater to the needs of researchers, bioinformaticians, and data scientists in both academia and industry who want to leverage machine learning and deep learning technologies in genomic applications to extract insights from sets of big data. Managers and leaders who are already established in the life sciences and biotechnology sectors will not only find this book useful but can also adopt these methodologies to identify patterns, come up with predictions, and thereby contribute to data-driven decision-making in their respective companies.
The book is divided into three different parts. The first part introduces the fundamentals of genomic data analysis and machine learning. In this part, we will introduce the basic concept of genomic data analysis and discuss what machine learning is and why it is important for genomics and what value machine learning will bring to the life sciences and biotechnology industries. The second part will transition the readers from machine learning to deep learning and introduce them to the basic concepts of deep learning and diverse deep learning algorithms, using real-world examples to transform raw genomics data into biological insights. The final part will describe how to operationalize deep learning models using open source tools to enable predictions for end users. In this part, you will learn how to build and tune state-of-the-art machine learning models using Python and industry-standard libraries to derive biological insights from large amounts of multimodal genomic datasets and how to deploy these models on several cloud platforms such as AWS and Azure. The last chapter in the final part is fully dedicated to the current challenges for deep learning approaches to genomics and the potential pitfalls and how to avoid them using best practices.