Packt+ | Advance your knowledge in tech

0

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Free Learning

Bioinformatics with Python Cookbook

You're reading from Bioinformatics with Python Cookbook Learn how to use modern Python bioinformatics libraries and applications to do cutting-edge research in computational biology

Product type Paperback

Published in Nov 2018

Publisher Packt

ISBN-13 9781789344691

Length 360 pages

Edition 2nd Edition

Languages

Python

Tools

Biopython

Concepts

Bioinformatics

Author (1):

Tiago Antao

View More author details

Table of Contents (12) Chapters

Preface

1. Python and the Surrounding Software Ecology

2. Next-Generation Sequencing FREE CHAPTER

3. Working with Genomes

4. Population Genetics

5. Population Genetics Simulation

6. Phylogenetics

7. Using the Protein Data Bank

8. Bioinformatics Pipelines

9. Python for Big Genomics Datasets

10. Other Topics in Bioinformatics

11. Advanced NGS Processing

Computing sequencing statistics using Spark

If you need to use parallel computing, then Spark is one alternative to Dask. Its abstraction level is slightly higher. This gives you less granular control over the computation, but is more declarative to code. Spark is also somewhat language agnostic (it is actually Java/Scala-based). Here, we will compute some very basic statistics over the Parquet dataset that we generated in the previous recipe.

Getting ready

Preparing for this recipe can be quite tricky. First, we will have to start a Spark server. At the time of writing this book, the conda packages for accessing Spark were quite immature. We will still use conda here, but we will not install any Spark packages from conda. Follow these steps to prepare the environment:

Make sure that you have Java 8 installed. Be careful with the Java version, as an older version will not work, but a newer might also be problematic.
Download Spark (https://spark.apache.org/downloads.html). This code was tested...

The rest of the chapter is locked

Register for a free Packt account to unlock a world of extra content!

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (1)

Tiago Antao

Tiago Antao

Tiago Antao is a bioinformatician currently working in the field of genomics. A former computer scientist, Tiago moved into computational biology with an MSc in Bioinformatics from the Faculty of Sciences at the University of Porto (Portugal) and a PhD on the spread of drug-resistant malaria from the Liverpool School of Tropical Medicine (UK). Postdoctoral, Tiago has worked with human datasets at the University of Cambridge (UK) and with mosquito whole genome sequencing data at the University of Oxford (UK), before helping to set up the bioinformatics infrastructure at the University of Montana. He currently works as a data engineer in the biotechnology field in Boston, MA. He is one of the co-authors of Biopython, a major bioinformatics package written in Python.

See other products by Tiago Antao

Other recommended products

Related to this chapter

R Bioinformatics Cookbook

R Bioinformatics Cookbook

In the R Bioinformatics Cookbook, you encounter common and not-so-common challenges in the bioinformatics domain and solve them using real-world examples. The book guides you through varied bioinformatics analysis, from raw data to clean results. It shows you how to import, explore and evaluate your data and how to report it.

Oct 2019 10h 32m

R Bioinformatics Cookbook

R Bioinformatics Cookbook

In the R Bioinformatics Cookbook, you encounter common and not-so-common challenges in the bioinformatics domain and solve them using real-world examples. The book guides you through varied bioinformatics analysis, from raw data to clean results. It shows you how to import, explore and evaluate your data and how to report it.

Oct 2019 10h 32m

R Bioinformatics Cookbook

R Bioinformatics Cookbook

In the R Bioinformatics Cookbook, you encounter common and not-so-common challenges in the bioinformatics domain and solve them using real-world examples. The book guides you through varied bioinformatics analysis, from raw data to clean results. It shows you how to import, explore and evaluate your data and how to report it.

Oct 2019 10h 32m

Personalised recommendations for you

Based on your interests and search pattern

Mastering Customer Success

Mastering Customer Success

This guide unveils strategies to cultivate enduring customer relationships. Grounded in effective communication and problem-solving, it shows you how to harness cross-functional collaboration for competitive advantage.

May 2024 5h 40m

Mastering Customer Success

Mastering Customer Success

This guide unveils strategies to cultivate enduring customer relationships. Grounded in effective communication and problem-solving, it shows you how to harness cross-functional collaboration for competitive advantage.

May 2024 5h 40m

Mastering Customer Success

Mastering Customer Success

This guide unveils strategies to cultivate enduring customer relationships. Grounded in effective communication and problem-solving, it shows you how to harness cross-functional collaboration for competitive advantage.

May 2024 5h 40m

Mastering Customer Success

Mastering Customer Success

This guide unveils strategies to cultivate enduring customer relationships. Grounded in effective communication and problem-solving, it shows you how to harness cross-functional collaboration for competitive advantage.

May 2024 5h 40m

Mastering Customer Success

Mastering Customer Success

This guide unveils strategies to cultivate enduring customer relationships. Grounded in effective communication and problem-solving, it shows you how to harness cross-functional collaboration for competitive advantage.

May 2024 5h 40m

Mastering Customer Success

Mastering Customer Success

This guide unveils strategies to cultivate enduring customer relationships. Grounded in effective communication and problem-solving, it shows you how to harness cross-functional collaboration for competitive advantage.

May 2024 5h 40m

Mastering Customer Success

Mastering Customer Success

This guide unveils strategies to cultivate enduring customer relationships. Grounded in effective communication and problem-solving, it shows you how to harness cross-functional collaboration for competitive advantage.

May 2024 5h 40m

Mastering Customer Success

Mastering Customer Success

This guide unveils strategies to cultivate enduring customer relationships. Grounded in effective communication and problem-solving, it shows you how to harness cross-functional collaboration for competitive advantage.

May 2024 5h 40m

Mastering Customer Success

Mastering Customer Success

This guide unveils strategies to cultivate enduring customer relationships. Grounded in effective communication and problem-solving, it shows you how to harness cross-functional collaboration for competitive advantage.

May 2024 5h 40m

Mastering Customer Success

Mastering Customer Success

This guide unveils strategies to cultivate enduring customer relationships. Grounded in effective communication and problem-solving, it shows you how to harness cross-functional collaboration for competitive advantage.

May 2024 5h 40m

Mastering Customer Success

Mastering Customer Success

This guide unveils strategies to cultivate enduring customer relationships. Grounded in effective communication and problem-solving, it shows you how to harness cross-functional collaboration for competitive advantage.

May 2024 5h 40m

Mastering Customer Success

Mastering Customer Success

This guide unveils strategies to cultivate enduring customer relationships. Grounded in effective communication and problem-solving, it shows you how to harness cross-functional collaboration for competitive advantage.

May 2024 5h 40m