Who should read this book

Biologists that need to use the command line. If you want to learn how to use the command line to install and run software this book is for you. The command line is used throughout the book and you will quickly gain familiarity with the most important commands. You will also learn how to install software and how to work on remote machines. The latter will be important if you want to run bioinformatics software on your institutes high performance cluster.

Biologists that want to create their own data analysis scripts. If you want to learn how to write your own data analysis scripts this book is also for you. The book starts off by explaining fundamental computing concepts, teaching you how to think like a computer. You will then learn how to use Python by creating a script to analyse the guanine-cytosine (GC) content of a bacterial genome. There is also a chapter on data visualisation that teaches you how to work with R. Furthermore, programming best practises are highlighted and explained throughout the book.

Biologists that want to ensure their data analysis is reproducible. If you want to ensure that your data analysis is reproducible this book is also for you. Early on you will learn how to use version control to track changes to your projects. Furthermore, the concept of using automation to ensure reproducibility is explored in detail.

No prior knowledge required. Important concepts and jargon are explained as they are introduced. No prior knowledge is required. Just a willingness to learn, experiment and have fun geeking out.


This work uses the Creative Commons (CC0 1.0) licence. So you can copy, modify, distribute and perform the work, even for commercial purposes, without having to ask for permission.

Source code

The source for this book is hosted on GitHub.


This book is still a work in progress. I would really appreciate your feedback. Please send me an email to let me know what you think. Alternatively you can contact me via Twitter @tjelvar_olsson, @bioguide2comp or message me via the Facebook page.

If you want to receive updates about this book do sign up to the mailing list. You can find the sign up form on the website:


Thanks to Nadia Radzman, Sam Mugford and Anna Stavrinides for providing feedback on early versions of the initial chapters. Many thanks to Tyler McCleary for continued in depth feedback and suggestions for improvements. Thanks to Nick Pullen for feedback and discussions on the data visualisation chapter. Many thanks to Matthew Hartley for discussions and encouragement.

I’m also grateful for feedback from Lucy Liu and Morten Grøftehauge.