The Biologist’s Guide to Computing¶
- Preface
- Getting some motivation
- How to think like a computer
- First steps towards automation
- First things first, how to find help
- Creating a new directory for our project
- Downloading the Swiss-Prot knowledge base
- Creating a work flow using pipes
- Examining files, without modifying them
- Finding FASTA identifier lines corresponding to human proteins
- Extracting the UniProt identifiers
- Using redirection to create an output file
- Viewing the command history
- Clearing the terminal window
- Copying and renaming files
- Removing files and directories
- Key concepts
- Structuring and storing data
- Keeping track of your work
- Data analysis
- What is Python?
- Using Python in interactive mode
- Variables
- Determining the GC count of a sequence
- Creating reusable functions
- List slicing
- Loops
- Creating a sliding window GC-content function
- Downloading the genome
- Reading and writing files
- Creating a function for reading in the Streptomyces sequence
- Writing out the sliding window analysis
- Key concepts
- Data visualisation
- Starting R and loading the Iris flower data set
- Understanding the structure of the
iris
data set - A note on statistics in R
- Default plotting in R
- Installing the ggplot2 package
- Loading the ggplot2 package
- Plotting using ggplot2
- Available “Geoms”
- Scripting data visualisation
- Faceting
- Adding more colour
- Purpose of data visualisation
- Conveying a message to an audience
- Writing a caption
- Other useful tools for scripting the generation of figures
- Key concepts
- Collaborating on projects
- Creating scientific documents
- Automation is your friend
- Practical problem solving
- Working remotely
- Managing your system
- Next steps
- Glossary