Installing Software

If you want a particular piece of software installed on mozzie or rescue you can email me at jchung@unimelb.edu.au. Anyone is also free to compile and install software in their personal directories.

Installing software on Linux can be challenging. Sometimes it’s as easy as a downloading a pre-compiled binary file and sometimes you’re stuck in Dependency Hell. There are also be many different ways to install software depending on which form it comes in. The most common ways you should use listed below.


Downloading binaries or jar archives

If the software is pre-compiled or is a java archive (ending in *.jar), you can just download it to your directory and run it.

For example, let’s say you want to install the latest version of Bowtie2 because the bowtie already installed on the cluster doesn’t have the new feature you want.

You would usually search for the software in your favourite search engine, then make your way to the download page. Bowtie2 is hosted on Sourceforge and has pre-compiled binaries, so we’ll download the software from there.

The Bowtie2 2.3.3 directory on Sourceforge currently has three items:

The first two items contain binaries for MacOS and Linux respectively. The “x86_64"refers to the architecture it’s meant to run on. The last file is an archive containing the source code if we want to compile it ourselves.

Since our cluster runs Linux, we can download bowtie2-2.3.3-linux-x86_64.zip. Right click on the link and copy the link address so we can download it to the server.

# Login and change to a directory where you want to store your software
cd software
# Download the file
wget https://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.3.3/bowtie2-2.3.3-linux-x86_64.zip/download -O bowtie2-2.3.3-linux-x86_64.zip
# Unzip the directory
unzip bowtie2-2.3.3-linux-x86_64.zip
# Enter the directory and see what's inside
cd bowtie2-2.3.3 && ls
# Run and view the help options of bowtie
./bowtie2 --help

Java archives are also simple to run. Just download the .jar file, and execute something like:

java -jar <my_program>.jar

Sometimes you’ll need to specifiy the memory required like so:

# Specify 4gb as the maximum heap size
java -Xmx4g -jar <my_program>.jar

Conda packages

Conda is a package manager written in Python and is available for use on our clusters. Package managers make installing easy by automating the process and handing dependencies. Many bioinformatics tools have been wrapped for Conda installation and are in the bioconda channel.

You’ll need to create your own virtual environment to use Conda, since you won’t have write permission to install software in the default location. A virtual environment is an environment where you have your own isolated copy of Conda and your own software packages that you installed via Conda. Virtual environments work by appending the virtual environment’s bin directory to your $PATH.

You can learn more about Conda here.

If you’re using the Melbourne Bioinformatics clusters, you’ll need to module load Python into your environment path first.

# If you're using barcoo
module load python-intel/3.6

Let’s create a new Conda virtual environment and install a package.

# Create a virtual environment called my_new_env
conda create -n my_new_env

# Activate the enviroment
source activate my_new_env

# Your command line prompt should now start with (my_new_env) $

# Look at your path and see the first bin directory is your
# conda environment in your home directory
echo $PATH

You can manage packages using the conda command.

# Print conda help
conda -h

# List conda packages
conda list

You can install packages with conda install. For example, bioawk is available in the bioconda channel and you can install it like so:

# Install bioawk from the bioconda channel
conda install -c bioconda bioawk

When you’re finished using your software in conda, you can exit the virtual environment with:

source deactivate

And when you need your virtual environment again, you can reactivate it with:

source activate <my-environment-name>

Compiling from source

Compiling from source is another way to install software. Sometimes, only the source code is provided and in this case, you’ll need to compile the software yourself.

A common case is that the source code of the program you want to install is available on GitHub. Often, there are installation instructions in the README or INSTALL file which you can follow.

In the most simple case, a make command will run a Makefile and compile the software into a binary file which you can execute. Let’s compile bwa as an example.

# Clone the repository
git clone https://github.com/lh3/bwa.git
# Change directory
cd bwa
# Compile
make

A new file called bwa should be in the directory which you can run with ./bwa.

Software may also specify it needs to be built with ./configure, make, then make install. The configure script checks the environment the software is to be built in and checks dependencies. Typically, you can also specify additional options when running ./configure such as --prefix=/home/my_username/bin to specify where to install the software. Running the configure script should output a Makefile. Running make will built the software and make install will copy the executable files into the default or specified directory.

You can read more about using configure, make, and make install here and here.