Stacks Databases
Database management
Databases are usually large in size and we have limited space on our clusters. To remedy this, please remove any databases you don’t need. If you have a database older than 3 months in the system, you will be asked to clean up your old databases.
Naming your databases
Your database name must end with _radtags
to show up in the web interface.
Make sure to include your name and the date your database was created in your database name. Some examples of good database names are:
alice_mozzie_20170304_radtags
bob_possum_filtered_20170701_radtags
carol_AaegL2_males_20170122_radtags
If your name and a creation date is not included in your database name, your database may be deleted without warning.
Removing databases
Delete databases by using the SQL DROP DATABASE
statement.
mysql -e "DROP DATABASE jess_20170815_radtags"
Backing up databases
You can also backup your database with the mysqldump
command. The output is
in plain-text, so use gzip
to compress the file to save space.
mysqldump --databases jess_20170815_radtags \
| gzip \
> ~/my_backups/jess_20170815_radtags.sql.gz
Restoring a database from backup
You can restore your saved databases by importing your file into MySQL.
# Uncompress your file
gunzip jess_20170815_radtags.sql.gz
# Import from file
mysql jess_20170815_radtags < ~/my_backups/jess_20170815_radtags.sql
Running Stacks on clusters which don’t allow databases
If you’re using the barcoo cluster for your analysis, you won’t be able to load your catalog to a database on the cluster. This means you’ll need to do your computation (on barcoo) separately to a machine that alows database access.
On barcoo
If you’re using ref_map.pl
or denovo_map.pl
you can use the -S
option
to disable all database interaction. The output from this step should be
many tsv files.
Example output:
batch_1.catalog.alleles.tsv
batch_1.catalog.snps.tsv
batch_1.catalog.tags.tsv
batch_1.haplotypes.tsv
batch_1.hapstats.tsv
batch_1.markers.tsv
batch_1.populations.log
batch_1.sumstats_summary.tsv
batch_1.sumstats.tsv
batch_1.vcf
mozzie-1.alleles.tsv
mozzie-1.matches.tsv
mozzie-1.models.tsv
mozzie-1.snps.tsv
mozzie-1.tags.tsv
mozzie-2.alleles.tsv
mozzie-2.matches.tsv
mozzie-2.models.tsv
mozzie-2.snps.tsv
mozzie-2.tags.tsv
...
Transfer your data
You’ll need to transfer your output files from ref_map.pl
or denovo_map.pl
to another computer which you have permission to create databases on. Since
tsv files are plain-text, it’s good practice to compress them before transferring.
You can use a graphical FTP client or the command line to transfer files.
Here we’ll use rsync
with the --compress
option to do transfer the files.
On barcoo, in an interactive slurm session, you can transfer the directory
called catalog
with the following command:
rsync -av --compress --update --progress \
catalog \
<username>@<IP-address>:<path_to_my_project_directory>
replacing <username>
, <IP-address>
, and <path_to_my_project_directory>
with your relevant information.
Load the catalog with load_radtags.pl
On your machine which you have permission to create databases, load your
catalog with load_radtags.pl
.
# Create an empty database
mysql -e "CREATE DATABASE jess_20170815_radtags"
# Load template tables
mysql jess_20170815_radtags < /mnt/galaxy/gvl/software/stacks/share/stacks/sql/stacks.sql
# Load RAD tags
load_radtags.pl \
-D jess_20170815_radtags \
-p catalog \
-b 1
-B -e "Description here" \
-c
Index the catalog with index_radtags.pl
You’ll then need to create indices for your database. Use -c
to generate
the catalog index and -d
to generate the unique tags index.
index_radtags.pl \
-D jess_20170815_radtags \
-c \
-t
View your data in the web interface
With your web browser, go to the server you have your data on, and navigate to the Stacks page. Click on your database name to view it.