get_db: download and install a database¶
The Hatfull lab server hosts the primary actinobacteriophage database instance, Actino_Draft, which is routinely updated with new genomics data, as well as databases that have been frozen for publication. These databases can be downloaded and installed using the pdm_utils get_db
tool.
To view and interactively select the list of available databases on the Hatfull lab server:
> python3 -m pdm_utils get_db server
The pdm_utils get_db
interactive tool functions as a faux command line interface that allows you to view the database packages at the specified url and navigate through subdirectories. Navigation through subdirectories is modelled after common CLI commands:
root@http://databases.hatfull.org/::$ ls
Abscessus_prophages PhageDatabase
Actino_Draft_v6 Actino_Draft
Actino_prophage Published/
root@http://databases.hatfull.org/::$ cd Published
root@http://databases.hatfull.org/::/Published$ ls
Actinobacteriophage_1060 Actinobacteriophage_1321
Actinobacteriophage_2422 Actinobacteriophage_554
Actinobacteriophage_685 Actinobacteriophage_692
../
With the pdm_utils get_db
interactive tool you can see a description of a database package that is at available at the specified url:
root@http://databases.hatfull.org/::$ desc PhageDatabase
Name:
PhageDatabase
Date:
202x-xx-xx
Description:
This database contains sequenced phages from the John Doe lab.
A database of choice can be selected using the pdm_utils get_db
interactive tool with the following:
root@http://databases.hatfull.org/::$ select PhageDatabase
To download and install the current version of a database, like the Actino_Draft database, without the interactive tool:
> python3 -m pdm_utils get_db server -db Actino_Draft
> python3 -m pdm_utils get_db server --database Actino_Draft
The -db argument ‘Actino_Draft’ indicates the name of the database to download from the server, and it will install it under the same name in your local MySQL. The database will be downloaded, installed, and then the file will be removed.
Note
This will overwrite an existing Actino_Draft database, if it exists.
To download and install a database from a non-standard server, specify the URL:
> python3 -m pdm_utils get_db server -db PhageDatabaseName -u http://custom/server/website
> python3 -m pdm_utils get_db server -db PhageDatabaseName --url http://custom/server/website
The pdm_utils get_db
tool checks your local database version against the specified server database package and will not download if the local database version is equal to or higher than the database package to prevent redundancies and/or loss of data. To ignore this check:
> python3 -m pdm_utils get_db server -db Actino_Draft -fp
> python3 -m pdm_utils get_db server -db Actino_Draft --force_pull
Databases can be downloaded and installed in two steps, which can be used to install a database under a new name:
First download the database sql and version files from the Hatfull server using the ‘-d’ and ‘-v’ flags. This will save the database to your local computer as a SQL file (e.g. Actino_Draft.sql) without installing it in MySQL. Also specify where the file should be downloaded using the ‘-o’ flag (if omitted, the default is the /tmp/ directory):
> python3 -m pdm_utils get_db server -db Actino_Draft -o ./ -d -vNext, indicate the new name of the database to be created (e.g. NewDB), indicate that a local file will be used with ‘file’ , and indicate the path to the downloaded SQL file:
> python3 -m pdm_utils get_db NewDB file ./downloaded_db/Actino_Draft.sql
Use of a Database management configuration file can automate the pipeline so that user input such as MySQL credentials or server URL is not needed:
> python3 -m pdm_utils get_db server -db Actino_Draft -o ./ -d -v -c config_file.txt