NCBI Tools¶
NCBI BLAST+¶
The BLAST+ software package is a stand-alone version of BLAST applications and is maintained by the NCBI (Camacho et al., 2009). This software is required in the find_domains
pipeline to identify conserved domains within genes. Follow the installation instructions specific to your operating system provided in the BLAST+ user guide.
MacOS and Ubuntu installation¶
Open a Terminal window and start the Conda environment:
> conda activate pdm_utils (pdm_utils)>
The most straightforward option is to use Conda to install blast:
(pdm_utils)> conda install -c bioconda blast -y
Test whether blast has been successfully installed:
(pdm_utils)> blastp -help
If successful, a detailed description of the software’s options should be displayed.
If unsuccessful, an error message should be displayed, such as:
-bash: blastp: command not found
NCBI Conserved Domain Database¶
The NCBI Conserved Domain Database is a curated set of protein domain families (Lu et al., 2020). This database is required in the find_domains
pipeline to identify conserved domains within genes.
Download the compressed NCBI CDD.
Expand the archived file into a directory of CDD files. The entire directory represents the database and no files should be added or removed.