NCBI Tools

NCBI BLAST+

The BLAST+ software package is a stand-alone version of BLAST applications and is maintained by the NCBI (Camacho et al., 2009). This software is required in the find_domains pipeline to identify conserved domains within genes. Follow the installation instructions specific to your operating system provided in the BLAST+ user guide.

MacOS and Ubuntu installation

  1. Open a Terminal window and start the Conda environment:

    > conda activate pdm_utils
    (pdm_utils)>
    
  2. The most straightforward option is to use Conda to install blast:

    (pdm_utils)> conda install -c bioconda blast -y
    
  3. Test whether blast has been successfully installed:

    (pdm_utils)> blastp -help
    

If successful, a detailed description of the software’s options should be displayed.

If unsuccessful, an error message should be displayed, such as:

-bash: blastp: command not found

NCBI Conserved Domain Database

The NCBI Conserved Domain Database is a curated set of protein domain families (Lu et al., 2020). This database is required in the find_domains pipeline to identify conserved domains within genes.

  1. Download the compressed NCBI CDD.

  2. Expand the archived file into a directory of CDD files. The entire directory represents the database and no files should be added or removed.