Official BioV Documentatin

The link to the BioV Webpage is http://biov.tcdb.org/

To view documentation on all BioV tools, click this link:

To read the paper by Reddy & Saier, 2012, describing in detail how each program works, click here

Creating Super Family Trees

This tutorial will explain how to make Super Trees using the SFT1 approach

The SFT1 approach builds a tree at the level of each TCDB sequence. To build a SFT2, instead of using getNcbiSeq.pl, you should build fasta files containing homologs for all the members of a subfamily or group, fasta files like those representing an entire TC family. After obtaining fasta files for each group that will be included in your SFT2, you would run supertree.pl.

The programs used here have recently been updated to run faster than before, to take care of some labeling problems (due to phylip‘s limitation to 10 character labels), and to run fitch and consense. These instructions reflect the new usage protocols.

Read more

Using Ancient-Rep to find internal TMS repeat units

Ancient-Rep will find internal TMS repeats using a list of homologs.

Enter ‘ancient’ in the Terminal app to begin.

Read more

Create a list of homologs to represent an entire TC Family

Having a FASTA file that defines an entire family is very useful if you want to find repeats with Ancient or use TSSearch/Protocol2 to compare two entire families for homology.
A TC-Family looks like this : 2.A.1 (It has three digits).

The program we are using is called define_family.py

Usage: define_family.py FAMILY <P/PSI> OUTPUT

Open up your terminal application and type:

cd ~/Desktop/ # Changes your working directory to your desktop.
define_family.py 2.A.1 P output.faa # P or PSI

The “P” option refers to BLASTP. Alternatively we can use “PSI” if we are looking for more distant homologs. When comparing families or looking for repeats, it is best to use the “P” option. If no good results are found, then use “PSI”.

When prompted, enter 0.7 for CD-Hit threshold if you are about to compare this family to another. Enter 0.9 if you are searching for repeats. This will remove proteins that are 70% and 90% identical to their clusters, respectively.

We use forgiving thresholds, because having a very large FASTA list will not cost us very much time, so long as we are using TSSearch. When looking for repeats, we don’t want to eliminate too many sequences. This becomes apparent when doing a vertical search with Ancient. A good example of a TMS repeat across two homologs can be masked if we have a threshold that is any lower.

Super Family Tree (SFT) Program

Download and Installation:

The SFT programs can be downloaded from the Lab-Software section of the Transporter Classification Database (TCDB; www.tcdb.org) website.  These programs will require a MacOS computer with BLAST programs (psiblast, blastcl3, blastall, etc) and cd-hit installed.  The programs, getNcbiSeq.pl and supertree.pl, should be downloaded and placed into your bin folder.  You will need to change the permissions of these programs using the chmod command as necessary.  Then the programs, fitch and consense, should be downloaded and installed using the ProtPars software package (http://evolution.genetics.washington.edu/phylip/doc/protpars.html).  The older version of the Tree View (TV) program can be downloaded from the Biotools section of TCDB or http://taxonomy.zoology.gla.ac.uk/rod/treeview.html or the newer versions can be downloaded from http://darwin.zoology.gla.ac.uk/~rpage/treeviewx/download.html.

Second Section (insert between sections “Step 6: SFT2A and SFT2B” and the last section containing the citations):

Experiencing difficulties?

For assistance with these programs:

e-mail: jchen@alumni.ucsd.edu

Please include with your request

(1)  name

(2)  lab

(3)  current project

(4)  contact info (phone number and email)

(5)  location of your files

(6)  the exact nature of your problem

Response times will vary between 1-3 days.  I will do my best to respond and resolve issues promptly. Thanks.

 

Return top

Welcome.