Computational Biology Resource Center

Transcription Factor Searches

MUSC non-Net-based Transcription based searches

If you wish to scan your DNA sequence for transcription factor bindingsites you have two choices in GCG. I. Here's the first process 1) Start GCG and "fetch" tfsites.dat.   tfsites.dat is a compilation (a bit outdated--who knows why   it is supplied by GCG and there was NOT a newer release   available from NCBI...) of known transcription factor    binding sites along with their literature references.   The current tfsites.dat file fetched this way is   dated 1996. A 2003 version in GCG format(tfsites.gcg) may be    retrieved from  this site.2) Select your to be scanned sequence and use findpatterns   with the following option: -data=tfsites.dat   ie:      findpatterns -dat=tfsites.dat filename.seq                ^^^^^^^^^^^^^^^^3) The outputfile ie filename.find contains all the locations   where the recognition sites for transcription factors have   been mapped. You can scan through the file to see if anything   looks useful to you.4) There is no provision for finding the references to these   known sites. This is a flaw in GCG. To get around this    I wrote some things and Karen Jesmer at CCIT helped a huge amount   to create a short unix shell script which will read your findpatterns   output file and then get the references which match your hits.  

Send an email to Starr Hazard requesting the unix shell program starr

5) Using "starr"    a) run findpatterns with -dat=tfsites.dat    b) type sh, starr, the findpatterns output file and the ref file        to be created  ie       sh starr filename.find filename.findref     c) examine filename.findref for the references which  your findpattern       search located.II. Here's the second way 1) The tfsites.dat file may also be read by any of the map programs   to locate the tfsites along your sequence.   type        mapplot -dat=tfsites.dat filename.seq                ^^^^^^^^^^^^^^^^   This will create a "digestion" map showing the places where the tfsites   finds recognition sites. Of course, the MAP program will use the   tfsites.dat file as well. Type:               map  -dat=tfsites.dat filename.seq            ^^^^^^^^^^^^^^^^7) OR you could use Dan Prestridge's SignalScan program. To use this you   must add two lines to your .cshrc file. Then save the modifications.   Finally type "source .cshrc" to activate the changes (or start a new   shell, or log out then login again). Typing "signal" should initiate   SignalScan. This is not a better program its just different. You can   go back and look at the references but only one at a time.

There are the two lines to add to your .cshrc file. Send an e-mail to Starr Hazard to get these two



   8) OR finally, refer to the following links to Web resources. These do not   generally work better or faster but they do give you hypertext links   to the references and are therefore more convenient in that regard.

Net-based Transcription Factor and Promoter Search Services

revised by ESH August 14, 2012


