Automated NOESY Assignment Using CYANA: Difference between revisions
(Created page with '== '''CYANA Run''' == Unfortunately, there is no comprehensive CYANA manual. Many features can be found in the original DYANA manual. For a summary of features consult the CYANA…') |
No edit summary |
||
(17 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
== ''' | == '''Introduction''' == | ||
Below is the description of how to run CYANA 2.1 for automated NOE assignment if you are working with CARA. A tutorial for performing structure calculations with automated NOESY assignments using CYANA 3.0 is available [http://www.cyana.org/wiki/index.php/Structure_calculation_with_automated_NOESY_assignment on-line]. | |||
== '''Input files''' == | |||
Required files | |||
*Initialization file <tt>init.cya</tt>. | |||
* Initialization file <tt>init.cya</tt>. | *SequenceList in XEASY format - usually <tt>XXXX.seq</tt>, where XXXX is the NESG ID. | ||
* SequenceList in XEASY format - usually <tt>XXXX.seq</tt>, where XXXX is the NESG ID. | *AtomList in XEASY format <tt>XXXX.prot</tt> . Chemical shifts should be real, not folded. Make sure that you are using the most recent file. Atom labels should be swapped if using stereospecific assignments. | ||
* AtomList in XEASY format <tt>XXXX.prot</tt> . Chemical shifts should be real, not folded. Make sure that you are using the most recent file. Atom labels should be swapped if using stereospecific assignments. | *Separate unfolded PeakList for <sup>15</sup>N and <sup>13</sup>C NOESY: <tt>n.peaks</tt>, <tt>ali.peaks</tt>, <tt>aro.peaks</tt>. | ||
* Separate unfolded PeakList for | |||
Optional files | Optional files | ||
*Stereospecific assignment script (such as <tt>stereofound.cya</tt> from FOUND/HABAS). Note that this script should contain only <tt>atom stereo</tt> declarations, but no <tt>atom swap</tt> statements! Atom labels must be already swapped in the AtomList and external UPL files. | |||
*External UPL files, such as <tt>short.upl</tt>. Atom labels should be swapped if using stereospecific assignments. | |||
*External ACO files, such as <tt>gridsearch.aco</tt> output of FOUND/HABAS. | |||
== '''Format Conversion''' == | |||
The input files (sequence, atom list, ACOs and UPLs) must adhere to the IUPAC nomenclature used by CYANA 2.1 (i.e., <tt>H instead of HN</tt>, etc.). CARA is fully compatible with this nomenclature, while data from other programs may need to be converted. | |||
For the automated / <tt>noesyassign</tt> runs of CYANA, please make sure that your chemical shift list conforms to the IUPAC nomenclature (i.e., <tt>H | === '''Conversion from XEASY/DYANA/CYANA 1.X''' === | ||
translate dyana | |||
For the automated / <tt>noesyassign</tt> runs of CYANA, please make sure that your chemical shift list conforms to the IUPAC nomenclature (i.e., <tt>H instead of HN</tt>). To update your atom names, do the following in CYANA: | |||
<pre>translate dyana | |||
read protein.prot | read protein.prot | ||
translate off | translate off | ||
write protein-cyana.prot | write protein-cyana.prot</pre> | ||
</ | The <tt>protein-cyana.prot</tt> file now contains all of the correct atom names for CYANA. | ||
You may need to do the same with UPLs created in DYANA or CYANA 1.X | You may need to do the same with UPLs created in DYANA or CYANA 1.X | ||
See the <tt>~/demo/details/MigrateFromDyanaCyana1.cya</tt> example script in the CYANA 2.1 installation directory for details. | See the <tt>~/demo/details/MigrateFromDyanaCyana1.cya</tt> example script in the CYANA 2.1 installation directory for details. | ||
=== '''Conversion from Sparky''' === | |||
CYANA can also read BMRB format chemical shift by using following commands: <br | CYANA can also read BMRB format chemical shift by using following commands: <br> | ||
... | <pre>... | ||
read bmrb protein.bmrb | read bmrb protein.bmrb | ||
write prot protein.prot </ | write prot protein.prot </pre> | ||
For Sparky users, please use Sparky command <tt>xe</tt> to write out XEASY format peaklists. | |||
For Sparky users, please use Sparky command <tt>xe</tt> to write out XEASY format peaklists. | |||
=== '''Splitting the simultaneous NOESY peaklist''' === | |||
When | When working with CARA it is not necessary to provide external ACO and UPL files. In CARA spin assignments are not derived from peak lists, and there is less impact from CYANA modifying existing peaks assignments. When external constraints are employed there are usually fewer peaks assigned and fewer UPLs derived. Thus it is recommended to use external UPL and ACO files only if there are convergence problems without them. | ||
< | When using a simultaneous 3D NOESY peaklsit XEASY, you need to generate separate peaklists with UBNMR. The following UBNMR macro is provided as an example. It calculates proper <sup>15</sup>N chemical shifts and peak positions, and writes out separate <tt>nnoe.peaks</tt> and <tt>cnoe.peaks</tt> peaklists. Modify the numbers to reflect the proper <sup>15</sup>N and <sup>13</sup>C carrier offsets (in ppm) and the spectral width ratios (<tt>sw2/sw2N</tt>).<br> | ||
init | <pre>init | ||
read seq xxx.seq | read seq xxx.seq | ||
write seq xxxseq.bmrb autoBMRB | write seq xxxseq.bmrb autoBMRB | ||
Line 70: | Line 67: | ||
update proton shift NE2 117.273 1 | update proton shift NE2 117.273 1 | ||
update proton shift NE1 117.273 1 | update proton shift NE1 117.273 1 | ||
write prot noe.prot | write prot noe.prot</pre> | ||
</ | === '''External UPL Files''' === | ||
<tt>noeassign</tt> employs so-called "sum of r<sup>-6</sup>" averaging method (<tt>peaks calibrate</tt>) to calibrate peaklists and interpret UPLs during calculation. Therefore, external UPLs should ideally be calibrated with the same method. | |||
If you supply UPL constraints created with CALIBA (CALIBA uses "center" averaging), you should be aware that these constraints will be too loose. | |||
=== '''Using Unassigned Peaklists''' === | |||
< | If you are using completely unassigned peaklist (for example, picke from scratch in CARA), then you will need to add the following line to the peaklist header: | ||
<pre>#CYANAFORMAT HNh</pre> | |||
or | |||
<pre>#CYANAFORMAT HCh</pre> | |||
The lowercase h denotes the indirect (NOE) <sup>1</sup>H dimension. | |||
If your peaklist contains assigned peaks, then CYANA will be able to determine the peaklist dimensions based on these assignments. | |||
== '''Running Automated Structure Calculation with CYANA 2.1''' == | |||
#Create a working subdirectory (for example, <tt>structure/cyana21/calc1</tt>). | |||
#Create an init.cya file as described in [[CYANA|Getting Started or]] copy a previously used file. Set an appropriate RMSD calculation range. | |||
#Copy the latest sequence (<tt>XXXX.seq</tt>) and peaklist files (<tt>n.peaks</tt>, <tt>ali.peaks</tt> and <tt>aro.peaks</tt>) into the working directory. The sequence file and peaklist should in principle be the same as those used to [[FOUND|run FOUND]]. | |||
#Copy the updated atomlist (<tt>XXXX.prot</tt>). The spin labels in it should be swapped according to the [[FOUND|output of FOUND]]. | |||
#If you used FOUND, then copy the <tt>gridsearch.aco</tt> file from the previous FOUND run. | |||
#If you used FOUND, then copy the <tt>stereofound.cya</tt> file from the previous FOUND run. Make sure that incorrect stereospecific assignments have been commented out or removed. | |||
#(Optional) Generate the short-range UPL (<tt>short.upl</tt>) file based on the existing peak assignments. This is more convenient to do on a workstation. You can use the [[Media:Make_short.cya|make_short.cya]] script (see below). Alternatively, you can define a <tt>KEEP</tt> subroutine in the <tt>CALC.cya</tt> file. | |||
#Download the [[Media:CALC_noeassign.cya|CALC.cya]] script (see below) and modify it according to the input data. | |||
You can choose whether you want to run structure calculation on a local Linux workstation or on the U2 Linux cluster. The typical machine times on a single workstation are 1.5 - 3 hours, depending on the protein size. Calculations on the cluster take only 15-30 minutes, but there my be additional queue waiting time. On weekdays during working hours (9 a.m. - 4 p.m.) there are 10 dual-processor nodes reserved for us only, and there is no waiting time. | |||
Check the [http://www.ccr.buffalo.edu/hotpages/content/u2/queue.htm queue status page] and the [http://www.ccr.buffalo.edu/hotpages/content/u2/nodes.htm nodemap page] to see the current system loads on U2. | |||
To run calculations on the U2 Linux cluster: | |||
#Log in to <tt>u2.ccr.buffalo.edu</tt> | |||
# Log in to <tt>u2.ccr.buffalo.edu</tt> | #Change directory to <tt>/san/projects1/szypersk/</tt>. | ||
# Change directory to <tt>/san/projects1/szypersk/</tt>. | #Create a working subdirectory (like <tt>username/XXXX/cyana21</tt>) | ||
# Create a working subdirectory (like <tt>username/XXXX/cyana21</tt>) | #Copy the entire subdirectory <tt>calc1</tt>. You can use <tt>gftp</tt>, <tt>scp</tt> or <tt>sftp</tt>. | ||
# Copy the entire subdirectory <tt>calc1</tt>. You can use <tt>gftp</tt>, <tt>scp</tt> or <tt>sftp</tt>. | #Download the PBS submission script [[Media:Cyana.pbs|cyana.pbs]] (see below). Modify it if needed. | ||
# Download the PBS submission script [[ | #Type <tt>qsub cyana.pbs</tt> to submit you job. | ||
# Type <tt>qsub cyana.pbs</tt> to submit you job. | |||
To run calculations on a workstation: | To run calculations on a workstation: | ||
#Start CYANA 2.1 by typing <tt>cyana21</tt> | |||
#Enter <tt>CALC</tt> at the cyana prompt. | |||
<br> | |||
== '''Output files''' == | |||
*<tt>final.pdb</tt> - resulting structure | |||
*<tt>final.ovw</tt> - final overview file | |||
*<tt>final.upl</tt> - final UPL file (unambiguous constraints; atom labels may be swapped) | |||
*<tt>*-final.prot</tt> - final atom list (chemical shifts unchanged?; atom labels may be swapped) | |||
*<tt>finalstereo.cya</tt> - stereospecific assignment file (to find swapped atom pairs see calculation log) | |||
*<tt>*-cycle7.peaks</tt> - assigned peaklists (in CYANA 2.1 format with multiple assignments) | |||
*<tt>cycleX.*</tt> - UPL, OVW, PDB and NOA files for cycle X (ambiguous constraints in UPL files) | |||
Macro <tt>noeassign</tt> in CYANA 2.1 performs 7 routine calculation cycles and one final cycle. The output files are labeled <tt>cycle1.*</tt>, <tt>cycle2.*</tt> ... <tt>cycle7.*</tt> and <tt>final.*</tt> with appropriate extensions. Additional stereospecific assignment search is performed after cycle 7, therefore files, <tt>final.upl</tt> and <tt>*-final.prot</tt> likely have some labels swapped. | |||
Assigned peak lists are saved after cycle 7. They may have multiple assignments for some peaks thus not being fully compatible with XEASY. | |||
Always check the output of CYANA calculation for the results of <tt>peakcheck</tt> command. It is executed before the first calculation cycle and reports various inconsistencies in the atom list and peak lists. In the end, many UPL violations can be traced back to mistakes in assignment or mis-picked peaks. | |||
== '''Example scripts''' == | |||
Below are the key scripts for running CYANA. See the demo subdirectory of CYANA installation for more details. | |||
< | === '''make_short.cya''' === | ||
peaks | <pre>peaks := n,ali,aro # names of peak lists | ||
prot | prot := $name # names of proton lists | ||
tolerance | tolerance := 0.05,0.02,0.3 # chemical shift tolerances | ||
# order: 1H(a), 1H(b), 13C/15N(b), 13C/15N(a) | # order: 1H(a), 1H(b), 13C/15N(b), 13C/15N(a) | ||
calibration:= 1.7E6,1.7E6,1.7E6 # calibration constants (will be determined | calibration:= 1.7E6,1.7E6,1.7E6 # calibration constants (will be determined | ||
# automatically, if commented out) | # automatically, if commented out) | ||
dref | dref := 4.2 # average upper distance limit for | ||
# automatic calibration | # automatic calibration | ||
peakcheck peaks=$peaks prot=$prot | peakcheck peaks=$peaks prot=$prot | ||
Line 156: | Line 148: | ||
peaks calibrate "**" simple | peaks calibrate "**" simple | ||
write upl short.upl | write upl short.upl | ||
</ | </pre> | ||
<br> For the <tt>calibration</tt> parameter you can provide the list of calibration constants you have derived for the "backbone" class with <tt>caliba</tt>, when you calibrated the initial peak lists for use with FOUND/HABAS. Do not comment or delete this line, leave it blank if you want automatic calibration. Automatic calibration uses the <tt>dref</tt> parameter as the presumed average distance for all peaks in a peaklist (not just for backbone, like <tt>caliba</tt>).<br> | |||
For the <tt>calibration</tt> parameter you can provide the list of calibration constants you have derived for the "backbone" class with <tt>caliba</tt>, when you calibrated the initial peak lists for use with FOUND/HABAS. Do not comment or delete this line, leave it blank if you want automatic calibration. Automatic calibration uses the <tt>dref</tt> parameter as the presumed average distance for all peaks in a peaklist (not just for backbone, like <tt>caliba</tt>). | |||
<br> | |||
< | === '''CALC.cya''' === | ||
peaks | <pre>peaks := n,ali,aro # names of NOESY peak lists | ||
prot | prot := $name # names of chemical shift lists | ||
constraints := gridsearch.aco,short.upl,stereofound.cya # additional (non-NOE) constraints | constraints := gridsearch.aco,short.upl,stereofound.cya # additional (non-NOE) constraints | ||
tolerance | tolerance := 0.05,0.02,0.4 # chemical shift tolerances | ||
# order: 1H(a), 1H(b), 13C/15N(b), 13C/15N(a) | # order: 1H(a), 1H(b), 13C/15N(b), 13C/15N(a) | ||
#upl_values | #upl_values := 2.4,6.0 # calibration cutoffs | ||
calibration := 1.7E6,1.7E6,1.7E6 # NOE calibration parameters | calibration := 1.7E6,1.7E6,1.7E6 # NOE calibration parameters | ||
structures | structures := 100,20 # number of initial, final structures | ||
steps | steps := 10000 # number of torsion angle dynamics steps | ||
rmsdrange | rmsdrange := 10..100 # residue range for RMSD calculation | ||
randomseed | randomseed := 434726 # random number generator seed | ||
dref | dref := 4.0 # average distance for calibration, default 4.0 | ||
keep | keep := # set to KEEP to retain existing assignments | ||
subroutine KEEP | subroutine KEEP | ||
Line 182: | Line 172: | ||
end | end | ||
#protocol := noeassign.out # output logging on | #protocol := noeassign.out # output logging on | ||
noeassign peaks=$peaks prot=$prot calibration=$calibration keep=$keep autoaco | noeassign peaks=$peaks prot=$prot calibration=$calibration keep=$keep autoaco | ||
#protocol := | #protocol := | ||
</ | </pre> | ||
<br> Parameter <tt>constraints</tt> can be a comma-separated list of all kinds of external constraints, which can be read by <tt>read data</tt> command in CYANA. You can have UPLs, ACOs and even .cya scripts, for example, defining stereospecific assignments of methyl groups. Do not comment this line, leave it blank if you are not providing external constraints. | |||
Parameter <tt>constraints</tt> can be a comma-separated list of all kinds of external constraints, which can be read by <tt>read data</tt> command in CYANA. You can have UPLs, ACOs and even .cya scripts, for example, defining stereospecific assignments of methyl groups. Do not comment this line, leave it blank if you are not providing external constraints. | |||
If you a providing stereospecific assignments, do not use <tt>atom swap</tt> in the <tt>stereofound.cya</tt> script. Use atom list and <tt>short.upl</tt> with all required labels swapped, then the <tt>stereo.cya</tt> should only contain <tt>atom stereo</tt> declarations. | If you a providing stereospecific assignments, do not use <tt>atom swap</tt> in the <tt>stereofound.cya</tt> script. Use atom list and <tt>short.upl</tt> with all required labels swapped, then the <tt>stereo.cya</tt> should only contain <tt>atom stereo</tt> declarations. | ||
For the <tt>tolerance</tt> parameter pay attention to the unintuitive dimension order. The recommended tolerances are: 0.03 ppm or less for 1H (0.02 ppm or less for 2D homonuclear peaklists) and 0.6 ppm or less for 15N and 13C. | For the <tt>tolerance</tt> parameter pay attention to the unintuitive dimension order. The recommended tolerances are: 0.03 ppm or less for 1H (0.02 ppm or less for 2D homonuclear peaklists) and 0.6 ppm or less for 15N and 13C. | ||
Lower and upper limit cutoffs can be changed by applying <tt>upl_values</tt>. The default values are 2.4 and 5.5 A, respectively. | Lower and upper limit cutoffs can be changed by applying <tt>upl_values</tt>. The default values are 2.4 and 5.5 A, respectively. | ||
For the <tt>calibration</tt> parameter you can provide the list of calibration constants you have derived for the "backbone" class with <tt>caliba</tt>, when you calibrated the initial peak lists for use with FOUND/HABAS. Do not comment or delete this line, leave it blank if you want automatic calibration. Automatic calibration uses the <tt>dref</tt> parameter as the presumed average distance for all peaks in a peaklist (not just for backbone, like <tt>caliba</tt>). Having initial calibration too tight is less of an issue with <tt>noeassign</tt>, because by default it "elastically" relaxes constrains, which are consistently violated. | For the <tt>calibration</tt> parameter you can provide the list of calibration constants you have derived for the "backbone" class with <tt>caliba</tt>, when you calibrated the initial peak lists for use with FOUND/HABAS. Do not comment or delete this line, leave it blank if you want automatic calibration. Automatic calibration uses the <tt>dref</tt> parameter as the presumed average distance for all peaks in a peaklist (not just for backbone, like <tt>caliba</tt>). Having initial calibration too tight is less of an issue with <tt>noeassign</tt>, because by default it "elastically" relaxes constrains, which are consistently violated. | ||
Use the <tt>protocol</tt> keywords to enable output logging when running CYANA on a workstation. They may not be necessary on a cluster, because the queue system generates its own log. | Use the <tt>protocol</tt> keywords to enable output logging when running CYANA on a workstation. They may not be necessary on a cluster, because the queue system generates its own log. | ||
Use the subroutine <tt>KEEP</tt> to keep assignment for peaks that you are confident, which is helpful if you peak list contains simulated peaks for short range NOEs. | Use the subroutine <tt>KEEP</tt> to keep assignment for peaks that you are confident, which is helpful if you peak list contains simulated peaks for short range NOEs. | ||
=== '''cyana.pbs - PBS queue submission script''' === | |||
<pre>#!/bin/csh | |||
< | |||
#!/bin/csh | #!/bin/csh | ||
#PBS -m abe | #PBS -m abe | ||
Line 216: | Line 204: | ||
cd $PBS_O_WORKDIR | cd $PBS_O_WORKDIR | ||
echo "working directory = "$PBS_O_WORKDIR | echo "working directory = "$PBS_O_WORKDIR | ||
set NN = `cat $PBS_NODEFILE | wc -l` | |||
set NN = `cat $PBS_NODEFILE | |||
echo "NN = "$NN | echo "NN = "$NN | ||
module load mpich/intel-9/ch_p4/current | module load mpich/intel-9/ch_p4/current | ||
Line 227: | Line 211: | ||
limit coredumpsize 0 | limit coredumpsize 0 | ||
source $MODULESHOME/init/tcsh | source $MODULESHOME/init/tcsh | ||
cat $PBS_NODEFILE | awk '{printf "%s.ccr.buffalo.edu\n",$1}' > tmp.$$ | |||
cat $PBS_NODEFILE | |||
cyana -c '/util/mpich/1.2.7p1/intel-9/ch_p4/bin/mpiexec ' ./CALC | cyana -c '/util/mpich/1.2.7p1/intel-9/ch_p4/bin/mpiexec ' ./CALC | ||
# | # | ||
echo "ALL Done!" | echo "ALL Done!" | ||
</ | </pre> | ||
<br> The <tt>#PBS</tt> lines pass option to the PBS queue system. See [http://www.ccr.buffalo.edu/hotpages/content/pbsEXia32.htm this page] for details | |||
The following options are important: | |||
* | |||
<tt>#PBS -m abe</tt> tell PBS queue system to send e-mail alerts when calculation starts (<tt>b), aborts (a</tt>) or terminates successfully (e). | |||
*Enter you e-mail address in <tt>#PBS -M myname@mydomain</tt>. Without this line e-mail alerts will go into local mailbox. | |||
*The line <tt>#PBS -l nodes=5:ppn=2</tt> means that we are using five dual-processor nodes and get 10-fold parallelization during simulated annealing. It doesn't make much sense to request more than 5 nodes: first, the relative gain in speed drops since NOE assignment step cannot be parallelized; second, the queue wait time may be longer when more nodes are requested. | |||
*<tt>#PBS -q short_c</tt> submits the job to the <tt>short_c</tt> queue. This queue is dedicated to short jobs and has higher priority. Members of Szyperski's lab have 10 nodes reserved for this queue every weekday 9 a.m. - 4 p.m. | |||
*<tt>#PBS -l walltime=02:00:00</tt> defines maximum allocated job execution time. The limit for the <tt>shorts_c</tt> queue is 2 hours, but even the most demanding CYANA job finish in less than one hour. | |||
<br><br> | |||
* [[ | *[[Media:CALC_noeassign.cya|CALC.cya]]: CYANA 2.1 automated structure calculation script | ||
* [[ | *[[Media:Cyana.pbs|cyana.pbs]]: PBS queue submission script for CYANA 2.1 on U2 cluster | ||
* [[ | *[[Media:Make_short.cya|make_short.cya]]: CYANA script to run manual calculation with local constraints |
Latest revision as of 21:54, 6 January 2010
Introduction
Below is the description of how to run CYANA 2.1 for automated NOE assignment if you are working with CARA. A tutorial for performing structure calculations with automated NOESY assignments using CYANA 3.0 is available on-line.
Input files
Required files
- Initialization file init.cya.
- SequenceList in XEASY format - usually XXXX.seq, where XXXX is the NESG ID.
- AtomList in XEASY format XXXX.prot . Chemical shifts should be real, not folded. Make sure that you are using the most recent file. Atom labels should be swapped if using stereospecific assignments.
- Separate unfolded PeakList for 15N and 13C NOESY: n.peaks, ali.peaks, aro.peaks.
Optional files
- Stereospecific assignment script (such as stereofound.cya from FOUND/HABAS). Note that this script should contain only atom stereo declarations, but no atom swap statements! Atom labels must be already swapped in the AtomList and external UPL files.
- External UPL files, such as short.upl. Atom labels should be swapped if using stereospecific assignments.
- External ACO files, such as gridsearch.aco output of FOUND/HABAS.
Format Conversion
The input files (sequence, atom list, ACOs and UPLs) must adhere to the IUPAC nomenclature used by CYANA 2.1 (i.e., H instead of HN, etc.). CARA is fully compatible with this nomenclature, while data from other programs may need to be converted.
Conversion from XEASY/DYANA/CYANA 1.X
For the automated / noesyassign runs of CYANA, please make sure that your chemical shift list conforms to the IUPAC nomenclature (i.e., H instead of HN). To update your atom names, do the following in CYANA:
translate dyana read protein.prot translate off write protein-cyana.prot
The protein-cyana.prot file now contains all of the correct atom names for CYANA.
You may need to do the same with UPLs created in DYANA or CYANA 1.X
See the ~/demo/details/MigrateFromDyanaCyana1.cya example script in the CYANA 2.1 installation directory for details.
Conversion from Sparky
CYANA can also read BMRB format chemical shift by using following commands:
... read bmrb protein.bmrb write prot protein.prot
For Sparky users, please use Sparky command xe to write out XEASY format peaklists.
Splitting the simultaneous NOESY peaklist
When working with CARA it is not necessary to provide external ACO and UPL files. In CARA spin assignments are not derived from peak lists, and there is less impact from CYANA modifying existing peaks assignments. When external constraints are employed there are usually fewer peaks assigned and fewer UPLs derived. Thus it is recommended to use external UPL and ACO files only if there are convergence problems without them.
When using a simultaneous 3D NOESY peaklsit XEASY, you need to generate separate peaklists with UBNMR. The following UBNMR macro is provided as an example. It calculates proper 15N chemical shifts and peak positions, and writes out separate nnoe.peaks and cnoe.peaks peaklists. Modify the numbers to reflect the proper 15N and 13C carrier offsets (in ppm) and the spectral width ratios (sw2/sw2N).
init read seq xxx.seq write seq xxxseq.bmrb autoBMRB read prot xxx-simnoesy.prot read peaks xxx-simnoesy.peaks update peak shift N -35.700 1.0822510 update peak shift N 117.273 1 write peaks ncnoe.peaks split ncnoe.peaks nnoe.peaks cnoe.peaks update proton shift N -35.700 1.0822510 update proton shift ND2 -35.700 1.0822510 update proton shift NE -35.700 1.0822510 update proton shift NE2 -35.700 1.0822510 update proton shift NE1 -35.700 1.0822510 update proton shift N 117.273 1 update proton shift ND2 117.273 1 update proton shift NE 117.273 1 update proton shift NE2 117.273 1 update proton shift NE1 117.273 1 write prot noe.prot
External UPL Files
noeassign employs so-called "sum of r-6" averaging method (peaks calibrate) to calibrate peaklists and interpret UPLs during calculation. Therefore, external UPLs should ideally be calibrated with the same method.
If you supply UPL constraints created with CALIBA (CALIBA uses "center" averaging), you should be aware that these constraints will be too loose.
Using Unassigned Peaklists
If you are using completely unassigned peaklist (for example, picke from scratch in CARA), then you will need to add the following line to the peaklist header:
#CYANAFORMAT HNh
or
#CYANAFORMAT HCh
The lowercase h denotes the indirect (NOE) 1H dimension.
If your peaklist contains assigned peaks, then CYANA will be able to determine the peaklist dimensions based on these assignments.
Running Automated Structure Calculation with CYANA 2.1
- Create a working subdirectory (for example, structure/cyana21/calc1).
- Create an init.cya file as described in Getting Started or copy a previously used file. Set an appropriate RMSD calculation range.
- Copy the latest sequence (XXXX.seq) and peaklist files (n.peaks, ali.peaks and aro.peaks) into the working directory. The sequence file and peaklist should in principle be the same as those used to run FOUND.
- Copy the updated atomlist (XXXX.prot). The spin labels in it should be swapped according to the output of FOUND.
- If you used FOUND, then copy the gridsearch.aco file from the previous FOUND run.
- If you used FOUND, then copy the stereofound.cya file from the previous FOUND run. Make sure that incorrect stereospecific assignments have been commented out or removed.
- (Optional) Generate the short-range UPL (short.upl) file based on the existing peak assignments. This is more convenient to do on a workstation. You can use the make_short.cya script (see below). Alternatively, you can define a KEEP subroutine in the CALC.cya file.
- Download the CALC.cya script (see below) and modify it according to the input data.
You can choose whether you want to run structure calculation on a local Linux workstation or on the U2 Linux cluster. The typical machine times on a single workstation are 1.5 - 3 hours, depending on the protein size. Calculations on the cluster take only 15-30 minutes, but there my be additional queue waiting time. On weekdays during working hours (9 a.m. - 4 p.m.) there are 10 dual-processor nodes reserved for us only, and there is no waiting time.
Check the queue status page and the nodemap page to see the current system loads on U2.
To run calculations on the U2 Linux cluster:
- Log in to u2.ccr.buffalo.edu
- Change directory to /san/projects1/szypersk/.
- Create a working subdirectory (like username/XXXX/cyana21)
- Copy the entire subdirectory calc1. You can use gftp, scp or sftp.
- Download the PBS submission script cyana.pbs (see below). Modify it if needed.
- Type qsub cyana.pbs to submit you job.
To run calculations on a workstation:
- Start CYANA 2.1 by typing cyana21
- Enter CALC at the cyana prompt.
Output files
- final.pdb - resulting structure
- final.ovw - final overview file
- final.upl - final UPL file (unambiguous constraints; atom labels may be swapped)
- *-final.prot - final atom list (chemical shifts unchanged?; atom labels may be swapped)
- finalstereo.cya - stereospecific assignment file (to find swapped atom pairs see calculation log)
- *-cycle7.peaks - assigned peaklists (in CYANA 2.1 format with multiple assignments)
- cycleX.* - UPL, OVW, PDB and NOA files for cycle X (ambiguous constraints in UPL files)
Macro noeassign in CYANA 2.1 performs 7 routine calculation cycles and one final cycle. The output files are labeled cycle1.*, cycle2.* ... cycle7.* and final.* with appropriate extensions. Additional stereospecific assignment search is performed after cycle 7, therefore files, final.upl and *-final.prot likely have some labels swapped.
Assigned peak lists are saved after cycle 7. They may have multiple assignments for some peaks thus not being fully compatible with XEASY.
Always check the output of CYANA calculation for the results of peakcheck command. It is executed before the first calculation cycle and reports various inconsistencies in the atom list and peak lists. In the end, many UPL violations can be traced back to mistakes in assignment or mis-picked peaks.
Example scripts
Below are the key scripts for running CYANA. See the demo subdirectory of CYANA installation for more details.
make_short.cya
peaks := n,ali,aro # names of peak lists prot := $name # names of proton lists tolerance := 0.05,0.02,0.3 # chemical shift tolerances # order: 1H(a), 1H(b), 13C/15N(b), 13C/15N(a) calibration:= 1.7E6,1.7E6,1.7E6 # calibration constants (will be determined # automatically, if commented out) dref := 4.2 # average upper distance limit for # automatic calibration peakcheck peaks=$peaks prot=$prot calibration prot=$prot peaks=$peaks constant=$calibration dref=$dref peaks calibrate "**" simple write upl short.upl
For the calibration parameter you can provide the list of calibration constants you have derived for the "backbone" class with caliba, when you calibrated the initial peak lists for use with FOUND/HABAS. Do not comment or delete this line, leave it blank if you want automatic calibration. Automatic calibration uses the dref parameter as the presumed average distance for all peaks in a peaklist (not just for backbone, like caliba).
CALC.cya
peaks := n,ali,aro # names of NOESY peak lists prot := $name # names of chemical shift lists constraints := gridsearch.aco,short.upl,stereofound.cya # additional (non-NOE) constraints tolerance := 0.05,0.02,0.4 # chemical shift tolerances # order: 1H(a), 1H(b), 13C/15N(b), 13C/15N(a) #upl_values := 2.4,6.0 # calibration cutoffs calibration := 1.7E6,1.7E6,1.7E6 # NOE calibration parameters structures := 100,20 # number of initial, final structures steps := 10000 # number of torsion angle dynamics steps rmsdrange := 10..100 # residue range for RMSD calculation randomseed := 434726 # random number generator seed dref := 4.0 # average distance for calibration, default 4.0 keep := # set to KEEP to retain existing assignments subroutine KEEP peaks select "*,* number=20000..37999" end #protocol := noeassign.out # output logging on noeassign peaks=$peaks prot=$prot calibration=$calibration keep=$keep autoaco #protocol :=
Parameter constraints can be a comma-separated list of all kinds of external constraints, which can be read by read data command in CYANA. You can have UPLs, ACOs and even .cya scripts, for example, defining stereospecific assignments of methyl groups. Do not comment this line, leave it blank if you are not providing external constraints.
If you a providing stereospecific assignments, do not use atom swap in the stereofound.cya script. Use atom list and short.upl with all required labels swapped, then the stereo.cya should only contain atom stereo declarations.
For the tolerance parameter pay attention to the unintuitive dimension order. The recommended tolerances are: 0.03 ppm or less for 1H (0.02 ppm or less for 2D homonuclear peaklists) and 0.6 ppm or less for 15N and 13C.
Lower and upper limit cutoffs can be changed by applying upl_values. The default values are 2.4 and 5.5 A, respectively.
For the calibration parameter you can provide the list of calibration constants you have derived for the "backbone" class with caliba, when you calibrated the initial peak lists for use with FOUND/HABAS. Do not comment or delete this line, leave it blank if you want automatic calibration. Automatic calibration uses the dref parameter as the presumed average distance for all peaks in a peaklist (not just for backbone, like caliba). Having initial calibration too tight is less of an issue with noeassign, because by default it "elastically" relaxes constrains, which are consistently violated.
Use the protocol keywords to enable output logging when running CYANA on a workstation. They may not be necessary on a cluster, because the queue system generates its own log.
Use the subroutine KEEP to keep assignment for peaks that you are confident, which is helpful if you peak list contains simulated peaks for short range NOEs.
cyana.pbs - PBS queue submission script
#!/bin/csh #!/bin/csh #PBS -m abe #PBS -M yourname@domain #PBS -q short_c #PBS -l nodes=5:ppn=2 #PBS -l walltime=02:00:00 #PBS -o cyana.out #PBS -j oe #PBS -N cyana # cd $PBS_O_WORKDIR echo "working directory = "$PBS_O_WORKDIR set NN = `cat $PBS_NODEFILE | wc -l` echo "NN = "$NN module load mpich/intel-9/ch_p4/current module load cyana/2.1-p4 limit stacksize unlimited limit coredumpsize 0 source $MODULESHOME/init/tcsh cat $PBS_NODEFILE | awk '{printf "%s.ccr.buffalo.edu\n",$1}' > tmp.$$ cyana -c '/util/mpich/1.2.7p1/intel-9/ch_p4/bin/mpiexec ' ./CALC # echo "ALL Done!"
The #PBS lines pass option to the PBS queue system. See this page for details
The following options are important:
#PBS -m abe tell PBS queue system to send e-mail alerts when calculation starts (b), aborts (a) or terminates successfully (e).
- Enter you e-mail address in #PBS -M myname@mydomain. Without this line e-mail alerts will go into local mailbox.
- The line #PBS -l nodes=5:ppn=2 means that we are using five dual-processor nodes and get 10-fold parallelization during simulated annealing. It doesn't make much sense to request more than 5 nodes: first, the relative gain in speed drops since NOE assignment step cannot be parallelized; second, the queue wait time may be longer when more nodes are requested.
- #PBS -q short_c submits the job to the short_c queue. This queue is dedicated to short jobs and has higher priority. Members of Szyperski's lab have 10 nodes reserved for this queue every weekday 9 a.m. - 4 p.m.
- #PBS -l walltime=02:00:00 defines maximum allocated job execution time. The limit for the shorts_c queue is 2 hours, but even the most demanding CYANA job finish in less than one hour.
- CALC.cya: CYANA 2.1 automated structure calculation script
- cyana.pbs: PBS queue submission script for CYANA 2.1 on U2 cluster
- make_short.cya: CYANA script to run manual calculation with local constraints