Structure Refinement Using CNS Energy Minimization With Explicit Water

From NESG Wiki
Jump to navigation Jump to search

Structure Refinement Using CNS Energy Minimization With Explicit Water

The final structure from CYANA should be refined using CNS energy minimization with explicit water before PDB deposition.

The common input files are:

  • PDB coordinates
  • NOE constraints
  • Dihedral angle constraints
  • Hydrogen bond constraints

Converting input files from CYANA to XPLOR/CNS

If the final structure calculation was performed with CYANA 2.1, then the constraint and coordinate files must be first converted to XPLOR/CNS format.

Using cyana2cns.cya script

This procedure can be used to prepare input files to use with the WaterRefCNS script (see below).

  1. Download cyana2cns.cya into directory with the final structure calculation.
  2. Modify the read pdb, read upl and read aco lines if necessary.
  3. Start CYANA 2.1 and run cyana2cns.cya.
  • All conformers will be stored in a single PDB file KKK.pdb. If you plan run water bath refinement manually, use the p2X program instead.
  • The resulting NOE constraint file KKK_noe.tbl will have all lower limits set to 0. If want them to be set according to VdW radii, use the r2X program instead. Experience shows that there is no significant effect on geometry and quality scores as reported by PSVS.

Using programs p2X and r2X and d2X

These programs were written by Alex Lemak at the University of Toronto. They are typically used to prepare input files for manual CNS water bath refinement.

Go to ~1_projects/targetID/structure/cns/ directory and conversion_scripts sub directory, convert required input files to CNS format from CYANA format by using macro mkfil: p2X cyana2.pdb pref r2X found-c.upl noeupl.tbl r2X gr-hbonds.upl hbonds.tbl cat noeupl.tbl hbonds.tbl > noe.tbl d2X found-c.aco aco.tbl

  • p2X splits conformers from a pdb file generated by CYANA 2.1 into individual pdb files (one file per conformer) and converts atom names to CNS format. The second argument =pref specifies a common prefix name for the output pdb files. It should be no more than 4 characters long.
  • r2X converts distance constraints (noe/hbond) from CYANA 2.1 format to CNS format. The lower limits in the resulting constraint file are set according to VdW radii.
    • Distance constraints obtained from CYANA macro caliba or calibration can be used as input files.
    • Distance constraints obtained from CYANA macro peak calibrate should have the pseudo-atom correction run before used as input files, run following commands under CYANA:
      read upl final.upl distance correct write upl final_corrected.upl This, however, will not add corrections for multiplicity!
  • d2X, to run: d2X cyana2.aco cns.tbl This will convert angle constraints from CYANA format to CNS format. If you don't have any dihedral angle constrains, create an empty file.

Both p2X and r2X require a translation table file atomtransC.tbl.

If have used simplified pseudoatom names (H* instead of Q*) with option pseudo=2 in CYANA 2.1, you may first want to them before running p2X and r2X: pseudo=0 read upl noe.upl read pdb cyana2.pdb write upl out.upl write pdb out.pdb all Note that pseudo=0 should be set before loading PDB and constraint files.

Conversion from DYANA

Use CYANA macro MigrateFromDyanaCyana1.cya to convert of data files from standard CYANA1.x or DYANA nomenclature to the standard IUPAC nomenclature used by CYANA, then do then same thing as above. 
translate dyana # use Cyana 1.x/Dyana nomenclature read seq demo-dyana.seq # read sequence from Cyana 1.x/Dyana read upl demo-dyana.upl unknown=warn # read upper distance limits from Cyana 1.x/Dyana read aco demo-dyana.aco unknown=warn # read angle restraints from Cyana 1.x/Dyana translate off # return to standard (IUPAC) nomenclature write demo-cyana.seq # save sequence in standard Cyana format write upl demo-cyana.upl # save upper distance limits in standard Cyana format write aco demo-cyana.aco # save angle restraints in standard Cyana format

Running CNS using WaterRefCNS script at CABM

Please follow the instruction of ~/WaterRefinement_cns after download the CABM CNS refinement protocol. All you need is FOUR files in a directory, sssuming the name of your protein is KKK the needed files are: KKK.pdb -- best set from XPLOR KKK_noe.tbl -- noe KKK_dihe.tbl -- dihedral KKK_hbond.tbl -- hbonds Here the KKK.pdb can take from the CYANA final.pdb, the KKK_noe.tbl can be converted from CYANA final.upl by using pdbstat with 10% relax of the constraints.

In order to run it you

  1. Log into hummer
  2. go to the directory gathering those files
  3. type /farm/software/WaterRefinement_cns/WaterRefCNS -na KKK -que PBS

and that's it. if you have some CIS residue you add -ci NUMRES to the options above. If you want to have better vdw violation, you can add -par PARAM19 to the options above.

In case you want more explanations just type /farm/software/WaterRefinement_cns/WaterRefCNS with no options (or only -help as option) and it will give you some terse explanations.

WaterRefCNS at SUNY Buffalo

The CABM water refinement package was customized to run on computers at UB:

  • Fixed a bug in WaterRefCNS so that the assembled refined PDB structure file has a .pdb extension.
  • PBS queue submission code in WaterRefCNS modified for use on U2 cluster
  • Added -hisd and -hise options to use neutral histidine isoprotomers. This required modification of WaterRefCNS, cns_refine_h2o.inp and generate_h2o.inp.
  • Fixed a bug in the protein topology file topallhdg5.3.pro. The line atom CD1 type=CR1E charge=0.130 end in the HISE patch section was removed.
  • Fixed a bug that caused the order of conformers to be changed in the output PDB file by replacing the line $WATREFLIB/Agrupa *.pdb > All_${Name}_cns.pdb with $WATREFLIB/Agrupa `ls resa_?.pdb ; ls resa_??.pdb` > All_${Name}_cns.pdb in the WaterRefCNS script.

The modified package is installed in /nsm/chem/cen2/HTP2/3_src/WaterRefinement_cns on the local server and in /san/projects1/szypersk/src/WaterRefinement_cns on the U2 Linux cluster. Make sure that this directories is included in your path on spins* workstations and U2, respectively.

For reference, the download location is here: WaterRefinement_UB.tar.Z

The required files are:

  • PDB coordinates (with atom names in XPLOR/CNS format)
  • NOE constraints (in XPLOR/CNS format)
  • Dihedral angle constraints (in XPLOR/CNS format)
  • Hydrogen bond constraints (in XPLOR/CNS format)

Assuming that your coordinate file is called KKK.pdb, the constraint files should be named KKK_noe.tbl, KKK_dihe.tbl and KKK_hbond.tbl. If you don't have dihedral angle and/or hydrogen bond constraints, then create the corresponding empty files.

You may need to add a line like nassign=1000 at the top of the constraint files. The value should be equal to or greater than the number of constraints in the file.

WaterRefCNS assumes "sum of r^-6" averaging by default. When using "center"-averaged UPLs (e.g. from caliba) add -av center when starting WaterRefCNS. For information on averaging conventions for calibration in CYANA see NOE Calibration in CYANA

It is recommended to run it on U2 Linux cluster. Examples:

  1. WaterRefCNS -na KKK -que PBS
  2. WaterRefCNS -na KKK -que PBS -ci 21,49
    residues 21 and 49 are cis-Pro
  3. WaterRefCNS -na KKK -que PBS -hise 40,82 -hisd 32,65
    residues 40 and 82 are ε-protonated neutral His
    residues 32 and 65 are δ-protonated neutral His
  4. WaterRefCNS -na KKK -que PBS -av center
    for use with center-averaged calibration

To run on spins* Linux workstations without queue system use -que NO option instead of -que PBS option. The default is no queue system, so you can omit the -que option altogether.

Running WaterRefCNS

  1. Convert the coordinate and constraint files as described above.
  2. Copy the converted files into a CNS working directory on the U2 cluster or a workstation (e.g. structure/cns/calc1).
  3. Run the WaterRefCNS with the proper arguments.
  4. If the calculation was successful, the refined coordinate file will be stored in the refinedPDB subdirectory. Check the BeSureToREADME for details.

Running CNS "manually"

Go to ~1_projects/targetID/structure/cns/ directory and cns_scripts sub directory, follow the procedures as describe below.

  1. Generating MTF topology file from a PDB file.
    modify the header of the input file generate_mtf.inp and then run cns < generate_mtf.inp > mtf.log.
    • Modify input files path and name base on individual target as follow: <bt _moz-userdefined=""></bt> ... {in} pdb_file="/san/user/gliu2/u2/cns/cns/convertion_scripts/ufc1_1.pdb"; {in} param_file="/san/user/gliu2/u2/cns/cns/cns_scripts/parallhdg5.3C.pro"; {in} topol_file="/san/user/gliu2/u2/cns/cns/cns_scripts/topallhdg5.3.pro"; {in} plink_file="/san/user/gliu2/u2/cns/cns/cns_scripts/protein.link"; {out} struct_file="ufc1_1.mtf"; ...
    • Requires these files

    • For cis-Proline, define the residue Id of the residue prior to cis-Proline by additional lines in generate_mtf.inp before the WRITE command <bt _moz-userdefined=""></bt> patch cisp reference=nil=( resid 90 ) end
    • If you get an error message on the atom type, you also can try patch cipp reference=nil=( resid 90 ) end
    • For neutral Nε- and Nδ-protonated histidines add these lines in generate_mtf.inp before the WRITE command <bt _moz-userdefined=""></bt> patch hise reference=nil=( resid 40 ) end patch hisd reference=nil=( resid 65 ) end
    • For dimer refinement, add a line containing only TER between two monomer unit coordinates in the PDB conformer that is defined in generate_mtf_cis.inp as input file.

  2. Rebuilding hydrogen atom positions for each structure.
    Modify input file rebuild.inp and then run cns < rebuild.inp > rebuild.log
    • Modify input files path and name base on induvidual target as follow:<bt _moz-userdefined=""></bt> ... evaluate ($mtf_file="ufc1_1.mtf") evaluate ($pdbname_in="../convertion_scripts/ufc1") evaluate ($pdbname_out="ufc1_rb") evaluate ($number_of_struct= 20 ) ... evaluate ($topol_p_file="/san/user/gliu2/u2/cns/cns/cns_scripts/topallhdg5.3.pro") evaluate ($param_p_file="/san/user/gliu2/u2/cns/cns/cns_scripts/parallhdg5.3C.pro") ...
    • Requires these files

  3. Refining structures with explicit water.
    • Running CNS on workstation with single processor, modify input file re_h2oc.inp and then run cns < re_h2oc.inp > h2o.log
      • Modify input files path and name base on induvidual target as follow:<bt _moz-userdefined=""></bt> ... evaluate ($mtf_file="ufc1_1.mtf") evaluate ($noe_file="../noe.tbl") evaluate ($dihe_file="../aco.tbl") evaluate ($hb_file="../hbond.tbl") evaluate ($pdbname_in="ufc1_rb") evaluate ($pdbname_out="ufc1_ref") evaluate ($number_of_struct= 20 ) ...
      • Modify the weight of energy contribution part if necessary, e.g. use 2x to 10x of the following scales <bt _moz-userdefined=""></bt> scale ambi 50 scale dist 50 scale hbond 50 end restraints dihedral scale=200 end
    • Requires
    • Run CNS on workstation with multiple processors.
      CNS will refine each conformer independently and in parallel, it will be much faster if 20 conformer were refined in parallal at u2.ccr.buffalo.edu.
      1. Run cns_convertion as descibed above
      2. Go to cns_scripts directory, modify input file generate_mtf_cis.inp and then run =cns < generate_mtf_cis.inp = as described above
      3. Run macro getfil cp ../convertion_scripts/noe.tbl . cp ../convertion_scripts/hbonds.tbl . cp ../convertion_scripts/aco.tbl . cp ufc1_1.mtf com/.
      4. Modify input file rebuild.inp and then run cns < rebuild.inp as described above.
      5. Modify input file re_h2oc.inp as described above, remember to set $number_of_struct = 20.
      6. Run macro subaba #bash for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 do mkdir ref$i cd ref$i cp ../com/* . cp ../jan24/ufc1_rb_$i.pdb ufc1_rb_1.pdb qsub cns.sc cd .. donedone where cns.sc is a PBS macro used to submit CNS job by using PBS:
        #!/bin/csh #PBS -m e #PBS -q short_c #PBS -l nodes=1:GM:ppn=1 #PBS -l walltime=00:56:00 #PBS -o out1 #PBS -j oe #PBS -N clean1 cd $PBS_O_WORKDIR echo "working directory = "$PBS_O_WORKDIR {| border="1" |- set NN = `cat $PBS_NODEFILE || wc -l` |} echo "NN = "$NN # Run Job cns < re_h2oc.inp echo "ALL Done!"
      7. Run getpdb to collect refined structures. #bash for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 do cd ref$i cp ufc1_ref_1.pdb ../pdb/ufc1_ref_$i.pdb cp ufc1_ref_1.vio ../pdb/ufc1_ref_$i.vio cd .. done
      8. Run script Agrupa to put all 20 conformers into a single PDB file.
        to run: Agruba ufc1_ref*.pdb > ufc1all.pdb

    Running CNS for Protein with Metal Ions

    The CNS refinement for protein with metal ions can be performed with the new WaterrefCNS script, before the new !waterrefCNS available, please do the refinement as following: 
    1. Set environment for CNS1.1 by run source /farm/users/gliu/alias.cns
      alias cns1 /farm/software/cns/cns_solve_1.1/intel-i686-linux_g77/bin/cns setenv CNS_TOPPAR /farm/data/gliu/cns1/ In addition to the topology and parameter files, the metal ion parameter file ion.top is required. An example can be found in "/farm/users/gliu/projects/cns_cuttha_cis" with all required input files.
    2. Prepared required files as described above (final.tbl, final cns format PDB files and put in xplorPDB dir with name as sa_#.pdb) except the PDB file should include the metal ions with format according to CNS library ion.top. cp sa_1.pdb as template.pdb, input files for creating mtf file Note that alignment is important. eg:
      ATOM 1249 OT2 ALA 83 69.296 13.232 5.744 1.00 0.00 ATOM 1250 ZN+2 ZN2 150 63.086 13.789 -10.407 1.00 0.00 zinc
    3. Run generate_h2o.inp once to create temp_h2o.pdb and temp_h2o.mtf. The extra proton atom in the ligand residues, eg. HIS HD1 or CYS S, are removed by editing the generate_h2o.inp; cis proline is also defined here in the generate_h2o.inp (resid is the residue number prior the proline).
      {* any special prosthetic group patches can be applied here *} {===>} delete select (name hg and resname cys and resid 61) end delete select (name hg and resname cys and resid 85) end delete select (name hd1 and resname his and resid 46) end delete select (name he2 and resname his and resid 83) end patch cisp reference=1=( resid 13 ) end
    4. Edit generate_1.inp to remove the extra proton as did above.
    5. Run generate_20.com, this will run generate_#.inp 20 times, updating each pdb number and this creates cnsPDB/sa_cns_#.pdb
    6. Edit and run re_h2o_cu.inp, the refined pdb is kept in refinedPDB, or
    7. use subcns to submit cns refinement by using PBS: eg, type "sh subcns". Before run subcns , make a folder " com" contains the following file. Type getpdb to get refined pdb files in refinedPDB after it finished.
      1. cns.sc: PBS submission
      2. cutc_h2o.mtf: mtf file created as descrive above
      3. topology and parameter files: parallhdg5.3C.pro, parallhdg5.3.pro, topallhdg5.3.pro
      4. re_h2o_cu.inp: input file for cns refinement

    Note: Example files can be downloaded from below:

    -- Main.GaohuaLiu - 29 Jan 2007

    • cyana2cns.cya: CYANA script to convert coordinates and constrains to XPLOR/CNS format
  • zn.tar: Exampleof CNS refinment for protein with zn (from Alex Lemark)