Finding Consensus NOE Assignments
Introduction
By comparing the results from the programs CYANA and AutoStructure, identical NOE assignment are found and are regarded as correct NOE. Manual refinement can thus be focused mostly on the remaining peaks. Consensus peaklists can be created with UBNMR, EASYNMR, CARA/LUA or David's perl scripts. Ulitmately, UBNMR and CARA/LUA scripts should contain the accepted protocol and the other scripts will be abandoned.
If you ran your calculations on a cluster, download the results into local directory tree.
Using UBNMR
- Download Con_ubnmr.tar.gz: Self-training data for Consensus NOE Assignment
- Go to one of the MERGE calculation directories, e.g. ~/1_projects/targetID/structure/merge/calc1, and run getfil to get the required files including NOE assignments from both CYANA and AutoStructure.
getfil example
cp ../../cyana/calc1/cnoe.peaks . cp ../../cyana/calc1/nnoe.peaks . cp ../../cyana/calc1/noe_sw.prot noe_sw.prot . cp ../../cyana/calc1/noe.seq . cp ../../cyana/calc1/cnoe4abs-cycle7.peaks . cp ../../cyana/calc1/nnoe4abs-cycle7.peaks . cp ../../autos/calc1/CYCLE8-1_final/cnoe4abs.peaks_assSparky . cp ../../autos/calc1/CYCLE8-1_final/nnoe4abs.peaks_assSparky .
- Run UBNMR with macro consrun.mac or step by step as describe below
- Start UBNMR and load your XEASY/CYANA format sequence file SequenceList and chemical shift filet AtomList.
- Run the UBNMR command: consRun [originalPeaklist] [cyanaPeaklist] [sparkyAssList] [-R ####] , where #### is an xeasy peak number. All peaks with peak numbers less than ### will be "reserved". That is, the assignment from the original peaklist will be used independant of the new cyana and autostructure assignments.
- NOTE the UBNMR consRun assumes that the same original peaklist was used for both the cyana and autostructure runs. It also assumes that the peaks in your peaklist are in ascending order. If this is not true, a FATAL ERROR will result.
- The UBNMR consRun command will produce four files:
- An xeasy peaklist starting with "consRun" that contains the assignments for matched (and reserved) peaks.
- An xeasy ".assign" file which contains the assignments for unmatched peaks plus any multiple assignments.
- A log file showing the actual input for each matched peak.
- A log file showing the actual input for each unmatched peak.
- See UbnmrConsensusDetails for details on the consensus run matching algorithm.
- Convert XEASY/CYANA format PeakList to Sparky format as following by using XeasytoSparky if needed:
awk -f peaks2sparky.awk noe.seq noe_sw.prot noe_consensus.peaks > noe_sparky.list
In order to load the consensus peak list onto the simNoesy spectrum in XEASY we must first convert the N shifts back to Carbon frequencies and then join the peak lists, a macro to do this is attached below, macrojoin
Using CARA
Protocol
Assuming that you have 15N, 13C aliphatic and 13C aromatic peakists n.peaks, ali.peaks and aro.peaks, do the following:
- Run SparkyPeakListToXeasy LUA script to convert the corresponding n.peaks_assSparky peaklist in the subdirectory of final cycle (for example, autostructure/calc1/CYCLE10-0_final) to XEASY-style peaklist. The resulting file will contain multiple assignments in the format compatible with CYANA 2.1 . Save it as n_auto.peaks.
- Run ImportCyana2xPeakList LUA script to load the corresponding n-cycle7.peaks peaklist created by CYANA 2.1 and the n_auto.peaks converted peaklist from AutoStructure. Both peaklists will be strored in the CARA repository. Peaks with multiple assignments will be loaded as 'assignment guesses'.
- Run CreateConsensusPeakList LUA script. You will be asked to select two input peaklists from the repository. You will be prompted to save the consensus statistics in a log file (use n_cons.txt). Save the resulting peak list in the repository as n_cons.
- Open the corresponding NOESY spectrum with Open MonoScope (rotated)... so that X-Y-Z dimension order matches the peaklists. Load the peaklist from the repository by clicking Peaks -> Open PeakList.... Click Peaks -> Set Peaktist Owner... to link the peaklist to the spectrum. This is needed for proper subsequent peak display irrespective of the spectrum orientation. Later you can open MonoScope in a normal orientation.
- Run Label_3D_NOESY_peaks LUA script to set peak labels - these are helpful for peak sorting in MonoScope
- Repeat steps 1-5 for the other peaklists.
How a consensus peaklist is created with CreateConsensusPeakList (compare to other methods):
- Peaks with matching unique assignments are retained.
- Peaks with mis-matching unique assignments are unassigned and original assignments are saved as 'assignments guesses'. The same procedure is followed when a peak is uniquely assigned in one peaklist and unassigned in another.
- Peaks with multiple assignments in at least one peaklist are unassigned. All assignments are combined and saved as 'assignments guesses'. This also concerns peaks with matching multiple assignments.
Loading Structure information into CARA
You can load structure information into CARA in form of spin links by using UplsToSpinLinks2 LUA script. It is possible load the resulting UPL files, either combined or consensus, or load the distance statistics for the structures prepared in MolMol.
Using EASYNMR
1. Go to one of the MERGE calculation directory, e.g. ~/1_projects/targetID/structure/merge/calc1, and run getfil to get the required files includes NOE assignment from both CYANA and AutoStructure
2. Run m
acro run, which will run the following EASYNMR macros (one can also run any individual macro from below depend on the need):- prot2auto: Modifying XEASY format atom list by replace the atom name by using BMRB atom name;
- auto2easy: reformatting the NOE assignment file fro AutoStructure;
- runc13: Converting the NOE assignment file of 13C NOESY peak list from AutoStructure format to XEASY format peak lists.
- runn15: Converting the NOE assignment file of 15N NOESY peak list from AutoStructure format to XEASY format peak lists
- runautonew: Selecting the newly assigned NOE peaks by exclude the simulated peaks those were already assigned.
- cyana2easy: Converting the CYANA assigned peak list to XEASY format and selecting the newly assigned NOE peaks by excluded the simulated peaks
- runc13B: Comparing the NOE assignment of 13C peak list, adapting the consensus NOE assignment and merging to the original XEASY format peak list, and write out:
- a complete merged XEASY PeakList autocyanaxeasycnoe.peaks that used for further consensus or manual analysis;
- a partially PeakList autocyanacnoenew.peaks that only contains consensus peaks;
- two XEASY format assignment files that contan non-consensus NOE assignment from both programs and can be loaded back to XEASY for anslysis:
- diffc.assign (all peaks that not match);
- diffc1.assign (peaks that both assigned but not match).
- runn15B: Comparing the NOE assignment of 15N peak list, adapting the consensus NOE assignment and merging to the original XEASY format peak list, write out:
- a complete merged XEASY PeakList autocyanaxeasynnoe.peaks that used for further consensus or manual analysis;
- a partially PeakList autocyanannoenew.peaks that only contains consensus peaks;
- two XEASY format assignment files that contan non-consensus NOE assignment from both programs and can be loaded back to XEASY for anslysis:
- diffn.assign (all peaks that not match);
- diffn1.assign (peaks that both assigned but not match).
Key Issues
AutoStructure significantly modifies the atom list by replacing atoms having near-degenerate chemical shift with pseudoatoms. This makes distinguishing between unique and multiple assignments and extablishing matches non-trivial, especially for diastereotopic methyl groups. Here are some examples:
L53 in BoR54
Input BMRB file
1323 53 LEU HD1 H 0.729 0.000 1 1322 53 LEU HD2 H 0.721 0.000 1 1321 53 LEU CD1 C 25.438 0.000 1 1320 53 LEU CD2 C 22.674 0.000 1
Autostructure replaces HD1 and HD2 with QD, even though CD1 and CD2 are distinct. Assignments in the resulting peaklists will have the following meanings:
- A35HB-A35CB-L53QD - multiple assignment, A35HB-A35CB-L53HD1 and A35HB-A35CB-L53HD2
- L53QD-L53CD1-A35HB - unique assignment, L53HD1-L53CD1-A35HB
- L53QD-L53CD2-A35HB - unique assignment, L53HD2-L53CD2-A35HB
V13 in BoR54
Input BMRB file
1227 13 VAL HG1 H 0.841 0.000 1 1225 13 VAL HG2 H 0.897 0.000 1 1228 13 VAL CG1 C 21.353 0.000 1 1226 13 VAL CG2 C 21.294 0.000 1
Autostructure replaces CG1 and CG2 with QCG, even though HG1 and HG2 are distinct. Assignments in the resulting peaklists will have the following meanings:
- V13HG2-V13QCG-C14H - unique assignment, V13HG2-V13CG2-C14H
- V13HG1-V13QCG-V13H - unique assignment, V13HG1-V13CG1-V13H
peaks2sparky.awk: format converting from Xeasy to Sparky
Con_ubnmr.tar.gz: Self-training data for Consensus NOE Assignment
macrojoin: UBNMR macro to update and join consensus run peak lists for XEASY simNoesy
easynmr2008a: easynmr