Resonance Assignment/Abacus: Difference between revisions

Revision as of 22:46, 24 November 2009

<span style="font-size: 16pt;" />

INTRODUCTION TO ABACUS

ABACUS (Applied BACUS) is a novel approach for protein structure determination that has been applied successfully for more than 20 NESG targets. ABACUS is characterized by use of BACUS, a procedure for automated probabilistic interpretation of NOESY spectra in terms of unassigned proton chemical shifts based on the known information on "connectivity" between proton resonances. BACUS is used in both the resonance assignment and structure calculation steps. The ABACUS is distinguished from conventional approaches to NMR structure determination mostly by its resonance assignment strategy (see Fig.1.1A).

Figure 1.1. A.(on the left) Flowchart of resonance assignmnent by ABACUS.

B. (on the top)Schematic description of two types of molecular fragments: traditional spin-system (AA-fragment) include all the atoms belonging to the same residue; PB-fragment includes all the atoms from one residue except the backbone amide group, plus the amide group from the next residue in the protein

¹⁾Lemak A., Steren, C., Arrowsmith, C.H. and Llinás, M. (2008) J. Biomol. NMR, 41, 29-41. ²⁾ Grishaev, A., Steren, C.A., Wu, B., Pineda-Lucena, A., Arrowsmith, C. and Llinás, M. (2005) Proteins, 61,36-43. ³⁾Grishaev, A. and Llinás, M. (2004) J. Biomol. NMR, 28, 1-10.

Some features /advantages of the ABACUS protocol:

- It does not rely on sequential connectivities from less sensitive experiments such as HNCACB indispensable for most traditional sequential assignment procedures;

- Inter-residue sequential connectivities are established mainly from NOE data, which saves time at a later stage in “troubleshooting” NOE and resonance assignments.;

- Probabilistic nature of the ABACUS procedure provides measure of reliability of assignments, and therefore one can obtain a partial, yet highly reliable assignment (even when the NMR data are sub-optimal) with the knowledge of where to focus manual intervention;

- It can make use of partial spin-systems;

- It can efficiently identify manual errors in the input peak lists;

NMR spectra required for ABACUS

The spectra typically needed for ABACUS approach are most conveniently separated into 3 groups: NH-rooted, the CH-rooted and the aromatic (also CH-rooted). Table 1 shows the optimal set of NMR spectra. This, of course, is neither an exclusive or exhaustive list. For example, a simultaneous CN-NOESY could be recorded instead of three different ones listed in the table. In case there are very few aromatic residues in a protein, to collect only one aromatic spectrum, namely aromatic NOESY, could be enough for assignment of aromatic resonances.

Table 1. ABACUS optimal set of experiments

NH-rooted	CH-rooted	Aromatic
¹⁵N-HSQC	¹³C-CT-HSQC	¹³C-HSQC-aro
HNCO	¹³C-HSQC	H(C)CH-TOCSY-aro
HNCA	H(C)CH-TOCSY	(H)CCH-TOCSY-aro
CBCA(CO)NH	(H)CCH-TOCSY	¹³C-NOESY-HSQC-aro
HBHA(CO)NH	¹³C-NOESY-HSQC
¹⁵N-NOESY-HSQC
CCCONH-TOCSY (optional)
H(CCCO)NH-TOCSY (optional)

'Spin-system identification strategy

The resonance assignment procedure starts from grouping resonances in spin systems (PB-, or peptide bond, fragments) comprising correlated resonances from the side chain of residue i and the NH resonances of residue i+1 (see Figure1.1B). The uncompleted HN-rooted PB spin-systems, which include resonances of atoms only, are called bPB-fragments in this manual.

Spin-system identification in ABACUS approach consists of 3 main steps.

1. On the first step, bPB-fragments are collected from high sensitivity NMR correlation experiments (such as HNCO, CBCA(CO)NH, and HBHA(CO)NH) that transfer magnetization via the intervening peptide bond (see Figure 1.2A)

2. On the second step, completion of bPB-fragments with side-chain aliphatic resonances as well as identification of additional spin-systems (lacking HN resonances) is performed using HCCH-TOCSY and 13C-NOESY spectra (see Figure 1.2B)

3. Finally, spin-system validation and correction is performed. This step allows one to find mistakes made during spectra peak-picking and to correct the mistakes by going back to the spectra.

For each spin-system, 20 scores S(T) were calculated during the validation (see Figure 1.3). Here T corresponds to amino acid type, and T=A,R,D,…, and V, respectively. The score evaluate goodness-of-fit of the spin-system resonances to those observed in BMRB data base. If the best score , where , is too low, it means that either the spin-system has very unusual chemical shifts or the spin-system does not make sense and need to be corrected.

Fragments assignment by FMC

Sequence-specific assignment of PB-fragments is achieved using a Fragment Monte Carlo (FMC) stochastic search procedure. The scoring function used in the FMC procedure is based on both fragment amino acid typing (matching the spin system to amino acid types) and fragment contact map (reflecting which residue is next to which) derived from HNCA data and the analysis of NOEs interpreted by BACUS (see Figure 1.4).

FMC procedure performs probabilistic assignment of PB-fragments. The assignment probabilities are calculated by Simulated Annealing (SA) or Replica Exchange Method (REM) Monte Carlo (MC) simulations. Here, is a probability of fragment k to occupy position s;k = 1,….,N_f. ;and s = 1,….,N_s+1. Sequence-specific assignment of PB-fragments is achieved by analyzing probabilities (see Figure 1.5) as well as sub-optimal fragment’s mapping that are provided by MC simulations.

FMCGUI

FMCGUI is a graphical interface that assist user to carry out resonance assignment and structure calculation using ABACUS approach.

FMCGUI_2.2 COMMANDS

0. FMCGUI objects.

Most of FMCGUI commands operate mainly with the following three objects that are located in computer memory: protein sequence, peak list, and PB-fragments.

Protein sequence. This object can be created in memory using [Data>Protein sequence>load] or [Project>load] commands.

The position ID of the first residue in the sequence should be specified by user upon loading sequence file (in the case it is not specified in the input file). Some commands in FMCGUI implies that the first residue of the protein sequence has position ID of 1. Therefore, if there is HIS-tag in the loaded sequence, it should be numbered accordingly starting with a negative position ID of the first residue.

Peak Lists. Different peak lists objects can be created in memory using [Data>”Peak list name”>load] or [Project>load] commands. For some peak-lists, peaks in the list could be referenced by spin-system (fragment) user ID.

The following table shows what peak lists are required referencing (+), peak lists

that are optionally referenced (+/-), and peak lists for which referencing is not used

even if present in the input file (-):

N15 NOESY	-
C13 NOESY H2O	-
Arom NOESY	-
N15 HSQC	+
C13 HSQC	+
HNCA	+/-
HNCO	-
CBCACONH	+/-
HBHACONH	+

List of Fragments. This object can be created in memory using [Fragment>Load], [Fragment>Create], or [Project>load] commands. Each fragment in the list has the following main properties:

- Fragment ID assigned by user, U_id;

U_id can’t be changed within FMCGUI.

- Assignment ID, A_id, that indicate the sequence position ID to which the fragment is assigned; ( A_id = -99 if the fragment is not assigned to any position in the sequence)

A_id could be set up or modified by the commands: [Assignment>Fix Assignment>Manually], [Assignment>Fix Assignment>Using probability Map], and [Assignment>Fix Assignment>Reset all].

- Typing probabilities , where t correspond to one of 20 AA residue types.

could be calculated or modified by the commands:

[Fragment>Type>Calculate] and [Fragment>Type>Fix].

- Three Fragment contact maps , , and , respetively. Each contact map scores the possibility for any fragment f to be next to fragment U_id in the protein sequence; were f and U_id stand for fragment user ID.

is calculated from HNCA spectrum by the command [Assignment>Contact>HNCA]

Fragment contact maps and , calculated from NOESY spectra with and without using BACUS procedure, respectively;

can be calculated by the commands [Assignment>Contact>NOE>fawn] and [Assignment>Contact>NOE>abacus], while is calculated by [Assignment>Contact>NOE>abacus].

- Fragment assignment probabilities and are calculated using SA and REM Monte Carlo simulations, respectively. Here s stands for protein sequence position ID.

is calculated by command [Assignment>Calculate probabilities>SA] or it can be loaded in memory using command [Assignment>Load probabilities];

is calculated by command [Assignment>Calculate probabilities>REM] or it can be loaded in memory using command [Assignment>Load probabilities];

The current values of all these properties for a particular fragment could be observed in the “Fragment Graph “window which is opened by command [View>Fragment]

1. Main window

The main frame of FMC Graphical Interface consist of 4 sections (see Figure )

- the title bar displays the name of the current project and the directory where the project is located;

- the bar with six menu: Project, Data, Fragment, Assignment, Structure, and View, respectively;

- the main message window, where message from the last executed command is displayed ;

- the log window, where the history of executed commands is shown.

2. PROJECT menu</font</div>

[Project>New] : To start a new project.

User have to provide a name of the project PROJECTNAME,and to select a directory that will host the project. The project root directory with the same name PROJECTNAME is created.

[Project>Load] : To continue to work on previously saved project.

User have to select file PROJECTNAME.prj in the directory PROJECTNAME, where PROJECTNAME is the name of the root directory of the project.

[Project>Save] : To save the current state of the project.

What is currently in the computer memory is saved in the file PROJECTNAME.prj located in the root directory of the project.

[Poject>Quit] : To save the current state of the project and to quit.

3. DATA menu</font</div>

The DATA section serves to load&save the data (such as protein sequence and peak lists). Since there are different formats of data-files that could be loaded in memory or saved on disk, one can use this section as format converter as well.

[Data>Protein Sequence>Load] : To load a protein sequence into memory.

The input formats:

- 1-letter code (fasta format);

- 3-letter code (standard format).

User have to select the file with sequence and to specify the first residue ID, in the case when the ID is not specified in the input file. It is recommended, that if there is His-tag in the sequence file, than the first residue ID should be set to a negative number so that the first residue of a protein has ID of 1.

[Data>Protein Sequence>Save as] : To save protein sequence in the file on disk.

The output formats:

- 1-letter code (fasta format);

- 3-letter code ("standard" format, for cyana)

- 3-letter code (for AutoStructure);

- 3-letter code (for RCI);

There are separate buttons for different peak lists.

[Data>N15 NOESY>load/Save as]

[Data>C13 NOESY>load/Save as]

[Data>Arom NOESY>load/Save as]

[Data>N15 HSQC>load/Save as]

[Data>C13 HSQC>load/Save as]

[Data>HNCA>load/Save as]

[Data>HNCO>load/Save as]

[Data>CBCACONHN>load/Save as]

[Data>HBHACONH>load/Save as] : To load or save a peak list.

Input and output formats:

- Sparky;

- Xeasy;

- Standard;

[Data>Tolerances] : To set tolerances for chemical shift matching in different spectral dimensions.

4. FRAGMENT menu</font</div>

[Fragment>Load>assigned] : To load assigned chemical shifts (spin-systems) in the memory.

Prerequisites:

- Loaded sequence

Input formats:

- assigned AA-fragments in standard format;

- CYANA chemical shift file (prot-file);

[Fragment>Load>PB fragments] : To load unassigned spin-systems in the memory.

Input format :

PB-fragments in standard format.

[Fragment>Save>PB fragments] : To save PB-fragments in a file on disk.

Output format:

PB-fragments in standard format.

The name of the saved file and it’s location are specified by user.

There are 3 options to save PB-fragments in the file:

- in order of fragments index, that is in the order by which fragments are stored in memory;

- in order of fragments user ID, U_id;

- in order of fragments assignment ID, A_id. In this case 2 files are saved. One file, with user specified name 'user_name', contains only fragments assigned to protein sequence positions, that is to positions with residue ID of >= 1. The second file, with the name 'user_name_na', contains all not assigned fragments (that is fragments with A_id = -99).

[Fragment>Save>cyana] : To save assigned chemical shifts (that is fragments with A_id >0 ) in CYANA format.

[Fragment>Save>bmrb] : To save assigned chemical shifts in the format suitable for BMRB deposition (star2.1)

[Fragment>Save>talos] : To save assigned chemical shifts in the format suitable for TALOS/CS-Rosetta;

[Fragment>Save>abacus] :To save unassigned PB-fragments in the format suitable for BACUS;

[Fragment>Create>fawn] : To create/evaluate bPB-fragments.

Prerequisites:

- loaded in memory referenced peak lists of CBCA(CO)HN, HBHA(CO)HN, N15HSQC, and HNCA spectra;

- Specified tolerances.

There are two steps in executing this command.

On the first step, a fake C13HSQC peak list is created and shown in the popped up window “fake C13HSQC”.

User can use the information shown in the main FMCGUI window and check /edit the list in the entry section of “fake C13HSQC” window. Pressing OK will result in loading the peak list from the entry window into memory as C13HSQC peak list.

On the second step, a number of bPB-fragments corresponding to 20 different AA types are generated from user-identified spin-systems. Each generated bPB-fragment is evaluated by a score SpS that measure how good the spin-system chemical shifts match corresponding statistical chemical shifts derived from BMRB database. The bPB-fragment with highest score is selected to form a list of bPB-fragments.

In the result, a new window ‘Create Fragment’ pops up and warning messages of the ‘sps_create’ script are shown in the main FMCGUI window.

The window consists of three sections. The left sections contains suggested bPB-fragments, while the other sections contains two reports of fragments scoring with both C and H resonances and with only C resonances, respectively. Following the warning messages shown in the main FMCGUI window, user can accept/modify generated bPB-fragments. Alternatively, when ‘poor’ bPB-fragments are present, user can go back to spectra, fix the pick lists accordingly, and repeat the fragment generation again.

User-approvedbPB-fragments will be loaded in the memory by pressing OK button.

Results:

- C13HSQC peak list loaded in memory

- bPB-fragments are loaded in memory

[Fragment>Create>abacus] : To create/evaluate PB-fragments.

Prerequisites:

- loaded in memory referenced C13HSQC, N15HSQC, and HNCA peak lists and not referenced CBCA(CO)HN peak list; ( as an option, HNCA peak list could be not referenced as well)

- Specified tolerances.

On the second step, a number of PB-fragments corresponding to 20 different AA types are generated from user-identified spin-systems. Each generated PB-fragment is evaluated by a score SpS that measure how good the spin-system chemical shifts match corresponding statistical chemical shifts derived from BMRB database. The PB-fragment with highest score is selected to form a list of bPB-fragments.

Spin-system which have all SpS scores less than 10-4 are reported in the main FMCGUI window.

Following these warnings user can accept or to modify generated PB-fragments in the left section of “Create Fragment’ window. Alternatively, user can go back to spectra, fix the pick lists accordingly, and repeat the fragment generation again.

User-approved bPB-fragments will be loaded in the memory by pressing OK button

Results:

- PB-fragments are loaded in memory

[Fragment>Type>Calculate>fawn/abacus] : Probabilistic typing of bPB-fragments (fawn) or PB-fragments (abacus) .

Prerequisites:

- loaded in memory protein sequence

- loaded in memory PB-fragments

- specified tolerances.

Results:

- Fragment typing probabilities are calculated and loaded in memory.

The main FMCGUI window displays:

- the summary table that shows how many fragments of each AA-residue type are expected and how many fragments were actually recognized by the typing script;

- warning messages that suggest user to check and possibly modify typing manually of some fragments manually

[Fragment>Type>fix] :To modify typing probabilities .

Prerequisites:

- loaded in memory protein sequence

- loaded in memory PB-fragmen

New window "Fragment Property Modification" (FPM) window is opened.This window has 3 sections.

In the top section of FPM window user can select fragment user ID, U_id. Then typing probabilities for all AA types t will be shown on the graph. The chemical shifts of the fragments and its assignment status (A_id) are shown as well. User can modify typing probabilities of the selected fragment by selecting AA types by clicking right mouse button and pressing ‘Update’ button. In the result only propapbilities corresponding to the selected AA types will be set to the same non-zero values.

In the top section of FPM window user can select AA residue type t1.Then the graph will show typing probabilities that correspond to the selected residue type t1 for all available fragments IDs f . Selecting a particular fragment U_id by clicking right mouse button (the color of U_id is changed to red) and pressing “Update” button will set the probability to 1 while for all other f will be set to 0. Selecting a particular fragment U_id by clicking left mouse button (the color of U_id is changed to blue) and pressing “Update” button will set the probability to 0.

[Fragment>Expected Peaks>”spectra name”] :To generate different peak lists expected from covalent structure of fragments.

Prerequisites:

- protein sequence is loaded in memory

- PB-fragments are loaded in memory

[Fragment>Modify assigned] : To correct assigned fragments.

Prerequisites:

- loaded in memory protein sequence

- loaded in memory PB-fragments

- loaded HNCO, CBCACONH, and HNCA peak lists.

- specified tolerances.

In the result CO chemical shifts are added and chemical shift names are corrected for PB-fragments which are assigned (that is which has A_id > -99 )

5. Assignment menu</font</div>

[Assignment>Contacts>HNCA] :To calculate fragments contact map.

Prerequisites:

- PB-fragments are loaded in memory ;

- typing probabilities are calculated;

- HNCA peak list (recommended to be referenced) is loaded in memory.

- tolerances are specified.

In the result, the contact map is calculated and loaded in the memory.

[Assignment>Contacts>NOE>fawn] :To calculate fragments contact map.

Prerequisites:

- PB-fragments are loaded in memory;

- typing probabilities are calculated;

- N15_NOESY peak list is loaded in memory.

- tolerances are specified.

In the result, the contact map is calculated using N15 NOESY peak list and loaded in memory.

[Assignment>Contacts>NOE>abacus] :To calculate both and fragments contact maps.

Prerequisites:

- PB-fragments are loaded in memory;

- typing probabilities are calculated;

- N15_NOESY and C13_NOESY peak lists are loaded in memory.

- tolerances are specified.

In the result, the contact map is calculated using only N15_NOESY peak list while contact map is calculated using both N15_NOESY and C13_NOESY peak lists that are interpreted by BACUS procedure. Both calculate maps are loaded in memory.

[Assignment>Calculate Probabilities>SA] :To calculate assignment probabilities.

Prerequisites:

- protein sequence is loaded in memory;

- PB-fragments are loaded in memory;

- typing probabilities are calculated;

- fragments contact map is calculated;

- at least one of and fragments contact maps is calculated;

Probabilistic mapping of PB-fragments onto protein sequence is performed using Simulated Annealing Monte Carlo simulations.

A new window “Calculate SA” is open were user can specify different parameters in the control file of the SA simulations.

The main parameters to consider are:

- “Name of the SA run”. Normally the name is sa_run#. A new directory under this name will be created within PROJECTNAME/assign directory. SA calculations will be curried out and the results will be stored in this directory.

- “Size of the pool for unassigned fragments”. The number of positions that are appended to the protein sequence and discarded (unassigned) fragments, if there are any, will be located there. It is safe to over-estimate this number. (If this number is under-estimated, this will force the mapping of spurious spin-systems onto protein sequence);

- “Number of SA trajectories”. The time needed for calculations is proportional to this number. On the other hand, having more SA trajectories the assignment probabilities could be calculated more accurately. In the case of good data, when all SA trajectories converge to assignments with the same energy, 10-15 trajectories should be enough. In the case of poor data, it is better to calculate 40-50 SA trajectories.

- “NOE bbcmap type”. User should specify which one NOE contact map, (abacus) or (fawn) should be used in the calculations;

- “Fixing position flag”. If the flag is set to 1, sequence position of all fragments which has assignment ID > -99 will be fixed during the simulation.

- “Final temperature”. Setting the optimal final temperature will provide all SA trajectories converge to optimal or sub-optimal assignments (the assignments that are in the vicinity of the global energy minimum). The optimal final temperature could be find by running one or a few SA runs with 3-4 trajectories and by analysing convergence of the trajectories from the report shown in the main FMCGUI window after each run.

.

In the result of the SA calculations assignment probability map is

calculated and loaded in memory. The map is also saved in the file 'sa.probmap' located into PROJECTNAME/assignment/sa_run# directory.

[Assignment>Calculate Probabilities>REM] :To calculate assignment probabilities.

Prerequisites:

- protein sequence is loaded in memory;

- PB-fragments are loaded in memory;

- typing probabilities are calculated;

- fragments contact map is calculated;

- at least one of and fragments contact maps is calculated;

Probabilistic mapping of PB-fragments onto protein sequence is performed using Replica Exchange Method Monte Carlo simulations.

A new window “Calculate REM” is open were user can specify different parameters in the control file of the REM simulations.

The main parameters to consider are:

- “Name of the REM run”. Normally the name is rem_run#. A new directory under this name will be created within PROJECTNAME/assign directory. REM calculations will be curried out and the results will be stored in this directory.

- “Size of the pool for unassigned fragments”. The number of positions that are appended to the protein sequence and discarded (unassigned) fragments, if there are any, will be located there. It is safe to over-estimate this number. (If this number is under-estimated, this will force the mapping of spurious spin-systems onto protein sequence);

- “Number of REM steps”. The time needed for calculations is proportional to this number. On the other hand, with more REM steps more extensive sampling of assignment space wil be achieved, which in turn results in more accurate assignment probabilities. This number should be increased for large proteins.

- “NOE bbcmap type”. User should specify which one NOE contact map, (abacus) or (fawn) should be used in the calculations;

- “Fixing position flag”. If the flag is set to 1, sequence position of all fragments which has assignment ID > -99 will be fixed during the simulation.

- “Low temperature”. The optimal low temperature will provide extensive sampling of should sub-optimal assignments during REM simulation.

User can check a correct setting of the low temperature as well as the number of REM steps by analysing a report shown in the main FMCGUI window after the calculations are done.

In the result, the 50 lowest energy assignments are used to calculate assignment probabilities . The probabilities are loaded in memory and saved in the file ‘rem.prbmap’ located into PROJECTNAME/assign/rem_run# directory.

[Assignment>Fix Assignment>Using Probability map] :To perform sequence specific assignment of PB-fragments using results of SA or REM calculations.

Prerequisites:

- protein sequence is loaded in memory;

- PB-fragments loaded in memory;

- at least one SA or REM calculations of assignment probabilities was done

Calculation of assignment probabilities with FMCGUI could be repeated a few times using different methods and parameters. Results of each calculation are stored in a separate directory with the user-specified name. Therefore, there could be a few different directories (for example, sa_run1, sa_run2, rem_run0, rem_run1, rem_run2) located within PROJECTNAME/assign/ directory that contain different assignment probability maps.

User will be asked to select the calculation directory (sa_run# or rem_run#) and to specify the value of acceptance probability P_a. Normally, P_a =0.9 is appropriate. A fragment is considered to be assigned to a sequence position if the corresponding assignment probability (taken from the selected directory) is >= P_a.

In the result, part of the PB-fragments will be assigned, namely, their assignment IDs A_id will be specified. The assignment report will be shown in the main FMCGUI window as well as saved in the corresponding simulation directory (‘sa.fix ‘or ‘rema.fix’ files). For each sequence position, the IDs of both unambiguously and ambiguously assigned fragments are shown in the report. The list of discarded fragments is also presented.

[Assignment>Fix Assignment>Manually] :To perform sequence specific assignment of PB-fragments manually.

Prerequisites:

- protein sequence is loaded in memory;

- PB-fragments are loaded in memory;

This command allows user to fix sequence position of individual fragments.

“Fragment Property Modification” window will be open. User can select a fragment in the bottom section of the window and the ingormation regarding the fragment assignemts will be displayed in this section. Namely, the graph shows assignment probabilities ( or ) that are currently loaded in memory and the text part shows the chemical shifts making up the fragment and it’s assignment ID (A_id).

To modify the current fragment assignment user have to select sequence position on the graph (by clicking on it by mouse) and then to press ‘Update’ button.

In the result, Assignment ID of the selected fragment will be set to the selected sequence position.

There are two special positions “U” and “B” shown on the graph at the end of the protein sequence.

Selecting “U” and pressing “Update” button results in changing assignment status of the fragments to Unassigned, that is A_id is set to -99.

Selecting “B” and pressing “Update” button results in fixing fragment position in the pool of discarded fragments.

[Assignment>Fix Assignment>Reset all] :To change assignment status of all fragments to “Unassigned”.

Prerequisites:

- protein sequence is loaded in memory;

- PB-fragments are loaded in memory;

In the result, for all fragments A_id is set to -99.

[Assignment>Load Probabilities] :To load assignment probabilities in memory.

Prerequisites:

- protein sequence is loaded in memory;

- PB-fragments are loaded in memory;

- SA / REM calculations of assignment probabilities was done

User have to select a directory where SA or REM calculations were done (sa_run# or rem_run#).

The assignment probabilities from the selected directory will be loaded in memory.

6. Structure menu</font</div>

[Structure>Constraints>Talos>calculate] :To generate dihedral angle constraints.

Prerequisites:

- protein sequence is loaded in memory;

- assigned shemical shifts are loaded in memory;

Backbone dihedral angles are predicted using TALOS and then transformed to dihedral angle contraints.

In the result, the constraints are saved in two formats: files 'belok.aco' and 'prot_dihe.tbl' for CYANA and CNS calculations, respectively. Both files are saved in the root project directory PROJECTNAME.

[Structure>Constraints>Talos>load] :To load dihedral angle constraints in CYANA format (aco-file).

[Structure>Constraints>H-bonds>Specify] :To prepare H-bond distance constraints manually.

Prerequisites:

- protein sequence is loaded in memory;

A new window “O/HN Pairs” is opened.

For each H-bond constraint, user have to specify O-HN pair by typing in the ID of residues corresponding to O and HN atoms. Pressing “OK” will save H-Bond constraints in two formats: files 'hbond.upl' and 'prot_hbond.tbl' for CYANA and CNS calculations, respectively. Both files are saved in the root project directory PROJECTNAME.

[Structure>Constraints>H-Bonds>load] :To load HBond distance constraints in CYANA format (upl-file).

[Structure>Calculate>Cyana] :To set up a new structure calculation run with CAYANA.

Prerequisites:

- protein sequence is loaded in memory;

- assigned chemical shifts are loaded in memory;

- N15_NOESY, C13_NOESY, and Aron_NOESY peak lists are loaded in memory

- Specified tolerances.

- dihedral angle constraints are created ( file “belok.aco” is present in the root project directory PROJECTNAME)

Optional:

- Hbond distance constraints (file “hbond.upl” is present in the root project directory PROJECTNAME)

- file that contains ZN ion ligands (file “zn_ligands” is present in the root project directory PROJECTNAME)

A new directory under the name that is specified by user (normally “crun#”) will be created inside project root directory PROJECTNAME. This directory contains all files required to start automatic structure calculations with CYANA.

[Structure>Calculate>Cyana] :To set up ABACUS structure calculations.(Under construction)

[Structure>RPF>RP] :To perform Recall/Precision analysis of structural ensemble.

Prerequisites:

- protein sequence is loaded in memory;

- assigned shemical shifts are loaded in memory;

- N15_NOESY, C13_NOESY, and Arom_NOESY peak lists are loaded in memory

- Specified tolerances.

- coordinates of structural ensemble in CYANA format (final.pdb) or in CNS format (prot_ref_al.pdb). pdb

The parameters to set up:

- “RP directory name”. Normally the name is rp#. A new directory under this name will be created within PROJECTNAME/assign directory. The results of the RPFanalysis will be stored in this directory.

- “sequence gap”. Residue pairs separated by less than the value of sequence gap will be excluded from generating expected peak lists.

- “cutting distance for recall”. Distance threshold for evaluating matching of an experimental peak to a structural ensemble (Recall score)

- “cutting distance for precition”. Distance threshold for generating expected peak from structural ensemble (Precision score)

The results of the RPF analysis will be saved in a new directory (the name of which is specified by user) that is located in the project root directory PROJECTNAME. The results include peak lists in the SPARKY format of both false negative and false positive peaks for different NOESY spectra in a separate files.

[Structure>RPF>DP] :To set up calculations of DP score with AutoStructure.

Prerequisites:

- protein sequence is loaded in memory;

- assigned shemical shifts are loaded in memory;

- N15_NOESY, C13_NOESY, and Aron_NOESY peak lists are loaded in memory

- Specified tolerances.

- coordinates of structural ensemble in CYANA format (final.pdb) or in CNS format (prot_ref_al.pdb). pdb

[Structure>Water refinement>calculate] :To set up water refinement of structural ensemble obtained with CYANA.

Prerequisites:

- structure calculation with CYANA should be done

Optional:

- RDC data in PALES format.

- file that contains ZN ion ligands (file “zn_ligands” )

User will be asked to specify the name of directory for water refinement calculations WATDIR and to select a number of files with coordinates and constraints. In addition to this user have to indicate cisProline residues (if there are any) and to specify protonation state of HIS residues which is double protonated by default.

In the result, a new directory WATDIR will be created inside PROJECTNAME directory that contains all files and scripts required to carry out water refinement calculations with CNS. It is recommended to start calculations on linux cluster.

[Structure>Water refinement>summary] :To create a summary.

Prerequisites:

- water refinement with CNS should be done.

User will be asked to specify residues used for structure superposition and to select the water refinement directory WATDIR.

The refined structural models will be superimpose and combined in one file. The created summary reports values of different energy components and constraint violation statistic for each structural model. A new directory WATDIR_results will be created. The directory contains final superimposed coordinates, distance and dihedral angle constrains in a format suitable for PDB deposition, summary and constraint violations report.

[Structure>Add ZN ligands] :To create zn_ligands file.

User have to tipe in IDs of residues that ligate zinc ion into the popped up entry window “ZN ligands”. If there are a few zinc ions, information for each ion should be provided in a separate row.

The file “zn_ligands” will be created inside the project root directory PROJECTNAME by pressing “OK” button.

[Structure>RCI] :To calculate Random Coil Index.

Random Coil Index is calculated using rci_v_1c.py script. New directory PROJECTNAME/rci that contains the results of calculations is created.

7. View menu</font</div>

This section provides means to visualize data and calculation results.

[View>Fragment] :To display current fragment properties.

This command opens “Fragment Graph” (FG) window were all properties of a selected fragment that currently are loaded in memory will be displayed.

The chemical shifts making up the fragment are shown in the middle part of the window. The heading line (for example line “Fragment #21 rem_run1 L108” shown on Figure ) shows fragment ID (#21), the name of directory that contains assignment probabilities used to assign the fragment (rem_run1) and sequence position to which the fragment is assigned (L108). All other fragment properties are shown in the form of graphs. Typing probabilities the top section of the window one can see typing probabilities and both assignment probabilities and are shown on three graph on the top part of the FG window. It is also indicated the name of the directories from which the displayed assignment probabilities were loaded. Two graphs on the bottom part of the FG window shows a column (or ) and a row (or ) of a contact map calculated from NOESY data, respectively. Here U_id is selected fragment ID, while f is any fragment ID for which the corresponding score is > 0. What contact map, or), is shown on the graphs is indicated on the bar at the bottom of FG window. Clicking on this bar by mouse will switch from one contact map to another.

The contact map is shown on this graph implicitly as well by color of graph bars.

For example, on the figure the elements of contact map related to the fragment 21 are shown. The score for fragment 21 to be before the fragments 37, 20 and 17 in protein sequence are shown in magenta which means that these connections being derived from NOESY data are supported by HNCA spectra as well. The score for fragment 21 to be before the fragments 22 and 4 are shown in red meaning that theses connections are not supported by HNCA spectra.

[View>Assignment] :To display assignment results.

The user is asked to select a directory were assignment probabilities were calculated

(sa_run# or rem_run#).

New “Assignments Graphs” (AG) window pops up. This window provides user with graphical means to manipulate PB-fragment’s assignments without making changes of assignment status of PB-fragments in memory. The current fragment’s assignment, taken from the memory, altogether with different scores associated with this assignment is shown when AG window is opened. The fragments that currently are not assigned are also shown in the bottom-right part of the AG window. User can modify the current assignment and to observe the resulting changes in the scores.

A particular fragment’s assignment is displayed on the left part of the AG window. The graph at the bottom shows ID of fragments assigned to each sequence position, while four graphs on the top show different scores that correspond to this assignment, namely, HNCA score, NOE scores, typing and assignment probabilities that currently loaded in memory. The scroll bar at the bottom allows user to move along protein sequence.

In order to modify the assignment user have first to select protein sequence segment of interest by clicking on IDs of two residues that correspond to the edges of the segment. The colour of these residues, for example K56 and D59, will change to red (see Figure). Then pressing on “Select Segment” bar at the bottom of AG window will result in possible assignment for the selected window to be shown on top-right corner of AG window. The possible assignments for the selected sequence segment are taken form the selected directory shown on the title bar and correspond to sub-optimal fragment assignments sampled during SA or REM calculations

The new assignment for the sequence segment should be selected from the list of possible assignments using mouse. The color of the selected line will turn red (see Figure). In the case user want to consider an assignment that is not present in the list, he should select any assignment from the list ant modify it by pressing the button “Edit” at the bottom of the AG window. A new window “Edit Assignment” pops up, and user can modify the assignment by typing changes in this window.

Once a new assignment for the sequence segment is selected the current fragments assignment shown on the left part of AG window can be modified by pressing button “Update” (see Figure)

FMCGUI2.2 DATA FORMATS

A. Sequence formats.

A1. "Fasta" format. The first line should start with '>'. Next one or more

lines contain sequence in 1-letter code.

Example_A1:

>

MDSKEVLVHVKNLEKNKSNDAAVLEILHVL

DKEFVPTE KLLRETKVGVE VNKFKKSTN

VEISKLVKKMISSWKDAIN

A2. "Standard" format. Each line contains name of one residue in 3-letter code

and optionally the residues ID. (Only residue ID of the first residue is used).

Example_A2.1 (position ID is not specified)

GLN

GLY

HIS

MET

PRO

GLY

ILE

TYR

GLU

GLY

LYS

GLY

THR

ASN

MET

GLU

....

Example_A2.2 (with specified all position ID):

GLN -3

GLY -2

HIS -1

MET 0

PRO 1

GLY 2

ILE 3

ILE 4

TYR 5

.....

Example_A2.3 (with specified first position ID):

GLN -3

GLY

HIS

MET

PRO

GLY

ILE

TYR

.....

B. Peak lists formats.

B1. Sparky format.

Could be both referenced and not referenced, with or without Volume/Height.

Referencing, if present, has format F#, were # is user defined

PB-fragment ID. The volume/height could be provided by digit number

or by float number in E format (0.141E+10)

Example_B1.1: (referenced HNCA)

User w1 w2 w3

F1 55.269 106.560 9.027

F1 43.558 106.560 9.026

F2 51.232 114.375 9.101

F2 59.634 114.375 9.096

F3 57.686 118.215 9.514

F3 59.071 118.215 9.519

F4 55.762 118.966 9.306

F4 59.071 118.966 9.306

F5 52.547 119.119 9.250

F5 59.071 119.119 9.249

F6 59.071 120.109 9.230

F6 55.128 120.109 9.232

F7 55.598 122.600 9.599

F9 56.184 122.430 9.316

F9 57.616 122.430 9.319

F11 58.719 122.464 9.158

F11 55.292 122.464 9.157

Example_B1.2: (not referenced N15NOESY)

Assignment w1 w2 w3 Data Height

?-?-? 3.592 106.850 8.261 1435751

?-?-? 4.244 106.850 8.259 1096259

?-?-? 1.625 108.915 8.438 544482

?-?-? 6.898 108.915 8.438 428124

?-?-? 2.017 119.793 7.851 858875

?-?-? 4.358 119.793 7.855 1260421

?-?-? 4.573 126.081 8.518 590579

?-?-? 3.891 126.081 8.522 651350

?-?-? 5.665 125.092 8.261 793816

?-?-? 3.567 125.092 8.264 1123055

?-?-? 3.300 125.092 8.262 814461

?-?-? 4.713 125.092 8.262 6211645

?-?-? 6.976 125.092 8.262 2048095

?-?-? 7.884 125.092 8.261 651915

Example_B1.3: (not referenced HNCO)

w1 w2 w3

176.935 106.560 9.025

175.679 114.375 9.107

175.310 118.215 9.517

174.027 118.966 9.307

178.232 119.102 9.255

175.174 120.109 9.227

173.781 122.600 9.599

175.720 122.430 9.312

172.143 122.464 9.165

173.699 121.269 9.058

176.102 121.525 9.099

173.645 123.727 9.143

173.781 124.819 9.113

ExampleB_1.4: (referenced N15HSQC)

User w1 w2

F34 123.334 8.843

F99 122.498 7.685

F144 117.711 7.610

F163 121.227 8.178

F50 116.577 8.451

F1 106.560 9.025

F100 122.361 7.821

F101 121.372 7.868

F102 122.361 8.006

F103 122.378 8.048

F105 122.703 8.123

F106 122.327 8.231

F107 121.440 8.358

F109 120.570 8.577

F11 122.464 9.164

F111 120.535 8.306

F112 121.218 8.200

F113 120.843 8.157

F114 121.338 8.105

B2. XEASY format.

First (one or a few) lines should start with character '#'.

There is no referencing in this format.

Example_B2: (not referenced Arom_NOESY)

# Number of dimensions 3

#FORMAT xeasy3D

#INAME 1 C

#INAME 2 H

#INAME 3 HC

#CYANAFORMAT ChH

1 132.896 7.221 7.216 1 U 0.141E+10 0.000E+00 e 0 0 0 0

2 132.896 6.799 7.219 1 U 0.177E+09 0.000E+00 e 0 0 0 0

3 132.896 4.695 7.217 1 U 0.205E+08 0.000E+00 e 0 0 0 0

4 132.896 3.271 7.220 1 U 0.274E+08 0.000E+00 e 0 0 0 0

5 132.896 3.006 7.219 1 U 0.180E+08 0.000E+00 e 0 0 0 0

6 132.896 1.848 7.228 1 U 0.108E+08 0.000E+00 e 0 0 0 0

7 132.896 1.436 7.233 1 U 0.112E+08 0.000E+00 e 0 0 0 0

8 118.540 6.796 6.796 1 U 0.452E+10 0.000E+00 e 0 0 0 0

9 118.540 7.221 6.792 1 U 0.228E+09 0.000E+00 e 0 0 0 0

10 118.540 7.054 6.794 1 U 0.167E+09 0.000E+00 e 0 0 0 0

11 118.540 4.708 6.795 1 U 0.641E+08 0.000E+00 e 0 0 0 0

12 118.540 2.032 6.794 1 U 0.203E+08 0.000E+00 e 0 0 0 0

13 118.540 0.537 6.797 1 U 0.380E+08 0.000E+00 e 0 0 0 0

14 131.288 6.911 6.900 1 U 0.261E+09 0.000E+00 e 0 0 0 0

15 131.288 7.170 6.899 1 U 0.897E+08 0.000E+00 e 0 0 0 0

...

B3. "Standard" ABACUS format.

Example_B3.1: (not referenced N15NOESY)

1 106.850 4.244 8.259 1096259

2 108.915 1.625 8.438 544482

3 108.915 6.898 8.438 428124

4 119.793 2.017 7.851 858875

5 119.793 4.358 7.855 1260421

6 126.081 4.573 8.518 590579

7 126.081 3.891 8.522 651350

8 125.092 5.665 8.261 793816

9 125.092 3.567 8.264 1123055

…

Example_B3.2: (not referenced C135NOESY)

1 61.234 2.083 4.157 0.124E+08

2 61.234 7.471 4.152 0.751E+07

3 31.252 4.219 2.974 0.814E+07

4 31.252 4.222 3.029 0.985E+07

5 50.670 4.728 3.651 0.114E+09

6 17.853 2.445 0.865 0.506E+08

7 17.853 3.055 0.868 0.189E+08

8 17.853 3.837 0.868 0.285E+08

9 17.853 7.122 0.873 0.278E+08

10 17.853 6.847 0.868 0.185E+08

11 17.853 8.292 0.860 0.263E+08

12 17.853 8.393 0.860 0.198E+08

C. Spin-systems file.

One can load both assigned and not assigned spin-systems (fragments)

There is only one format to load not assigned spin-systems.

Assigned spin-systems could be loded using two different formats.

C1. PB-fragments in standard format.

Each fragment is represented by a number of lines. The groups of lines

corresponding to different fragments are separated by empty line. The line

that follow the last fragment should have 'Q' at the first position.

A fragment is described by the following format:

31 X 7 0.000

9.025 HN 106.560 N

5.221 HA 55.302 CA

1.940 HB* 33.679 CB

1.480 HG1 27.976 CG

1.775 HG2 27.976 CG

2.943 HD1 43.958 CD

3.205 HD2 43.958 CD

The first line contains userID, 1-letter residue type ('X' if not known), the

number of the following lines, and chemical shift of CO (0.000 if not known).

The fragments user ID could be any digit number < 500.

There is no restriction on the order of fragments in the file.

Example_C1:(not assigned PB-fragments)

.....

90 X 5 0.000

8.462 HN 122.771 N

4.262 HA 56.242 CA

1.880 HB1 30.528 CB

2.006 HB2 30.528 CB

2.222 HG* 36.201 CG

91 S 4 174.082

8.089 HN 123.488 N

4.626 HA 55.058 CA

3.413 HB1 65.338 CB

3.801 HB2 65.338 CB

247 X 5 0.000

8.070 HN 123.607 N

4.034 HA 62.358 CA

2.048 HB 32.775 CB

0.873 HG1* 21.111 CG1

0.884 HG2* 20.501 CG2

18 X 4 0.000

8.110 HN 123.044 N

5.042 HA 58.559 CA

4.179 HB 70.135 CB

0.087 HG2* 16.018 CG2

Q

C2. Assigned AA-fragments in standard format.

AA-fragments are ordered in a file according to their assignment to

protein sequence. Each fragment is described using the following format:

11 T 4 174.109 106

8.114 HN 113.206 N

4.325 HA 61.900 CA

4.283 HB 69.855 CB

1.126 HG2* 21.639 CG2

The first line contains protein sequence ID to which the fragment is assigned,

1-letter residue type, the number of the following lines, chemical shift of

CO, and user ID of the fragment. The last number, user ID, is optional.

Example_C2.1:(assigned AA-fragments)

.....

10 G 2 174.136 158

8.605 HN 110.399 N

3.708 HA* 46.225 CA

11 T 4 174.109 106

8.114 HN 113.206 N

4.325 HA 61.900 CA

4.283 HB 69.855 CB

1.126 HG2* 21.639 CG2

12 N 1 175.100 999

8.231 HN 122.327 N

16 A 2 179.269 50

3.224 HA 56.082 CA

1.518 HB* 18.860 CB

17 D 3 179.378 74

8.451 HN 116.577 N

4.370 HA 57.335 CA

2.656 HB* 39.698 CB

18 V 5 177.276 117

7.579 HN 122.993 N

3.720 HA 65.701 CA

2.068 HB 31.930 CB

1.160 HG1* 22.719 CG1

0.642 HG2* 20.806 CG2

...........

Example_C2.2:(assigned AA-fragments)

.....

10 G 2 174.136

8.605 HN 110.399 N

3.708 HA* 46.225 CA

11 T 4 174.109

8.114 HN 113.206 N

4.325 HA 61.900 CA

4.283 HB 69.855 CB

1.126 HG2* 21.639 CG2

12 N 1 175.100

8.231 HN 122.327 N

16 A 2 179.269

3.224 HA 56.082 CA

1.518 HB* 18.860 CB

17 D 3 179.378

8.451 HN 116.577 N

4.370 HA 57.335 CA

2.656 HB* 39.698 CB

18 V 5 177.276

7.579 HN 122.993 N

3.720 HA 65.701 CA

2.068 HB 31.930 CB

1.160 HG1* 22.719 CG1

0.642 HG2* 20.806 CG2

...........

C3. CYANA chemical shift file (prot-file).

Example_C3:(assigned chemical shifts)

1 8.414 0.000 H 1

2 123.795 0.000 N 1

3 4.723 0.000 HA 1

4 53.298 0.000 CA 1

5 1.898 0.000 HB3 1

6 32.288 0.000 CB 1

7 2.030 0.000 HB2 1

8 2.475 0.000 HG3 1

9 32.024 0.000 CG 1

10 2.547 0.000 HG2 1

11 177.440 0.000 C 2

12 4.379 0.000 HA 2

13 63.455 0.000 CA 2

14 1.914 0.000 HB3 2

15 32.194 0.000 CB 2

16 2.268 0.000 HB2 2

17 1.971 0.000 HG3 2

18 27.313 0.000 CG 2

19 2.008 0.000 HG2 2

20 3.650 0.000 QD 2

21 50.670 0.000 CD 2

22 173.945 0.000 C 3

23 8.438 0.000 H 3

24 108.915 0.000 N 3

25 3.941 0.000 QA 3

26 45.341 0.000 CA 3

..................................

FMCGUI2.2 HOW-TOs

Spin-system identification

Normally, the NMR spectra shown in Table 1 should be collected for ABCUS. Spectra should ideally be collected from the same protein preparation and under the same conditions to make sure peaks are within tolerance between spectra. All spectra need to be appropriately synchronized and calibrated against reference spectra of your choice. It is a good idea to use as a reference for spectra calibration the 2D HN-projection of 15N-NOESY and the 2D CH-projection of the 13C-NOESY.

Step 1. Generate HN-rooted spin-systems (bPB-fragments) .

1.1 These spin-systems can be identified using the following spectra

15N_HSQC

HNCO

CBCA(CO)NH

HBHAC (CO)NH

HNCA

15N_NOESY

It is a good idea to analyze all these spectra simultaneously, for example, using SPARKY, in order to obtain peak lists of 15N_HSQC, HNCA, and CBCA(CO)HN spectra (see Figure 4.1A). The first two spectra should be referenced.

1.2. Create initial HN-rooted spin-systems (bPB-fragments) using FMCGUI:

- start new project PRJ_1 [Poject>new]

- load protein sequence [DATA>Protein Sequence>load]

- load referenced HNCA peak list [DATA>HNCA>load]

- load referenced CBCA(CO)NH peak list [DATA>CBCACONH>load]

- load referenced HBHA(CO)NH peak list [DATA>HBHACONH>load]

- load referenced 15N_HSQC peak list [DATA>N15 HSQC>load]

- set tolerances for matching of resonances in different spectral dimensions [DATA>Tolerances]

- create bPB-fragments [Fragment>create>fawn]

First, a referenced C13_hsqc peak list is created and shown in new window ‘fake C13 HSQC’. Warning messages are shown in the project main window. You have to check the list and modify it if needed. Then, press ‘OK’ button in the ‘‘fake C13 HSQC’ window. In the result, new window ‘Create Fragment’ pops up. The window consists of three sections. The left sections contains suggested bPB-fragments, while the other sections contains two reports of fragments scoring with both C and H resonances and with only C resonances, respectively. Consider warning messages shown in the project main window and check/modify generated bPB-fragments in the left section of “Create Fragment” window. When satisfied, the bPB-fragments will be loaded in the memory by clicking on “OK” button.

- save project PRJ_1. [Project>Save] or [Project>Quit]

Step 2. Complete bPB-fragments with aliphatic side-chain resonances and generating additional spin-systems (without HN) using the following spectra.

13C_HSQC

(H)CCH-TOCSY

H(C)CH-TOCSY

13C_NOESY

2.1. Generate expected peak lists for C13HSQC, (H)CCH_TOCSY, and H(C)CH_TOCSY spectra using created bPB-gfragments:

- open project PRJ1 [Project>load]

- generate expected C13hsqc_exp.list that consists of C_-H_ and C_-H_ moieties

[Fragment>Expected Peaks>C13HSQC]

- generate expected CCH_tocsy.list [Fragment>Expected Peaks>(H)CCH]

- generate expected HCH_tocsy.list [Fragment>Expected Peaks>H(C)CH]

2.2 Read the generated peaks into SPARKY.

2.3 Using SPARKY, complete bPB-fragments with aliphatic side-chain resonances by analyzing C_-H_ and C_-H_ strips in all CH_rooted spectra (see Figure 2.B,C). New resonances should be peaked and referenced only in 13C_HSQC spectrum.

2.4 When all peaks corresponding to HN-rooted spin-systems are peaked in 13C_HSQC spectrum, the unpicked peaks are used as a starting point for identification of spin-systems without backbone HN resonances (PB-fragments corresponding to residues before prolines in the protein sequence, last residue and residues missing in HN-rooted spectra).

Step 3. Spin-systems validation and correction.

Validate PB-fragments using FMCGUI:

- start a new project PRJ2 [Project>new]

- load protein sequence [DATA>Protein Sequence>load]

- load referenced 15N_HSQC peak list [DATA>N15 HSQC>load]

- load referenced 13C_HSQC peak list [DATA>C13 HSQC>load]

- load referenced HNCA peak list [DATA>HNCA>load]

- load CBCA(CO)HN peak list [DATA>CBCACOHN>load]

- load HNCO peak list [DATA>HNCO>load]

- set tolerances for matching of resonances in different spectral dimensions [DATA>Tolerances]

- create PB-fragments [Fragment>create>abacus]

This command starts script ‘sps_create’. In the result, a new window ‘Create Fragment’ pops up. The warning messages of the ‘sps_create’ script are shown in the main window and indicate spin-system that have low score, namely, S_max < 10^-4. Following the warnings check and modify, if necessary, generated PB-fragments in the left section of “Create Fragment’ window. Alternatively, go back to spectra, fix peak lists accordingly, and repeat the fragment generation/validation again.

When satisfied, the PB-fragments will be saved (file ‘sps_pb.dat’ into directory PRJ2/sps) and loaded in the memory by pressing OK button in “Create Fragment’ window.

- save project PRJ2 [Project>Save] or [Project>Quit]

Sequence specific assignment of PB-fragments

Step 1. Peak picking of NOESY spectra.

To facilitate peak-picking of NOESY spectra, you can first generate expected tocsy peaks of PB-fragments using FMCGUI:

- open project PRJ2 [Project>load]

- generate tocsy peaks of N15 NOESY spetrum [Fragment>Expected Peaks>N15NOESY]

- generate tocsy peaks of C13NOESY spectrum [Fragment>Expected Peaks>C13NOESY]

Then, using SPARKY, read in the expected peaks into corresponding spectra and complete peak picking manually.

Step 2. Probabilistic assignment of PB-fragments to protein sequence.

According to FMC procedure (see Introduction) you have to do the following:

- Probabilistic typing of PB-fragments.

- To calculate two fragments “contact” maps C_NOEandC_{HNCA .}C_NOE is a contact map based on 15N_ and 13C_NOESY data (it could be calculated by 2 methods, “abacus” and “fawn”) and C_HNCA is a map based on HNCA data.

- To calculate assignment probabilities by Simulated Annealing (SA) or Replica Exchange Method (REM) Monte Carlo simulations

This can be done by the following commands:

- open project PRJ2 [Project>load]

- load 15N NOESY peak list [DATA>N15 NOESY>load]

- load 13C NOESY peak list [DATA>C13NOESY H2O>load]

- set tolerances [Data>Tolerances]

- calculate typing probabilities for all fragments [Fragment>Type>calculate>abacus]

You have to consider the warning messages shown in the project main window and to analyze/modify typing probabilities manually by using “Fragment Property Modification” (FPM) window. To open FPM window click on [Fragment>Type>fix]. This window has 3 sections. Top section allows you to select a fragment (by user ID) and modify its typing probabilities. The middle section shows typing probabilities that correspond to the selected amino acid type for all fragments. Here you can fix for any fragment it’s typing probability corresponding to the selected amino acid type to the values of 1 or 0.

- calculate fragment contact map from HNCA data [Assignment>Contacts>HNCA]. In the result the map is calculated and loaded in the memory. It is strongly recommended to check the messages in the project main window regarding HNCA peak list. In case there are inconsistencies present in the input HNCA peak list, go back to spectra and fix the list.

- calculate fragment contact map from NOESY data [Assignment>Contacts>NOE>abacus]. In the result, two contact maps, and C_{NOE_F}are calculated and loaded in the memory. Calculation of involves use of BACUS procedure for NOESY data interpretation, while does not.

- Calculate assignment probabilities. It can be done using two different Monte Carlo simulation methods. Namely, Simulating Annealing (SA) [Assignment>Calculate Probability>SA] and Replica Exchange Method (REM) [Assignment>Calculate Probability>REM]. Before starting calculations you have to specify control parameters. The main parameters to consider are: ‘Name of the SA/REM run’, ‘Size of the pool for unassigned fragments’, ‘number of SA runs’, ‘Final Temperature’ (SA), ‘Low Temperature’ (REM), and ‘NOE contact map’. The results of the calculations will be stored in the directory under specified name which is created inside PRJ2/assign/ directory. The main result consists of optimal and sub-optimal fragments assignments and assignment probability map. The last one will be automatically loaded in the memory as well.

Calculation of assignment probabilities could be repeated a few times using different methods and parameters. Result of each calculation is stored in a separate directory. Therefore, there could be a few different directories (for example, sa_run1, sa_run2, rem_run0, rem_run1, rem_run3… ) within PRJ2/assign/ directory that contain different assignment probability maps.

Step 3. Sequence-specific assignment of PB-fragments by analyzing probabilities .

A fragment assignment to a sequence position using FMCGUI could be done in two ways, manually and using assignment probability map, respectively.

- manual assignment [Assignment>Fix Assignment>manually].

This command pops up ‘Fragment Property Modification’ window. You can change the assignment ID of any selected fragment using the bottom section of the window.

- assignment using probability map [Assignment>Fix Assignment>using probability map]. You have to select a calculation directory (sa_run# or rem_run#) that contains assignment probability map, , and to specify the probability threshold . A fragments k will be assigned to position s if the condition is satisfied (see Figure 1.).

Step 4. Assignment analysis.

In the case of poor data, only a part of the fragments get assigned unambiguously. The uncertainty in fragments assignment could be resolved manually with the help of FMCGUI command [View>Assignment]. This command pops up “Assignment Graph” window that provides you with graphical tool to visualize the current assignment and to analyze sub-optimal fragment assignments.

Step 5. Final resonance assignment.

When sequence specific PB-fragment assignment is done you have to put in order assigned fragments and to assignCO resonances using command [Fragment>Modify assigned].

Structure calculation using FMCGUI

Step 1. Load Data.

- open new project PRJ3 [Project>new]

- load 15N NOESY peak list [DATA>N15 NOESY>load]

- load 13C_aliphatic NOESY peak list [DATA>C13NOESY H2O>load]

- load 13C_aromatic NOESY peak list [DATA>AromNOESY>load]

- set tolerances [Data>Tolerances]

- load 13C NOESY peak list [DATA>C13NOESY H2O>load]

- load assigned chemical shifts [Fragment>Load>assigned]

The file with assigned chemical shifts could be either in “standard” format (assigned AA-fragments) or in cyana format (prot-file).

Step 2. Set up constraints

The structure calculation requires dihedral angle constraints in the cyana format (aco-file). These constraints are usually prepared using the results of dihedral angle prediction by TALOS. H-bond constraints are optional.

- calculate dihedral angle constraints [Structure>Constraints>Talos>Calculate]

- set up H-bond constraints [Structure>Constraints>H-bonds>Specify]

In the cased dihedral angle or H-bond constraints in cyana format (aco-file or upl-file, respectively) already are prepared, then the constraints can be loaded from the corresponding files [Structure->Constraints->Talos>Load] or [Structure>Constraints>H-bonds>Load]

Step 3. Specify ligands coordinating ZN ions (if there are any).

If there are ZN ions as a part of a protein structure the file “zn_ligands” should be present inside FMCGUI project directory. This file can be created by the following command

- specify residues that coordinate ZN ion(s) [Structure>Add ZN ion]

Step 4. Set up CYANA calculations

- setting up structure calculations with CYANA [Structure>Calcuate>cyana]

All files that are necessary for CYANA run are prepared and saved in the user specified directory, crun#, which is located inside the project directory. These files include chemical shifts (belok.prot file), sequence file, peak lists, dihedral angles constraints (file belok.aco), H-bond constraints, if available, (file hbond.upl), and constraints for ZN ions, if present, (files zn.upl, zn.lol).

Step 5. Water refinement

- set up water refinement calculations of the ensemble of structures obtained by CYANA [Structure>Water Refinement>calculate]. The popped up window allows user to select file with cyana structural ensemble (final.pdb), cyana dihedral angle constraints (belok.aco), cyana distance constraints (final.upl) , H-bond constraints (hbond.upl), RDC data, ZINC ligands file, and to specify cis-Proline residues and proton state of HIS residues. The command will set up the water refinement calculations in the user specified directory WRdir.

- curry out water refinement calculations (on linux cluster is recommended) following the instruction given in the project main window

- analp.comyze and superimpose refined structures [Structure>Water Refinement>Summary]. The refined structural models are superimposed and combined in one file. Also, for each refined structure, different energy component are calculated and analysis of constraint violations is performed. The results of this analysis are placed in the created directory WRdir_results.

Step 6. Structure evaluation and peak list refinement.

RPF analysis and DP score allow one to estimate goodness-of-fit of a structural ensemble to NOESY peak lists. The results of RPF analysis can serve both for structure validation and peak lists refinement.

- to run RPF analysis [Structure>RPF>RP]. The results of the RPF analysis include peak lists in the SPARKY format of both false negative and false positive peaks for C13_aliphatic_NOESY, C13_aromatic_NOESY, and N15_NOESY spectra in separate files.

- to set up DP-score calculations with AutoStructure [Structure>RPF>DP].

@@ Line 1: / Line 1: @@
 <div style="margin: 12pt 0cm 3pt;">'''<font size="6">&nbsp;</font>'''</div>
 '''&lt;span style="font-size: 16pt;" /&gt;'''
+<div style="margin: 12pt 0cm 3pt;">'''<font size="6"><span><font size="5">INTRODUCTION TO ABACUS</font></span></font>'''</div><div align="left" style="line-height: 12pt;">&nbsp;</div><div style="text-indent: 36pt;"><span style="font-size: 11pt; color: black;">ABACUS (''A''pplied ''BACUS'') is a novel approach for protein structure determination that has been applied successfully for more than 20 NESG targets. ABACUS is characterized by use of BACUS, a procedure for automated probabilistic interpretation of NOESY spectra in terms of unassigned proton chemical shifts based on the known information on "connectivity" between proton resonances. BACUS is used in both the resonance assignment and structure calculation steps. The</span><span style="font-size: 11pt;"> ABACUS<span style="color: black;"> is distinguished from conventional approaches to NMR structure determination mostly by its resonance assignment strategy (see Fig.1.1A). </span></span></div><div>&nbsp;</div><div>&nbsp;</div><div>&nbsp;</div>
-= '''<font size="6"><span><font size="5">INTRODUCTION TO ABACUS</font></span></font>''' =
-<div align="left" style="line-height: 12pt;">&nbsp;</div><div style="text-indent: 36pt;"><span style="font-size: 11pt; color: black;">ABACUS (''A''pplied ''BACUS'') is a novel approach for protein structure determination that has been applied successfully for more than 20 NESG targets. ABACUS is characterized by use of BACUS, a procedure for automated probabilistic interpretation of NOESY spectra in terms of unassigned proton chemical shifts based on the known information on "connectivity" between proton resonances. BACUS is used in both the resonance assignment and structure calculation steps. The</span><span style="font-size: 11pt;"> ABACUS<span style="color: black;"> is distinguished from conventional approaches to NMR structure determination mostly by its resonance assignment strategy (see Fig.1.1A). </span></span></div><div>&nbsp;</div><div>&nbsp;</div><div>&nbsp;</div>
 {| width="666" cellspacing="0" cellpadding="0" border="0" style="width: 499.75pt; border-collapse: collapse;" class="FCK__ShowTableBorders"
 |- style="height: 269.1pt;"
@@ Line 10: / Line 8: @@
 |}
 <div>'''<sup><span style="font-size: 11pt;">1)</span></sup>'''<span style="font-size: 11pt;">Lemak A., Steren, C., Arrowsmith, C.H. and Llinás, M. (2008) ''J. Biomol. NMR'', 41, 29-41.''' <sup>2)</sup> '''Grishaev, A., Steren, C.A., Wu, B., Pineda-Lucena, A., Arrowsmith, C. and Llinás, M. (2005) ''Proteins'', 61,36-43.''' &nbsp;<sup>3)</sup>'''Grishaev, A. and Llinás, M. (2004) ''J. Biomol. NMR'', 28, 1-10</span><span style="font-size: 10pt;">.</span></div><div>&nbsp;</div><div>&nbsp;</div><div><span style="font-size: 11pt;">Some features /advantages of the ABACUS protocol:</span></div><div style="margin: 0cm 0cm 0pt 18pt; text-indent: -18pt;"><span style="font-size: 11pt;">-<span style="font-family: 'times new roman'; font-style: normal; font-variant: normal; font-weight: normal; font-size: 7pt; line-height: normal; font-size-adjust: none; font-stretch: normal;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span></span><span style="font-size: 11pt;">It does not rely on sequential connectivities from less sensitive experiments such as HNCACB indispensable for most traditional sequential assignment procedures</span><span style="font-size: 11pt;">;</span></div><div style="margin: 0cm 0cm 0pt 18pt; text-indent: -18pt;"><span style="font-size: 11pt;">-<span style="font-family: 'times new roman'; font-style: normal; font-variant: normal; font-weight: normal; font-size: 7pt; line-height: normal; font-size-adjust: none; font-stretch: normal;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span></span><span style="font-size: 11pt;">Inter-residue sequential connectivities are established mainly from NOE data, which saves time at a later stage in “troubleshooting” NOE and resonance assignments.;</span></div><div style="margin: 0cm 0cm 0pt 18pt; text-indent: -18pt;"><span style="font-size: 11pt;">-<span style="font-family: 'times new roman'; font-style: normal; font-variant: normal; font-weight: normal; font-size: 7pt; line-height: normal; font-size-adjust: none; font-stretch: normal;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span></span><span style="font-size: 11pt;">Probabilistic nature of the ABACUS procedure provides measure of reliability of assignments, and therefore one</span><span style="font-size: 11pt;"> can obtain a partial, yet highly reliable assignment (even when the NMR data are sub-optimal) with the knowledge of</span><span style="font-size: 11pt;"> where to focus manual intervention</span>;</div><div style="margin: 0cm 0cm 0pt 18pt; text-indent: -18pt;"><span style="font-size: 11pt;">-<span style="font-family: 'times new roman'; font-style: normal; font-variant: normal; font-weight: normal; font-size: 7pt; line-height: normal; font-size-adjust: none; font-stretch: normal;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span></span><span style="font-size: 11pt;">It can make use of&nbsp;partial spin-systems; </span></div><div style="margin: 0cm 0cm 0pt 18pt; text-indent: -18pt;"><span style="font-size: 11pt;">-<span style="font-family: 'times new roman'; font-style: normal; font-variant: normal; font-weight: normal; font-size: 7pt; line-height: normal; font-size-adjust: none; font-stretch: normal;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span></span><span style="font-size: 11pt;">It can efficiently identify manual errors in the input peak lists;</span></div><div>&nbsp;</div><div>&nbsp;</div>
-== <font size="4"><span>NMR spectra required for ABACUS</span></font>  ==
+<font size="4"><span>NMR spectra required for ABACUS</span></font>
 &nbsp;
@@ Line 48: / Line 46: @@
 | width="590" valign="top" colspan="3" style="border: medium none rgb(212, 208, 200); padding: 0cm 5.4pt; width: 442.8pt; background-color: transparent;" | <div>''H(CCCO)NH-TOCSY (optional)''</div>
 |}
-<div>&nbsp;</div><div>&nbsp;</div>
+<div>&nbsp;</div><div>&nbsp;</div><div style="margin: 12pt 0cm 3pt;">'<font size="4">Spin-system identification strategy</font></div><div>'''''&nbsp;'''''</div><div><span style="font-size: 11pt; color: black;">The resonance assignment procedure starts from grouping resonances in spin systems (PB-, or peptide bond, fragments) comprising correlated resonances from the side chain of residue i and the NH resonances of residue i+1 (see Figure1.1B). The uncompleted HN-rooted PB spin-systems, which include resonances of&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;atoms only, are called bPB-fragments in this manual.</span></div><div><span style="font-size: 11pt; color: black;">Spin-system identification in ABACUS approach consists of 3 main steps.</span></div><div><span style="font-size: 11pt; color: black;">1. On the first step, bPB-fragments are collected from high sensitivity NMR correlation experiments (such as HNCO, CBCA(CO)NH, and HBHA(CO)NH) that transfer magnetization via the intervening peptide bond (see Figure 1.2A)</span></div><div>&nbsp;</div><div><span style="font-size: 11pt; color: black;">2. On the second step, completion of bPB-fragments with side-chain aliphatic resonances as well as identification of additional spin-systems (lacking HN resonances) is performed using HCCH-TOCSY and 13C-NOESY spectra (see Figure 1.2B) &nbsp;</span></div><div>&nbsp;</div><div><span style="font-size: 11pt; color: black;">3. Finally, spin-system validation and correction is performed. </span><span style="font-size: 11pt;">This step allows one to find mistakes made during spectra peak-picking and to correct the mistakes by going back to the spectra. </span></div><div><span style="font-size: 11pt;">For each spin-system, 20 scores S(T) were calculated during the validation (see Figure 1.3). Here T corresponds to amino acid type, and T=A,R,D,…, and V, respectively. The score evaluate goodness-of-fit of the spin-system resonances to those observed in BMRB data base.&nbsp;If the best score , where ,&nbsp;is too low, it means that either the spin-system has very unusual chemical shifts or the spin-system does not make sense and need to be corrected.''''' '''''</span></div><div>'''''&nbsp;'''''</div><div>'''''&nbsp;'''''</div><div style="margin: 12pt 0cm 3pt;"><font size="4">Fragments assignment by FMC</font></div><div>'''''&nbsp;'''''</div><div><span style="font-size: 11pt; color: black;">Sequence-specific assignment of PB-fragments is achieved using a Fragment Monte Carlo (FMC) stochastic search procedure. The scoring function used in the FMC procedure is based on both fragment amino acid typing (matching the spin system to amino acid types) and fragment contact map (reflecting which residue is next to which) derived from HNCA data and the analysis of NOEs interpreted by BACUS (see Figure 1.4).</span></div><div>'''''&nbsp;'''''</div><div>'''''&nbsp;'''''</div><br><div>&nbsp;</div><div><span style="font-size: 11pt;">&nbsp;FMC procedure performs ''<u>probabilistic assignment</u>'' of PB-fragments. The assignment </span><span style="font-size: 11pt;">probabilities are calculated by Simulated Annealing (SA) or Replica Exchange Method (REM) Monte Carlo (MC) simulations. &nbsp;Here, &nbsp;is a </span><span style="font-size: 11pt;">probability of fragment ''k'' to occupy position ''s;''</span>''<span style="font-size: 11pt;">k = 1,….,N<sub>f.&nbsp;; </sub></span>''<span style="font-size: 11pt;">and ''<sub>&nbsp;</sub>s''</span><span style="font-size: 11pt;"> = 1,….,N<sub>s</sub>+1.&nbsp;Sequence-specific assignment of PB-fragments is achieved by analyzing probabilities </span><span style="font-size: 11pt;">(see Figure 1.5) as well as sub-optimal fragment’s mapping that are provided by MC simulations.</span></div><div>&nbsp;</div><div>&nbsp;</div><div style="margin: 12pt 0cm 3pt;">'''<font size="4"><span>''FMCGUI''</span></font>'''</div><div>&nbsp;</div><div>FMCGUI is a graphical interface that assist user to carry out resonance assignment and structure calculation using ABACUS approach. </div><div>&nbsp;</div><div>&nbsp;</div><div>&nbsp;</div><div style="margin: 12pt 0cm 3pt;">'''<font size="6"><span><font size="5">FMCGUI_2.2 COMMANDS</font></span></font>'''</div><div>&nbsp;</div><div style="margin: 12pt 0cm 3pt;">'''<font size="4"><span>''0. FMCGUI objects.''</span></font>'''</div><div>&nbsp;</div><div><span style="font-size: 11pt;">Most of FMCGUI commands operate mainly with the following three objects that are located in computer memory:&nbsp;protein sequence, peak list, and PB-fragments. </span></div><div>&nbsp;</div><div>'''<span style="font-size: 11pt;">Protein sequence</span>'''<span style="font-size: 11pt;">. This object can be created in memory using [<span style="color: rgb(153, 51, 102);">Data&gt;Protein sequence&gt;load</span>] or [<span style="color: rgb(153, 51, 102);">Project&gt;load</span>] commands.</span></div><div>&nbsp;</div><div><span style="font-size: 11pt;">&nbsp;The position ID of the first residue in the sequence should be specified by user upon loading sequence file (in the case it is not specified in the input file). Some commands in FMCGUI implies that the first residue of the protein sequence has position ID of 1. Therefore, if there is HIS-tag in the loaded sequence, it should be numbered accordingly starting with a negative position ID of the first residue.</span></div><div>&nbsp;</div><div>'''<span style="font-size: 11pt;">Peak Lists.</span>'''<span style="font-size: 11pt;"> Different peak lists objects can be created in memory using ['''<span style="color: rgb(153, 51, 102);">Data&gt;</span>'''”<span style="color: blue;">Peak list name</span>'''<span style="color: rgb(153, 51, 102);">”&gt;load</span>'''] or ['''<span style="color: rgb(153, 51, 102);">Project&gt;load</span>'''] commands. For some peak-lists, peaks in the list could be referenced by spin-system (fragment) user ID.</span></div><div><span style="font-size: 11pt;">The following table shows what peak lists are required referencing (+), peak lists </span></div><div><span style="font-size: 11pt;">that are optionally referenced (+/-), and peak lists for which referencing is not used </span></div><div><span style="font-size: 11pt;">even if present&nbsp;in the input file (-):</span></div><div><span style="font-size: 11pt;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span></div>
-== '<font size="4">Spin-system identification strategy</font> ==
-<div>'''''&nbsp;'''''</div><div><span style="font-size: 11pt; color: black;">The resonance assignment procedure starts from grouping resonances in spin systems (PB-, or peptide bond, fragments) comprising correlated resonances from the side chain of residue i and the NH resonances of residue i+1 (see Figure1.1B). The uncompleted HN-rooted PB spin-systems, which include resonances of&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;atoms only, are called bPB-fragments in this manual.</span></div><div><span style="font-size: 11pt; color: black;">Spin-system identification in ABACUS approach consists of 3 main steps.</span></div><div><span style="font-size: 11pt; color: black;">1. On the first step, bPB-fragments are collected from high sensitivity NMR correlation experiments (such as HNCO, CBCA(CO)NH, and HBHA(CO)NH) that transfer magnetization via the intervening peptide bond (see Figure 1.2A)</span></div><div>&nbsp;</div><div><span style="font-size: 11pt; color: black;">2. On the second step, completion of bPB-fragments with side-chain aliphatic resonances as well as identification of additional spin-systems (lacking HN resonances) is performed using HCCH-TOCSY and 13C-NOESY spectra (see Figure 1.2B) &nbsp;</span></div><div>&nbsp;</div><div><span style="font-size: 11pt; color: black;">3. Finally, spin-system validation and correction is performed. </span><span style="font-size: 11pt;">This step allows one to find mistakes made during spectra peak-picking and to correct the mistakes by going back to the spectra. </span></div><div><span style="font-size: 11pt;">For each spin-system, 20 scores S(T) were calculated during the validation (see Figure 1.3). Here T corresponds to amino acid type, and T=A,R,D,…, and V, respectively. The score evaluate goodness-of-fit of the spin-system resonances to those observed in BMRB data base.&nbsp;If the best score , where ,&nbsp;is too low, it means that either the spin-system has very unusual chemical shifts or the spin-system does not make sense and need to be corrected.''''' '''''</span></div><div>'''''&nbsp;'''''</div><div>'''''&nbsp;'''''</div>
-== <font size="4">Fragments assignment by FMC</font> ==
-== '''''&nbsp;''''' ==
-<div><span style="font-size: 11pt; color: black;">Sequence-specific assignment of PB-fragments is achieved using a Fragment Monte Carlo (FMC) stochastic search procedure. The scoring function used in the FMC procedure is based on both fragment amino acid typing (matching the spin system to amino acid types) and fragment contact map (reflecting which residue is next to which) derived from HNCA data and the analysis of NOEs interpreted by BACUS (see Figure 1.4).</span></div><div>'''''&nbsp;'''''</div><div>'''''&nbsp;'''''</div><div><span style="font-size: 11pt;">''Figure 1.4.''</span>'''''<span style="font-size: 11pt;">&nbsp;PB-fragments mapping onto protein sequence.</span>'''''</div><div>'''''&nbsp;'''''</div><div><span style="font-size: 11pt; color: rgb(51, 153, 102);">''Set of PB-frsagments''</span>'''''<b><span style="font-size: 16pt; color: rgb(51, 153, 102);">:</span></b><span style="font-size: 16pt;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; F<sub>1</sub> F<sub>2</sub> F<sub>3</sub> F<sub>4</sub> ....</span>'''''</div><div>'''''&nbsp;'''''</div><div>''<span style="color: rgb(51, 153, 102);">Positions:</span>'''<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span style="color: blue;">protein sequence</span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span style="color: red;">recycle bin</span></span>'''''</div><div><span>'''&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1 2 3 ……&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;… N<sub>s</sub>… .N<sub>total</sub>'''</span></div><div>&nbsp;</div><div><span style="font-size: 11pt; color: rgb(51, 153, 102);">Scoring function:</span></div><div>'''''&nbsp;'''''</div><div>'''''&nbsp;'''''</div><div>'''''&nbsp;'''''</div>
-<br> <br>
-<div><span style="font-size: 11pt;">Where</span></div><div><span style="font-size: 11pt;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;'''''A '''''is any possible fragment’s mapping (assignment) onto protein sequence;</span></div><div><span style="font-size: 11pt;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; N<sub>f</sub> is number of PB-fragments;</span></div><div><span style="font-size: 11pt;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; N<sub>s</sub> is number of residues in protein sequence;</span></div><div><span style="font-size: 11pt;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;N<sub>totl</sub> is number of total positions for fragment mapping; Positions in the “recycle bin” are reserved for discarded (not assigned)&nbsp;PB-fragments.</span></div><div>&nbsp;</div><div><span style="font-size: 11pt;">&nbsp;FMC procedure performs ''<u>probabilistic assignment</u>'' of PB-fragments. The assignment </span><span style="font-size: 11pt;">probabilities are calculated by Simulated Annealing (SA) or Replica Exchange Method (REM) Monte Carlo (MC) simulations. &nbsp;Here, &nbsp;is a </span><span style="font-size: 11pt;">probability of fragment ''k'' to occupy position ''s;''</span>''<span style="font-size: 11pt;">k = 1,….,N<sub>f.&nbsp;; </sub></span>''<span style="font-size: 11pt;">and ''<sub>&nbsp;</sub>s''</span><span style="font-size: 11pt;"> = 1,….,N<sub>s</sub>+1.&nbsp;Sequence-specific assignment of PB-fragments is achieved by analyzing probabilities </span><span style="font-size: 11pt;">(see Figure 1.5) as well as sub-optimal fragment’s mapping that are provided by MC simulations.</span></div><div>&nbsp;</div><div>&nbsp;</div><div style="margin: 12pt 0cm 3pt;">'''<font size="4"><span>''FMCGUI''</span></font>'''</div><div>&nbsp;</div><div>FMCGUI is a graphical interface that assist user to carry out resonance assignment and structure calculation using ABACUS approach. </div><div>&nbsp;</div><div>&nbsp;</div><div>&nbsp;</div>
-= '''<font size="6"><span><font size="5">FMCGUI_2.2 COMMANDS</font></span></font>''' =
-<div>&nbsp;</div>
-== '''<font size="4"><span>''0. FMCGUI objects.''</span></font>''' ==
-<div>&nbsp;</div><div><span style="font-size: 11pt;">Most of FMCGUI commands operate mainly with the following three objects that are located in computer memory:&nbsp;protein sequence, peak list, and PB-fragments. </span></div><div>&nbsp;</div><div>'''<span style="font-size: 11pt;">Protein sequence</span>'''<span style="font-size: 11pt;">. This object can be created in memory using [<span style="color: rgb(153, 51, 102);">Data&gt;Protein sequence&gt;load</span>] or [<span style="color: rgb(153, 51, 102);">Project&gt;load</span>] commands.</span></div><div>&nbsp;</div><div><span style="font-size: 11pt;">&nbsp;The position ID of the first residue in the sequence should be specified by user upon loading sequence file (in the case it is not specified in the input file). Some commands in FMCGUI implies that the first residue of the protein sequence has position ID of 1. Therefore, if there is HIS-tag in the loaded sequence, it should be numbered accordingly starting with a negative position ID of the first residue.</span></div><div>&nbsp;</div><div>'''<span style="font-size: 11pt;">Peak Lists.</span>'''<span style="font-size: 11pt;"> Different peak lists objects can be created in memory using ['''<span style="color: rgb(153, 51, 102);">Data&gt;</span>'''”<span style="color: blue;">Peak list name</span>'''<span style="color: rgb(153, 51, 102);">”&gt;load</span>'''] or ['''<span style="color: rgb(153, 51, 102);">Project&gt;load</span>'''] commands. For some peak-lists, peaks in the list could be referenced by spin-system (fragment) user ID.</span></div><div><span style="font-size: 11pt;">The following table shows what peak lists are required referencing (+), peak lists </span></div><div><span style="font-size: 11pt;">that are optionally referenced (+/-), and peak lists for which referencing is not used </span></div><div><span style="font-size: 11pt;">even if present&nbsp;in the input file (-):</span></div><div><span style="font-size: 11pt;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span></div>
 {| cellspacing="0" cellpadding="0" border="1" style="margin: auto auto auto 4cm; border-collapse: collapse;"
 |-
@@ Line 90: / Line 76: @@
 | width="88" valign="top" style="border-style: none solid solid none; border-color: rgb(212, 208, 200) windowtext windowtext rgb(212, 208, 200); border-width: medium 1pt 1pt medium; padding: 0cm 5.4pt; width: 65.95pt; background-color: transparent;" | <div align="center"><span style="font-size: 11pt;">+</span></div>
 |}
-<div align="center"><span style="font-
+<div align
 </div>

Resonance Assignment/Abacus: Difference between revisions

Revision as of 22:46, 24 November 2009

Navigation menu

Search