Extensions in Python

If Sparky doesn't do what you want you can write your own extension in Python. You can also make use of extensions that other people have written. If you want to do something similar to what an existing extension does you can customize it.

Standard Extensions

New developments in Sparky are often done in Python. Core features are implemented in C++. Python is simpler than C++ and is interpretted so an extension can be added or modified without building new Sparky executable. Some of the extensions below are inconveniently slow, their reliability is lower than the core C++ features, and they are more likely to be changed in future releases than are core features. In spite of these drawbacks many do tasks that cannot be readily done using only core features.

An important limitation of all current Python extensions is that they are not updated when you add or delete spectra from your project. This means if you use an extension (say restricted peak picking) which displays a spectrum menu, and subsequently load a new spectrum, it will not appear in the menu. To work around this problem you can destroy the window with the spectrum menu (using the close button on the window frame, not the close button at the bottom of the window), and then display the window again with the appropriate Sparky command. This causes a new window and menu to be created with the full list of spectra.

Errors that occur in a Python extension generally do not cause Sparky to crash. Instead the Python shell window pops up and displays a stack trace. The stack trace shows exactly what functions in which files of Python code were executing when the error occurred. The last line indicates the type of error. For instance, if you mistype a PDB file name to an extension it may fail with an I/O error indicating the file could not be found. Many of the Python extensions do not adequately check their input and catch such problems. In these cases looking at the last line of the stack trace may indicate what caused the problem.

Extension File Description
AutoAssign (aa) autoassign.py Run protein backbone automatic assignment program AutoAssign
Assignment distances (ad) distance.py List assignments with far atoms, peaks with multiple close atom assignments, close atom pairs with no assignment, ...
Atom name translation (ax) atomnames.py Show translations used to convert non-standard atom names to standard names.
Center view setup (cv) centerview.py Center spectrum view on mouse position or peak in another view.
Chemical shift plot (cs) chemshift.py Plot chemical shifts for atoms
Chimera molecule view (km) chimeraview.py Display molecule and restraint violations with Chimera
Copy peak linewidths (cl) copylinewidth.py Copy peak linewidths and positions
CORMA simulated spectrum (cx) cormaspectrum.py Create a simulated NOESY spectrum using CORMA predicted peak intensities
Linewidth plot (lp) linewidthplot.py Plot linewidths for peaks for each resonance
Midas atom picking (ma) midaspick.py Pick atoms using Midas 3D display and show peaks
Midas constraints (mc) midasconstraint.py Display Mardigras constraint violations using molecular display program Midas
Mirror peaks (md, mp) mirror.py Find mirror peaks in HSQC-noesy spectra
Molecule sequence (sq) sequence.py Enter molecule sequence for use by other extensions
Peak list (LT) peaklist.py List peak assignments, volumes, linewidths, PDB model distances, Mardigras distance bounds, Corma predicted intensities, ...
Peak table (pb) peaktable.py List peaks with columns for several spectra
Python shell (py) pythonshell.py Shell for typing Python commands and display errors
Read peak list (rp) readpeaks.py Read a peak list from a file and create peaks on a spectrum
Relaxation peak heights (rh) relax.py Produce a table of peak heights for a set spectra for use in calculating relaxation time constants.
Reposition sequence (rs) reposition.py Reposition assigned protein fragment using chemical shift statistics
Restricted peak picking (kr) restrictedpick.py Pick peaks using peaks from another spectrum as a guide
Shift resonances (mv) movepeaks.py Move one assigned peak and have all other peaks on resonance lines moved by the same amount.
Spectrum labelled axis (la) axes.py Specify spectrum proton axis labelled by a heavy atom
Spectrum type (sy) spectrumtype.py Specify the type of the selected spectrum
Spin graphs (sg) spingraph.py Show a diagram of atoms connected by lines, one line for each assigned peak
Spin graph assigner (ga) assigngraph.py Explore possible assignments using spin graph display
Strip plot (sp) strips.py Show strips of 3-d spectra in a single window
Volume errors (ve) volumeerror.py Set volume error estimates based on several criteria

Automated Protein Backbone Assignment

Setup and run automatic assignment program AutoAssign (aa).

This extension is not usable unless you have obtained the AutoAssign "ascii client" program. This is not part of the standard AutoAssign distribution. It allows Sparky to run AutoAssign without invoking the AutoAssign graphical user interface.

AutoAssign is a program that does protein backbone assignment. It was developped by Gaetano Montelione's group at Rutgers. It is not part of the Sparky distribution. You can get it from http://www-nmr.cabm.rutgers.edu/software. It takes lists of picked peaks from a 2D N15 HSQC spectrum and the following 7 triple resonance spectra: HNCO, HNCACB, CBCA(CO)NH, HNCA, CA(CO)NH, HNHA, HA(CO)NH. The following paper describes the program. "Automated Analysis of Protein NMR Assignments Using Methods from Artificial Intelligence", J. Mol. Biol. (1997) 269, 592-610. The paper tested the program on six proteins ranging in size from 58 to 121 amino acids. In all cases almost all assignable resonances (more than 90%) were correctly assigned.

This Sparky extension exports the peak lists needed by AutoAssign, runs the program (taking seconds to minutes), imports the calculated resonance assignments, infers peak assignments, and displays the peak assignments using the spin graph assignment tool. Eight files containing peak lists and a spectrum table file are written and passed to AutoAssign. You specify the table file path. The peak list files are created in the same directory. The peak list file names use the Sparky molecule name and the spectrum name and a .pks suffix. All picked peaks in the 8 spectra are exported to AutoAssign. The restricted peak picking extension is useful for creating intial peak lists for the triple resonance spectra. The previous peak assignments for the 8 spectra are deleted and the new calculated assignments are made to the Sparky peaks. You can inspect the assignments using the spin graph display, add or remove peaks to fix problems and rerun AutoAssign. Or you can use the calculated assignments as a starting point for manual assignment.

Problems using AutoAssign

You must have the AutoAssign ascii client in your execution search path before you run Sparky. It should be executable as ascii_autoclient. Also the AutoServer program must be in your search path.

The 8 spectra needed by AutoAssign must be opened in Sparky. This extension tries to identify the 8 spectra by comparing their Sparky names to a list of spectrum types n15hsqc, hnco, hnca, hncoca, caconh, hncacb, cbcaconh, hncocacb, hnha, haconh, hncoha, .... This may incorrectly identify the type of some of the spectra. You can override the guessed spectrum types using the spectrum type dialog. If you try to run AutoAssign and not all the spectra are identified an error message is displayed. If there is more than one spectrum of a type needed by AutoAssign this will also be reported as an error. For example, if you have two hnco spectra open in Sparky AutoAssign will not be run. Currently there is no way to specify which should be used. The way to proceed is to close one of the spectra.

AutoAssign is able to handle spectra folded in the N15 dimension but this is not currently supported by this Sparky extension.

Checking Assignment Distances

Use atom distances to check noesy assignments (ad).

You can list assignments with protons farther apart than a specified distance, peaks with more than one assignment where the resonance lines are close and atoms are close, all pairs of close atoms having no assigned peak, unassigned peaks for which reasonable assignments exist, .... This works for 2-D noesy and 3-D hsqc-noesy spectra.

Atom coordinates are read from a PDB file. If a Sparky atom name does not match a PDB file atom name then assignment distances involving that resonance are not available. This extension is currently not able to translate atom names. To get distances for all assignments you can make a PDB file that uses your Sparky atom names.

Atom name translation

Show atom name translations used in converting Sparky atom names (ax), PDB file atom names, Mardigras file atom names, ..., to standard atom names.

Some Sparky extensions need to compare atom names from different sources. For example, the Python peak list extension (PL) can show proton-proton distances for assigned peaks calculated from a PDB model. If the atom names used in the PDB file do not exactly match the atom names used in Sparky assignments then translations need to be done to match the names. The spin graph extension tries to display residues using standard templates. For this to work the Sparky atom names must be matched to the standard atom names for which the templates are defined. The mirror peak extension needs to determine the name of the heavy atom attached to Sparky assigned protons. It finds the heavy atom name by looking up the proton atom name in a table based on standard atom names.

Sparky attempts to accomodate minor atom name variations, for example, GLY <-> G and and HN <-> H and HB2 <-> 2HB and H2" <-> H2'2, It does this by defining a set of standard names and translations between non-standard and standard names. When a new source of atom names is read (eg. PDB file, Mardigras constraint file, Corma predicted intensities file, ...) Sparky guesses what translations to apply. Translations define a one to one mapping between the non-standard and standard names. They are used to compare atom names and also when lookup in standard atom tables is needed (eg. to determine attached heavy atom, or determine atom layout for display of a residue).

The atom name translation dialog (ax) shows what translations are being used for each set of atom names (PDB files, Sparky names, ...). Initially Sparky guesses the translations to use. The dialog lets you select a different set of translations and lets you see what atom names are not recognized and not translated to standard names.

The translations are defined in the file atomnames.py. You can copy that file to your ~/Sparky/Python directory and add additional translations following the instructions and examples in that file.

There are many limitations of the atom translation facility. Here are a few.

Center Views

With the center view dialog (cv) you choose a spectrum window to be centered by subsequent uses of the center on mouse (cm) and center on peak (cp) commands. The center on mouse command will center the prespecified view at the current mouse position. The center on peak command centers the prespecified view on the selected peak. If "center when peak selected" is turned on, then selecting a peak causes the specified view to center without typing any command. When the peak or mouse position comes from a spectrum other than the one to be centered then a correspondence is made between the spectrum axes. Each axis of the view to be centered is matched with the axis of the source spectrum having the same nucleus type. If more than one axis in the source spectrum has matching nucleus and one of the matches has the same axis number (w1, w2, ...) then that axis is used. In other cases no correspondence is made with the axis of the centered spectrum and so that axis does not get centered.

There is a simpler view center command vc that centers the spectrum window it is typed in instead of centering a pre-specified window as the above described commands. The vc command centers at the position of the selected peak. All of the view centering commands can be used between spectra of different dimensions (eg. select a 3D HNCA peak and center 2D N15-HSQC spectrum).

Chemical Shift Plot

Plot resonance lines by group/atom type (cs).

Chemical Shift Image

Plot resonance frequencies as tick marks along a horizontal scale. Each group/atom type is plotted on a separate line. Only resonances in the specified ppm range are shown. A subset of atoms to show can be specified as a comma separated list of atoms or groups and atoms (eg. H2', C H5, K CD, N).

Chimera Molecule Display

Display molecule and restraint violations using Chimera (km).

Chimera is a molecular display program developped by the Computer Graphics Lab at UC San Francisco. It is the successor to their Midas and MidasPlus molecular display programs. An alpha release is out as of May 1999. This extension uses Chimera to display a model and restraint violations. The restraints are read from a file in Mardigras format. Mardigras is a program to compute NMR distance restraints from NOESY peak intensities using complete relaxation methods. It was developped by Tom James' NMR group and UC San Francisco. With the alpha release of Chimera it is necessary to have Chimera start Sparky for this extension to work. This is because it is not yet possible to start Chimera as a Python extension using the normal Python mechanisms. So it is necessary for Chimera to start Sparky as a Python extension. This is done with the following command:

	% chimera /usr/local/sparky/python/start_in_chimera.py

You select a PDB file and a Mardigras restraint file. You can show satisfied and / or violated restraints as colored lines connecting pairs of atoms. Red restraint violation lines indicate the pair of atoms are separated by more than 1.2 times the restraint distance upper bound. Magenta lines indicate a violation of in the range of 1.0 to 1.2 times the upper bound. Cyan lines indicate an atom separation of .8 to 1.0 times the lower bound distance. Blue lines indicate atoms closer than .8 times the lower bound distance. Satisfied restraints are shown as green lines. The interface does not allow you to change these colors. The atoms for which you have resonance assignments are colored by a color of your choice. Select restraint lines in Chimera (hold control and press the left mouse button) and the currently selected spectrum will be centered to show the corresponding peak. A status line in the Sparky extension dialog reports what restraint was selected and gives an explanation if the spectrum could not be centered (for example, because chemical shifts are not known). When you select a peak the corresponding restraint line is highlighted if one exists. If one does not exist a blue line is created and highlighted provided the two atoms associated with the peak assignment can be identified. Atom identification uses the atom name translation facility. Assignment names that refer to more than one atom (eg methyl group ALA MB) are not matched to any molecule atom. Restraints involving such NMR pseudo-atoms are not shown.

Copy Peak Linewidths

Copy peak linewidths and positions to aid peak fitting (cl).

The positions and linewidths of selected assigned peaks are set to the average position and linewidth along each axis from other peaks in the spectrum having the same assignment on that axis. This is intended to be used to help fit overlapped regions. You assign peaks in an overlapped region and set their positions and linewidths from other (better resolved) peaks with like assignments. Then you fit the overlapped region by adjusting only peak heights. In practice, not all of the overlapped peaks will be assigned and there will be assignments for which there is no other peak in the spectrum with a known linewidth. The copy linewidth command sets the linewidth to zero along the axes for which no linewidth is known. Zero linewidths are always adjusted in the fitting procedure, even when the "Adjust linewidths?" option turned off.

Corma Simulated Spectrum

Create a simulated NOESY spectrum using CORMA predicted peak intensities (cx).

This command creates a simulated 2D homonuclear NOESY spectrum from predicted peak intensities. The peak positions and linewidths are derived from a specified assigned NOESY spectrum (ie. experimental data). The simulated peak intensities are read from a file in CORMA format. Here is a sample of this file format:

ATOM1   ATOM2    DISTANCE      RATE     Icalc     Iobs     error         
H1'   4 2H2'  4     2.357  -4.97628   0.11698   0.06497   0.05201 ****      
H1'   4 H8   41     3.989  -0.21205   0.03088   0.03152  -0.00064           

The header line is necessary and there can be lines preceding the header which are ignored. The column locations are significant (this is Fortran style output). The simulated spectrum extension expects the first atom name in columns 1-4, first residue number in columns 5-7, second atom name in columns 9-12, second residue number in columns 13-15, and the predicted intensity in columns 36-45. All other columns are ignored.

A peak is made in the simulated spectrum for each entry in the CORMA file provided a position and linewidth can be determined from the specified experimental NOESY spectrum. The position is found by looking for assigned resonances for the two atoms. The atom name translation facility is used to match the atom names in the CORMA file with the Sparky assignment names. The linewidths are derived from assigned peaks in the experimental NOESY spectrum. An average linewidth is calculated for each resonance separately for each spectrum axis using all assigned peaks in the experimental spectrum for which linewidths are known. If no peak linewidth is available from the experimental data for a resonance along one of the axes, then the default linewidth specified in the dialog is used. If the default is zero no simulated peak will be created in this case. The peak intensity (ie volume) from the CORMA file is divided by the product of the linewidths and multiplied by a scale factor 1e6 to produce a peak height. This normalization is of course arbitrary so the resulting peak heights cannot be directly compared to experimental peak heights. The peaks in the simulated spectrum are Gaussian and truncated beyond 5 standard deviations. There is no noise added. The simulated spectrum file is a normal UCSF format spectrum file which can be opened in Sparky. This extension writes the file but does not open it in Sparky. Comparison can be done by overlaying the simulated spectrum on the experimental spectrum.

Creating simulated spectra with peaks2ucsf

The above described extension uses a program peaks2ucsf that comes with Sparky to create the simulated spectrum file. You can use this program directly to create other simulated data sets. The parameter file that describes the desired spectrum and peaks is illustrated below. (This is the documentation displayed when you run peaks2ucsf with no arguments.)

Syntax: peaks2ucsf output-file < parameter-file

Creates a UCSF format spectrum file from a list of Gaussian peaks.
The parameter file has the following format.

2				# dimension
1024 2048			# matrix size
8.37 9.21			# ppm at index 0,0
H H				# nuclei (H, N or C)
500.123 500.123			# nucleus frequencies (MHz)
4000.732 5200.183		# spectral widths (Hz)
4.37 2.15 1.95e6 13.2 15.1	# Gaussian center (ppm), height, linewidth (Hz)
    .
    .
    .

Linewidth Plot

Plot peak linewidths for each resonance (lp).

This extension is similar to the chemical shift plot but displays peak linewidths for each resonance. Each row in the plot corresponds to one resonance. The horizontal axis is linewidth and tick marks are placed to show the linewidth of each peak assigned with that resonance. The linewidth for just one of the spectrum axes is displayed. You specify which axis in the dialog. This helps you examine whether peak linewidths for a resonance are all the same as expected. It can identify integration errors by showing abnormally large or small linewidths. It can also be used to view linewidth variation for different resonances giving an indication of atom dynamics.

You can click on the individual tick marks and the corresponding peak will be centered in a spectrum window. You can limit the plot to certain atoms. You enter these in the dialog as a comma separated list of simple atom names (eg HA, HB), or residue symbol and atom name separated by a single space character (eg G HA, V HB).

Midas Atom Picking

Recenter spectra to show peak for a pair of atoms chosen in a 3D model (ma).

Once you have started Midas using the Midas constraint dialog (mc) you can select pairs of protons on the 3D model and have the selected spectrum recenter to the location where a crosspeak for the atom pair should appear. You turn on the "Show peaks?" switch in the atom picking dialog (ma). After selecting a pair of assigned protons the currently selected spectrum is automatically recentered to the relevant spectrum region. This works for 2D noesy and 3D noesy-hsqc spectra. The atom picking dialog also has switches to automatically label each picked atom in Midas, to show distances between picked pairs of atoms, and to show all assigned atoms (in yellow).

Midas Constraint Display

Display Mardigras constraint violations using molecular display program Midas (mc).

A PDB model and Mardigras constraints are displayed using Midas. This is similar to the Midas delegate NOEShow with fewer features. Violated distance constraints are show as colored lines connecting atoms. Distances less than .8 times the lower bound are shown in blue, .8 - 1.0 times the lower bound are shown in cyan, greater than 1.2 times the upper bound are shown in red, 1.0 - 1.2 times the upper bound are shown in magenta, and satisfied constraints are shown green.

Checking Mirror Peaks

Use mirror peaks to help make assignments of 3-D labelled noesy spectra.

For a pair of close protons H1, H2 you expect two hsqc-noesy peaks. One corresponds to the two protons and the C13 or N15 attached to the first proton, and the other is for the two protons and the C13/N15 attached to the second proton. To check a proposed assignment for a peak you can look to see if the "mirror peak" (the other peak associated with the pair of protons) exists. This code helps you look for mirror peaks. For each possible assignment it lists the signal/noise at the mirror peak location and allows you to jump to that place in the spectrum.

To find mirror peaks, this code must be able to lookup the attached heavy atom for any proton. In order to do this your Sparky atom names must be recognized. They must either be the standard names used by Sparky extensions or you must define translations between the atom names you use in Sparky and the standard atom names.

Molecule Sequence

Enter molecule sequence for use by other extensions (sq)

The sequence is specified as a string of one letter amino acid or nucleic acid codes. White space characters (space, tab, newline) are ignored and lower or upper case can be used. You specify the number of the first residue. It is not possible to put breaks in the numbering sequence. Instead of typing in or pasting in the sequence you can specify a file which contains the sequence string. The file must contain only the sequence string.

Peak List (Python extension)

List peak assignments, volumes, linewidths, PDB model distances, Mardigras distance bounds, Corma predicted intensities, ...

This extension is similar to built-in Sparky peak lists but will show fields not supported by the built-in peak lists. PDB model distances for multiple models can be shown for each assigned peak. Fields for Mardigras distance bounds and Corma predicted peak intensities can also be shown. The Corma intensities are shown as an absolute intensity and as a fraction of the peak volume. The intensities are normalized to make the average fractional value equal 1. Data will not be displayed when PDB, Mardigras, or Corma file residue and atom names do not exactly match Sparky residue and atom names.

The peak list initially shows no fields. Press the Setup button to select fields. Unlike built-in peak lists, these lists are not updated when properties of the peaks (volume, position, ...) are changed. To update the list you must press the Update button.

Peak Table

List peaks from several spectra on one line

You select one or more spectra and a table is shown where the rows are assignments and the columns are different spectra. An entry indicates whether the assignment has been made in that spectrum. The entry can be 'yes', 'no', '.', or the peak volume if the peak has been integrated. A '.' entry means one of the resonances has not been determined for the that spectrum condition. Clicking on any entry will select the peak in the corresponding spectrum. Double clicking on an entry recenters the spectrum to show the peak position (or where the peak should be if no assigned peak exists).

The 'Noesy Format?' switch changes the format of the table. The assignment is displayed as a pair of protons in Mardigras format. Peaks above and below the diagonal and peaks from 3-D noesy spectra are combined on a single row.

Python Shell

The Python shell window lets you type commands to the Python interpretter and displays their output. Refer to the Python documentation to see what kinds of commands you could type here. Invoking a Sparky extension from the Extensions menu or with a two letter accelerator causes the associated Python command to be sent to the Python shell. If an error occurs in an extension the shell window is displayed and a long and ugly and very useful stack trace tells exactly where the error occurred. You turn off the automatic shell window display using the switch in the preference dialog. This is accessed by clicking the preferences button at the bottom of the shell window.

Reading Peak Lists

Make peaks on a spectrum from a peak list file (rp).

Peaks are read from a peak list file and placed on the selected spectrum. The file should contain a line for each peak. The line should have an assignment followed by chemical shifts for each axis. For example

	C2H5-G1H1      5.395      6.030

The assignment can contain ? components. It can omit a group name for a component -- the shorthand G1H1'-H2' where the second group is omitted is equivalent to G1H1'-G1H2'. The residue name is separated from the atom name by looking for a residue number followed by one of the letters H, C, N, Q, or M. Extra columns are used to set the peak note. Peaks for 3-D or 4-D spectra can be read.

Relaxation Peak Heights

This rh command lets you create a table of peak intensities for a several spectra. It was intended to collect data for fitting exponentials to calculate relaxation time constants. When you first invoke rh a dialog appears. You choose the spectra you want intensities for. Then you select a peak or peaks and when you type rh a line is added to the list of peaks. The line lists the peak assignment and intensities for each spectrum. The spectrum intensities are listed in the order in which you initially selected the spectra. If a peak marker with the same assignment as the selected peak does not exist in any of the chosen spectra then it is created. When a new peak marker is created it will automatically be centered if the "Center new peaks?" check button is on. Instead of peak heights you can choose to put volumes in the peak list by turning on the "Show volumes?" check button. This requires that the peaks are already integrated. Peak integration is not done automatically. The list can be saved or appended to a file with the Save and Append buttons for subsequent fitting with other software. The list is cleared with the Clear button.

Reposition Sequence

Reposition assigned protein fragment using chemical shift statistics (rs).

This extension helps determine the correct sequential position of a stretch of assigned protein backbone resonances. You specify a numeric range of residues. Every possible repositioning of the resonances for these residues up and down the sequence is considered. For each position the deviation of the resonance chemical shifts from the expected shifts based on residue and atom type is determined. A score describing how well the assignments fit at this sequence position is calculated based on a database of expected shifts. The positions and scores are displayed in a list ordered from best to worst score. This list is produced by pressing the Positions button. You can select a line in the list and press the Move button to update all peak assignments involving the moved resonances.

The chemical shift statistics come from BioMagResBank and reflect an average of assigned resonances for all proteins in their database. (Resonances where unusual chemical shift referencing was used, or that are outside 8 standard deviations, or that are aberrant in other ways were excluded. The actual data used is in the Sparky Python file shiftstats.py and contains more details.) Each database residue/atom shift has a standard deviation. The score listed by the repositioning extension is obtained by averaging for all moved resonances the magnitude of the difference between experimental and expected shift divided by the standard deviation for that expected shift. So a score of 1.0 means that at the new sequence position the experimental shifts on average are 1 standard deviation from the expected values. The list also has a column called Mismatches. A mismatch is where a resonance does not make sense in the new sequence position. An example is moving a CB to a glycine residue (which has no CB atom). If you move assignments to a location where there are mismatches the resonances are moved and resonances not normally defined for a residue will be made. Another column in the scored positions list is called Collisions. A collision is where a moved resonance lands on an already assigned resonance. It is inadvisable to move assignments onto already assigned resonances (ie when there are collisions). It only makes sense to do so if the chemical shift of the destination is the same as that of the resonance being moved.

In addition to scoring all possible sequence positions for a segment of residues, you can list individual chemical shifts and their deviations from database values. This is done by pressing the Shifts button at the bottom of the dialog. The shifts for all resonances in the specified range of residues are listed as well as the expected shift and deviation from the expected shift in standard deviation units. Also the number of peaks assigned with each resonance is shown. Expected chemical shifts are not shown for atoms with non-standard names.

Restricted Peak Picking

The restricted peak pick command kr automatically picks peaks using existing peak markers from another spectrum as a guide. For example you can pick peaks in a 3D HNCA spectrum (correlating amide proton and nitrogen and alpha carbon) using already picked peaks from a 2D N15-HSQC (correlating amide proton and nitrogen). Only peaks in the 3D spectrum which have nearly the same 1H and 15N shifts as a peak in the 2D spectrum would be found. The advantage of restricting the search is that lower picking thresholds can be used. More real peaks and fewer noise peaks will be found. And the search is much faster. Other useful cases include picking 3D N15-NOESY peaks restricted with 2D N15-HSQC peaks, or picking 3D HCCH-TOCSY peaks using 2D C13-HSQC peaks. Another useful case is picking 3D HNCACB peaks using already picked 3D HNCA peaks. The HNCACB spectrum is just like the HNCA only it sees beta carbons in addition to alpha carbons. Because it has lower sensitivity you can first pick HNCA peaks and then pick HNCACB peaks that are near HNCA peaks using low thresholds.

The kr command shows a dialog where you specify the spectrum to be peak picked and the reference spectrum. Only the selected peaks in the reference spectrum will be used. You specify which axes of the pick spectrum will be restricted to have shifts near reference peaks. And you specify the allowable deviation in ppm for each reference axis. Pressing the Pick Peaks buttons will then search for peaks using the selected peaks in the reference spectrum. The minimum pick heights and linewidths used are exactly the same as the ones used in manual peak picking. They can be set with the peak pick thresholds command (kt). There is also a Select Peaks button which just selects existing peaks in the pick spectrum using the selected reference peaks to narrow the search.

Technical note / known bug: A corridor around each reference peak position is searched in the pick spectrum. Because this corridor is in index units (rather than ppm units) peaks may be found that are outside the specified ppm range by up to half a spectrum index unit or peaks inside the ppm range can be missed that are half an index unit inside the specified ppm range.

Shifting Resonances

Move one assigned peak and have all other peaks along its resonance lines moved by the same amount (mv).

This is used to assign a new spectrum taken under new experimental conditions when a previous spectrum has already been assigned. You can select all peaks from the old spectrum (pa) and copy and paste them to the new spectrum (accelerators oc and op). Then start this tool and move a peak for each resonance line to its shifted location.

For each resonance in the assignment of the hand moved peak all other peaks in the spectrum assigned with that resonance are located. For each peak with this resonance the corresponding peak in the old spectrum is found and the new peak is shifted relative to the old peak to match the size of the shift for the hand moved peak. The reason the shifts are referenced to the old spectrum is as follows. Suppose you move some peaks without using this tool. Then you turn this automatic peak mover on. Now if you move a peak, other peak markers that you already moved by hand would be moved away from their correct locations. This is not what you want. By referencing the shift to positions of the peaks in the old spectrum this problem is avoided.

Specify Labelled Axis

Specify spectrum proton axis labelled by a heavy atom (la)

This dialog allows you to specify the proton axis labelled (ie. directly attached to) a heavy atom axis of 3D spectra. It applies to spectra such as HSQC-NOESY or HNHA where there are two proton axes and one heavy atom axis. You identify which of the two protons is attached to the the heavy atom. This is needed by some extensions (eg. the AutoAssign extension) to correctly interpret the spectra. When such an extension needs this information it will ask so cases where you need to use this dialog are rare.

Specify Spectrum Type

Specify the type of the selected spectrum (sy)

Some extensions need to know the type of a spectrum (eg. hnco, hcch-tocsy, ...). From the spectrum type and molecule primary structure what peaks are expected can be inferred. This is used by the spin graph assigner and AutoAssign extensions. The type of a spectrum is guessed by matching its name against a list of known types n15hsqc, hnco, hnca, hncoca, caconh, hncacb, cbcaconh, hncocacb, hnha, haconh, hncoha, .... If the spectrum name contains one of these type names then it is used as the type of the spectrum. If more than one type name is contained in the spectrum name then the longest match is used. If the guess is incorrect you can fix it with this command.

Spin Graphs

Show a diagram of atoms connected by lines, a line for each assigned peak (sg).

Spin Graph Image

A spin graph is a diagram with vertices and lines connecting vertices. The vertices represent atoms and the lines represent NMR interactions. This extension shows spin graphs for assigned peaks. A peak in a 2-D spectrum is shown as a line between the two atoms. A peak in a 3-D spectrum is shown as two lines connecting the w1 and w2 axis atoms and the w2 and w3 axis atoms. Peaks in more than one spectrum are shown as different color lines. The graph is layed out based on templates for DNA, RNA, and proteins. In order to use the templates your Sparky atom names must be recognized. You can define translations between the atom names you use and the standard atom names used by Sparky extensions. You can print a diagram with the Save Postscript entry under the Commands menu. Only the region visible on the screen is printed.

Spin Graph Options

Under the Spectra menu there are checkbutton entries for each loaded spectrum. Pushing in a button causes all peaks in that spectrum to be displayed. The color of the button matches the color of the lines in the graph. Deselecting a button undisplays the peaks for that spectrum.

Spin Graph Options Image

The What... entry under the Layout menu brings up a dialog of controlling what type of atoms and peaks are displayed. This dialog will only appear when you have selected at least one spectrum to display. When you select a spectrum the residue labels for each residue is shown but not individual atoms. The Show Residue Atoms and Hide Residue Atoms buttons in the What to Display dialog show and hide all atoms. To limit the displayed atoms to certain types you can enter a space separated list of atom names. Also you can display only specified ranges of residues by entering ranges of residue numbers. The text labels identifying atoms and residues can be turned on or off. If an atom is not being displayed but the residue it belongs to has its label displayed then peak lines can be drawn to the residue label. This is controlled with the Lines to Residue Labels switch. If it is off then peaks assigned to undisplayed atoms are not shown. If you are displaying residue labels and lines between the residues, a single line can correspond to many peaks. You can display a numeric count of how many peaks is represented by that line by turning on the Peak Counts option. If you are displaying multiple spectra, the count includes lines from all spectra. There are switches to display or undisplay intra-residue, sequential or long range peaks. And there is an option to shade the intra-residue, sequential and long range lines in 3 distinct shadings emphasizing the intra-residue lines. After you have changed any of these options you need to press the Ok or Apply buttons for them to take effect. The Hide/Show Residue Atoms buttons have an immediate effect.

Label sizes, line thickness and spacing, and dot radii can all be set. This is done with the Sizes... entry under the Layout menu. This shows a dialog where numbers in units of screen pixels can be entered for all sizes. Label sizes represent height of characters. The line spacing parameter controls the offset between parallel lines of different colors. If the offset is 0 then the lines will be on top of each other and the top one will obscure any others. The line spacing should be greater than or equal to the line thickness to avoid parallel lines overlapping one another. There are checkbuttons to specify whether text labels should be automatically scaled as you zoom in and out.

Spin Graph Layout

The initial layout of the residues zig zags down the screen with increasing residue numbers to form a roughly square pattern. The first row goes left to right, the second right to left, ..., and so on. Choosing Row Layout under the Layout menu puts all residues in a single row from lowest residue number to highest. The scale is set so that the residue that takes up the most vertical space is fully visible using the current window height.

You can rearrange the layout of spin graph atom labels, atom dots, and residues with the left mouse button. Dragging a residue label moves all atoms in that residue. A box is drawn around these atoms while the left button is held down. If you want to move the residue label but not the atoms drag with the middle mouse button held down. If you click on a residue label but do not move it, its atoms are displayed or undisplayed. If you press the left button over a blank spot and drag a rectangle will be drawn. The spin graph will be zoomed to display the delimited region when you release the button. The vertical slider on the right edge of the spin graph window allows you to zoom in and out.

To reposition a sequence of residues drag a residue label using the right mouse button. When the button is depressed a yellow line will be drawn connecting the residues in sequential order. You drag the chain around like a piece of string and the drop it. You can chop your sequence up into smaller pieces for purposes of this string dragging operation. To do this press and release the right button on a residue label. The label will be highlighted in yellow. Now you can drag the string of higher numbered residues or the string of lower numbered residues. The highlighted residue label represents a break point and is included with the string of lower numbered residues. You can right click to make a number of these break points and drag the individual strings around. To eliminate a break point right click on the yellow residue label again.

You can save new layouts. When you load a layout from a file you should already be displaying a graph. Loading the new layout just repositions the atoms, it does not create the graph. This allows you to use the same layout for different spectra.

The atoms of a residue are initially positioned according to templates defined in the spinlayout.py file that is part of the Sparky distribution. You can use the Template... command under the Layout menu to interactively adjust the layout templates and create your own template file. Instructions are given in the Template... dialog.

Spin Graph to Spectrum

You can click on a line with the left mouse button to select the corresponding peak. Using the middle mouse button selects the peak and centers a spectrum window and raises it to the top to show you the peak. If the spin graph shows peaks of several spectra in different colors then you can click on the individual parallel colored lines to select or show peaks from the different spectra. In cases where a line represents more than one peak, for example, a 2D noesy peak and its transpose peak, then one of them is arbitrarily chosen. If there is no line between two atoms and you wish to see the region of a spectrum where the corresponding peak would be click with the left button on each atom. The currently selected view window will be recentered so that the hypothetical peak would be in the middle. The selected atoms should be clicked in the w1, w2 axis order of the the current spectrum window. If you click them in the wrong order you will be shown the wrong region. If it lies outside the bounds of the spectrum it will be aliased onto the spectrum giving seemingly random positions. The same procedure works to show a region of a 3D spectrum. You left click on the 3 atoms in the w1, w2, w3 axis order. The top line of the spin graph windows indicates what atoms have been chosen.

Spin Graph Assigner

Explore possible assignments using spin graph display (ga)

Assignment List Assignment Graph

This extension helps you make protein backbone resonance assignments using spin graphs. A spin graph is a diagram showing atoms as dots and assigned peaks as lines between the atoms. You can click on an atom and a list of possible chemical shifts is shown. Under each candidate chemical shift is a group of peaks from different spectra which support the proposed assignment. The suggested chemical shifts for the selected atom are ordered to put the most probable assignment at the top. The ordering is based on the total number of peaks consistent with the proposed assignment. Each resonance assignment line also shows the number of standard deviations of this chemical shift from the average shift for this atom and residue type using statistics from BioMagResBank. The peak assignment lines show the peak position or the deviation (in ppm) from the resonance shift for each spectrum axis. You can select a resonance assignment line from the list and press the Assign button to make all the associated peak assignments. You can click on atoms which already have an assignment and press the Unassign button to remove all associated peak assignments and then examine alternatives. As you assign and unassign resonances the spin graph lines are updated to display the assigned peaks. This extension uses peaks that you have already picked in the spectra. It does not find new peaks that have not been picked.

The Setup button on the assignment graph dialog brings up a window for choosing the spectra to use and chemical shift tolerances. Only the following spectra for use in protein backbone assignment are supported: n15hsqc, hnco, hnca, hncoca, hncacb, cbcaconh, hnha, haconh. Information about what peaks are expected in each of these spectra is encoded in the experiments.py Python code that comes with Sparky. In the future (sometime after 5/99) I will add spectra for making protein side chain assignments and for assigning DNA and RNA. The type of each of your spectra is guessed from the spectrum name which is derived from the file name. If your spectrum name does not contain one of the above listed spectrum types then you will be asked to specify the spectrum type. If it does contain one of the above type names but this is not the correct type for the spectrum you will have to change it using the spectrum type command sy. After you press Ok on the setup dialog a spin graph window will appear showing the existing assignments.

Here are details of the algorithm that produces the proposed resonance assignments and lists of supporting peak assignments. When you click on an atom in the spin graph display all expected peaks (from the spectra chosen in the setup dialog) involving that atom are considered. For example if the atom selected is a protein backbone CA and an HNCA spectrum is being used then an intra-residue peak connecting the CA to the amide H and N of the same residue is expected and an inter-residue peak connecting the next residue's amide H and N to this CA is expected. For each expected peak a check is made to see if any of the resonances have been assigned. If any is assigned then all picked peaks in the spectrum consistent with those resonances (using the tolerances set in the setup dialog) are considered. Each such peak gives a possible chemical shift value for the CA you are trying to assign. At this point we have a list of peaks connecting already assigned resonances to the CA resonance you want to assign. Each peak suggests a chemical shift for the CA. For each of these possible CA shifts, the peaks that agree with the shift to within the setup dialog tolerances are collected. These are the possible resonance assignments and sets of supporting peak assignments that are displayed.

Since resonance assignments are only proposed using peaks that connect to already assigned resonances you must have at least one already assigned resonance to get started. To start I suggest choosing an HSQC peak and assigning it with the standard assignment dialog (at) to an arbitrary place along the sequence. It is best to put it not too close to prolines or glycines so that you will be able to extend your assignments without running into these difficult residues. After you have extended to assign a few residues you can reposition the assignments using chemical shift statistics with the sequence repositioning tool (rs).

To make all peak assignments for a proposed resonance assignment you select the resonance assignment line (the top line of the group) and press the Assign button. Some peaks in the group may already be assigned. These preexisting assignments are shown in the last column. These peaks will not be reassigned when you select a resonance assignment line and press the Assign button. You can reassign the individual peaks by clicking on individual peak lines and pressing the Assign button. Or you can first unassign them by selecting the peak line and pressing the Unassign button. When you click on a spin graph atom that is already assigned you are shown just one group of peaks consistent with the current assignment. If you wish to see a list of alternative resonance assignments you need to unassign the resonance and then click on the atom again. (I should change this so that you see the alternatives in this case. The case of clicking on an already assigned resonance is handled differently because I want to show assignments that have been made that are outside of tolerances. This would be better handled by just putting the current assignment group at the top and include all existing assignments in it.)

When you assign and unassign peaks or groups of peaks with the Assign and Unassign button the spin graph lines are updated but the text window is not. (I don't update the text window because then all the alternative assignments disappear. This related to the last comment of the previous paragraph.) If you assign or unassign peaks by other means such as the standard assignment dialog (at) or using the unassign peak command aD, then neither the spin graph display or text is updated. Pressing the Update button on the assignment graph dialog will update the spin graph to reflect the current assignments. When you pick new peaks or delete peaks you also need to press the Update button so that the new peak lists are used for proposing assignments.

Strip Plot

Strip Plot Image

The strip plot window (command sp) is used for displaying narrow 2D portions of 3D spectra. It can display many strips of one or more spectra. and can search for strips having peaks with shifts matching selected peaks. There are many possible uses. For example, the HNCACB strip for each residue of a protein can be displayed with the amide proton on the x-axis, the alpha and beta carbons shifts on the y-axis, and the amide nitrogen on the z-axis (ie determining which plane is viewed). This spectrum correlates amide proton and nitrogen with intra and preceding residue CA and CB atoms. Every pair of strips from adjacent residues should show CA and CB peaks with matching y positions. So a walk down a protein backbone is displayed. Since only 20 or so strips will fit on the screen, there is a horizontal scrollbar to shift through strips. You can find strips with peaks having y positions matching selected peaks. So you can produce a walk down a protein backbone by starting with one HNCACB strip. Select the intra CA and CB peaks and invoke a command to find strips with peaks with matching y positions. This will find the amide strip for the next residue (recall each strip shows intra and preceding CA and CB). Then you can select the CA and CB for the new strip and find the next residue. In practice this will probably not work because peaks will be missing due to the low sensitivity of the HNCACB experiment. To accomodate this you can display strips for both HNCACB and CBCACONH. The latter spectrum shows only peaks from amide H and N to preceding residue CA and CB and may have better signal to noise. You can request that the strip plot window display strips for both spectra and you can search for matching strips in the CBCACONH spectrum. Besides backbone assignment using triple resonance spectra strip plots are also useful for side chain assignments. For instance starting with once strip of an HCCH-TOCSY (correlating attached proton and carbon to all other protons connected by chains of carbons) with the xy axes being the two proton axes you can select all peaks in the strip and find all other strips that match a large subset of those peaks. This can produce a collection of all strips for a sidechain spin system. Strip plots are also useful for making assignments in N15 and C13 editted NOESY spectra. To read more about how to take advantage of strip plots see: Journal of Biomolecular NMR, 5 (1995), 1-10 where Christian Bartels, Tai-he Xia, Martin Bileter, Peter Guntert, Kurt Wutrich describe their strip plot program XEASY. The Sparky strip plot extension was based on that paper.

How to Show Strips

Strip Plot Parameters Image

To display strips show the strip plot window with the sp command and choose "Select strip spectra... (ss)" under the Show menu. This displays a dialog listing the currently loaded 3D spectra. You click the checkbuttons next to the spectra you want to display strips for. You need to select which spectrum axes will be the x (horizontal), y (vertical) and z (plane) axes for each spectrum you will display strips for. Now you can select a peak and use the add selected peak strips command sk to show strips corresponding to the peak position for all of the chosen spectra. The strips are displayed in the same order that the spectra were chosen. Additionally each strip label has a border whose color matches the color of the checkbutton for that spectrum. A spectrum strip is only displayed if the x and z axes can be matched with corresponding axes of the spectrum the peak comes from. The peak must have a single axis with matching nucleus. (Note: This should be generalized to use the matching rules used by view centering that fallback on matching corresponding axis numbers.) The sk command adds strips at the end of the current set of strips. To delete individual strips type the Delete Strip command sd in the strip to be deleted. The sD command deletes all strips.

To display strips for all assigned peaks use the All Assigned Strips command sn. All x-axis and z-axis assignments are found for all chosen spectra and strips are displayed at these positions, ordered by residue number. The Goto Assigned Peak Strip command sj (under the Find menu) can then be used to scroll to the first strip whose assignment matches that of the selected peak.

To find all strips in a spectrum with peaks matching the y positions of currently selected peaks use the add strips matching peaks command sm. The search is done in the spectrum you typed the sm command in. Typically you pick one or two peaks in one strip and then wish to find strips in a different spectrum. To change the input focus to the spectrum in which you want to find matching strips without deselecting your peaks you can hold the shift key while clicking in the desired spectrum window. The peaks of the spectrum are clustered into strips using ppm ranges for each nucleus type. These ranges are specified in the Select Strip Spectra (ss) dialog. Then each strip has its peak y positions compared to the y positions of the selected peaks. A strip peak matches the y position of one of the selected peaks if it is within the range for that nucleus type. You can show only strips that have peaks matching every selected peak, or you can allow some number of selected peaks to have no match. The allowable number of mismatched peaks is also set in the Select Strip Spectra dialog. The displayed matching strips are ordered by how good the match is measured as the sum of the absolute y position errors. An unmatched position counts as an error equal to the ppm range in this score. Many matching strips can be found. They are appended to the current set of strips. To delete all the matching strips found by the most recent sm command use the Delete Matched Strips command sM.

Each strip is a view window that can be manipulated like all other spectrum view windows. For instance the contour scale can be displayed (vC) and contour levels adjusted. To apply changes made to one strip to all other strips of the same spectrum use the copy view strip options command sv in the strip whose settings you want to propagate.

Volume Errors

Integrated peaks have a volume and volume error associated with them. The seldom used volume error is intended to estimate errors in integration arising from poor fitting, offset baselines, other spectrum processing artifacts, overlap with other peaks, .... It can be output as a column in peak lists for use by programs which compute distance bounds from NOESY peak intensities. The volume error is not set automatically when peak integration is performed. Peak fitting produces a "fit residual" which can be displayed as a separate column in peak lists. The ve command allows you to set volume errors for all integrated peaks based on a few criteria. There are default percent error values for fitting and for box/ellipse methods, there is a penalty for being near the spectrum diagonal, and there are penalties for being overlapped with smaller, bigger, and comparable volume peaks. All error values are expressed as a percent of total volume.

How to customize a Sparky extension

Here is how to modify a Sparky extension to do something new. Copy the Python code for the extension (standard extensions are in /usr/local/sparky/python) to a directory named Python under your home Sparky directory. Sparky looks first in this personal Python directory when you ask it to load Python code. If you want to use both the original and customized versions rename the file. Now look through the Python code and modify it as desired. Python is a very clean language so even if you don't know it you can probably figure out many of its features just by looking at some code. There are several books on Python and there is online documentation. Python is a general purpose language. The way Sparky data is represented in Python and how you can use the Sparky user interface are described in sparky.py.

To use a modified Sparky extension (in file mycode.py) bring up the Python shell window in Sparky (py) and type the command:

	import mycode

You can have your extension loaded every time you start Sparky by adding the above import command to a file called sparky_init.py in your Python directory. You will probably want to add a menu entry and accelerator to invoke the extension. You can do this by adding a line to the end of the Python code like:

	sparky.add_command('sg',
	                   'Spin graph (sg)',
			   'spingraph.show_spin_graph()')

This adds an accelerator and extension menu item that execute the specified Python command.

If your modifications to the Python code generate an error, Python will print an error message including a trace indicating where the problem is. Sparky should never crash because of incorrect Python code. The error message will appear in the Python shell window (py). You can modify the code and then reload it with the command:

	reload(mycode)

You have to use this reload command instead of import because import does not reread the file.

How to write your own Sparky extension

You can look at the extensions /usr/local/sparky/python for examples of Sparky extensions. A good way to learn Python is take an existing extension that does something similar to what you want and modify it. Here is an example of a Sparky extension.

def write_peak_list(peak_list, path) :
  file = open(path, 'w')
  for peak in peak_list:
    if peak.assignment != None and peak.volume != None:
      line = "%15s %9.0f\n" % (peak.assignment, peak.volume)
      file.write(line)
  file.close()

The function write_peak_list() is defined. It takes two arguments, the first is a list of peaks and the second is the name of a file to write a peak list to. The first line opens the file for writing. The second line starts a loop where the variable "peak" will be set to each peak in "peak_list" in succession. The "if" statement checks that the peak has an assignment and a volume. If it does the variable "line" is set to a string containing the assignment and the volume. The next line writes the string to the file. When the loop is finished the file is closed.

The words highlighted in red are defined by the Python language. The words "volume" and "assignment" are defined by Sparky. The other words are variables names I chose. The format of the string uses the C language format specifiers. The assignment name is placed in a field 15 characters long, then there is a space, and the volume with zero digits after the decimal place in a field 9 characters long followed by a newline character.

To use this function in Sparky you do the following. First put the code in a file called "writepeaks.py" in the directory "~/Sparky/Python". Then bring up the Python interpretter in Sparky with the command py and type:

	>>> from writepeaks import *
	>>> peaks = selected_view().spectrum().peak_list()
	>>> write_peak_list(peaks, "mypeaks")

The ">>>" is the Python prompt. The first line loads your function. The second line calls 3 Sparky defined functions to get the currently selected view, find the spectrum it is associated with, and produce a list of all that spectrum's peaks. The third line writes the peaks to a file called "mypeaks".

That's alot of typing to produce a list of peaks. You could much more simply produce this list by bringing up a peak list (lt) and saving it to a file with the Save button. But if you needed to precisely control the format of the list so that it could be used as input to another program it might be worth writing the above Python function with exactly the right format specifier. If you were going to use this more than once it you could make it callable with a keyboard accelerator like all other Sparky commands. I'd write a slightly different function for this purpose.

def write_spectrum_peaks() :
  spectrum = selected_view().spectrum()
  peak_list = spectrum.peak_list()
  path = "~/Sparky/Lists/" + spectrum.name + ".peaks"
  file = open(path, 'w')
  for peak in peak_list:
    if peak.assignment != None and peak.volume != None:
      line = "%15s %9.0f\n" % (peak.assignment, peak.volume)
      file.write(line)
  file.close()

This version of the function takes the peaks from the spectrum of the current view and writes them to a file whose name is the spectrum name with a ".peaks" suffix. To have this function called when I type "wp" in a view window I would put the following lines in the file "~/Sparky/Python/sparky_init.py".

	from sparky import add_command
	from writepeaks import write_spectrum_peaks

	add_command("wp", "Write spectrum peaks (wp)",
		    "write_spectrum_peaks()")

The first two lines make the add_command() and write_spectrum_peaks() functions available. The three arguments to the add_command() function are the new keyboard accelerator "wp", the text to use for a menu entry, and the Python code to execute when the command is typed. Sparky executes the sparky_init.py file when it is started. So now whenever you start Sparky this command will be available, and will have a menu entry in the "Extensions" menu. To test this without exitting and restarting Sparky you would type to the Python interpretter:

	>>> from sparky_init import *

There are more examples of Python extensions to Sparky in the file example.py that comes with the Sparky distribution (look in directory /usr/local/sparky/python). Copying example code and modifying it to suit your needs is an easy way to learn the Python language. You can also learn how Sparky data (peaks, spectra, ...) is represented in Python from example. A more direct way is to look at the file sparky.py distributed with Sparky. This file defines peaks, spectra, ..., and all the functions for getting data from Sparky and manipulating the Sparky user interface.

More information about Python

For more on Python look at the excellent tutorial, or for a crash course, the quick reference. For more details see the language manual, and library manual. To get the latest version of the Python language, about all kinds of optional packages, etc ..., go to Python's home http://www.python.org