Atom Specification

Atoms can be specified in commands using:

hierarchical specifiers
zones
built-in classifications
attributes

or combinations of these. Some commands accept specifications of non-atomic models.

**For a quick start, see the examples on the second page of the Chimera Quick Reference Guide (PDF)**

In many commands where the specification is the last argument (e.g. color), a blank specification means “all.”

In commands where the specification is not necessarily the last argument (e.g. findclash), specifications with embedded spaces should be enclosed in single or double quote marks.

When a plus sign (+) has been typed into the Command Line, it will be replaced by the atom specification string of the next picked atom. Each plus sign must be preceded and followed by a space (or the end of the line). In addition, the following are valid atom specifications:

selected, sel, or picked, indicating atoms that are selected
names of previously saved selections, using sel=selection_name or just selection_name; if a saved selection has the same name as a surface category, the saved selection will be used.

Those familiar with atom specification in Midas may wish to consult the summary of differences.

Hierarchical Specifiers

BASIC ATOM SPECIFICATION
Symbol	Reference Level	Definition
#	model	number assigned to the model by default or by the user with the open command.
:	residue	residue name OR residue sequence number, with any insertion code appended
::	residue	residue name
@	atom	atom name

Each file of coordinates opened in Chimera becomes a model with an associated model ID number. Model numbers are assigned by the user with the open command or sequentially by default. Each model contains one or more residues, and each residue contains one or more atoms with names that are unique within that residue. Thus, an atom can be specified by its model number, residue number, and atom name. The lack of a specifier is interpreted to mean all units of the associated reference level; for example, if a model number is not given, the specification refers to all models.

Residue and atom names are read from the input file. In PDB format, a standard nomenclature is used for standard amino acid and nucleic acid residues. Asterisks (*) in input atom names will be translated to prime symbols ('), but translated back if coordinates are later saved to a file. Prime symbols in input atom names are not translated on input or output.

Model and residue numbers are integers, although a residue number may have an insertion code directly appended. Multiple model numbers or residue numbers may be indicated by comma-separated lists and/or one or more ranges of the form start-end. The words start and end can be substituted for start and end, respectively, and all can be substituted for the whole range (same as no specification at that level).

Residue names and numbers cannot be specified together using a single colon, but both can be specified in a single line by using more than one colon. For example, to specify all residues named HEM and residue 52:

:hem:52 (NOT :hem,52)

More examples:

#0

- all atoms in model 0

#3:45-83,90-98

- residues 45-83 and 90-98 in model 3

#0:12@CA,N

- alpha carbon and nitrogen of residue 12 in model 0

:12,14@CA

- alpha carbons of residues 12 and 14

:12:14@CA

- all atoms of residue 12 and alpha carbon of residue 14

:12-20@CA:14@N

- alpha carbons of residues 12-20 and nitrogen of residue 14

:lys
- or -
::lys

- all lysine residues

In some cases, the basic notations cannot uniquely specify the model, residue, or atom of interest:

SUBCATEGORIES
General Form	Explanation
#model(s).submodel(s)	when a single PDB file contains multiple MODELs, they are considered submodels 1, 2, ... of a single model in Chimera
:residue(s).chain(s)	when a single model contains multiple chains, a unique specification includes both residue number and chain identifier
@atom(s).altloc(s)	when a single residue contains alternate locations of certain atoms, an independent specification includes both atom name and alternate location identifier

Submodel(s) are integers and may be indicated by a single value or a range of the form start-end. The words start and end can be substituted for start and end, respectively, and all can be substituted for the whole range (same as no submodel specification).

Chain(s) and altloc(s) are alphabetical characters. Any residue in PDB HETATM records (or the mmCIF equivalent) that does not already have a chain identifier is assigned to chain het, unless the residue is named WAT or HOH, in which case it is assigned to chain water; residue numbers are unchanged. (See also the residue attribute isHet.) Residue specifications with chain IDs omitted will match residues in chains with single-character IDs but not residues in these special chains. For example, :12 includes :12.A and :12.B but not :12.water or :12.het.

Capitalization of residue and atom names, chain identifiers, insertion codes, or alternate location identifiers is not important, with one exception: when a model contains both uppercase and lowercase chain identifiers, case matters for chain specification in that model only.

Subcategorizations are appended to the basic specification. The symbol for the relevant category (#, :, or @) must precede the subcategory specification, although they need not be immediately adjacent. Because commas are used only to separate values of the basic reference levels (model, residue, and atom), they cannot be used to separate values of the sublevels directly. For example,

#0.1-3,5

is interpreted as submodels 1-3 of model 0 and all of model 5, while

#0.1-3,.5

indicates submodels 1-3 of model 0 and submodel 5 of all models.

A subcategory specification applies only to the preceding category value(s) not separated from it by any commas. Thus,

:50.B,.D

indicates residue 50 in chain B and all residues in chain D;

:12-15,26-28.a,45.b@ca

specifies CA atoms in residues 12-15 in all chains (except the special chains het and water), 26-28 in chain A, and 45 in chain B. The following specifies residues 12-15 in no-ID chains only:

:12-15.

More examples:

:.b

- chain B

:.a@n,ca,c,o

- peptide backbone atoms in chain A

:195.a,221.a@n,ca,c,o

- peptide backbone atoms in residues 195 and 221 of chain A

:522.water

- water residue 522 (a HETATM residue named HOH or WAT without a chain ID in the input file)

@.a

- all atoms with alternate location identifier A

WILD CARDS

The global wild card * matches all atoms in a residue or all residues in a model. It stands alone as a symbol, that is, it cannot be used to match parts of names, such as G* or *A. The partial wild card = matches parts of atom or residue names but not parts of residue sequence numbers; similarly, the single-character wild card ? matches single characters within residue or atom names but not single digits within residue sequence numbers. For example:

#1:12@*
- or -
#1:12

- all atoms in residue 12 of model 1

#0,1,2:50-*@CA

- all alpha carbon atoms in residues 50 to the end of models 0, 1 and 2

#2:G??

- all residues which have three-letter names beginning with G in model 2

:fmn@?1

- atoms within residue FMN which have two-letter names ending with 1

@S=

- all atoms which have names beginning with S; in general, this will be all sulfur atoms

#0@H@H?@H??

- all atoms with one-, two-, or three-letter names beginning with H in model 0

Zones

Zone specifiers indicate atoms and residues that are within or beyond a given distance of the referenced atom(s). z< and zr< specify all residues with any atom within the given distance from the referenced atoms. za< specifies all atoms within the given distance. z>, zr>, and za> yield the sets complementary to their < counterparts. For example,

#1:gtp za<10.5

specifies all atoms within 10.5 Å of any atom in residue GTP, model 1.

Zone specifiers refer to atoms only, not surfaces; however, the command zonesel allows basing zones on surfaces and including surfaces in zones.

Built-in Classifications

Any of the entries in the following sections of the Select menu can be used for command-line atom specification:

Chemistry
Residue... amino acid category
Structure

For example, Structure includes

type of molecule: protein, nucleic acid, markers
protein secondary structure: helix, strand, coil
automatic categories for surface calculation: main, ligand, solvent, ions

... whereas Chemistry includes all of the element symbols (C, Fe, etc.) and many functional groups (disulfide, phosphate, ether O, etc.). The same pattern of capitalization and spaces, if any, as shown in the Select menu should be used. Where there is ambiguity, the parent menu should be included in the specification, for example, “IDATM type.H” or “element.H” instead of “H” alone. The parent menu can be included in this manner even when not necessary.

One can also use custom amino acid categories that have been defined with ResProp and custom categories for surface calculation that have been defined with the command msms cat (surfcat). If a saved selection has the same name as a surface category, the saved selection will be used.

Some examples:

side chain/base.without CA/C1'
- or -
without CA/C1'

- atoms in amino acid side chains (not including CA) and atoms in nucleic acid bases (not including C1'); using “with” instead of “without” would include CA/C1'

#1 & Mg

- magnesium atoms/ions in model 1

helix & positive

- residues in helices that are also in the positive amino acid category

carboxylate

- atoms in carboxylate groups

solvent

- atoms automatically categorized as solvent

Attributes

Attributes are properties of atoms, residues and models. The slash mark / indicates specification by attribute name and value. The symbol for the relevant category (@ for atom attributes, : for residue attributes, # for model attributes) must precede the slash mark, although it need not be immediately adjacent.

Multiple attributes at the same reference level (different atom properties, for example) can follow a single slash mark and should be separated by and or or. When and and or occur in the same list, and has higher priority (and-separated lists can be considered as grouped within parentheses).

The attribute names are case-sensitive; the attribute values, if any, are case-sensitive if specified with ==, but not if specified with =. Attribute values containing spaces (some color names, for example) must be enclosed by double quotes. The exclamation mark ! indicates that the atoms, residues, or models must not match the subsequent attribute specification. For yes/no properties the syntax is !attribute_name, and for multivalued properties the syntax is attribute_name!=value.

Attributes with numerical values can also be used with > (greater than), < (less than), >= (greater than or equal to), and <= (less than or equal to).

Color values can be specified:

by name, built-in or defined previously with colordef
as a comma-separated list of component values red,green,blue,alpha where each value can range from zero to 1 and specification of alpha (1 – transparency) is optional

When placed before an attribute name, the caret ^ indicates that the atoms, residues, or models have not been assigned any value for the attribute. For example, :/^kdHydrophobicity designates residues (such as water or nucleic acids) that lack a Kyte-Doolittle hydrophobicity assignment.

The operators ~ and !~ can be used instead of = and !=, respectively, to indicate that the subsequent string should be treated as a regular expression.

SELECTED ATOM ATTRIBUTES ^**
Atomspec Usage	Explanation
altLoc=altloc	altloc is the alternate location identifier of the atom
areaSAS=sasa	sasa is the solvent-accessible surface area of the atom (available when a molecular surface has been computed)
areaSES=sesa	sesa is the solvent-excluded surface area of the atom (available when a molecular surface has been computed)
bfactor=bfactor	bfactor is the B-factor value of the atom (see also Thermal Ellipsoids)
color=color	color is the color of the atom (assigned on a per-atom basis; see coloring hierarchy)
defaultRadius=rad	rad is the default VDW radius of the atom in Å
display	whether display is enabled at the atom level (see display hierarchy)
drawMode=mode	mode can be 0 (synonyms: dot, wire, wireframe), 1 (sphere, cpk, space-filling), 2 (endcap, stick), or 3 (ball, ball and stick, ball-and-stick, ball+stick, bs, b+s); see draw mode
element=atno	atno is the atomic number or the element symbol
idatmType=type	type is the atom type
label	whether the atom is labeled
label=label	label is the text of the atom label
labelColor=labcolor	labcolor is the color of the atom label
name=name	name is the atom name
occupancy=occupancy	occupancy is the occupancy value of the atom
pdbSegment=segid	segid is the segment identifier of the atom
radius=radius	radius is the current VDW radius of the atom in Å (may have been changed by the user from the default VDW radius)
serialNumber=n	n is the atom serial number in the input file
surfaceCategory=category	category is the name of the surface calculation category to which the atom has been assigned automatically or manually using surfcat
surfaceDisplay	whether molecular surface display is turned on for the atom (however, this can be true even for atoms that do not contribute to the molecular surface)

Examples:

@ca/!label and color!=green and color!=red
- or -
@/name=ca and !label and color!=green and color!=red

- atoms named CA which are not labeled, and are not green or red

@n/drawMode=1 and color=green

- atoms named N that are green and drawn as spheres

@n/drawMode=1 or color=green

- atoms named N that are green and/or drawn as spheres

@/color=yellow or color=blue and label

- atoms that are yellow and atoms that are both blue and labeled

@/color!=yellow or color!=blue

- all atoms, because if an atom is yellow it fulfills the criterion of not being blue, and vice versa

@/bfactor>=50

- atoms with B-factor values greater than or equal to 50

@/bfactor>=20 and bfactor<=40

- atoms with B-factor values ranging from 20 to 40

SELECTED RESIDUE ATTRIBUTES ^**
Atomspec Usage	Explanation
areaSAS=sasa	sasa is the solvent-accessible surface area of the residue (available when a molecular surface has been computed)
areaSES=sesa	sesa is the solvent-excluded surface area of the residue (available when a molecular surface has been computed)
isHelix	whether the residue is in a helix (true is only possible for amino acids)
isHet	whether the residue is in PDB HETATM records (or the mmCIF equivalent)
isSheet OR isStrand	whether the residue is in a beta strand (true is only possible for amino acids)
kdHydrophobicity=value	value is the Kyte-Doolittle hydrophobicity of the amino acid residue
numAtoms=N	N is the total number of atoms in the residue
phi=angle	angle is the protein/peptide backbone φ dihedral angle (C_i-1-N-CA-C)
psi=angle	angle is the protein/peptide backbone ψ dihedral angle (N-CA-C-N_i+1)
ribbonColor=ribcolor	ribcolor is the color of the residue's ribbon segment (see coloring hierarchy)
ribbonDisplay	whether ribbon display is turned on for the residue (however, this can be true even when the residue is a type that does not have any ribbon, such as water)
segment=segname	segname is the name of the segment to which the residue belongs (read from PSF, a format used for trajectory input)
ssId=N	N is the secondary structure element identifier, for example, 1 for residues in the first helix and first strand (starting from the N-terminus)
type=resname	resname is the residue name
uniprotIndex=N	N is the residue number in the corresponding UniProt sequence (discerned from HEADER and SEQRES information in the structure PDB file using a web service provided by the RCSB PDB)

Helix and strand assignments are taken from the input structure file. When the input file lacks secondary structure information, ksdssp is called automatically to generate helix and strand assignments. The ksdssp command (or compute SS in the Model Panel) can also be used to recompute assignments.

Examples:

:/type!=gly and type!=pro

- all residues not named GLY or PRO

:/isStrand :/isHelix
- or -
:/isStrand or isHelix

- all amino acid residues in beta strands or helices

:/isStrand and isHelix

- nothing, because the criteria are mutually exclusive

SELECTED MOLECULE MODEL ATTRIBUTES ^**
Atomspec Usage	Explanation
ballScale=factor	factor is ball radius relative to VDW radius
color=color	color is the color assigned on a per-model basis (see coloring hierarchy)
display	whether display is enabled at the model level (see display hierarchy)
lineWidth=width	width is the linewidth of bonds in the model in the wire representation
name=name	name is the name of the model
numAtoms=N	N is the total number of atoms in the model
numResidues=M	M is the total number of residues in the model
ribbonInsideColor=color	color is the color used for the insides of peptide/protein helix ribbon segments
stickScale=factor	factor is stick radius relative to bond radius (the default bond radius is 0.2 Å)

^**Additional attributes can be created:

arbitrarily with Define Attribute, defattr, or setattr
by combining other attributes using the Attribute Calculator
by various other Chimera tools such as Add Charge, Values at Atom Positions, and Multalign Viewer

Attribute Names

The preceding tables [atoms] [residues] [molecule models] include the names of some commonly used attributes. Additional attributes are listed in attribute inspector dialogs, and yet others may be generated only when a particular tool is used, or created arbitrarily by the user.

Attribute name lookup in Chimera:

attribute names are shown in the balloon help of attribute inspector dialogs
Select by Attribute lists most numerical, string-valued, and boolean attributes of atoms, residues, and molecule models; Render by Attribute lists most numerical ones
the command list resattr lists most residue attributes
Data descriptors (attributes) defined in the C++ layer of Chimera can be listed by entering the following into IDLE (under Tools... General Controls):
help(chimera.Class)
where Class can be Atom, Residue, or Molecule. This approach also works for Bond, PseudoBond, and PseudoBondGroup. Attributes defined in the Python layer will not be included, however.

Combinations

Atom specifications can be combined with the operators:

& for intersection (AND)
| for union (OR)
~ for negation (NOT)

Space-delimiting these operators is optional. When & and | occur in the same list, & has higher priority (&-separated lists can be considered as grouped within parentheses).

Note that a different set of operators (and, or, !, etc.) are used to combine attribute tests; however, both types of operators can be used within the same specification.

Examples:

#1:/type=asp or type=glu & #0 z<10
- or -
#1:asp,glu & #0 z<10

- aspartate residues and glutamate residues in model 1 that are within 10 Å of model 0

:cys@sg & ~disulfide
- or -
:cys&S&~disulfide

- cysteine sulfur atoms not participating in a disulfide bond

ions za<4 & ~ ions

- atoms within 4 Å of atoms categorized as ions, excluding the ions themselves

ligand z<5 &~ ligand &~ solvent

- residues with any atoms within 5 Å of residues categorized as ligand, excluding ligand and solvent

Ng+|N3+

- guanidinium nitrogens and sp³-hybridized, formally positive nitrogens (see atom types)

Surfaces and Other Model Types

Some commands can apply to things other than molecule models (atomic coordinates) and molecular surfaces. To handle such cases, “atom specification” has been extended as follows:

an entire surface model (all of its pieces) can be specified by model number preceded by #, or by selection and using the word selected, sel, or picked
surface pieces (independently selectable subparts of a surface model) can be specified individually by selection from the screen and using the word selected, sel, or picked
a single surface piece can be specified by :name or #N:name where N is model number, but note:
- this is only available when the individual surface piece has a name (not the surface model name), e.g., multiscale or sym low-resolution surfaces, IMOD segmentation surfaces; on mouseover, an “atomspec balloon” shows surface model number and surface piece name, with “?” indicating a lack of name
- only a single piece at a time can be specified by name in a command, as name is handled as a simple string, not as a list or range
other types of models can be specified by model number preceded by #
selections containing surfaces or other model types can be named and later referred to by name

Which of these can be used and whether they can be combined with atom specifications depends on the command.

UCSF Computer Graphics Laboratory / May 2019