ChimeraX docs icon

Sequence Coloring Format

Sequence coloring format (SCF) is a simple, plain-text file format used to color sequences and structures to match. Opening an SCF file using the Sequence Viewer context menu adds colored regions to the sequence window and colors the corresponding residues of any associated structures.

SCF files were first written by the program JEvTrace:

JEvTrace: refinement and variations of the evolutionary trace in JAVA. Joachimiak MP, Cohen FE. Genome Biol. 2002;3(12):RESEARCH0077.
... and subsequently by the ConSurf Server for display in Chimera (and now ChimeraX). They can also be generated “by hand” in a text editor or programmatically.

The regions may contain any subset of residues in any subset of the sequences shown in the Sequence Viewer. Only the positions to be colored are encoded in an SCF file. In the file, columns are space-separated and need not be aligned. Comments starting with “//” or “#” at the end of a line are optional. If a comment is present, it is used as the region name, and lines with both the same comment and the same color are used to define a single region.

Two variants of the format can be read:

In the older format, each line refers to a single position in the alignment. A line includes five integer fields: the alignment position minus one, the sequence index number (0 to indicate all sequences), and the red, green, and blue components of the color, each on a scale from 0 to 255. Example:

337    0      0     0   255  # column 338
340    1      0   255   255 
338    9    255   255     0

The first line creates a blue region at position 338 in all of the sequences named “column 338.” The second line creates a cyan region at position 341 in the first (top) sequence. The third line creates a yellow region at position 339 in the ninth sequence.

The newer format is similar, but contains additional columns for easier specification of residue and sequence ranges. Example:

8   28    -1     -1    0   255   255      
8   28    9     9    255   175   175    // pinkish
8   28    10     10    255   175   175   // pinkish
The first two fields in each line indicate a range of positions in the alignment, starting position minus one and ending position minus one. The next two fields indicate a range of sequences in the alignment, starting sequence and ending sequence (zeroes indicate all sequences, whereas values of -1 indicate the line should be ignored by ChimeraX). The next three fields specify the red, green, and blue components of the color, each on a scale from 0 to 255. The example creates a region named “pinkish” spanning positions 9-29 in the ninth and tenth sequences.

UCSF Resource for Biocomputing, Visualization, and Informatics / June 2018