Opened 5 years ago
Closed 5 years ago
#3570 closed enhancement (fixed)
Morph mismatches protein glycans
| Reported by: | Owned by: | Tom Goddard | |
|---|---|---|---|
| Priority: | moderate | Milestone: | |
| Component: | Structure Comparison | Version: | |
| Keywords: | Cc: | ||
| Blocked By: | Blocking: | ||
| Notify when closed: | Platform: | all | |
| Project: | ChimeraX |
Description
The glycans fly all over the place when morphing between these two sars-cov-2 spike structures because they have different residue numbers for all the sugars.
Some really fancy atom pairing code might be able to fix this. It would have to be able to match up two branched glycans which may or may not be identical and consider their attachment to already matched amino acids. Tristan Croll is using some approximate graph matching code that could be helpful.
This is probably way too much work for the small expected use.
Attachments (2)
Change History (7)
by , 5 years ago
| Attachment: | 6vxx_1_1_1.pdb added |
|---|
by , 5 years ago
| Attachment: | 6vsb_1_1_1.pdb added |
|---|
sars-cov-2 spike from http://www.charmm-gui.org/?doc=archive&lib=covid19
comment:1 by , 5 years ago
| Reporter: | changed from to |
|---|
Begin forwarded message:
From: "Browne, Kristen (NIH/NIAID) [C]" <>
Subject: morphing glycosylated structures
Date: July 30, 2020 at 5:16:53 PM PDT
To: Tom Goddard
Tom:
Do you have a strategy to do a morph between two structures with glycans attached? I’m doing a morph between the open and closed spike from structures here: http://www.charmm-gui.org/?doc=archive&lib=covid19
But the glycans go a bit crazy in the output. They’re… stretched and weird… Is there a way to do this?
Thanks,
K
Begin forwarded message:
From: Tom Goddard
Subject: Re: morphing glycosylated structures
Date: July 30, 2020 at 7:54:30 PM PDT
To: "Browne, Kristen (NIH/NIAID) [C]"
Hi Kristen,
The morph between 6vxx and 6vsb has the glycans flying all over because the residue numbering of the glycans is completely different in the two structures, not surprising since the structures were solved by different labs. The morph aligns the two amino acid sequences, so those are ok and the amino acids don't fly around. But PDB models with glycans given the glycans different residue numbers than the amino acids they are attached to. In fact each glycan is made up of multiple residues I guess one per sugar. So I see in the 6vxx structure you gave me glycan residue 2178 in chain A is attached to amino acid 1158. In 6vsb the same glycan residue number 2178 in chain A is attached to amino acid 801. So morph tries to fly glycan 2178 between two distant amino acids.
If the morph code were much smarter it might ignore all those glycan residue numbers and try to figure out which match with which using which amino acids they are connected to. Matching the glycans would also have to figure out which sugar branches match which branches. It is not that smart. If we had twice as many programmers and twice as much funding we might be able to do that kind of data rescue operation.
So the basic problem is the glycan numbering is more or less random and the two labs that solved 6vxx and 6vsb have completely different numbering. It would be a lot of work to fix the numberings so they match. Maybe some script could assign every glycan a numbering derived from is amino acid numbering, like the glycan on amino acids 801 could have glycan sugars numbered 80100, 80101, 80102, ... The branches of the glycan would have to be put in a consistent order to.
I'll make a ChimeraX ticket to describe the issue. We've seen it before. Good to have it documented.
Tom
comment:2 by , 5 years ago
Maybe there could be an option so that any residue that is not an amino acid or nucleic acid (which are handled by sequence alignment) get paired up only by connectivity to amino and nucleic acids and all atom names in those residue have to exactly match and there can be no two atoms in such a residue with the same atom name. All those restrictions would probably be satisfied by the glycans. I looked at the glycans attached to amino acid /A:74 of the attached two structures and it looks like they might match, although they are so complex I did not verify that they are even the same glycan. Maybe something like this could be funded by the NIAID / NETE contract -- not sure if it fits any of the deliverables in that contract (which has not yet been finalized).
comment:3 by , 5 years ago
This should be fairly doable for N-linked glycans. In mammals the core always has a fixed structure: ASN - NAG - (1,4) - NAG - (1,4) - BMA, sometimes with a FUC (1,6) linked to the first NAG. Beyond that, you almost always have two MAN residues (1,3) and (1-6) linked to the MAN... then the branching gets a bit more complicated. But if you work outwards from the ASN it shouldn't be incredibly difficult to match up.
comment:4 by , 5 years ago
I am hoping it is easy to match up. At least the one case I looked at all of the sugar atom names were unique and I can just match requiring exactly the same atom names and bonds. Of course different sugars won't match but morph in general just drops any atoms that don't match. I expect the glycan motions will be bad passing atoms right through others in most cases, so it would really need some MD with dihedral angle driving to get something that looks nice. I don't plan on doing that -- although I am curious if the dihedral driving would get stuck as the glycan tries to unfurl into a totally different conformation.
comment:5 by , 5 years ago
| Resolution: | → fixed |
|---|---|
| Status: | assigned → closed |
Fixed.
I investigated how morph is messing up the glycans. According to the morph documentation it is supposed to handle covalently linked ligands such as glycans. I fixed the errors in that code and now it correctly morphs the attached files including the glycans.
The way it works with covalently bound ligands is it does not consider the residue numbers. Instead it looks to see if a non-sequence residue is attached to an already paired (e.g. sequence aligned) residue and finds such residues that are in both morphed structures bound to the same matched residues and connected to the same atom name in those already paired residues. This will allow it to pair up a sugar bound to an amino acid. Then it does another round which allows it to pair a sugar bound to that sugar. It keeps pairing residues until it cannot pair any more.
sars-cov-2 spike from http://www.charmm-gui.org/?doc=archive&lib=covid19