[chimera-dev] About Backbone Dependent Library
Eric Pettersen
pett at cgl.ucsf.edu
Wed Sep 16 17:31:34 PDT 2009
Hi Yang,
The details of the library format can be found here: http://dunbrack.fccc.edu/bbdep/bbdepformat.php
It includes a list of the 850 proteins they used and their filtering
criteria. According to that document, N is " the number of ... side
chains in this bin in the data set".
I ran a script across the PDBs in the set and got a number very
similar to yours: 379. So either N is wrong or the description of N
is wrong, or the description of the data set is wrong. There are only
8919 ASNs total in the chains of the data set so there is basically no
way that 2348 of them could fall in a 10 degree by 10 degree bin.
Chimera doesn't really care about N; it only cares about the p
values, so as long as they're right it's not a problem for Chimera.
Of course, wildly wrong N values are alarming! I'll be sending mail
to Roland Dunbrack to see if he has any insight on this issue, and
I'll cc you.
--Eric
Eric Pettersen
UCSF Computer Graphics Lab
http://www.cgl.ucsf.edu
On Sep 16, 2009, at 2:09 PM, ylei at ecs.umass.edu wrote:
> Hi all
>
> I am currently working on a project, which needs to rebuild the
> statistics using modified protein structures, say, some other set of
> proteins.
>
> There is one quick question about interpreting the BBDep Library.
> What's
> the exact definition of the fourth column "N" in the library file
> format? I thought for each residue within some (phi,psi) angle pair
> bin, it is the total number of sidechains found in all of the 850
> proteins. However, I checked ASN and GLN these two types of amino
> acids using "Chimera" to return the (phi,psi) dihedral angles for each
> ASN and GLN residue among 850 proteins with tens of exceptions which
> have no dihedral angles returned from Chimera(I just ignore them).
> Then, I count the number of sidechains for each (phi,psi) bin(10*10
> degree). It turned out that this number is not consistent with the "N"
> value in BBDep. e.g. For ASN at bin (phi=-70,psi=-40), Chimera
> returns 372
> sidechains in total from those 850 proteins, while BBDep says 2348.
> The difference is so large and very common over almost all the bins. I
> thought probably there might be some stupid errors in my understanding
> of the "N" value, but could you point that out for me? Or, did
> Dunbrack guys use another set of proteins other than those 850 ones?
>
> Best regards and thanks,
>
> Yang Lei
>
> Electrical & Computer Engineering
> University of Massachusetts, Amherst
>
> _______________________________________________
> Chimera-dev mailing list
> Chimera-dev at cgl.ucsf.edu
> http://www.cgl.ucsf.edu/mailman/listinfo/chimera-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://plato.cgl.ucsf.edu/pipermail/chimera-dev/attachments/20090916/42fa1701/attachment.html>
More information about the Chimera-dev
mailing list