Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#7901 closed defect (fixed)

PDB QS 2CSE

Reported by: kristen.browne@… Owned by: Eric Pettersen
Priority: normal Milestone:
Component: Performance Version:
Keywords: Cc: Tom Goddard, phil.cruz@…, michal.stolarczyk@…
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

Another hanging at similar spot:
27 October 2022,12:08:15           prefect.ShellTask            INFO     Using preset: Initial Styles / Original Look
27 October 2022,12:08:18           prefect.ShellTask            INFO     INFO:
27 October 2022,12:08:18           prefect.ShellTask            INFO     Preset implemented in Python; no expansion to individual ChimeraX commands available.

Kristen Browne, MSc, MscBMC
Contractor - MSC, Inc. | A Guidehouse Company
3D Modeling and Biovisualization Specialist
Bioinformatics and Computational Biosciences Branch (BCBB)<https://www.niaid.nih.gov/research/bioinformatics-computational-biosciences-branch>
OCICB/OSMO/OD/NIAID/NIH

5601 Fishers Lane, Room 4A60
Rockville, MD 20852
Office 202-253-5228


logs (10).txt

Attachments (1)

logs (10).txt (7.6 KB ) - added by kristen.browne@… 3 years ago.
Added by email2trac

Download all attachments as: .zip

Change History (12)

by kristen.browne@…, 3 years ago

Attachment: logs (10).txt added

Added by email2trac

comment:1 by Eric Pettersen, 3 years ago

Cc: Eric Pettersen added
Component: UnassignedPerformance
Owner: set to Tom Goddard
Platform: all
Project: ChimeraX
Status: newassigned

It took 9 hours to execute "surface #1 enclose #1 grid 2.5 sharp true"

in reply to:  3 comment:2 by kristen.browne@…, 3 years ago

Is it possible to have some kind of processing log be spit out every 30min so that we know that these long tasks are still processing vs frozen?  This would help us know whether or not we should be killing processes.

-----Original Message-----
From: ChimeraX <ChimeraX-bugs-admin@cgl.ucsf.edu> 
Sent: Monday, October 31, 2022 12:04 PM
Cc: goddard@cgl.ucsf.edu; Browne, Kristen (NIH/NIAID) [C] <kristen.browne@nih.gov>; pett@cgl.ucsf.edu
Subject: [EXTERNAL] Re: [ChimeraX] #7901: PDB QS 2CSE

#7901: PDB QS 2CSE
---------------------------------------+-------------------------
          Reporter:  kristen.browne@...  |      Owner:  Tom Goddard
              Type:  defect            |     Status:  assigned
          Priority:  normal            |  Milestone:
         Component:  Performance       |    Version:
        Resolution:                    |   Keywords:
        Blocked By:                    |   Blocking:
Notify when closed:                    |   Platform:  all
           Project:  ChimeraX          |
---------------------------------------+-------------------------
Changes (by pett):

 * status:  new => assigned
 * cc: pett (added)
 * component:  Unassigned => Performance
 * project:   => ChimeraX
 * platform:   => all
 * owner:  (none) => Tom Goddard


Comment:

 It took 9 hours to execute "surface #1 enclose #1 grid 2.5 sharp true"

--
Ticket URL: <https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.rbvi.ucsf.edu%2Ftrac%2FChimeraX%2Fticket%2F7901%23comment%3A1&amp;data=05%7C01%7Ckristen.browne%40nih.gov%7C7350e8a6c4fb497998e508dabb59bbcd%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C638028291322592441%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=%2FJ2pcVq4bsySXetOCiloh%2BMKos8o6ZMTT%2BWMLmVBW2g%3D&amp;reserved=0>
ChimeraX <https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.rbvi.ucsf.edu%2Fchimerax%2F&amp;data=05%7C01%7Ckristen.browne%40nih.gov%7C7350e8a6c4fb497998e508dabb59bbcd%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C638028291322592441%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=7If0v571BhQEF%2BjFnPEdABS%2Fm%2FngRCm4bRI6n%2FHtiS8%3D&amp;reserved=0>
ChimeraX Issue Tracker
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and are confident the content is safe.

comment:3 by Tom Goddard, 3 years ago

The slow calculation and high memory use making a solvent excluded surface for million atom virus capsid 2CSE was discussed in ticket #7905

https://www.rbvi.ucsf.edu/trac/ChimeraX/ticket/7905

The proposed solution was to not compute the SES surface for these very large structures. In the #7905 ticket I computed the SES surface with grid spacing 2 taking 30 minutes, but the resulting surface makes a 2 Gbyte GLTF file which is too large to be useful. I proposed setting a limit of say 300,000 atoms for calculating SES surfaces, larger structures should compute a lower resolution Gaussian surface.

As long as ChimeraX is still running it is still computing the surface. I have never seen it hang. But it could take arbitrarily long run times (tens of hours) and it will certainly crash when it runs out of memory. For these reasons the only sensible approach is that the script not compute SES surfaces for very large structures.

comment:4 by Tom Goddard, 3 years ago

Cc: Tom Goddard added; Eric Pettersen removed
Owner: changed from Tom Goddard to Eric Pettersen

Reassigning to Eric because the NIH 3D script will need modifying to not run SES surface calculations on very large structures.

comment:5 by Eric Pettersen, 3 years ago

Cc: phil.cruz@… added

Looking back at the original NIH3D script, for structures of more that 250K atoms it would use "surface #1 enclose #1 resolution 18 grid 6", which for this structure takes 7.7 seconds. I will change the presets to do this instead of the hours-long "grid 2.5" that they do now in these cases.

in reply to:  7 comment:6 by kristen.browne@…, 3 years ago

Thanks!

Do you think there's a practical time limit that we could set to "time out" processing of models with this in place?  Is there any reason an entry should take more than 1hr to process?  2hrs?  I'm hoping to have a safety switch of sorts to cut of processing if things happen to get stuck (for whatever reason)

K

-----Original Message-----
From: ChimeraX <ChimeraX-bugs-admin@cgl.ucsf.edu> 
Sent: Monday, October 31, 2022 1:51 PM
Cc: goddard@cgl.ucsf.edu; Browne, Kristen (NIH/NIAID) [C] <kristen.browne@nih.gov>; pett@cgl.ucsf.edu
Subject: [EXTERNAL] Re: [ChimeraX] #7901: PDB QS 2CSE

#7901: PDB QS 2CSE
---------------------------------------+-------------------------
          Reporter:  kristen.browne@...  |      Owner:  Tom Goddard
              Type:  defect            |     Status:  assigned
          Priority:  normal            |  Milestone:
         Component:  Performance       |    Version:
        Resolution:                    |   Keywords:
        Blocked By:                    |   Blocking:
Notify when closed:                    |   Platform:  all
           Project:  ChimeraX          |
---------------------------------------+-------------------------

Comment (by Tom Goddard):

 The slow calculation and high memory use making a solvent excluded surface  for million atom virus capsid 2CSE was discussed in ticket #7905

   https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.rbvi.ucsf.edu%2Ftrac%2FChimeraX%2Fticket%2F7905&amp;data=05%7C01%7Ckristen.browne%40nih.gov%7C3ac9fd20c4314f380a9d08dabb688525%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C638028354839048031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=T59SQTgbhmbP2MvNjE6qDaRxlEyRLP666IaZXU2or7k%3D&amp;reserved=0

 The proposed solution was to not compute the SES surface for these very  large structures.  In the #7905 ticket I computed the SES surface with  grid spacing 2 taking 30 minutes, but the resulting surface makes a 2  Gbyte GLTF file which is too large to be useful.  I proposed setting a  limit of say 300,000 atoms for calculating SES surfaces, larger structures  should compute a lower resolution Gaussian surface.

 As long as ChimeraX is still running it is still computing the surface.  I  have never seen it hang.  But it could take arbitrarily long run times  (tens of hours) and it will certainly crash when it runs out of memory.
 For these reasons the only sensible approach is that the script not  compute SES surfaces for very large structures.

--
Ticket URL: <https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.rbvi.ucsf.edu%2Ftrac%2FChimeraX%2Fticket%2F7901%23comment%3A3&amp;data=05%7C01%7Ckristen.browne%40nih.gov%7C3ac9fd20c4314f380a9d08dabb688525%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C638028354839048031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=B%2FP%2BlEEj4QOj6cP2A8hyIglLHAfFx%2BTsBtQ%2FI7iKryQ%3D&amp;reserved=0>
ChimeraX <https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.rbvi.ucsf.edu%2Fchimerax%2F&amp;data=05%7C01%7Ckristen.browne%40nih.gov%7C3ac9fd20c4314f380a9d08dabb688525%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C638028354839048031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=lhFk%2BSLLA4riDQymPBaMwnURuu%2BGOwlUjlrwGjAf7s0%3D&amp;reserved=0>
ChimeraX Issue Tracker
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and are confident the content is safe.

comment:7 by Eric Pettersen, 3 years ago

Resolution: fixed
Status: assignedclosed

In addition to using the faster "surface" command for large structures, I also had to improve the performance of the "protein" selector, which was looking up missing-structure pseudobond atoms in the overall atom list instead of a set -- which heavily impacts this CA-only structure with many missing-structure pseudobonds. That fix will be in tomorrow's daily build and release candidate. Will send a new version of the NIH Presets bundle in a separate email.

in reply to:  9 comment:8 by goddard@…, 3 years ago

I think it is a good idea to have the processing jobs killed after some time out.  Probably 1 hour is ok, but it should probably be chosen by seeing how long jobs actually take, for example when running 3DPX data through the new pipeline.  It would probably be wise to have those killed jobs logged so they can be debugged.  The timeout is probably something Michal would add to the pipeline.



comment:9 by Eric Pettersen, 3 years ago

Cc: michal.stolarczyk@… added

Due to large structures now using gaussian surfaces, the "surface hydrophobicity" preset no longer works on them (surface patches no longer associated with atoms), so there are two calls to that preset that need to be guarded by "num_atoms < HUGE_CUTOFF" tests in the NIH3D script. Michal, do you want me to make a pull request or do you want to make the changes?

in reply to:  11 comment:10 by michal.stolarczyk@…, 3 years ago

Thanks for the update, Eric. Yes, please create a PR with these changes.

Best,
Michal


Get Outlook for iOS<https://aka.ms/o0ukef>
________________________________
From: ChimeraX <ChimeraX-bugs-admin@cgl.ucsf.edu>
Sent: Monday, October 31, 2022 7:07:19 PM
Cc: goddard@cgl.ucsf.edu <goddard@cgl.ucsf.edu>; Browne, Kristen (NIH/NIAID) [C] <kristen.browne@nih.gov>; Stolarczyk, Michal (NIH/NIAID) [C] <michal.stolarczyk@nih.gov>; pett@cgl.ucsf.edu <pett@cgl.ucsf.edu>; Cruz, Phil (NIH/NIAID) [C] <phil.cruz@nih.gov>
Subject: [EXTERNAL] Re: [ChimeraX] #7901: PDB QS 2CSE

#7901: PDB QS 2CSE
---------------------------------------+--------------------
          Reporter:  kristen.browne@…  |      Owner:  pett
              Type:  defect            |     Status:  closed
          Priority:  normal            |  Milestone:
         Component:  Performance       |    Version:
        Resolution:  fixed             |   Keywords:
        Blocked By:                    |   Blocking:
Notify when closed:                    |   Platform:  all
           Project:  ChimeraX          |
---------------------------------------+--------------------
Changes (by pett):

 * cc: michal.stolarczyk@… (added)


Comment:

 Due to large structures now using gaussian surfaces, the "surface
 hydrophobicity" preset no longer works on them (surface patches no longer
 associated with atoms), so there are two calls to that preset that need to
 be guarded by "num_atoms < HUGE_CUTOFF" tests in the NIH3D script.
 Michal, do you want me to make a pull request or do you want to make the
 changes?

--
Ticket URL: <https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.rbvi.ucsf.edu%2Ftrac%2FChimeraX%2Fticket%2F7901%23comment%3A9&amp;data=05%7C01%7Cmichal.stolarczyk%40nih.gov%7C351c9e4410a745936e1008dabb94ab1b%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C638028544465258592%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=P9gwFGt9vmU7zudYjpfXpTMCMJl25smHjWXaSSGUSss%3D&amp;reserved=0>
ChimeraX <https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.rbvi.ucsf.edu%2Fchimerax%2F&amp;data=05%7C01%7Cmichal.stolarczyk%40nih.gov%7C351c9e4410a745936e1008dabb94ab1b%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C638028544465258592%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=WCVp8zuhF4XbLf6y328OBbC%2BW7C3eza1%2FdNcen%2B7MoY%3D&amp;reserved=0>
ChimeraX Issue Tracker
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and are confident the content is safe.

comment:11 by Eric Pettersen, 3 years ago

I have opened a pull request.

Note: See TracTickets for help on using tickets.