Opened 5 years ago

Closed 5 years ago

#3339 closed defect (fixed)

Easy to get surface cap deletion wrong

Reported by: Tristan Croll Owned by: Tom Goddard
Priority: normal Milestone:
Component: Surface Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

The following bug report has been submitted:
Platform:        Linux-3.10.0-1062.9.1.el7.x86_64-x86_64-with-centos-7.7.1908-Core
ChimeraX Version: 1.0rc202005270557 (2020-05-27 05:57:28 UTC)
Description
Looks related to #3222. I'd openend and closed a few different model/map combos with Clipper, and had one open when I closed - but try as I might, I can't find a scenario in this pipeline where the "cap far" drawing actually exists. Clipper doesn't use clip caps, and clipping planes weren't enabled before initialising the Clipper object. Printing [d.name for d in {top-level model}.all_drawings()] doesn't show a drawing of that name.

Possibly related (?): I'm also still seeing the occasional segmentation fault on exit (at least once in a session where I'd never used ISOLDE or Clipper). Finally managed to capture one under gdb, but it's the most spectacularly unhelpful stack trace I've ever seen (more helpful Python stack trace underneath):

{{{
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffee6ffd700 (LWP 10493)]
0x00007fffb30ed3a4 in ?? ()
(gdb) bt
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffee6ffd700 (LWP 10493)]
0x00007fffb30ed3a4 in ?? ()
(gdb) bt
#0  0x00007fffb30ed3a4 in  ()
#1  0x00007ffee6ffcdb0 in  ()
#2  0x00007ffee6ffcda0 in  ()
#3  0x000000005ed22d77 in  ()
#4  0x0000000000026f11 in  ()
#5  0x000000005ed22d77 in  ()
#6  0x000000000f77bb68 in  ()
#7  0x0000000000000001 in  ()
#8  0x0000000002d78da0 in  ()
#9  0x00007fffb3ff69e0 in  ()
#10 0x0000000002d78da0 in  ()
#11 0x00007fffb3ff6200 in  ()
#12 0x00007fffb3ff6200 in  ()
#13 0x00007fffb3ff69e0 in  ()
#14 0x00007fffb315efcc in  ()
#15 0x0000000000000001 in  ()
#16 0x00007ffee6ffd700 in  ()
#17 0x0000000000000000 in  ()
}}}


NOTE: Exception ignored in: <function Drawing.__del__ at 0x7f047851d200>
Traceback (most recent call last):
  File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/graphics/drawing.py", line 1113, in __del__
    self.delete()
  File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/core/models.py", line 94, in delete
    Drawing.delete(self)
  File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/graphics/drawing.py", line 1126, in delete
    raise RuntimeError('OpenGL context make_current() raised an error during Drawing.delete()  for drawing "%s"' % self.name) from e
RuntimeError: OpenGL context make_current() raised an error during Drawing.delete()  for drawing "cap far"
Exception ignored in: <function Drawing.__del__ at 0x7f047851d200>
Traceback (most recent call last):
  File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/graphics/drawing.py", line 1113, in __del__
    self.delete()
  File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/core/models.py", line 94, in delete
    Drawing.delete(self)
  File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/graphics/drawing.py", line 1126, in delete
    raise RuntimeError('OpenGL context make_current() raised an error during Drawing.delete()  for drawing "%s"' % self.name) from e
RuntimeError: OpenGL context make_current() raised an error during Drawing.delete()  for drawing "cap near"
Exception ignored in: <function Buffer.__del__ at 0x7f04761a8830>
Traceback (most recent call last):
  File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/graphics/opengl.py", line 2502, in __del__
    % self.shader_variable_name)
chimerax.graphics.opengl.OpenGLError: OpenGL buffer "position" was not deleted before graphics.Buffer destroyed
Exception ignored in: <function Buffer.__del__ at 0x7f04761a8830>
Traceback (most recent call last):
  File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/graphics/opengl.py", line 2502, in __del__
    % self.shader_variable_name)
chimerax.graphics.opengl.OpenGLError: OpenGL buffer "normal" was not deleted before graphics.Buffer destroyed
Exception ignored in: <function Buffer.__del__ at 0x7f04761a8830>
Traceback (most recent call last):
  File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/graphics/opengl.py", line 2502, in __del__
    % self.shader_variable_name)
chimerax.graphics.opengl.OpenGLError: OpenGL buffer "vcolor" was not deleted before graphics.Buffer destroyed
Exception ignored in: <function Buffer.__del__ at 0x7f04761a8830>
Traceback (most recent call last):
  File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/graphics/opengl.py", line 2502, in __del__
    % self.shader_variable_name)
chimerax.graphics.opengl.OpenGLError: OpenGL buffer "tex_coord" was not deleted before graphics.Buffer destroyed
Exception ignored in: <function Buffer.__del__ at 0x7f04761a8830>
Traceback (most recent call last):
  File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/graphics/opengl.py", line 2502, in __del__
    % self.shader_variable_name)
chimerax.graphics.opengl.OpenGLError: OpenGL buffer "position" was not deleted before graphics.Buffer destroyed
Exception ignored in: <function Buffer.__del__ at 0x7f04761a8830>
Traceback (most recent call last):
  File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/graphics/opengl.py", line 2502, in __del__
    % self.shader_variable_name)
chimerax.graphics.opengl.OpenGLError: OpenGL buffer "normal" was not deleted before graphics.Buffer destroyed
Exception ignored in: <function Buffer.__del__ at 0x7f04761a8830>
Traceback (most recent call last):
  File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/graphics/opengl.py", line 2502, in __del__
    % self.shader_variable_name)
chimerax.graphics.opengl.OpenGLError: OpenGL buffer "vcolor" was not deleted before graphics.Buffer destroyed
Exception ignored in: <function Buffer.__del__ at 0x7f04761a8830>
Traceback (most recent call last):
  File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/graphics/opengl.py", line 2502, in __del__
    % self.shader_variable_name)
chimerax.graphics.opengl.OpenGLError: OpenGL buffer "tex_coord" was not deleted before graphics.Buffer destroyed




Log:
UCSF ChimeraX version: 1.0rc202005270557 (2020-05-27)  
© 2016-2020 Regents of the University of California. All rights reserved.  
How to cite UCSF ChimeraX  




OpenGL version: 3.3.0 NVIDIA 440.33.01
OpenGL renderer: TITAN Xp/PCIe/SSE2
OpenGL vendor: NVIDIA Corporation
Manufacturer: Dell Inc.
Model: Precision T5600
OS: CentOS Linux 7 Core
Architecture: 64bit ELF
CPU: 32 Intel(R) Xeon(R) CPU E5-2687W 0 @ 3.10GHz
Cache Size: 20480 KB
Memory:
	              total        used        free      shared  buff/cache   available
	Mem:            62G        4.2G         46G        265M         11G         57G
	Swap:          4.9G          0B        4.9G

Graphics:
	03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [TITAN Xp] [10de:1b02] (rev a1)	
	Subsystem: NVIDIA Corporation Device [10de:11df]	
	Kernel driver in use: nvidia
PyQt version: 5.12.3
Compiled Qt version: 5.12.4
Runtime Qt version: 5.12.8

Change History (10)

in reply to:  1 ; comment:1 by Tristan Croll, 5 years ago

Minor point: the Python traceback and the C++ stack trace are from two 
different ChimeraX runs.

On 2020-05-30 11:03, ChimeraX wrote:

in reply to:  2 ; comment:2 by Tristan Croll, 5 years ago

Running the same scenario with ChimeraX.valgrind gives some better info:

{{{
==9625== Thread 39:
==9625== Jump to the invalid address stated on the next line
==9625==    at 0x3C80D3A4: ???
==9625==    by 0x6A988DAF: ???
==9625==    by 0x6A988D9F: ???
==9625==    by 0x5ED23702: ??? (in 
/opt/UCSF/ChimeraX/lib/python3.7/site-packages/PyQt5/Qt/plugins/imageformats/libqsvg.so)
==9625==    by 0x92F6: ???
==9625==    by 0x5ED23702: ??? (in 
/opt/UCSF/ChimeraX/lib/python3.7/site-packages/PyQt5/Qt/plugins/imageformats/libqsvg.so)
==9625==    by 0x833F5D7: ??? (in /usr/lib/locale/locale-archive)
==9625==  Address 0x3c80d3a4 is not stack'd, malloc'd or (recently) 
free'd
==9625==
{
    <insert_a_suppression_name_here>
    Memcheck:Jump
    obj:*
    obj:*
    obj:*
    
obj:/opt/UCSF/ChimeraX/lib/python3.7/site-packages/PyQt5/Qt/plugins/imageformats/libqsvg.so
    obj:*
    
obj:/opt/UCSF/ChimeraX/lib/python3.7/site-packages/PyQt5/Qt/plugins/imageformats/libqsvg.so
    obj:/usr/lib/locale/locale-archive
}
==9625==
==9625== Process terminating with default action of signal 11 (SIGSEGV)
==9625==  Access not within mapped region at address 0x3C80D3A4
==9625==    at 0x3C80D3A4: ???
==9625==    by 0x6A988DAF: ???
==9625==    by 0x6A988D9F: ???
==9625==    by 0x5ED23702: ??? (in 
/opt/UCSF/ChimeraX/lib/python3.7/site-packages/PyQt5/Qt/plugins/imageformats/libqsvg.so)
==9625==    by 0x92F6: ???
==9625==    by 0x5ED23702: ??? (in 
/opt/UCSF/ChimeraX/lib/python3.7/site-packages/PyQt5/Qt/plugins/imageformats/libqsvg.so)
==9625==    by 0x833F5D7: ??? (in /usr/lib/locale/locale-archive)
==9625==  If you believe this happened as a result of a stack
==9625==  overflow in your program's main thread (unlikely but
==9625==  possible), you can try to increase the size of the
==9625==  main thread stack using the --main-stacksize= flag.
==9625==  The main thread stack size used in this run was 8388608.
}}}

Looks suspiciously similar to https://github.com/lxqt/lxqt/issues/1193, 
which would indicate it's independent of the Drawing delete issues.

On 2020-05-30 11:08, ChimeraX wrote:

in reply to:  3 ; comment:3 by Tristan Croll, 5 years ago

Ah - I see how the clip caps get formed. After the first model is 
created Clipper's zoom mouse mode (which adjusts clip planes on zooming) 
remains active, so the next volume loaded gets clipped if I zoom before 
making the Clipper object.

On 2020-05-30 11:42, ChimeraX wrote:

in reply to:  4 ; comment:4 by Tristan Croll, 5 years ago

Split the segmentation fault into a separate ticket (#3341).

On 2020-05-30 12:03, ChimeraX wrote:

in reply to:  5 ; comment:5 by Tristan Croll, 5 years ago

It has something to do with what happens when Clipper's `NXmapHandler` 
takes over the `Volume`'s data then deletes it (code at 
https://github.com/tristanic/chimerax-clipper/blob/ab0af918e3ecc7adf76f5bf44b5cedcfe0b72c6c/src/maps/nxmapset.py#L142). 
If I grab references to the volume and its submodels before Clipper 
takes over:

{{{
v, surface, cap_far, cap_near = session.models.list()[-4:]
}}}

... then do "clipper assoc #2 to #1" to take control, afterwards:

{{{
v.deleted
Out: True

surface.deleted
Out: True

cap_far.deleted
Out: False

cap_near.deleted
Out: False
}}}

If I amend my code from:

{{{
             if volume in session.models.list():
                 session.models.remove([volume])
             name = volume.name
             volume.delete()

}}}

to:

{{{
             if volume in session.models.list():
                 session.models.close([volume])
             else:
                 volume.delete()
}}}

... then the problem goes away. Not sure if this counts as a bug in the 
API or not, but it looks like this particular instance is fixed.


On 2020-05-30 12:03, ChimeraX wrote:

comment:6 by Eric Pettersen, 5 years ago

Component: UnassignedSurface
Owner: set to Tom Goddard
Platform: all
Project: ChimeraX
Status: newassigned
Summary: ChimeraX bug report submissionEasy to get surface cap deletion wrong

comment:7 by Tom Goddard, 5 years ago

What is this bug about? Please explain in the first sentence of the bug report "I do X and bad thing Y happens". I read every comment and I still have little clue what problem you are describing -- something with clipping caps apparently.

I don't know what the fix in the last comment is supposed to be fixing, but both versions of the code should work. Randomly changing the code without understanding the problem is not a way to fix bugs.

comment:8 by Tom Goddard, 5 years ago

Ok, I think I see what all this is about. This is the OpenGL error at exit and you believe it is a clip cap that was not deleted. And I also see why your original code would cause that error. The models.remove(volume) call removes the volume and disconnects the tree of child models from their parents. So then volume.delete() only deletes the Volume but none of the former child models, since remove() made those no longer children.

This does seem like horrible API. If remove() disassembles the tree of children then how will those ever get deleted? I also am perplexed that it is disassembling the tree. I would not expect that. I'll see if I can have models.remove() keep the tree and just unassign all the model ids used by the children.

in reply to:  9 ; comment:9 by Tristan Croll, 5 years ago

This appears to be the (a?) root cause of the drawing-not-properly-deleted bug that’s been occurring for a while (#3222). If I understand correctly, session.models.remove() unlinks the clip-cap drawings from the parent Volume, so they don’t get deleted when the Volume’s delete() is called.
 

 


comment:10 by Tom Goddard, 5 years ago

Resolution: fixed
Status: assignedclosed

Ok, I changed Models.remove() so it does not disassemble the tree of child models. So now models.remove([v]) followed by v.delete() will properly delete all the child models. I also added a doc string to explain what remove() does.

Note: See TracTickets for help on using tickets.