Today I managed to fix 3 issues.
weston-subsurfaceswas leaking regions. This was apparent from rapidly resizing the window.
cheesewas leaking…something. Apparent from the object IDs always increasing.
geditwould display a subsurface and when the subsurface was dismissed the window would stop responding.
I had realised there was an in issue with
weston-subsurfaces. Namely, one of
the subsurfaces would, briefly, not be in the correct set_position. I reasoned
that I was either mapping the window too early, or not dealing with synchronisation
properly or a combination of the two.
This lead me sort out the
window.zig) function so that it also flips
synchronised subwindows (after the parent itself flips). I also changed
wl_surface.zig) so that it will only map a window when a buffer is attached.
The combination of the two seemed to do the trick and the window appears all
weston-subsurfaces desynchronises its children after set up,
but I reasoned that if you resize the window, it properly sets everything to
synchronous (otherwise how do you avoid one surface lagging behind). And right
enough, resizing causes a temporary
What this then revealed was that I was leaking
Regions. I was a bit confused
about this because I was pretty sure I was cleaning up regions properly
(especially after adding some logic for a case I hadn’t handled). But still
the leak persisted.
It was around this time that I realiased there was an issue with
the outstanding cheese issue put me into a period of self doubt.
Which is why I’m pleased to say that I fixed all three issues!
The regions issue came about because sometimes a region gets created but is then never set on a particular window. Because all of the extant clean up code was actually happening with relation to a window these regions were never released. The solution then was to store a
?*Windowagainst the region so that if
destroy()is called on a region and it has the null window then we can release immediately. See this, this and that.
This then revealed another issue. I was also leaking
ShmPools! This was a similar issues as above where a
wl_shm_poolwas created but a buffer was never created from that pool before the pool was destroyed. In that case we need to deinit the pool immediately on
destroy(). See this commit.
At that the
weston-subsurfaces issues were fixed.
gedit issue I was dreading the most. But it actually turned out to be not
so bad. It boiled down to not removing the subsurface from the window tree when
it was destroyed.
To help explain, here’s output from
➜ foxwhale git:(master) ✗ bin/foxwhalectl window-tree window[2 ^ null] @16 (xdg_toplevel) (0, 0) (617, 399): window[6 ^ 2] @20 (wl_subsurface) (157, 149) (233, 194) [2, -1] [-1, -1] window[2 ^ null] @16 (xdg_toplevel) (0, 0) (617, 399) [-1, -1] [-1, 6]
The unindented line is a toplevel window. Then the indented lines show the order
that the (sub)windows are rendered in. The window at the bottom is rendered first,
the window at the top is rendered last.
[6 ^ 2] means window with index 6 is
the child of window with index 2.
And this makes sense here: we draw our main toplevel window first (2) and then draw the “popup” subsurface above that.
But was happening when we clicked away from that subsurface (and it disappeared leaving only the main window was) this:
➜ foxwhale git:(master) ✗ bin/foxwhalectl window-tree window[2 ^ null] @16 (xdg_toplevel) (0, 0) (617, 399): window[6 ^ null] @20 (wl_surface) (0, 0) (233, 194) [-1, -1] [-1, -1]
What this implies is that we attempt to draw window 2 but we seem to get stuck and can only draw window 6 (which shouldn’t even exist any more). Note that we SHOULD always have the toplevel window (2) as one of the indented rows, but here it’s gone.
This boiled down to not detaching that subsurface window such that window 2 was
still pointing to it even though it was gone. As we iterated through the (incorrect)
drawing order we (incorrectly) move to window 6, but because window 6’s
prev pointers we deinitialised we can no longer see the toplevel window.
Again, the solution was just to make sure we
detach() inside the
This then gives the correct output after the subwindow disappears:
➜ foxwhale git:(master) ✗ bin/foxwhalectl window-tree window[2 ^ null] @16 (xdg_toplevel) (0, 0) (617, 399): window[2 ^ null] @16 (xdg_toplevel) (0, 0) (617, 399) [-1, -1] [-1, -1]
This issue I had already been aware and was consistent, though relatively quiet, noise in my brain.
I had been worried that it was something in the underlying wire protocol implementation
delete_id events were delayed after a callback
done such that the
client would go and take a new id rather than reuse and old ID.
However, having been looking at some of the other
destroy() functions and with
wl_shm_pool stuff fresh in my mind I realised that we were never releasing the
wl_shm_pool object. A quick fix later
cheese no longer causes leaks!
One of the cool things that’s come up from using fixed arrays of resources is that it’s absolutely fantastic for detecing memory leaks. If you set the size to be sensibly long (long enough for a well behaved program) then you get a nice crash (in debug build) when it exhausts the available objects.
If I was just malloc’ing resources then I think it would be much harder to track things down.
One thing I have been considering is that resources that belong to a
should not be global as the currently are. Rather each
Client should have a
fixed size of, say,
Windows that it can allocate. This then means, if we run
out of memory because a
Client is misbehaving then it will be killed rather
than the unfortunate client that happens to request a
Window that we’ve just
run out of.
I also think that perhaps we should do some dynamic allocation, so that we keep
our memory usage to a minimum, but still have some fixed maximum value. I imagine
this would be very similar to
Stalloc except we would grow (realloc) the underlying
array up to a certain maximum capacity. We still get our nice block of resources
but we can start small and grow when we need to but still kill bad clients if
they get excessive.
If we’re doing this we’ll need to, and I actually quite like this, is rather
than use pointers to resources (e.g.
*Window), we’ll just use an index. This
means that the allocator can quite happily reallocate the array and copy the
existing content and all our references will still work.
Perhaps we swap out what type of allocation is being used a build time, such
that a debug build would use a quite small (but big enough so that well behaved
clients don’t trigger
OutOfMemory) and fixed sized array so that memory leaks
become easily apparent, but a release build grows dynamically up to some max
Generally I just really like that Zig makes memory allocation quite an explicit thing. By that I mean that there’s no assumption of just malloc and that it encourages the right type of allocation for a particular problem.