[Solved] Surge in memory usage by Minetest process (7 GB in just 6 minutes)

Post Reply
User avatar
sorcerykid
Member
Posts: 1877
Joined: Fri Aug 26, 2016 15:36
GitHub: sorcerykid
In-game: Nemo
Location: Illinois, USA

[Solved] Surge in memory usage by Minetest process (7 GB in just 6 minutes)

by sorcerykid » Post

The past several hours I've been experiencing regular server crashes where the minetestserver process just terminates without reporting any error. So I tried running minetestserver under gdb, and that shows that the process is being killed.

Code: Select all

Program terminated with signal SIGKILL, Killed.
The program no longer exists.
So I checked the system log, and that's when I discovered it's an OOM error.

Code: Select all

Jan 29 17:14:23 da kernel: Out of memory: Kill process 15015 (minetestserver) score 126 or sacrifice child
Jan 29 17:14:23 da kernel: Killed process 15015, UID 500, (minetestserver) total-vm:5867224kB, anon-rss:5188036kB, file-rss:1212kB
Jan 29 17:27:51 da kernel: Emerge-0 invoked oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0
Jan 29 17:27:51 da kernel: Emerge-0 cpuset=/ mems_allowed=0
Jan 29 17:27:51 da kernel: Pid: 27505, comm: Emerge-0 Tainted: G           -- ------------  T 2.6.32-642.11.1.el6.x86_64 #1
:
Jan 29 17:27:51 da kernel: Out of memory: Kill process 27472 (minetestserver) score 126 or sacrifice child
Jan 29 17:27:51 da kernel: Killed process 27472, UID 500, (minetestserver) total-vm:6227732kB, anon-rss:5176460kB, file-rss:20kB
I did some more investigation using top and vmstat. That's when I noticed that the minetestserver process is consuming ridiculous amounts of memory in a very short time. In the matter of only 6 minutes, memory usage skyrocketed from 1 GB (which is what the server normally starts at) up to almost 7 GB before finally being terminated by the kernel.

Code: Select all

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
  520 minetest  20   0 7361m 6.3g 3056 S 43.8 20.0   5:14.46 minetestserver
What could possibly cause such a surge in memory usage of nearly 1 GB per minute? There were only 7 players logged on at the time of the recent crash. So it doesn't seem that the server is overloaded. The debug log doesn't indicate anything particularly unusual, although it might be worth enabling verbose logging. But I'm not sure if verbose logging would reveal the culprit, particularly if it's a mod-related issue.

The only thing that even remotely comes to mind is that one of the players is doing something nefarious that consumes extraordinary amount of RAM. The first thought that comes to mind is someone dropping millions of items in a single mapblock or someone filling chests with items containing large amounts of inventory metadata, but I'm not sure if there is any way to definitively determine what is the root cause. So any suggestions would be most welcome.
Last edited by sorcerykid on Mon Feb 03, 2025 01:13, edited 1 time in total.

User avatar
Blockhead
Moderator
Posts: 2252
Joined: Wed Jul 17, 2019 10:14
GitHub: Montandalar
IRC: Blockhead256
In-game: Blockhead Blockhead256
Location: Land Down Under
Contact:

Re: Surge in memory usage by Minetest process (7 GB in just 6 minutes)

by Blockhead » Post

Given your main server and it not being luantiserver, we can safely assume we're not talking about 5.10. If you check the git logs, there are a few references to leaks being fixed in the 5.x series, which your version may not have.
/˳˳_˳˳]_[˳˳_˳˳]_[˳˳_˳˳\ Advtrains enthusiast | My map: Noah's Railyard | My Content on ContentDB ✝️♂

User avatar
cx384
Member
Posts: 717
Joined: Wed Apr 23, 2014 09:38
GitHub: cx384
IRC: cx384

Re: Surge in memory usage by Minetest process (7 GB in just 6 minutes)

by cx384 » Post

There are many potential causes.
You can try to profile the memory usage e.g. with heaptrack
But for your description it seems likely that it's not a memory leak from the engine, but a mod problem instead.
Can you show your minetest log to see what happened, and provide a list of used mods?
Can your read this?

User avatar
Blockhead
Moderator
Posts: 2252
Joined: Wed Jul 17, 2019 10:14
GitHub: Montandalar
IRC: Blockhead256
In-game: Blockhead Blockhead256
Location: Land Down Under
Contact:

Re: [Solved] Surge in memory usage by Minetest process (7 GB in just 6 minutes)

by Blockhead » Post

The issue was solved by applying patches from upstream on top of JT2 engine's 0.4.x base - read about it below.
Spoiler
I was right about upstream, but it was actually from pre-5.0. I was vaguely aware JT2 wasn't based directly on 0.4.17.1 series, but not aware the fork went back as far as 0.4.14.
Last edited by Blockhead on Mon Feb 03, 2025 05:23, edited 1 time in total.
/˳˳_˳˳]_[˳˳_˳˳]_[˳˳_˳˳\ Advtrains enthusiast | My map: Noah's Railyard | My Content on ContentDB ✝️♂

User avatar
sorcerykid
Member
Posts: 1877
Joined: Fri Aug 26, 2016 15:36
GitHub: sorcerykid
In-game: Nemo
Location: Illinois, USA

Re: [Solved] Surge in memory usage by Minetest process (7 GB in just 6 minutes)

by sorcerykid » Post

I'm happy to report that I was able to resolve this issue. It was not a memory leak nor was it related to fixes in the 5.x series.

Rather it was a bug with mapblocks being continuously reloaded at the world border even after they were already in memory causing the associated static objects to be multiplied exponentially. So it was an unusual confluence of factors (two different bugs), hence why it never became apparent until exactly the right conditions were met.

Code: Select all

2025-01-30 12:56:20: ACTION[Server]: Recording login time for player sorcerykid
2025-01-30 12:56:20: ACTION[Server]: sorcerykid spawns at (-2148.3200683594,5.6659998893738,30927.201171875)
2025-01-30 12:56:22: ERROR[Server]: suspiciously large amount of objects detected: 460784 in (-135,-2,1933); removing all of them.
2025-01-30 12:56:22: ERROR[Server]: suspiciously large amount of objects detected: 328320 in (-134,-2,1933); removing all of them.

2025-01-30 13:07:45: ERROR[Main]: ServerEnv: Trying to store id=13 statically but block (-134,-2,1933) already contains 10046592 objects.
Thankfully I deduced from the onset that the cause was related to excessive objects in mapblocks. I was just wrong about the source. It was not items intentionally dropped by players, but rather the temporary entities that I was using in my Ambience II mod to track placement of environmental sounds in mapblocks surrounding each player.

Image

This proved to be a situation where my RocketLib Toolkit was indispensible. I was able to put togther a quick script to locate all mapblocks with unusually large numbers of objects.

Code: Select all

local maplib = require "maplib"

local source_path = assert( arg[ 1 ] )
local decode_pos = maplib.decode_pos
local pos_to_string = maplib.pos_to_string
local map_db = MapDatabase( source_path, false )

for index, block in map_db.iterate( ) do
        local fpos = decode_pos( index )
        local counts = { }
        for i, v in ipairs( block.object_list ) do
                local count = counts[ v.name ]
                counts[ v.name ] = count and count + 1 or 1
        end
        if #block.object_list > 100 then
                for k, v in pairs( counts ) do
                        print( string.format( "Found: %d of %s", v, k ) )
                end
                print( string.format( "%d total objects in mapblock %s",
                        #block.object_list, pos_to_string( decode_pos( index ) ) ) )
        end
end
Of course, it was immediately apparent that the objects from my Ambience II mod were being duplicated outside the world border, but nowhere else. So the real puzzle at that point was to determine why.

Image

My biggest fear was that I somehow had introduced a bug in my fork of the engine. So I scoured the CPP source for an entire day, double-checking every instance where static objects were being stored, and nothing stood out as unusual.

Image

To eliminate any possibility that this bug was specific to my fork, i compiled a clean version of Minetest 0.4.14-stable from source. Yet the problem persisted. I then compiled a clean version of Minetest 0.4.15-dev and the problem altogether disappeared. That afforded some sense of relief, even though it still didn't solve the mystery.

Image

So I was one step closer to a resolution, yet I still needed to determine at which point the bug had been fixed. There were a myriad of commits relating to world border oddities in Minetest 0.4.15, but none of them referenced this issue specifically. I realized the only choice was to work backwards day-by-day through the commit history, compiling and retesting until I could identify the commit responsible.

After several hours of research, I managed to narrow it down to one commit titled "Fix unnecessary block loading". Yet nowhere in the commit message itself was there any mention of duplicating objects or excessive memory use. Thankfully, I checked the associated PR because not only did the title specifically reference duplicating objects, but the comments and even a related issue also confirmed this was was the correct fix.

Image

Ultimately I ended up patching my fork with PR #4847 and PR #4888, both of which specifically addressed this issue. That has sense completely solved the out of memory errors :)

User avatar
Blockhead
Moderator
Posts: 2252
Joined: Wed Jul 17, 2019 10:14
GitHub: Montandalar
IRC: Blockhead256
In-game: Blockhead Blockhead256
Location: Land Down Under
Contact:

Re: [Solved] Surge in memory usage by Minetest process (7 GB in just 6 minutes)

by Blockhead » Post

sorcerykid wrote:
Mon Feb 03, 2025 04:57
...

So I was one step closer to a resolution, yet I still needed to determine at which point the bug had been fixed. There were a myriad of commits relating to world border oddities in Minetest 0.4.15, but none of them referenced this issue specifically. I realized the only choice was to work backwards day-by-day through the commit history, compiling and retesting until I could identify the commit responsible.

...
Glad to hear you fixed the issue. It's a good write-up and you have good tooling and expertise. Just wanted to add, you would probably appreciate knowing that rather than going on a commit-by-commit basis, git has a tool called git bisect, which can help save a lot of time by doing a binary search instead, based on a good and a bad commit.
/˳˳_˳˳]_[˳˳_˳˳]_[˳˳_˳˳\ Advtrains enthusiast | My map: Noah's Railyard | My Content on ContentDB ✝️♂

Post Reply