Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

备份时使用大量内存,最终概率导致OOM #32

Closed
chinese-wzq opened this issue Jun 9, 2024 · 11 comments
Closed

备份时使用大量内存,最终概率导致OOM #32

chinese-wzq opened this issue Jun 9, 2024 · 11 comments
Labels
info needed Further information is requested

Comments

@chinese-wzq
Copy link

chinese-wzq commented Jun 9, 2024

备份时后台看到内存占用一直在增加,最终吃掉了所有物理内存和虚拟内存

@chinese-wzq chinese-wzq changed the title 备份时使用大量内存,最终OOM 备份时使用大量内存,最终概率导致OOM Jun 9, 2024
@Fallen-Breath
Copy link
Contributor

请给出完整的环境以及可复现的方案,而非可信度低的简单的三言两语

@Fallen-Breath Fallen-Breath added the info needed Further information is requested label Jun 9, 2024
@chinese-wzq
Copy link
Author

服务器初次使用该插件,尝试make备份,之后内存占用持续增加,能占到18G,我的存档约有23G。

@Fallen-Breath
Copy link
Contributor

#32 (comment) ,信息量不足,无法复现

@chinese-wzq
Copy link
Author

chinese-wzq commented Jun 9, 2024

#32 (comment) ,信息量不足,无法复现

好,我尝试提供更多信息。

> plugins
[Server] [13:15:28 INFO]: Plugins (20): Beenfo, ChestSort, CoreProtect, dynmap, Essentials, EssentialsAntiBuild, EssentialsChat, EssentialsProtect, EssentialsSpawn, Graves, GroupManager, LWC, OpenInv, Playtimes*, ProtocolLib, SkinsRestorer, Slimefun, spark, TogglePvp, voicechat
> version
[Server] [13:16:38 INFO]: Current: git-Purpur-1632 (MC: 1.18.2)*
[Server] Previous: git-Leaves-"e59fda8" (MC: 1.18.2)

配置文件:

{
    "enabled": true,
    "debug": false,
    "storage_root": "./pb_files",
    "concurrency": 2,
    "command": {
        "prefix": "!!pb",
        "permission": {
            "root": 0,
            "abort": 1,
            "back": 2,
            "confirm": 1,
            "crontab": 3,
            "database": 4,
            "delete": 2,
            "delete_range": 3,
            "diff": 4,
            "export": 4,
            "help": 0,
            "import": 4,
            "list": 1,
            "make": 2,
            "prune": 3,
            "rename": 2,
            "show": 1,
            "tag": 3
        },
        "confirm_time_wait": "1m",
        "backup_on_restore": true,
        "restore_countdown_sec": 10
    },
    "server": {
        "turn_off_auto_save": true,
        "commands": {
            "save_all_worlds": "save-all flush",
            "auto_save_off": "save-off",
            "auto_save_on": "save-on"
        },
        "saved_world_regex": [
            "Saved the game",
            "Saved the world"
        ],
        "save_world_max_wait": "10m"
    },
    "backup": {
        "source_root": "./server",
        "source_root_use_mcdr_working_directory": false,
        "targets": [
            "world",
            "world_nether",
            "world_the_end",
            "plugins",
            "data-storage"
        ],
        "ignored_files": [
            "session.lock",
            "*.tmp"
        ],
        "follow_target_symlink": false,
        "hash_method": "xxh128",
        "compress_method": "plain",
        "compress_threshold": 64
    },
    "scheduled_backup": {
        "enabled": true,
        "interval": "48h",
        "crontab": null,
        "jitter": "10s",
        "reset_timer_on_backup": true,
        "require_online_players": false,
        "require_online_players_blacklist": []
    },
    "prune": {
        "enabled": true,
        "interval": "6h",
        "crontab": null,
        "jitter": "1m",
        "timezone_override": null,
        "regular_backup": {
            "enabled": true,
            "max_amount": 5,
            "max_lifetime": "0s",
            "last": -1,
            "hour": 0,
            "day": 0,
            "week": 0,
            "month": 0,
            "year": 0
        },
        "temporary_backup": {
            "enabled": true,
            "max_amount": 10,
            "max_lifetime": "30d",
            "last": -1,
            "hour": 0,
            "day": 0,
            "week": 0,
            "month": 0,
            "year": 0
        }
    },
    "database": {
        "compact": {
            "enabled": true,
            "interval": null,
            "crontab": "0 7 * * *",
            "jitter": "1m"
        },
        "backup": {
            "enabled": true,
            "interval": null,
            "crontab": "0 6 * * 0",
            "jitter": "1m"
        }
    }
}

尝试执行:

> !!pb make new
[MCDR] [13:17:56] [PB@19fc-worker-heavy/INFO] [prime_backup]: [PB] 创建备份中...请稍等
[Server] [13:17:57 INFO]: Automatic saving is now disabled
[Server] [13:17:57 INFO]: Saving the game (this may take a moment!)
[Server] [13:18:10 INFO]: ThreadedAnvilChunkStorage (world): All chunks are saved
[Server] [13:18:10 INFO]: ThreadedAnvilChunkStorage (DIM-1): All chunks are saved
[Server] [13:18:10 INFO]: ThreadedAnvilChunkStorage (DIM1): All chunks are saved
[Server] [13:18:10 INFO]: ThreadedAnvilChunkStorage: All dimensions are saved
[Server] [13:18:10 INFO]: Saved the game
[Server] [13:18:10 WARN]: Can't keep up! Is the server overloaded? Running 13199ms or 263 ticks behind
[MCDR] [13:18:39] [PB@19fc-worker-heavy/INFO] [prime_backup]: Creating backup for ['world', 'world_nether', 'world_the_end', 'plugins', 'data-storage'] at path 'server', timestamp 1717910319283012252, creator 'console:', comment 'new', tags {}
[MCDR] [13:19:17] [PB@19fc-worker-heavy/INFO] [prime_backup]: Pre-calculate all file hash done

之后就一直卡住了。内存占用量持续增加,等待很久都没有见到备份成功。
服务器有物理内存7.5G,swap20G,固态硬盘,硬盘空间充足

补:

OS: Ubuntu 22.04.4 LTS aarch64
Host: Orange Pi 5
Kernel: 6.1.43-rockchip-rk3588
Uptime: 37 days, 2 hours, 28 mins
Packages: 603 (dpkg)
Shell: zsh 5.8.1
Terminal: /dev/pts/0
CPU: (8) @ 1.800GHz
Memory: 6528MiB / 7683MiB

@Fallen-Breath
Copy link
Contributor

信息依然不足。这一类问题出现原因常与备份内容的结构紧密相关,如:

  • 文件总数(数量是否上千万。PB 需要在内存中储存与文件数量相同的数据)
  • 文件分布(是否存在软、硬链接等复杂的文件布局)

仅提供配置、环境,是无法复现所述 issue 的。如 #32 (comment) 中所述,请提供一个可复现该 issue 的方案:描述如何从 0 开始,构造一个可复现该 issue 的环境

@chinese-wzq
Copy link
Author

chinese-wzq commented Jun 9, 2024

信息依然不足。这一类问题出现原因常与备份内容的结构紧密相关,如:

  • 文件总数(数量是否上千万。PB 需要在内存中储存与文件数量相同的数据)
  • 文件分布(是否存在软、硬链接等复杂的文件布局)

仅提供配置、环境,是无法复现所述 issue 的。如 #32 (comment) 中所述,请提供一个可复现该 issue 的方案:描述如何从 0 开始,构造一个可复现该 issue 的环境

首先非常感谢你的注意。我可以搭建运行该项目源码的环境,也许你可以告诉我需要在哪里增加用于调试的代码,我一定尽力配合。
文件数量:

find ./ -type f|wc -l
1886975

使用chatgpt生成的命令查找软硬链接,没有找到:

查找软链接
$ find ./ -type l
查找硬链接
$ find ./ -type f -printf "%i %p\n" | sort | uniq -w 10 -D

我尝试打开了PB的debug开关,但控制台并没有输出什么更多的东西。
关于复现issue的问题,我尝试将相同的配置文件和plugins复制到一个新的文件夹(这意味着没有存档、plugins文件夹下插件自己生成的那些文件夹、data-storage文件夹),启动该服务端后再次!!pb make new很快成功完成。这已经是我能想到的最好的复现相同环境的方法了,但结果可能说明该问题与文件数量或大小息息相关。如何实锤这一点?
如果真的是因为文件数量过多,是否可以考虑将数据缓存在硬盘中,以牺牲一些速度为代价,需要时读取?
希望能在明天下午之前定位问题,因为那个时候我就要去上学了😭

@Fallen-Breath
Copy link
Contributor

1886975 个文件,这个文件数量对于原版 MC 存档而言已经过于庞大了,建议先排查下。如果是某些模组/插件会写入大量零散文件,可能看看能不能清理掉,或者在备份的时候排除掉。作为参考,TIS 服务器的存档中的文件数量仅有 1.1 万个,与你的场景差了 2 个数量级

在百万文件这一场景下,PrimeBackup 确实会占用大量内存,不过所述的 18G+ 的内存占用仍待复现。在这种情况下,就算 PrimeBackup 能正常备份文件,其性能表现也会大打折扣,并且 PrimeBackup 储存元信息的数据库性能效率也会非常低。建议还是想办法把文件数量压下来

@chinese-wzq
Copy link
Author

1886975 个文件,这个文件数量对于原版 MC 存档而言已经过于庞大了,建议先排查下。如果是某些模组/插件会写入大量零散文件,可能看看能不能清理掉,或者在备份的时候排除掉。作为参考,TIS 服务器的存档中的文件数量仅有 1.1 万个,与你的场景差了 2 个数量级

在百万文件这一场景下,PrimeBackup 确实会占用大量内存,不过所述的 18G+ 的内存占用仍待复现。在这种情况下,就算 PrimeBackup 能正常备份文件,其性能表现也会大打折扣,并且 PrimeBackup 储存元信息的数据库性能效率也会非常低。建议还是想办法把文件数量压下来

好的,我尝试下给ignored_files提交一个使用正则表达式的pr

@Fallen-Breath
Copy link
Contributor

如果方便的话,描述一下这个百万级别文件的存档,具体是什么场景什么用途。这样其他人如果出现类似的问题,可以更方便定位问题

@chinese-wzq
Copy link
Author

如果方便的话,描述一下这个百万级别文件的存档,具体是什么场景什么用途。这样其他人如果出现类似的问题,可以更方便定位问题

我安装了Dynmap插件,因此plugins/dynmap下生成了大量的地图图片,我的服务器上该文件夹下就有1883376个文件。

Fallen-Breath added a commit that referenced this issue Jul 17, 2024
…ing logic during backup creation,

resolved #31, closed #33
see also #32
@AnzhiZhang
Copy link
Contributor

我安装了Dynmap插件,因此plugins/dynmap下生成了大量的地图图片,我的服务器上该文件夹下就有1883376个文件。

dynmap 不建议放在存档目录,如果需要,建议排除,没有太大管理的必要性

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
info needed Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants