Continuous Replication Job Causing XO to Crash
-
Hello,
I have been messing around with The continuous replication part of backups. I created my job and all was running well then my XO crashed. Now the vms that were replicating when XO crashed show an error of operation blocked when I try to migrate them to another host. Please see below for more information.
XO commit: 919d2
nodejs version: 20.15.1
OS: Rocky Linux 9Crash that I saw in the log
xo:plugin INFO Cannot find module '/opt/xo/xo-builds/xen-orchestra-202407221847/packages/xo-server-test/dist'. Please verify that the package.json has a valid "main" entry { error: Error: Cannot find module '/opt/xo/xo-builds/xen-orchestra-202407221847/packages/xo-server-test/dist'. Please verify that the package.json has a valid "main" entry at tryPackage (node:internal/modules/cjs/loader:445:19) at Function.Module._findPath (node:internal/modules/cjs/loader:716:18) at Function.Module._resolveFilename (node:internal/modules/cjs/loader:1131:27) at requireResolve (node:internal/modules/helpers:190:19) at Xo.call (file:///opt/xo/xo-builds/xen-orchestra-202407221847/packages/xo-server/src/index.mjs:354:32) at Xo.call (file:///opt/xo/xo-builds/xen-orchestra-202407221847/packages/xo-server/src/index.mjs:406:25) at from (file:///opt/xo/xo-builds/xen-orchestra-202407221847/packages/xo-server/src/index.mjs:442:95) at Function.from (<anonymous>) at registerPlugins (file:///opt/xo/xo-builds/xen-orchestra-202407221847/packages/xo-server/src/index.mjs:442:27) at main (file:///opt/xo/xo-builds/xen-orchestra-202407221847/packages/xo-server/src/index.mjs:921:5) { code: 'MODULE_NOT_FOUND', path: '/opt/xo/xo-builds/xen-orchestra-202407221847/packages/xo-server-test/package.json', requestPath: '/opt/xo/xo-builds/xen-orchestra-202407221847/packages/xo-server-test' } }
The backup log and the operation blocked logs have been attached as well.
Thanks for the help!
2024-07-23T18_45_30.613Z - XO.log.txt 2024-07-23T18_22_34.453Z - backup NG.json.txt -
Hi,
I would check about the way you built it. How did you install it in the first place? Via our official doc or a 3rd party script?
-
Hi,
-
The
xo-server-test
error log is benign (it displays every time xo-server starts). I would suggest looking earlier in the log for something more troubling, like running out of memory. -
The issue with operation blocked should clear itself up after the next backup job is run on this VM.
How much memory do you have allocated to the XO VM?
HTH, Dan
-
-
@olivierlambert I am currently using a 3rd party script. I can follow the documentation and rebuild by hand to see what happens.
Here is a more telling error.
Jul 23 14:30:23 xo-server[85312]: <--- Last few GCs ---> Jul 23 14:30:23 xo-server[85312]: [85312:0x64ad290] 70880873 ms: Mark-Compact 2002.7 (2082.8) -> 1988.6 (2082.5) MB, 807.86 / 0.07 ms (average mu = 0.144, current mu > Jul 23 14:30:23 xo-server[85312]: [85312:0x64ad290] 70881716 ms: Mark-Compact 2002.9 (2083.2) -> 1986.5 (2081.8) MB, 721.13 / 0.11 ms (average mu = 0.144, current mu > Jul 23 14:30:23 xo-server[85312]: <--- JS stacktrace ---> Jul 23 14:30:23 xo-server[85312]: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory Jul 23 14:30:23 xo-server[85312]: ----- Native stack trace -----e913272/20240716T110135Z.vhd', Jul 23 14:30:23 xo-server[85312]: 1: 0xb80c98 node::OOMErrorHandler(char const*, v8::OOMDetails const&) [node] Jul 23 14:30:23 xo-server[85312]: 2: 0xeede90 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [node] Jul 23 14:30:23 xo-server[85312]: 3: 0xeee177 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [node] Jul 23 14:30:23 xo-server[85312]: 4: 0x10ffd15 [node] Jul 23 14:30:23 xo-server[85312]: 5: 0x11002a4 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [node] Jul 23 14:30:23 xo-server[85312]: 6: 0x1117194 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::internal::GarbageCollectionReason, cha> Jul 23 14:30:23 xo-server[85312]: 7: 0x11179ac v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallback> 0.044) task; scavenge might not succeed Jul 23 14:30:23 xo-server[85312]: 8: 0x117076c v8::internal::MinorGCJob::Task::RunInternal() [node] 0.145) task; scavenge might not succeed Jul 23 14:30:23 xo-server[85312]: 9: 0xd368e6 [node] Jul 23 14:30:23 xo-server[85312]: 10: 0xd39e8f node::PerIsolatePlatformData::FlushForegroundTasksInternal() [node] Jul 23 14:30:23 xo-server[85312]: 11: 0x18af2d3 [node] Jul 23 14:30:23 xo-server[85312]: 12: 0x18c3d4b [node] Jul 23 14:30:23 xo-server[85312]: 13: 0x18afff7 uv_run [node] Jul 23 14:30:23 xo-server[85312]: 14: 0xbc7be6 node::SpinEventLoopInternal(node::Environment*) [node] Jul 23 14:30:23 xo-server[85312]: 15: 0xd0ae44 [node] Jul 23 14:30:23 xo-server[85312]: 16: 0xd0b8dd node::NodeMainInstance::Run() [node] Jul 23 14:30:23 xo-server[85312]: 17: 0xc6fc8f node::Start(int, char**) [node] const*) [node] Jul 23 14:30:23 xo-server[85312]: 18: 0x7f004c029590 [/lib64/libc.so.6]lags) [node] Jul 23 14:30:23 xo-server[85312]: 19: 0x7f004c029640 __libc_start_main [/lib64/libc.so.6] Jul 23 14:30:23 xo-server[85312]: 20: 0xbc430e _start [node] Jul 23 14:30:23 systemd-coredump[89957]: [] Process 85312 (node) of user 0 dumped core. Jul 23 14:30:23 systemd[1]: xo-server.service: Main process exited, code=dumped, status=6/ABRT Jul 23 14:30:23 systemd[1]: xo-server.service: Failed with result 'core-dump'.
My XO vm currently has 6 gigs allocated to it.
-
I increased the memory of my XO vm to 12GiB and so far everything seems to be working. I will keep an eye on things.
-
@Delgado said in Continuous Replication Job Causing XO to Crash:
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
That's the issue. It's hard to give more precise advice since we do not have any control on the environment. As a reminder, for any 3rd party script install/run issue, it's better to open a ticket at the 3rd party script Github repo (since we have 0 control on how it's installed)
-
@olivierlambert Thanks. I will do that. I did bump the memory up to 12GiB and the backup ran successfully but I will pursue a ticket with the 3rd party script maker as well. Thank you for your time!