XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Weird performance alert. Start importing VM for no reason.

    Scheduled Pinned Locked Moved Backup
    5 Posts 2 Posters 188 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P Offline
      ph7
      last edited by

      XCP-ng, updated
      XO CE, commit 749f0 (now 2 behind)

      For some strange reason, I got performance alerts every minute starting at 05:00 this morning.

      Screenshot 2025-02-24 at 10-27-12 Papperskorg jake.blues@protonmail.com Proton Mail.png

      XO was trying to start/import a VM without any reason to.

      The backup of this VM is started from a Sequence-job and this job was run from 03:04 to 03:05

      Screenshot 2025-02-24 at 10-53-28 Backup.png

      The ed2b-job has health check enabled, but should only run on 1st and 15th

      Screenshot 2025-02-24 at 11-01-32 Backup.png

      Screenshot 2025-02-24 at 10-48-34 Backup.png

      I also run a replication job during the day outside the backup schedule

      Screenshot 2025-02-24 at 11-58-56 Backup.png

      The VM has autostart enabled, and maybe the host crashed and when it restarted it somehow thought it should auto start the replicated VM
      Unfortunately I did destroy the VM but I did read that it was started 32min ago, and this was around 05:30.
      Now when I checked the Host performance, there was nothing in the graph

      Screenshot 2025-02-24 at 10-25-22 X2 🚀 (Ryssen 🪐).png

      [11:23 x2 ~]# uptime 11:26:44 up 6:37, 2 users, load average: 0,10, 0,09, 0,10

      The host did restart for some reason
      I ran dmesg and at line 618 I found that I should run fsck
      [ 3.459057] FAT-fs (sda4): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.

      Maybe this isn't a backup problem, but I started investigating it as that

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Continuous replication is using export/import mechanism, so I think that's the reason for this task 🙂

        P 1 Reply Last reply Reply Quote 0
        • P ph7 referenced this topic on
        • P Offline
          ph7 @olivierlambert
          last edited by

          @olivierlambert
          All backups and Continuous replication ran fine during these ~ 15hours that the graphics was gone.
          Do You have any clue what logs I can check

          P 1 Reply Last reply Reply Quote 0
          • P Offline
            ph7 @ph7
            last edited by

            In the replication job I keep Replication Retention=2
            So at the time of reboot, 03:48 UTC, there was 2 saved CR-jobs that ran at 01:10 and 01:40
            Screenshot 2025-02-24 at 20-59-29 Backup.png

            Screenshot 2025-02-24 at 20-58-47 Backup.png

            According to the time in red markings, The first alert ran at 05:00 CET (UTC + 1 hour)

            Screenshot 2025-02-24 at 20-31.png

            And this is from dmesg:
            [ 0.994105] rtc_cmos 00:02: setting system clock to 2025-02-24 03:48:56 UTC (1740368936)

            I can not figure out

            1. why did the host reboot?
            2. why did one of the CR-jobs start ??
            3. why was there ~15 hours without any graph?

            I have UPS with NUT-shutdown on the host and on my trueNAS, with no indication of power failure.

            P 1 Reply Last reply Reply Quote 0
            • P Offline
              ph7 @ph7
              last edited by

              I increased the dom0 RAM from default 1.75 to 2 GiB
              Hopefully this will do.

              1 Reply Last reply Reply Quote 1
              • First post
                Last post