XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. dave
    3. Posts
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 3
    • Posts 26
    • Groups 0

    Posts

    Recent Best Controversial
    • RE: Server Locks Up Periodically with ASRock X570D4I-2T AMD Ryzen 9 3900X and Intel X550-AT2

      @R2rho yeah, there are Supermicro systems with AM5 which can handle a decent amount of load, like based on the h13sae-mf, like:

      https://www.supermicro.com/de/products/system/mainstream/1u/as-1015a-mt
      (with less depth)

      Seem to be stable, but we have a small issue regarding onboard graphics ATM:

      https://xcp-ng.org/forum/topic/9976/black-screen-after-install-on-supermicro-h13sae-mf-with-ryzen-9950x/3?_=1734419502978

      posted in XCP-ng
      daveD
      dave
    • RE: Server Locks Up Periodically with ASRock X570D4I-2T AMD Ryzen 9 3900X and Intel X550-AT2

      @R2rho We were building dozens ASRock Rack mainboard- and barebone based systems over the past few years. Starting with the X470D4U which worked realy great. Since the X570D4, it started to get messy. The B650D4U is also affected. We had random periodic reboots and freezes, mostly after some weeks or months uptime.

      Interestingly we have identical systems which have an uptime of over a year. I would say, about 60% of the systems were affected.

      BIOS version and attached hardware did not really matter.

      I once contacted the ASRock support, but they did not know of a general problem, instead they suggested to check other components. (which we also did)

      We went the RMA way and we even had some exchanged RMA mainboards, which also were faulty.

      But: The most recent mainboard returning from RMA seems to work...so maybe you`re lucky 🙂

      posted in XCP-ng
      daveD
      dave
    • RE: Troubleshooting Backups (in general)

      @olivierlambert Yes, i will update and have a look if it changes the behaviour anyhow.

      "For the rest, it's hard to answer without digging more."

      That`t exactly was i was looking for: Any Information where i can dig deeper?

      I`m looking for logs or traces of errors or so.

      It is not that i just wan`t this particular problem to be solved 🙂

      In my limited understanding of interal processes i think somehow a backup process dies and xo-server does not report or recognize this.

      But shouldn`t it somehow?

      posted in Xen Orchestra
      daveD
      dave
    • RE: Troubleshooting Backups (in general)

      @olivierlambert OK. Sorry 🙂 Xen Orchestra, commit 17027 installed on Debian 10.

      Just to add: I think in general it`s not something thats just happening at this version. I have seen such things happening for a few years on different systems.

      Until now, i lived with it (quite well) 🙂

      Just trying to explore where i can find aditional information and try to improve my understanding of whats happening under the hood.

      posted in Xen Orchestra
      daveD
      dave
    • RE: Troubleshooting Backups (in general)

      @olivierlambert its XO, as i wrote: with Server 5.107.5 and Web 5.109.0,

      posted in Xen Orchestra
      daveD
      dave
    • Troubleshooting Backups (in general)

      Hi, i want to learn how i can troubleshoot backup job problems.

      On the event of an error happening, In most cases the job status gets set to failed and i have an error message which i can then trace and resolve.

      But occasionally this does not happen, like in the following example:

      I have a job which runs one time in a week full backup and the other days delta.

      1438d9ef-dc99-4c08-bc07-7eabd7a8f974-image.png

      This job started a full backup on Dec 31. 22 on 5:00 AM. It was still in the state "started" 24 hours later.

      • there was no visible activity anymore (no tasks, no traffic)

      • 3VMs were backed up successfully

      • the timeout of this job was set to 23 hours (so it should have been killed allready?)

      Because of this job beeing stuck in "Started" the following days fail with "job allready running"

      67d5f441-a6a5-4723-9803-f467a12afab4-image.png

      On Jan 3 I restartet xo-server service and then the job was set to "interrupted" without an end time.

      The delta backup on Jan 4. started as planed, but is stuck again.

      I would probably be able to reconfigure the job, and it would be ok, but since this happens sometimes i would like to understand what happens.

      Where i can get aditional information?
      Why is it, at least after the configured timeout, not set to failed?

      BTW: I am running source with Server 5.107.5 and Web 5.109.0, but i had such things happening in earlier versions too.

      posted in Xen Orchestra
      daveD
      dave
    • Backup on encrypted, exchangable disks

      Hi,
      in small environments with a single host, we often use https://github.com/NAUbackup/VmBackup for backup to USB Disks encrypted with LUKS.

      This basically does a "xe vm-export" and works well.

      We wrap this into a small shell script, that identifies the currently attached USB drive, mounts it, backups, and then unmounts it.

      Everything is handled by the hosts itself.

      Now i was thinking about using a XO VM fur such Tasks, the following questions came into my mind:

      • What would be the best way to expose the attached USB drives to a XO VM? NFS, USB Passthrough or udev SR or such?

      • How to configure XO for Backup Remotes that get Plugged or unplugged?

      • What backup Strategy would be best suited for such a Task?

      posted in Xen Orchestra
      daveD
      dave
    • RE: Alert: Control Domain Memory Usage

      @stormi

      I upgraded a pool which was affected from 8.1 to 8.2 this weekend and installed the driver on one of the Hosts. Its a little early, but as you can see, there seems to be a difference in the memory usage:

      Stock Driver

      c53f2add-8bbe-4203-bba8-97e94b466c56-image.png

      Stromis Driver:

      57bf6a63-eff5-47ca-a155-b918c12b95b2-image.png

      One can allready see a constanty, slowly growing mem-usage in "small steps" on the Server with the stock driver, wheras the server with stormis driver seems to be stable.

      posted in Compute
      daveD
      dave
    • RE: Can't install guest utilities in pfSense

      @ascar Maybe this helps:

      https://forum.netgate.com/topic/97553/pfsense-2-3-on-xen-server/5

      Also, dont forget to disable offloading.

      posted in Compute
      daveD
      dave
    • RE: Alert: Control Domain Memory Usage
      [10:36 xs03 ~]# free -m
                    total        used        free      shared  buff/cache   available
      Mem:          11921       11322         171         151         427         175
      Swap:          1023          37         986
      [10:36 xs03 ~]# ps -ef | grep CROND | wc -l
      1
      
      

      BTW: All my affected pools never had dynamic memory.

      posted in Compute
      daveD
      dave
    • RE: Alert: Control Domain Memory Usage

      I currently have:

      top - 13:35:31 up 59 days, 17:11,  1 user,  load average: 0.43, 0.36, 0.34
      Tasks: 646 total,   1 running, 436 sleeping,   0 stopped,   0 zombie
      %Cpu(s):  0.8 us,  1.1 sy,  0.0 ni, 97.5 id,  0.3 wa,  0.0 hi,  0.1 si,  0.2 st
      KiB Mem : 12205936 total,   149152 free, 10627080 used,  1429704 buff/cache
      KiB Swap:  1048572 total,  1048572 free,        0 used.  1153360 avail Mem
      
      
      top - 13:35:54 up 35 days, 17:29,  1 user,  load average: 0.54, 0.73, 0.77
      Tasks: 489 total,   1 running, 324 sleeping,   0 stopped,   0 zombie
      %Cpu(s):  3.5 us,  3.4 sy,  0.0 ni, 92.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.4 st
      KiB Mem : 12207996 total,   155084 free,  9388032 used,  2664880 buff/cache
      KiB Swap:  1048572 total,  1048572 free,        0 used.  2394220 avail Mem
      
      

      both with:

      # uname -a
      Linux xs01 4.19.0+1 #1 SMP Thu Jun 11 16:18:33 CEST 2020 x86_64 x86_64 x86_64 GNU/Linux
      # yum list installed | grep kernel
      kernel.x86_64                   4.19.19-6.0.11.1.xcpng8.1   @xcp-ng-updates
      
      

      shall i test something?

      posted in Compute
      daveD
      dave
    • RE: Alert: Control Domain Memory Usage

      Today another customer called:

      He had a host (pool master) with 16GB Dom0 mem and uptime of 119 days.

      Currently all my affected Systems were using megaraid_sas and iscsi and 10g intel nics.

      megaraid_sas is found in @MrMike and @inaki-martinez mods too.

      This is the customers lsmod:

      Module                  Size  Used by
      tun                    49152  0
      ebtable_filter         16384  0
      ebtables               36864  1 ebtable_filter
      nls_utf8               16384  0
      cifs                  929792  0
      ccm                    20480  0
      fscache               380928  1 cifs
      iscsi_tcp              20480  16
      libiscsi_tcp           28672  1 iscsi_tcp
      libiscsi               61440  2 libiscsi_tcp,iscsi_tcp
      scsi_transport_iscsi   110592  3 iscsi_tcp,libiscsi
      bonding               176128  0
      bridge                196608  1 bonding
      8021q                  40960  0
      garp                   16384  1 8021q
      mrp                    20480  1 8021q
      stp                    16384  2 bridge,garp
      llc                    16384  3 bridge,stp,garp
      ipt_REJECT             16384  3
      nf_reject_ipv4         16384  1 ipt_REJECT
      xt_tcpudp              16384  8
      xt_multiport           16384  1
      xt_conntrack           16384  5
      nf_conntrack          163840  1 xt_conntrack
      nf_defrag_ipv6         20480  1 nf_conntrack
      nf_defrag_ipv4         16384  1 nf_conntrack
      libcrc32c              16384  1 nf_conntrack
      iptable_filter         16384  1
      dm_multipath           32768  0
      sunrpc                413696  1
      sb_edac                24576  0
      intel_powerclamp       16384  0
      crct10dif_pclmul       16384  0
      crc32_pclmul           16384  0
      ghash_clmulni_intel    16384  0
      pcbc                   16384  0
      aesni_intel           200704  0
      aes_x86_64             20480  1 aesni_intel
      crypto_simd            16384  1 aesni_intel
      cryptd                 28672  3 crypto_simd,ghash_clmulni_intel,aesni_intel
      glue_helper            16384  1 aesni_intel
      dm_mod                151552  285 dm_multipath
      ipmi_si                65536  0
      ipmi_devintf           20480  0
      intel_rapl_perf        16384  0
      ipmi_msghandler        61440  2 ipmi_devintf,ipmi_si
      i2c_i801               28672  0
      sg                     40960  0
      lpc_ich                28672  0
      acpi_power_meter       20480  0
      ip_tables              28672  2 iptable_filter
      x_tables               45056  7 ebtables,xt_conntrack,iptable_filter,xt_multiport,xt_tcpudp,ipt_REJECT,ip_tables
      hid_generic            16384  0
      usbhid                 57344  0
      hid                   122880  2 usbhid,hid_generic
      sd_mod                 53248  9
      isci                  163840  0
      ahci                   40960  0
      libsas                 86016  1 isci
      libahci                40960  1 ahci
      scsi_transport_sas     45056  2 isci,libsas
      xhci_pci               16384  0
      ehci_pci               16384  0
      igb                   233472  0
      libata                274432  3 libahci,ahci,libsas
      ehci_hcd               90112  1 ehci_pci
      xhci_hcd              258048  1 xhci_pci
      e1000e                286720  0
      megaraid_sas          167936  12
      scsi_dh_rdac           16384  0
      scsi_dh_hp_sw          16384  0
      scsi_dh_emc            16384  0
      scsi_dh_alua           20480  1
      scsi_mod              253952  15 isci,scsi_dh_emc,scsi_transport_sas,sd_mod,dm_multipath,scsi_transport_iscsi,scsi_dh_alua,iscsi_tcp,libsas,libiscsi,megaraid_sas,libat                                                                                                    a,sg,scsi_dh_rdac,scsi_dh_hp_sw
      ipv6                  548864  545 bridge
      crc_ccitt              16384  1 ipv6
      
      
      posted in Compute
      daveD
      dave
    • RE: Alert: Control Domain Memory Usage

      @stormi Usualy i migrate all vms of affected hosts to others, when memory is nearly full. But that does not free any memory. Could LVM operations be still the happening, with no VMs running?

      posted in Compute
      daveD
      dave
    • RE: Alert: Control Domain Memory Usage

      Current Top:

      top - 15:38:00 up 62 days,  4:22,  2 users,  load average: 0.06, 0.08, 0.08
      Tasks: 295 total,   1 running, 188 sleeping,   0 stopped,   0 zombie
      %Cpu(s):  0.6 us,  0.0 sy,  0.0 ni, 99.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
      KiB Mem : 12210160 total,  3596020 free,  7564312 used,  1049828 buff/cache
      KiB Swap:  1048572 total,  1048572 free,        0 used.  4420052 avail Mem
      
        PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
       2516 root      20   0  888308 123224  25172 S   0.0  1.0 230:49.43 xapi
       1947 root      10 -10  712372  89348   9756 S   0.0  0.7 616:24.82 ovs-vswitc+
       1054 root      20   0  102204  30600  15516 S   0.0  0.3  23:00.23 message-sw+
       2515 root      20   0  493252  25388  12884 S   0.0  0.2 124:03.44 xenopsd-xc
       2527 root      20   0  244124  25128   8952 S   0.0  0.2   0:24.59 python
       1533 root      20   0  277472  23956   7928 S   0.0  0.2 161:16.62 xcp-rrdd
       2514 root      20   0   95448  19204  11588 S   0.0  0.2 104:18.98 xapi-stora+
       1069 root      20   0   69952  17980   9676 S   0.0  0.1   0:23.74 varstored-+
       2042 root      20   0  138300  17524   9116 S   0.0  0.1  71:06.89 xcp-networ+
       2524 root      20   0  211832  17248   7728 S   0.0  0.1   8:15.16 python
       2041 root      20   0  223856  16836   7840 S   0.0  0.1   0:00.28 python
      26502 65539     20   0  334356  16236   9340 S   0.0  0.1 603:42.74 qemu-syste+
       5724 65540     20   0  208404  15400   9240 S   0.0  0.1 469:19.79 qemu-syste+
       2528 root      20   0  108192  14760  10284 S   0.0  0.1   0:00.01 xapi-nbd
       9482 65537     20   0  316948  14204   9316 S   0.0  0.1 560:47.71 qemu-syste+
      24445 65541     20   0  248332  13704   9124 S   0.0  0.1  90:45.58 qemu-syste+
       1649 root      20   0   62552  13340   6172 S   0.0  0.1  60:28.97 xcp-rrdd-x+
      

      Requested Files:

      xl top.txt
      dom0 param list.txt
      grub.cfg.txt

      posted in Compute
      daveD
      dave
    • RE: Alert: Control Domain Memory Usage

      Don`t restart openvswitch, if you have active iSCSI storage attached.

      posted in Compute
      daveD
      dave
    • RE: Alert: Control Domain Memory Usage

      I have found a host with around 7Gigs om mem used, mostly without a visible process for it.
      This is a host which runs less VMs, so it takes longer to fill up the RAM.
      slabtop.txt meminfo.txt
      ps aux.txt

      top - 12:15:02 up 60 days, 59 min,  2 users,  load average: 0.25, 0.13, 0.10
      Tasks: 297 total,   1 running, 189 sleeping,   0 stopped,   0 zombie
      %Cpu(s):  0.5 us,  0.4 sy,  0.0 ni, 98.6 id,  0.4 wa,  0.0 hi,  0.0 si,  0.1 st
      KiB Mem : 12210160 total,  3879276 free,  7295660 used,  1035224 buff/cache
      KiB Swap:  1048572 total,  1048572 free,        0 used.  4691716 avail Mem
      
        PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
       2516 root      20   0  866796  92696  25116 S   0.3  0.8 222:40.81 xapi
       1947 root      10 -10  712372  89348   9756 S   0.7  0.7 594:52.86 ovs-vswitchd
       1054 root      20   0  102204  30600  15516 S   0.3  0.3  22:13.27 message-switch
       2515 root      20   0  493252  25328  12884 S   0.0  0.2 119:46.39 xenopsd-xc
       2527 root      20   0  244124  25128   8952 S   0.0  0.2   0:24.59 python
       1533 root      20   0  277472  23956   7928 S   0.0  0.2 155:35.64 xcp-rrdd
       2514 root      20   0   95448  19204  11588 S   0.0  0.2 100:44.55 xapi-storage-sc
       1069 root      20   0   69952  17980   9676 S   0.0  0.1   0:22.94 varstored-guard
       2042 root      20   0  138300  17524   9116 S   0.3  0.1  68:39.86 xcp-networkd
       2524 root      20   0  211576  17248   7728 S   0.0  0.1   7:57.55 python
       2041 root      20   0  223856  16836   7840 S   0.0  0.1   0:00.28 python
      26502 65539     20   0  331284  16236   9340 S   1.0  0.1 580:03.42 qemu-system-i38
       5724 65540     20   0  208404  15400   9240 S   0.7  0.1 450:29.20 qemu-system-i38
       2528 root      20   0  108192  14760  10284 S   0.0  0.1   0:00.01 xapi-nbd
       9482 65537     20   0  316948  14204   9316 S   0.3  0.1 541:50.85 qemu-system-i38
      24445 65541     20   0  247308  13704   9124 S   0.7  0.1  71:45.92 qemu-system-i38
       1649 root      20   0   62552  13340   6172 S   0.0  0.1  58:24.21 xcp-rrdd-xenpm
       1650 root      20   0  109848  13320   6388 S   0.0  0.1 102:33.45 xcp-rrdd-iostat
       1294 root      20   0  127660  11044   5848 S   0.0  0.1  43:57.60 squeezed
       1647 root      20   0  115764  10944   6008 S   0.0  0.1  47:06.07 xcp-rrdd-squeez
      26131 root      20   0   45096  10920   3024 S   0.0  0.1  10065:02 tapdisk
       4781 root      20   0  180476  10816   5832 S   0.0  0.1  41:45.65 mpathalert
       1725 root      20   0  987212  10024   8116 S   0.0  0.1   0:02.70 lwsmd
      25383 root      20   0  155244   9824   8488 S   0.0  0.1   0:00.06 sshd
       1068 root      20   0  222612   9756   5544 S   0.0  0.1  39:12.40 v6d
       1648 root      20   0  196692   9688   5364 S   0.0  0.1  38:58.31 xcp-rrdd-gpumon
       3198 root      20   0 4178388   9488   4160 S   0.0  0.1  22:03.95 stunnel
       1603 root      20   0 1187748   8476   6724 S   0.0  0.1   0:00.05 lwsmd
       1055 root      20   0   67656   8432   4764 S   0.0  0.1 118:55.38 forkexecd
       1691 root      20   0 1060428   7840   6256 S   0.0  0.1   0:00.01 lwsmd
       1073 root      20   0  112824   7752   6724 S   0.0  0.1   0:00.01 sshd
       1558 root      20   0  322832   7652   6292 S   0.0  0.1   2:47.05 multipathd
       1263 root      20   0   73568   7548   3620 S   0.0  0.1  52:55.82 oxenstored
       1651 root      20   0  774588   7144   5732 S   0.0  0.1   0:00.01 lwsmd
      23598 root      20   0   67656   6664   2988 S   0.0  0.1   0:00.00 forkexecd
       1576 root      20   0 1016092   6348   4920 S   0.0  0.1   0:00.02 lwsmd
       5170 root      10 -10   34412   5784   4112 S   0.0  0.0   0:00.00 iscsid
      23599 root      20   0   44980   5696   4968 S   0.0  0.0   0:00.00 stunnel
          1 root      20   0   43816   5460   3792 S   0.0  0.0  17:48.63 systemd
      26109 root      20   0   39700   5396   3024 S   0.0  0.0 272:18.60 tapdisk
       1032 root      20   0  266820   5352   3284 S   0.0  0.0  31:45.39 rsyslogd
       1935 root      10 -10   44740   5260   3800 S   0.0  0.0  20:40.87 ovsdb-server
      26226 root      20   0   39460   5160   3284 S   1.0  0.0 975:42.44 tapdisk
      14571 root      20   0  196608   5044   4388 S   0.0  0.0   0:00.01 login
      25491 root      20   0  162332   4676   3764 R   0.0  0.0   0:00.62 top
       5305 root      20   0   38944   4668   3024 S   0.3  0.0  88:02.03 tapdisk
       9231 root      20   0   38676   4528   3024 S   0.0  0.0  24:38.80 tapdisk
       1469 root      20   0   21428   4508   1764 S   0.0  0.0   9:47.96 cdrommon
      24991 root      20   0  162116   4508   3664 S   0.0  0.0   0:00.73 top
      14758 root      20   0  116504   4420   3008 S   0.0  0.0   0:00.02 bash
      24342 root      20   0   38560   4412   3024 S   0.0  0.0   1:04.19 tapdisk
       1042 dbus      20   0   58120   4328   3824 S   0.0  0.0   1:14.70 dbus-daemon
       2049 root      20   0   63560   4288   2988 S   0.0  0.0   0:00.00 forkexecd
      25437 root      20   0  116500   4264   2916 S   0.0  0.0   0:00.03 bash
       1064 root      20   0   24504   4008   3328 S   0.0  0.0   0:00.11 smartd
       6542 root      20   0  115968   3808   2932 S   0.0  0.0   0:23.13 sh
      
      
      posted in Compute
      daveD
      dave
    • RE: Alert: Control Domain Memory Usage

      @stormi Of course, sorry. Thats strange, i would have expected the usage would be quite high after 30 days uptime - as it was every time the last year...The usage is quite low, on two Servers which were affected before, ATM.
      Before the 30 days seen uptime, i did a yum update on 25.09.2020. A lot of driver and kernel packages were upated in this run.
      I have another affected pool, which was restarted 7 days ago because of mem-consumption, but without updating. (lates yum update on 08.05.2020) i will keep an eye on it. Maybe some Updates for 8.1 released beetween 08.05.2020 and 25.09.2020 fixed this error. We will see.

      posted in Compute
      daveD
      dave
    • RE: Alert: Control Domain Memory Usage

      @stormi I currently have this one:

      top - 15:55:55 up 30 days, 19:31,  1 user,  load average: 0.13, 0.19, 0.23
      Tasks: 645 total,   1 running, 437 sleeping,   0 stopped,   0 zombie
      %Cpu(s):  0.7 us,  0.7 sy,  0.0 ni, 97.9 id,  0.5 wa,  0.0 hi,  0.0 si,  0.2 st
      KiB Mem : 12205936 total,   159044 free,  6327592 used,  5719300 buff/cache
      KiB Swap:  1048572 total,  1048572 free,        0 used.  5455076 avail Mem
      
        PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
      11785 root      20   0   38944   4516   3256 S   3.6  0.0  27:50.89 tapdisk
      16619 root      20   0   71988  37640  35464 S   2.0  0.3   1048:44 tapdisk
       2179 root      10 -10 1302860 155032   9756 S   1.7  1.3 699:20.93 ovs-vswitchd
       8627 root      20   0   42496   8276   5896 S   1.3  0.1 645:07.94 tapdisk
      12127 65572     20   0  220692  14508   9220 S   1.3  0.1 105:51.34 qemu-system-i38
      15573 65567     20   0  228884  14880   9168 S   1.3  0.1 113:17.76 qemu-system-i38
      16713 root      20   0   71244  37060  35636 S   1.3  0.3 431:04.58 tapdisk
      17124 65565     20   0  253460  15536   9212 S   1.3  0.1 230:28.27 qemu-system-i38
        507 65547     20   0  204308  13576   9176 S   1.0  0.1 374:00.32 qemu-system-i38
       1348 65548     20   0  199188  15852   9268 S   1.0  0.1 478:44.62 qemu-system-i38
       1822 root      20   0  122268  15792   6292 S   1.0  0.1 251:54.49 xcp-rrdd-iostat
       3560 65549     20   0  236052  15696   9272 S   1.0  0.1 478:25.30 qemu-system-i38
       4049 65550     20   0  211476  13712   9096 S   1.0  0.1 374:53.29 qemu-system-i38
       9089 65566     20   0  225812  16328   9236 S   1.0  0.1 226:40.10 qemu-system-i38
      19051 65555     20   0  213524  14960   9444 S   1.0  0.1 312:44.65 qemu-system-i38
      22650 65540     20   0  231956  14016   9104 S   1.0  0.1 476:19.21 qemu-system-i38
      28280 65543     20   0  284180  14356   9180 S   1.0  0.1 481:22.74 qemu-system-i38
      28702 65544     20   0  194068  13636   9020 S   1.0  0.1 373:26.97 qemu-system-i38
      28981 65568     20   0  174604  15528   9244 S   1.0  0.1 107:15.89 qemu-system-i38
      29745 65541     20   0  171532  13792   9132 S   1.0  0.1 476:38.74 qemu-system-i38
       1244 root      20   0   67656   8252   4576 S   0.7  0.1 160:47.13 forkexecd
       4993 root      20   0  180476  10244   3608 S   0.7  0.1  50:10.80 mpathalert
       7194 root      20   0  162508   5052   3824 R   0.7  0.0   0:00.67 top
      15180 root      20   0   44744  10500   9328 S   0.7  0.1  26:43.32 tapdisk
      16643 65573     20   0  229908  14280   9220 S   0.7  0.1  66:42.94 qemu-system-i38
      18769 root      20   0   46616  12316  10912 S   0.7  0.1 241:10.00 tapdisk
      22133 65539     20   0   13.3g  16384   9180 S   0.7  0.1 374:26.35 qemu-system-i38
         10 root      20   0       0      0      0 I   0.3  0.0  47:35.79 rcu_sched
       2291 root      20   0  138300  16168   7660 S   0.3  0.1  65:30.99 xcp-networkd
       3029 root      20   0       0      0      0 I   0.3  0.0   0:02.12 kworker/6:0-eve
       3100 root      20   0   95448  17028   9280 S   0.3  0.1  76:30.01 xapi-storage-sc
       3902 root      20   0       0      0      0 I   0.3  0.0   0:07.16 kworker/u32:0-b
       3909 root      20   0       0      0      0 I   0.3  0.0   0:07.48 kworker/u32:4-b
       6663 root      20   0       0      0      0 S   0.3  0.0  70:40.93 kdmwork-253:0
       7826 root      20   0  193828   4224   3668 S   0.3  0.0   0:00.01 login
       8626 root      20   0   71368  37184  35636 S   0.3  0.3 345:42.82 tapdisk
      

      Please contact me with a DM.

      posted in Compute
      daveD
      dave
    • RE: Alert: Control Domain Memory Usage

      Hi, i still have this problem on 5 hosts in 2 pools. I increased the dom0 memory to 12GB and 16GB, but its still happening. XCP 8.0 and 8.1 involved. On hosts with more VMs it occures more often then on hosts with less VMs. Its happens between 40 - 140 days, depending on the number of VMs running.

      Yes, ownvswitch has the greatest memory usage, but ist still only a small percentage. I still cant see, what eats up the memory. Restarting XAPI doensn`t change anything.

      top - 18:37:06 up 144 days, 19:43,  1 user,  load average: 2.23, 2.12, 2.16
      Tasks: 443 total,   1 running, 272 sleeping,   0 stopped,   0 zombie
      %Cpu(s):  1.3 us,  1.7 sy,  0.0 ni, 95.7 id,  0.8 wa,  0.0 hi,  0.1 si,  0.4 st
      KiB Mem : 12205932 total,    91920 free, 11932860 used,   181152 buff/cache
      KiB Swap:  1048572 total,   807616 free,   240956 used.    24552 avail Mem
      
        PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
       2248 root      10 -10 1302696 158708   9756 S   1.0  1.3   2057:54 ovs-vswitchd
       3018 root      20   0  597328  25804      4 S   0.3  0.2 635:40.46 xapi
       1653 root      20   0  255940  20628   1088 S   0.0  0.2   1517:12 xcp-rrdd
       1321 root      20   0  142596  15100   7228 S   0.3  0.1  40:49.02 message-switch
       6571 root      20   0  213720  12164   4920 S   0.0  0.1   9:30.58 python
       1719 root      20   0   62480   9980   3488 S   0.0  0.1 269:35.54 xcp-rrdd-xenpm
      13506 root      20   0   43828   9652   2856 S   0.0  0.1   0:05.11 tapdisk
       1721 root      20   0  111596   8684   1592 S   0.0  0.1 337:51.20 xcp-rrdd-iostat
       2342 root      20   0  138220   8656   2744 S   0.0  0.1 218:17.74 xcp-networkd
       1639 root      20   0 1241012   8428   6024 S   0.0  0.1 150:25.56 multipathd
       6092 root      20   0   42428   7924   3924 S  17.2  0.1   2987:48 tapdisk
       1649 root      20   0   75116   6980   2192 S   0.0  0.1 294:06.89 oxenstored
       5436 root      10 -10   35432   6760   4112 S   0.0  0.1   0:00.03 iscsid
      13898 root      20   0   40824   6648   2856 S   0.0  0.1   0:09.13 tapdisk
       3547 root      20   0   39852   5564   3376 S   0.7  0.0  54:01.72 tapdisk
       3006 root      20   0   40028   5460   2496 S  14.2  0.0  19:10.48 tapdisk
       1326 root      20   0   67612   5220   2840 S   0.0  0.0 529:23.01 forkexecd
       3027 root      20   0  108028   5176   5176 S   0.0  0.0   0:00.02 xapi-nbd
      15298 root      20   0   39644   5156   3940 S   0.7  0.0 853:39.92 tapdisk
       3694 root      20   0  238044   5084   5084 S   0.0  0.0   0:01.39 python
       6945 root      20   0   39484   4860   3804 S  15.8  0.0 591:05.22 tapdisk
      24422 root      20   0   44980   4844   4756 S   0.0  0.0   0:00.22 stunnel
      11328 root      20   0   44980   4684   4640 S   0.0  0.0   0:00.06 stunnel
       2987 root      20   0   44980   4608   4440 S   0.0  0.0   0:00.29 stunnel
       6095 root      20   0   38768   4588   2912 S   0.0  0.0 764:33.14 tapdisk
       1322 root      20   0   69848   4388   3772 S   0.0  0.0   1:15.05 varstored-guard
      14873 root      20   0   38688   4360   2744 S   0.0  0.0  57:33.78 tapdisk
       1329 root      20   0  371368   4244   3664 S   0.0  0.0   0:41.49 snapwatchd
       1328 root      20   0  112824   4212   4212 S   0.0  0.0   0:00.02 sshd
       2219 root      10 -10   44788   4004   3064 S   0.0  0.0 138:30.56 ovsdb-server
       3278 root      20   0  307316   3960   3764 S   0.0  0.0   3:15.34 stunnel
      17064 root      20   0  153116   3948   3772 S   0.0  0.0   0:00.16 sshd
      30189 root      20   0   38128   3716   2828 S   0.0  0.0  97:30.65 tapdisk
      
      

      @olivierlambert Do you have an idea how to get to the root of this? Is it maybe possible to get some (paid) support to check this?

      posted in Compute
      daveD
      dave
    • RE: XcpNG - Xen kernel crash (FATAL TRAP: vector = 2 (nmi))

      @petr-bena Thanks.

      I can confirm: Until now everything is stable for us, too. ( with nmi=dom0 )

      @olivierlambert Since i have only production-servers with the affected hardware ATM, i cant test the 8.1 beta right now. But after relase i will try 8.1 final. Do you think there is a real chance that this error wont appear in 8.1 stock? Or should I do the same change?

      posted in Compute
      daveD
      dave