XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Hardware Health Monitoring

    Scheduled Pinned Locked Moved Management
    10 Posts 4 Posters 2.6k Views 5 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • H Offline
      hawk223
      last edited by

      I'm looking at switching from ESXi. As far as I can tell there is no hardware health monitoring in XCP-ng/XO. I'm hoping that I'm just not finding it. How is this handled? This is a requirement for switching.

      In ESXi I get notifications when a hard drive in a raid array or other fails, fan failures, power supply failures etc. I don't feel comfortable switching without these features. How is everybody handling this?

      The only thing I found was coming in 8.3 for raid arrays. But I don't see anything else.

      K 1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates đŸȘ Co-Founder CEO
        last edited by

        Hi,

        What do you need exactly and on which server model? Because there's no universal answer. On the short term, we are working with 2CRSi to provide a range of servers with all the hardware details. It might come with other manufacturers 🙂

        H 1 Reply Last reply Reply Quote 0
        • K Offline
          KPS Top contributor @hawk223
          last edited by

          @hawk223
          There is no "default-monitoring" for hardware, but this should not be a big problem.
          If you are using any big vendor, there should be tools like storcli, which can be used for monitoring.

          --> ipmi-sensors is installed
          --> storcli is available as rpm (and working find)

          H 1 Reply Last reply Reply Quote 0
          • H Offline
            hawk223 @olivierlambert
            last edited by

            @olivierlambert What I'm hoping for it to get an email if there is a hardware failure. Mostly on the raid array, power supplies, fans, ecc errors, temperature and other stuff would be nice as well.

            I have some older Intel R2224gz models with the s1600gz board. I'll looking at getting some HP gen 9 stuff as well.

            1 Reply Last reply Reply Quote 0
            • H Offline
              hawk223 @KPS
              last edited by

              @KPS I have storcli running already. Needed it to setup the raid. I've been using it on esxi to manage the array when a drive needs replacing. I'll have to try out the ipmi stuff. I tried detect sensor and it didn't find anything on the ipmi but it did find ipmi.

              While I can manually check with storcli is there a way to automate the check and email issues?

              K 1 Reply Last reply Reply Quote 0
              • K Offline
                KPS Top contributor @hawk223
                last edited by

                @hawk223 said in Hardware Health Monitoring:

                While I can manually check with storcli is there a way to automate the check and email issues?

                I am running a script from my monitoring software (PRTG) every 5 minutes, that is checking through storcli

                1 Reply Last reply Reply Quote 0
                • olivierlambertO Offline
                  olivierlambert Vates đŸȘ Co-Founder CEO
                  last edited by

                  We'll work on better hardware report/integration in XO, that's one of the target we have 🙂

                  1 Reply Last reply Reply Quote 0
                  • J Offline
                    john.c
                    last edited by john.c

                    I have a couple of Dell PowerEdge R620 (3 of them) one of them was obtained during December 2023 to become the storage for the other 2.

                    As for them its upgrade potential is massive (for my needs), up to 768GB of RAM, up to 96 TB storage and also is dual socket supporting with 12 Core CPU.

                    Has a max CPU speed on 12 Core of 2.7Ghz.

                    Any way would like to see better hardware health monitoring, could be aided by interfacing in both directions with Dell's iDRAC.

                    @hawk223 In the meantime do you have permission to configure yourself as an email alert recipient on the out of band management controller (BMC) on those servers?

                    The reason as it will mainly send hardware health alerts, though if you don't and the one who does isn't able or willing to add yourself. Then you may need to wait for this addition.

                    H 1 Reply Last reply Reply Quote -1
                    • H Offline
                      hawk223 @john.c
                      last edited by

                      @john-c I have access to configure myself as an email recipient however, I don’t recall the reason I never did that. There is some complication with doing so. If I recall it needs a hack to configure an email server. I will investigate.

                      I have also been looking into something like nagios or zabbix. But then that’s something extra required. I would prefer a simple solution that doesn’t require a lot of effort and hacking.

                      J 1 Reply Last reply Reply Quote 0
                      • J Offline
                        john.c @hawk223
                        last edited by john.c

                        @hawk223 said in Hardware Health Monitoring:

                        @john-c I have access to configure myself as an email recipient however, I don’t recall the reason I never did that. There is some complication with doing so. If I recall it needs a hack to configure an email server. I will investigate.

                        I have also been looking into something like nagios or zabbix. But then that’s something extra required. I would prefer a simple solution that doesn’t require a lot of effort and hacking.

                        Most modern BMC include a web interface for access, so you can navigate to it and log in via its form. From there it's fairly simple to then setup, though the account for sending the email maybe a challenge as some implementations require the sending account to not have a password!

                        The exact host name depends on the configuration of the IP address and/or FQDN of the BMC in your network system.

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post