Seeking advice on debugging unexplained change in server fan speed
-
@DustinB I wish I had asked the question here earlier. I asked it a little while ago on ServerFault.com, figuring that was the best place for this question since it has nothing to do with XCP-ng. Nobody has answered and one person even downvoted it without saying why.
If you use ServerFault and you answer over there, I'll mark it as an answer if this works, so you can get some internet points.
https://serverfault.com/questions/1169753/what-might-cause-server-fans-to-double-in-rpm-after-a-simple-reboot -
@CodeMercenary said in Seeking advice on debugging unexplained change in server fan speed:
@DustinB I wish I had asked the question here earlier. I asked it a little while ago on ServerFault.com, figuring that was the best place for this question since it has nothing to do with XCP-ng. Nobody has answered and one person even downvoted it without saying why.
If you use ServerFault and you answer over there, I'll mark it as an answer if this works, so you can get some internet points.
https://serverfault.com/questions/1169753/what-might-cause-server-fans-to-double-in-rpm-after-a-simple-rebootI don't think I've ever signed up over there, but I'll take a look.
Just replied for anyone else who may need it in the future. I'm Jarli
-
@CodeMercenary Any update?
-
@DustinB Nothing useful yet. I rebooted the servers and explored a bit in the BIOS to see if there were any settings, or to at least tweak some things to see if it would reset whatever went wrong in the reboot in mid December. While doing that I found that one of the two impacted servers was a version behind for the BIOS as well as for the iDRAC so I updated both of them. Unfortunately, that made no change to the fan speeds.
I've been out sick all of this week, so far, but I'll be looking into this more when I get back to the office. I've read about ways to manually control the fans but I'd rather not have to depend on a script running somewhere that makes those kinds of decisions, I'd much rather have iDRAC, or whatever normally controls it, handle it like it used to.
-
@CodeMercenary said in Seeking advice on debugging unexplained change in server fan speed:
@DustinB Nothing useful yet. I rebooted the servers and explored a bit in the BIOS to see if there were any settings, or to at least tweak some things to see if it would reset whatever went wrong in the reboot in mid December. While doing that I found that one of the two impacted servers was a version behind for the BIOS as well as for the iDRAC so I updated both of them. Unfortunately, that made no change to the fan speeds.
I've been out sick all of this week, so far, but I'll be looking into this more when I get back to the office. I've read about ways to manually control the fans but I'd rather not have to depend on a script running somewhere that makes those kinds of decisions, I'd much rather have iDRAC, or whatever normally controls it, handle it like it used to.
Sorry you're not feeling well, when you're back on your feet, specifically look for Firmware for your Fans.
Hope you're feeling better soon.
-
@DustinB I forgot to mention that I did look for firmware for the fans and I see nothing on Dell's downloads for the R630 that indicate that there is any fan related firmware at all. That's why I started trying to tweak the settings in the BIOS and iDRAC related to power and cooling, to see if I could get it to go back to the way it was.
-
@CodeMercenary Do you mind sending me your serial number, I'd be happy to take a look for you to confirm.
-
So, a bit after I originally posted this, one of the two servers fans slowed back down and I don't know why. I only noticed it weeks later.
Then this morning we had a power outage and all the servers were shut down. When I booted them back up when power was restored, the other server was running the fans at normal speed. No idea why it went back, I didn't do anything to fix it.
After that reboot though, just a single fan in the server that originally didn't have that problem, is now running fast. That makes me wonder if the fan is failing so I'm looking to find some spares to keep around.
Any future reboots are going to make me a bit stressed wondering if the fans will speed up again.
I did install the pool patches today and that reboot didn't impact the fans, thankfully. I wish I understood what happened but if it happens again I might use this docker container to take over control of them: https://github.com/tigerblue77/Dell_iDRAC_fan_controller_Docker
-
@CodeMercenary Thanks for providing this. My fans suddenly spun up like crazy and I used that github container to calm them down. Works a treat!
That package installs a container that runs every X seconds and connects to the server and resets the fans. I don't find that necessary. In fact, my fans went nuts after I installed a 3rd party (non-Dell) SAS controller. I notice that one of the IPMI parameters it sets is
THIRD_PARTY_PCIE_CARD_COOLING_RESPONSE. I set that to false. It hasn't spun the fans up super high. I think setting that once was persistent and there's no need to have this script connecting every so often resetting the fans.If anyone is looking for a one-time fix to this, I think running this command one time is sufficient.
-
@paco Oh, that's great information, thank you. I'd much rather reset that flag instead of constantly adjusting the fans.