XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. mpyusko
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 2
    • Posts 23
    • Groups 0

    mpyusko

    @mpyusko

    HyperVisors SuperMicro 6016T-NTF (3x)
    CPU 2x Dual X5660 , 1x Dual L5640
    RAM 96GB ECC
    GPU AMD Radeon Pro W6600 8GB (1x)
    Controller USB 3.0
    SAN/NAS Dell R710
    CPU Dual Intel Xeon X5660 @ 2.8GHz
    RAM 168 GB DDR3 ECC RDIMM
    TrueNAS 13
    Controller LSI SAS 9211-8i
    PCIe x2 SATA III
    USB 3.0
    PCIe x16 to NVMe
    Boot Samsung 860 EVO 250 GB
    Storage WD RED NAS 5400RPM 3TB (6x)
    L2ARC 118 GB Intel OPTANE SSD P1600X Series NVMe (1292 TBW)
    ZIL/SLOG Intel Optane M10 64GB NVMe
    2
    Reputation
    410
    Profile views
    23
    Posts
    0
    Followers
    0
    Following
    Joined
    Last Online

    mpyusko Unfollow Follow

    Latest posts made by mpyusko

    • RE: XOce "Still Loading"

      @olivierlambert Yes, direct access is fine.

      posted in Xen Orchestra
      mpyuskoM
      mpyusko
    • XOce "Still Loading"

      948e1087-8a9b-46c8-8427-2abcef0263b7-image.png
      My Nginx RP .conf

      location / {
          proxy_pass https://192.168.1.200:443;
         proxy_set_header Host $host;
         proxy_set_header X-Real-IP $remote_addr;
         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
         proxy_set_header X-Forwarded-Proto $scheme;
         proxy_set_header Connection "upgrade";
         proxy_set_header Upgrade $http_upgrade;
         proxy_http_version 1.1;
         proxy_redirect default;
         proxy_read_timeout 1800;
      
      }
      
      

      Yes, I used the https://xen-orchestra.com/docs/reverse_proxy.html instructions. No, it's not working right. Please help.

      posted in Xen Orchestra
      mpyuskoM
      mpyusko
    • RE: vGPU - which graphics card supported?

      I am running a WX7100 under XCP-ng 7.6. It only allows full GPU passthrough. So you can run multiple VMS on the server, but only one at a time can be assigned the GPU. I have tested it to work properly with Server 2016 std, Server 2019 std, and Win10 pro. Currently I have it crunching BOINC under win10pro and here are the stats.... BOINCstats for Norby

      posted in Development
      mpyuskoM
      mpyusko
    • RE: XCP-ng Windows Management Agent

      I did try that, but it still did not want to recognize and load the PV drivers properly. The VM is still limping with the realtek divers and. It is performing noticeably slow. Oddly others upgraded without a hitch.

      posted in Development
      mpyuskoM
      mpyusko
    • RE: XCP-ng 7.5 - MegaRAID SAS 9240-8i hang/reboot issue.

      @r1 I have since replaced the LSI 9240-8i in this machine with a LSI 9210-8i. The Server has been running stable for a few months. I took the LSI 9240-8i and put it in a new build and installed XCP-ng on it, and it is running stable too. You can see the specs in my profile. What this boils down to, Is I believe it must be something where the hardware drivers don't want to play nice together. Both are running the same versions of XCP-ng (and are in the same pool just for kicks).

      posted in Development
      mpyuskoM
      mpyusko
    • RE: XCP-ng Windows Management Agent

      @borzel

      1, Upgrade Pool Master First.

      • Disable HA for Pool
      • Shuffle all running VMs to another server in Pool.
      • Place server in Maintenance mode+, designate new Pool Master
      • Mount XCP-ng iso in Virtual Media drive on Server
      • Reboot Server to Virtual Media
      • Follow Upgrade steps
      • Unmount Virtual Media and reboot server.
      • Designate as Pool Master

      2, Upgrade Subsequent servers in Pool

      • Shuffle all running VMs to another server in Pool.
      • Place server in Maintenance mode+
      • Mount XCP-ng iso in Virtual Media drive on Server
      • Reboot Server to Virtual Media
      • Follow Upgrade steps
      • Unmount Virtual Media and reboot server.

      3, Repeat part 2 for all remaining servers in the Pool.

      • Enable HA for Pool

      + = I manually shuffle the VMs before placing it in Maintenance Mode. The autmation does ok for a couple VMs but when there are a half-dozen in play it gets a little wonky and fails part way through. You wind up attempting two or three times and restarting the toolstack a few timesbefore they are all "automatically" moved. Manually I can move two or 3 at once to differing servers and get done a bit quicker and less of a headache.

      I had 3 servers (Xen 1,2 & 3 we'll call them) in this pool running approximately 20 VMs under XS 7.1. Between the remaining two, I was able to keep them running while any particular Server was being upgraded to XCP-ng 7.6. When I upgraded Xen1, Xen 2 and 3 ran all the VMs from a shared SR. When it came time to upgrade Xen2, I shuffled all the sites off the server onto Xen1. Xen3 wasn'e be messed with. Then I upgraded Xen3 and all the sites on it went to Xen2. After I was done with Xen3, I balanced the VMs out across the pool. It seemed to go smoothly, but shortly after I was done, I noticed alarms going off for one of the VMs (VM2 we'll call it). VM1, 2, 3 and 4 are all windows VMs. VMs 1, 2, and 3 are Server 2012 R2, VM4 is a Windows 10 Ent virtual workstation I use to do all this stuff with. VM4 actually floats around from Pool to Pool depending where I need it and what for. During this process It was running from Local Storage on Xen2 or Xen3. The bulk of the rest of the VMs are Linux based, they all chose to play nice too.

      So when I investigated the alarm, it turns out the VM detected new hardware and unloaded the PV drivers and loaded the QEMU, Realtek and so-on. The other VM1, 3 and 4 did not react the same and all still have their PV optimizations even after subsequent reboots. (I couldn't live migrate the VMs, I had to shut them down, move them, and reboot them until all servers in the pool were on the same version)

      In an attempt to resolve the issue...

      • I tried installing the Citrix Drivers (All other VMs are using Citrix drivers carried over from XS 7.1) That didn't work.
        -I found RC2 and the directions to uninstall the Citrix Drivers and install RC2. I uninstalled, cleaned as directed (your directions fail to mention a service needs to be disable in order to delete the last xen file from System32) and instaRC2. That didn't work.
      • I doublechecked the files and Device manager, and ran c:\Program Files\xcp-ng\XenTools\InstallAgent.exe DEFAULT (cmd as Admin). I got the notice to reboot and followed through. On reboot another notice popped up but it still wasn't loading.
      • I rebooted 3 more times (as was mentioned) and still no luck.
      • I found RC3 and followed the same process with RC3. Still not luck.
      • Now another SysAdmin is working with me and we've tried cleansing the system as prescribed and installing manually. I even found a how-to from a Veeam issue with similar symptoms (https://forums.veeam.com/veeam-agent-for-windows-f33/veeam-bare-metal-recovery-on-xenserver-with-pv-nic-drivers-t48304.html) so I made an edit to a couple registry keys and still no luck.
        As of right now, we're still poking around the one copy, while the other copy goes "Time Traveling." (see Previous post).
      posted in Development
      mpyuskoM
      mpyusko
    • RE: XCP-ng Windows Management Agent

      So I've been testing XCP-ng since the 7.5 release first came out. It has been my experience the PV drivers are automatically detected and install from Windows. (I have created new windows VMs and they come up already optimized.)

      Recently I have been upgrading our servers from XS 7.1 to XCP-ng 7.6. The vast majority took the upgrade, like a textbook and are fine. VMs running either the previous Citrix or the Windows Update drivers and up and optimized without a need to install these packages. EXCEPT (there's always an exception, right?) I have a Windows Server 2012 R2 VM and it doesn't want to play nice. After the Hypervisors were upgraded, and the shuffling was all said and done. this machine will not load the PV drivers. I've tried the "clean your system" how-to, the manual install, drivers RC2 and RC3, even harvested the Citrix drivers in a somewhat creative way and they'll all install, but Windows insists on loading the Realtek driver for the NIC and other 'less than optimized drivers." Like Kirk is ST IV, "I'm going to attempt time travel" to save this whale, by exporting the VM and then importing it into an older version of XS where I can try to install the guest utils properly. Xen won't let you simply "move" to an older version of the server.

      The odd part is, I have an essentially Identical VM which went through the upgrade process the same way and it's fine. I also have a Windows 2010 Enterprise VM that survived fine too. This one for some odd reason, isn't.

      I have two copies of this server running and I'm trying different solutions concurrently. I'm open to other suggestions.

      posted in Development
      mpyuskoM
      mpyusko
    • RE: XCP-ng 7.5 - MegaRAID SAS 9240-8i hang/reboot issue.

      @mpyusko BTW.... Yes, it did "repair" the SR, but it did not rebuild the array. It is functioning off one drive only. the other is now out of sync. Not a perfect transition, but it's nice to know my data isn't lost by simply swapping adapters. Clearly I'll need to build a software array to rectify the issue.

      posted in Development
      mpyuskoM
      mpyusko
    • RE: XCP-ng 7.5 - MegaRAID SAS 9240-8i hang/reboot issue.

      I just received an LSI SAS-9211 I ordered. I have the exact same controller running in a production environment for a web-hosting company. Similar architecture (HP DL380 G6) and it operates flawlessly with XS 7.1 so I figured I would try on in this machine. Initially I wanted the 9240 for it's hardware based RAID6. However with ZFS support rolling out in the new versions of XS/XCP-ng, there are greater gains to be had there. So opting for a different JBOD mode controller was a fair choice.

      Interesting things to note.

      • Both brand new LSI cards
      • they each use different drivers.
      • 9240 supports a broad range of RAID levels 0,1,10,5,6,etc and JBOD
      • 9211supports RAD levels 0,1,10, 1E/10E and JBOD
      • My system was originally configure for two drives in a 1TB RAID1 volume and 4 drives in a 9TB RAID5 volume.

      I removed the 9240 and installed the 9211 making sure the port 0 = 0 and 1 = 1. I then booted the system and entered the LSI setup. unlike the 924, there was no option to import existing arrays. Rather than detsroy everything on the 1TB array (the 9TB was still empty) I opted to just boot straight to XCP-ng 7.6 without any changes. The last time I shut down the server I detached all the storage volumes from the VMs. (A quick trick I learned... detatch volumes, export VMs - only takes a couple minutes and a few KB - then dd flashdrive to backup flash. Upgrade Xen, reattach drives. Keeps you from risking your data, especially on critical machines.) When I booted this time, all the SR's were broken but a quick repair brought them back and when I reattached the volumes, the VMs booted. Remember this was originally a Hardware level Array. I'm still trying to peek into the Array to see if both drives are functioning, but appears healthy so far.

      The big question is will the system still generate the same issue? Well, one is a megaraid controller and the other isn't so they use different drivers. Here is the output...

      [root@vincent ~]# lspci |grep LSI
      07:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
      [root@vincent ~]# lspci -vv -s 07:00.0
      07:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
              Subsystem: LSI Logic / Symbios Logic Device 3020
              Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
              Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
              Latency: 0, Cache Line Size: 64 bytes
              Interrupt: pin A routed to IRQ 40
              Region 0: I/O ports at ec00 [size=256]
              Region 1: Memory at df2bc000 (64-bit, non-prefetchable) [size=16K]
              Region 3: Memory at df2c0000 (64-bit, non-prefetchable) [size=256K]
              Expansion ROM at df200000 [disabled] [size=512K]
              Capabilities: [50] Power Management version 3
                      Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                      Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
              Capabilities: [68] Express (v2) Endpoint, MSI 00
                      DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                              ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
                      DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
                              RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
                              MaxPayload 256 bytes, MaxReadReq 512 bytes
                      DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                      LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s <64ns, L1 <1us
                              ClockPM- Surprise- LLActRep- BwNot-
                      LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                              ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                      LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                      DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported
                      DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF Disabled
                      LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                               Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                               Compliance De-emphasis: -6dB
                      LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                               EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
              Capabilities: [d0] Vital Product Data
      pcilib: sysfs_read_vpd: read failed: Input/output error
                      Not readable
              Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
                      Address: 0000000000000000  Data: 0000
              Capabilities: [c0] MSI-X: Enable+ Count=15 Masked-
                      Vector table: BAR=1 offset=00002000
                      PBA: BAR=1 offset=00003800
              Capabilities: [100 v1] Advanced Error Reporting
                      UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                      UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                      UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-
                      CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                      CEMsk:  RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr+
                      AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
              Capabilities: [138 v1] Power Budgeting <?>
              Capabilities: [150 v1] Single Root I/O Virtualization (SR-IOV)
                      IOVCap: Migration-, Interrupt Message Number: 000
                      IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
                      IOVSta: Migration-
                      Initial VFs: 16, Total VFs: 16, Number of VFs: 0, Function Dependency Link: 00
                      VF offset: 1, stride: 1, Device ID: 0072
                      Supported Page Size: 00000553, System Page Size: 00000001
                      Region 0: Memory at 0000000000000000 (64-bit, non-prefetchable)
                      Region 2: Memory at 0000000000000000 (64-bit, non-prefetchable)
                      VF Migration: offset: 00000000, BIR: 0
              Capabilities: [190 v1] Alternative Routing-ID Interpretation (ARI)
                      ARICap: MFVC- ACS-, Next Function: 0
                      ARICtl: MFVC- ACS-, Function Group: 0
              Kernel driver in use: mpt3sas
      
      [root@vincent ~]#
      
      

      And specifically the Module in use...

      [root@vincent ~]# modinfo mpt3sas |grep -i version
      version:        22.00.00.00
      srcversion:     80624A1362CD953ED59AF65
      vermagic:       4.4.0+10 SMP mod_unload modversions
      [root@vincent ~]#
      

      (Yes, my server is named after the robot in the Black Hole)

      posted in Development
      mpyuskoM
      mpyusko
    • RE: XCP-ng 7.5 - MegaRAID SAS 9240-8i hang/reboot issue.

      @olivierlambert It will be. Kali with 4.15 and Debian with 4.9 both do not exhibit the issue. However Xenserver and XCP-ng both do. I'd be interested to compare their compiler settings as to what they do and do not include.

      posted in Development
      mpyuskoM
      mpyusko