VDI_IO_ERROR(Device I/O errors) when you run scheduled backup

olivierlambert

Hmm nothing obvious here…

tuxen

This got my attention:

Jan 15 19:17:40 xcp-ng-xen12-lon2 xapi: [error||623653 INET :::80||import] Caught exception in import handler: VDI_IO_ERROR: [ Device I/O errors ]
Jan 15 19:17:40 xcp-ng-xen12-lon2 xapi: [error||623653 INET :::80||backtrace] VDI.import D:378e6880299b failed with exception Unix.Unix_error(Unix.EPIPE, "single_write", "")
Jan 15 19:17:40 xcp-ng-xen12-lon2 xapi: [error||623653 INET :::80||backtrace] Raised Unix.Unix_error(Unix.EPIPE, "single_write", "")

This Unix.EPIPE error on the remote target means that the pipe stream is being closed before VDI.Import receives all the data. The outcome is a VDI I/O error due to a broken, partial sent/received VDI.

Since a remote-over-the-internet link can be more prone to latency/intermittency issues, it might be needed to adjust the remote NFS soft timeout/retries or mounting the target with hard option.

I would also check if the remote target is running out-of-space during the backup process.

shorian

@tuxen Target has plenty of space, over 5Tb free or about 20x the cumulative VM sizes. No NFS involved, it’s a locally mounted ext4 raid 1 array on the target box.

If same backup takes place behind the firewall it runs successfully 95% of the time, across the WAN it fails 95% of the time. Both over a 1gbps link.

Sometimes the failures clean themselves up, sometimes end up with a VM/disk marked [importing.....<backup name><VM name>] that need to be manually removed.

Any help hugely appreciated.

olivierlambert

Interesting. It's like the data stream is interrupted somehow for a bit and that's enough to trigger the issue.

EddieCh08666741

i have similar issue like this too. I'm using ext4 and it was perfectly fine when i'm using 7.6. After upgrading to ext4 and 8.2 fresh install. The CR dont work anymore.

EddieCh08666741

I just tried installing Xen Orchestra from the sources on Debian 11. The same CR works well.

My previous Xen Orchestra from the sources ubuntu 18 having issue with VDI ERROR. Will do more testing.

olivierlambert

Note: XOA is only the version distributed by Vates. Everything else is "Xen Orchestra from the sources"

EddieCh08666741

@olivierlambert Thanks Corrected. I'm so happy for XCP-NG. I'm one of the early backers in Kickstarter. Hang the shirt up in our office.

rauly94

Good Morning guys,

it looks like I'm having the same issue on a host that I'm doing replications. On the normal Delta Backups I'm not getting this error. But I'm getting it on the schedule Replication for only 1 of the vm's. Here is what comes on the log for that VM.

"id": "1679486747187",
              "message": "transfer",
              "start": 1679486747187,
              "status": "failure",
              "end": 1679487904889,
              "result": {
                "code": "VDI_IO_ERROR",
                "params": [
                  "Device I/O errors"
                ],
                "url": "https://192.168.2.11/import_raw_vdi/?format=vhd&vdi=OpaqueRef%3Aa5f60c35-64c2-497e-ae15-77aa63d14274&session_id=OpaqueRef%3A1593dbd0-4347-4dee-ad02-58ed84f6dbf6&task_id=OpaqueRef%3Aced050e9-1f49-4bbe-8b74-ef787d536d62",
                "task": {
                  "uuid": "a7e3ba73-9d02-c211-48be-6a246828211c",
                  "name_label": "[XO] Importing content into VDI HEALmycroft 0",
                  "name_description": "",
                  "allowed_operations": [],
                  "current_operations": {},
                  "created": "20230322T12:05:52Z",
                  "finished": "20230322T12:25:03Z",
                  "status": "failure",
                  "resident_on": "OpaqueRef:259273e3-6fe1-4f9a-b688-ab0847cb81f8",
                  "progress": 1,
                  "type": "<none/>",
                  "result": "",
                  "error_info": [
                    "VDI_IO_ERROR",
                    "Device I/O errors"
                  ],
                  "other_config": {},
                  "subtask_of": "OpaqueRef:NULL",
                  "subtasks": [],
                  "backtrace": "(((process xapi)(filename ocaml/xapi/vhd_tool_wrapper.ml)(line 77))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename ocaml/xapi/import_raw_vdi.ml)(line 170)))"

olivierlambert

Device I/O error isn't a good sign. Check dmesg and the disk health.

rauly94

@olivierlambert sorry to ask, but where should I do this at?

AtaxyaNetwork

@rauly94 Hi !

You can type "dmesg" directly in your host.
You can check your disk's health with the command "smartctl", directly on the host too

rauly94

@olivierlambert said in VDI_IO_ERROR(Device I/O errors) when you run scheduled backup:

dmesg

that's what i got. I'm I doing it correctly.

The issue is happening on 1 VM. i tried doing a copy of the same vm and same error. just FYI

rauly94

@rauly94 @AtaxyaNetwork

rauly94

@rauly94 Hello everyone.
Anyone can help me on this issue. Now it started happening on 2 vm's instead of 1 vm. It is happening on the backup replication.

Error: VDI_IO_ERROR(Device I/O errors)This is a XenServer/XCP-ng error
Start: Jul 5, 2023, 09:03:18 AM
End: Jul 5, 2023, 09:44:46 AM
Duration: 41 minutes
Error: VDI_IO_ERROR(Device I/O errors)This is a XenServer/XCP-ng error

Start: Jul 5, 2023, 09:03:09 AM
End: Jul 5, 2023, 09:45:12 AM
Duration: 42 minutes
Error: VDI_IO_ERROR(Device I/O errors)This is a XenServer/XCP-ng error
Type: delta