VM backup retry - status failed despite it was done on second attempt

icompit

Hi,

I see that changes of backup engine causing a lot of new errors.

There are many "Error: EEXIST: file already exists" which never happened in the past. Restart of the same backup usually just works.
Due to 1 I've added option "retry" to each backup and now even if error occurs second attempt is successful but overall status of backup tasks is set to failed.

This is how it looks like.

This backup job is an old one which running at my environment for months if not years.
Backups are stored on NAS via NFS.

Other VMs processed in this job are successfully processed.

Backup job logs attached.
2025-07-01T01_00_00.011Z - backup NG.json.txt

olivierlambert

Another one @lsouai-vates

lsouai-vates

@icompit Hello, thanks for the report.

I am asking XO Team... @florent FYI

lsouai-vates

@icompit hello again and sorry for the late answer...

The current backup logging system is being reworked: instead of using a custom task system and compiling logs on the fly, it will soon rely on the generic task system and store precompiled logs.

As part of this change, the logic that currently causes a backup job to appear as "failed" — even when a retry succeeded — will be deprecated. So while fixing this specific issue now would be technically easy, it might introduce regressions and won’t be relevant once the new system is in place.

In short: this will be naturally fixed by the upcoming logging overhaul. Thanks for your patience!

icompit

@lsouai-vates
Sure, I understand issues might be related to changes of backup processing under the hood.
I hope my report going to help with identification of bugs.
Does the EEXIST error are also related with this?

icompit

From today morning...

Yesterday all was ok.

lsouai-vates

@icompit @Bastien-Nollet can you help to answer?

Bastien Nollet

@icompit The EEXIST error has appeared with @florent's recent work on backups. I think he's investigating this problem, but he his on vacations now, and will be back in two weeks.

If this causes too much trouble for you, for the moment I would recommend you to go back to a previous XO version.

I'm not familiar with the new "Unknown system error" you got. Could you give us the log of that backup job execution?

icompit

@Bastien-Nollet here is the log..
2025-07-09T01_00_00.011Z - backup NG.json.txt

Bastien Nollet

@icompit Thank you.

It seems that the backup process fails to get the file lock on the backup directory. Have you modified anything on your remote recently?

Also, could you tell me if this is happening on all of this job's runs, all of your backup runs, or if it only happened once on this specific backup job execution?