I didnt setup ssh on work XOA i just set the password but need to reboot it for it to work. The tunnel is still open if you dont mind doing it otherwise I will need to reboot xoa to get in myself.
I can do it on monday
I didnt setup ssh on work XOA i just set the password but need to reboot it for it to work. The tunnel is still open if you dont mind doing it otherwise I will need to reboot xoa to get in myself.
I can do it on monday
@acebmxer
If you are ok, you can do a heap memory export ( not that you will probably have to restart the xoa after to free the memory)
If you can do it, and we will collect the memory file on monday ( or when you are ready) , and see if this is a new cause or the return of an old one
@poddingue there is a cache system that keep the disk open a few minutes, we will rework it in the near future, it should improve the situation on these errors ( at the cost of being a little slower )
ping @julienxovates I think we can change thie behavior ( accepting a xo-server start even if a zombie process is still running)
florent said:
@McHenry yes
your replicated VMmaye be a little slower, and removing these snapshot can lead to a big and slow coalesce, but these tradeoff can be worth it for replication
we will try to clarify it for the 6.5 version ( end of may )
ping @julienxovates
@ducatijosh my bet is that xo-server was still not completly stoppe,d thus was serving the old code . As strange as it seems, it you start a second xo-server, it won't fail, but only show a warningin the logs
a full reboot really restart xo-server
@McHenry yes
your replicated VMmaye be a little slower, and removing these snapshot can lead to a big and slow coalesce, but these tradeoff can be worth it for replication
we will try to clarify it for the 6.5 version ( end of may )
oo many snapshots
the warning limit is 3
this make sense for a VM used for daily production, but we should maybe don't apply the same limit on replicated VM
@acebmxer the connectnbdclient error won't be in the task log, but in the system log
qcow2 backup need NBD
connectNbdClientIfPossible
yes connectNbdClientIfPossible is the nearest error message from the real cause
Now, why does it fails ?
nbd should be enabled on the network used for backups , do you have a default backup network defined ?
xo must have at least one VIF on this network
@acebmxer do you have something in the xo logs ( journalct probably ) ?
@florent Thanks!
I did not have time (yet) to look into the heap export due to weekend being in between.
I will update to current master and provide you with the heap export in the following days.
Best regards
no problem , the sooner we'll have these exports, the sooner we will be able to indentify what is leaking in your usage, since we already plugged the leaks we reproduced in our labs
we merge the code into master ( including the code to allow a heap memory export )
@MajorP93 I am not sure of which branch.
can you switch to https://github.com/vatesfr/xen-orchestra/pull/9725
then
when running this branch :
wait for the memory to fill
find the process running xo ( dist/cli.mjs )
send signal : kill -SIGUSR2
a file is created in `/tmp/xo-server-${process.pid}-${Date.now()}.heapsnapshot`
Note that these exports can contain sensitive data. This will freeze you xo from a few seconds to 2 minutes, and the memory consumption after will be higher after, but this will be very usefull to identify exactly what is leaking
The file will be a few hundred MB
@acebmxer did you enabled NBD on the network ( in the pool view ? ) is it a network accessible by the xo ?
@acebmxer you need to enable NBD on the backup to ensure ti works with qcow2
we 'll check for the progress bar
maybe a good news ?

is completely stable since yesterday's patch
@MajorP93 I think there are multiple issue for memory . Pillow and acebmxer uses proxy , so the backup memory is handled differently
are you using XO from source or a xoa ? would you be ok to export a memory heap ?
that's better but it still grew during the night

(and the process did an aout of memory when I asked for a memory dump)
I reduced the memory allocated to the rest api and will continue monitoring
@ph7

let's call it a day
@acebmxer after 3 hours the consumption was slightly rising, so we also deployed the latest rest api patch which is already on master : https://github.com/vatesfr/xen-orchestra/commit/0fef765ce6b4bc96b28e6af9be3d3dba3fa7dc1e