No reply to this topic but just seen same thing.
And same "workaround" : disabling vTPM at VM creation from template to avoid crashing.
I can confirm that XOA want to add a second vTPM, any fix ?
No reply to this topic but just seen same thing.
And same "workaround" : disabling vTPM at VM creation from template to avoid crashing.
I can confirm that XOA want to add a second vTPM, any fix ?
Hi,
After deleting certs in /etc/stunnel/certs on every hosts, and start/stop sdn-controller plugin on XOA, things came back to normal.
Have a good day.
Ah... so I missed the fix by one commit and two hours...
I can confirm it resolve it.
Thanks!
@bleader Hi,
After a restart of the entire host, port 6640 is now listed when I trigger ss.
But, unfortunatly, tunnels are not working, every VM on this host loose connection to other in the same sdn network.
Exemple with an ping between two hosts :
2025-10-09T12:22:54.781Z|00026|tunnel(handler1)|WARN|receive tunnel port not found (arp,tun_id=0x1f1,tun_src=192.0.0.1,tun_dst=192.0.0.3,tun_ipv6_src=::,tun_ipv6_dst=::,tun_gbp_id=0,tun_gbp_flags=0,tun_tos=0,tun_ttl=64,tun_erspan_ver=0,gtpu_flags=0,gtpu_msgtype=0,tun_flags=key,in_port=33,vlan_tci=0x0000,dl_src=56:30:10:5c:4d:ad,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.10.10,arp_tpa=192.168.10.20,arp_op=1,arp_sha=56:30:10:5c:4d:ad,arp_tha=00:00:00:00:00:00)
2025-10-09T12:22:54.781Z|00027|ofproto_dpif_upcall(handler1)|INFO|Dropped 61 log messages in last 59 seconds (most recently, 1 seconds ago) due to excessive rate
2025-10-09T12:22:54.781Z|00028|ofproto_dpif_upcall(handler1)|INFO|received packet on unassociated datapath port 33
If I migrate the VM on the third host to another, network came back.
This is very strange, because the network I've choose to test it is one of firt of all created, not last one, so it have worked before, and not now. I don't understand why and what to do...
Hi,
After deleting certs in /etc/stunnel/certs on every hosts, and start/stop sdn-controller plugin on XOA, things came back to normal.
Have a good day.
@olivierlambert
Iiiiiiii can't really say this...
After the end of creation, I see only VM 1, 3 and 4... No VM2
In logs, again : "TOO_MANY_STORAGE_MIGRATES", seems VM 4 started migrating before VM2 ends...
So, you're right, put custom template on the same storage is the best solution.
Right, I've now understand all of this stuff !
And since it's explained so well, I can also explain it to users (and especially teachers).
And I imagine that if I don't use "Fast Clone", it's a VM copy (not migrate), it will work but take much more time...
Thank you again.
EDIT : forget it, I've tested multiVM deployement fo the same template to local SR without Fast Clone. A VMcopy is made, and after that, an Async.VM.migrate_send to the local SR. I can see that only 3 migration are done at the same time, the fourth one is "blinking" in task list until one finish, and then start.
@olivierlambert Hi,
So, if I understand, VM creation is done, but XO have to create a copy of base disk and move it trough migration to local SR.
I think I didn't explain clearly, because don't understand the options provided.
Just to be clear, the user journey is :
I can effectively see 2 "base copy" disk of the VM wanted by users on the local SR, bur for the same template (called de13-xfce).
So I imagine that copy of base disk is made only once, and other VMs get their differential disk from it.
What I didn't understand is that when they create another VM from template (called deb13-nograph), there is no error and there is no base disk. However, fast clone is used here too...
Sorry for my misunderstanding, I'm pretty new to this kind of use with Xcp/XO (educational use) and I try to understand how it works.
Hi,
I have two questions regarding creation of new VM from a custom template.
To understand :
When users launch several VM creation from one template, some of them have an error ("unknown error from the peer").
If I get logs, the real error is "TOO_MANY_STORAGE_MIGRATES"
And they have to try many times.
I see in this article that there is a queue, but it seems that in that situation, there is not.
When I migrate more than 3 VM from one host to another (and to another SR), I don't have this kind of error, and queue seem to work because when one of the three first VM finish, another start.
So, my questions :
@bleader Hi, thnak you for your response.
However, What I can see :
iptables-save | grep 6640
-A xapi-INPUT -p tcp -m conntrack --ctstate NEW -m tcp --dport 6640 -j ACCEPT
systemctl status openvswitch
● openvswitch.service - Open vSwitch
Loaded: loaded (/usr/lib/systemd/system/openvswitch.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/openvswitch.service.d
└─local.conf, slice.conf
Active: active (running) since mer. 2025-10-01 12:00:35 CEST; 1 day 22h ago
Process: 44006 ExecStop=/usr/share/openvswitch/scripts/ovs-ctl stop (code=exited, status=0/SUCCESS)
Process: 44042 ExecStart=/usr/share/openvswitch/scripts/ovs-start (code=exited, status=0/SUCCESS)
CGroup: /control.slice/openvswitch.service
├─44085 ovsdb-server: monitoring pid 44086 (healthy)
├─44086 ovsdb-server /run/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --ssl-ciphers=AES256-GCM-SHA38:A...
├─44100 ovs-vswitchd: monitoring pid 44101 (healthy)
└─44101 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --pidfile=/var/run/openvswitch/ovs-vswitchd.pid --detach --monitor
ps -aux | grep ovsdb-server
root 44085 0.0 0.0 44128 556 ? S<s oct.01 0:00 ovsdb-server: monitoring pid 44086 (healthy)
root 44086 0.2 0.0 52252 12544 ? S< oct.01 6:52 ovsdb-server /run/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --ssl-ciphers=AES256-GCM-SHA38:AES256-SHA256:AES256-SHA:AES128-GCM-SHA256:AES128-SHA256:AES128-SHA --ssl-protocols=TLSv1.2 --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
But, port 6640 is not listening :
tcp LISTEN 0 1 127.0.0.1:5900 0.0.0.0:* users:(("vncterm",pid=1322,fd=3))
tcp LISTEN 0 128 0.0.0.0:111 0.0.0.0:* users:(("rpcbind",pid=1265,fd=8))
tcp LISTEN 0 128 0.0.0.0:22 0.0.0.0:* users:(("sshd",pid=1158,fd=3))
tcp LISTEN 0 64 0.0.0.0:36183 0.0.0.0:*
tcp LISTEN 0 5 0.0.0.0:10809 0.0.0.0:* users:(("xapi-nbd",pid=2864,fd=6))
tcp LISTEN 0 1 127.0.0.1:9500 0.0.0.0:* users:(("vncterm",pid=1322,fd=4))
tcp LISTEN 0 128 127.0.0.1:8125 0.0.0.0:* users:(("netdata",pid=2783309,fd=61))
tcp LISTEN 0 128 0.0.0.0:19999 0.0.0.0:* users:(("netdata",pid=2783309,fd=7))
tcp LISTEN 0 128 0.0.0.0:48863 0.0.0.0:* users:(("rpc.statd",pid=5012,fd=9))
tcp LISTEN 0 64 [::]:44169 [::]:*
tcp LISTEN 0 128 [::]:111 [::]:* users:(("rpcbind",pid=1265,fd=11))
tcp LISTEN 0 128 *:80 *:* users:(("xapi",pid=2859,fd=11))
tcp LISTEN 0 128 [::]:22 [::]:* users:(("sshd",pid=1158,fd=4))
tcp LISTEN 0 128 *:443 *:* users:(("stunnel",pid=3140,fd=9))
tcp LISTEN 0 128 [::]:19999 [::]:* users:(("netdata",pid=2783309,fd=8))
tcp LISTEN 0 128 [::]:57023 [::]:* users:(("rpc.statd",pid=5012,fd=11))
And yes, you're right, on the third node (wich raise this error), there is no "option" line on bridges.
I've tried to start/stop ovs, start/stop sdn-controller. No changes
Hi,
Using XO from source, last update two days ago.
I have some script to provision sefl-service which are working well.
But, two days ago, I put a node in maintainance mode. After restarting it, maintenance mode was automatically disabled. I put it again because I don't finish my work.
After that, when I crate a SDn network via xo-cli, I have this message :
✖ JsonRpcError: connect ECONNREFUSED X.X.X.X:6640
at Peer._callee$ (/home/uga/.nvm/versions/node/v22.17.0/lib/node_modules/xo-cli/node_modules/json-rpc-peer/dist/index.js:139:44)
at Peer.<anonymous> (/home/uga/.nvm/versions/node/v22.17.0/lib/node_modules/xo-cli/node_modules/@babel/runtime/helpers/regeneratorRuntime.js:52:18)
at Generator.<anonymous> (/home/uga/.nvm/versions/node/v22.17.0/lib/node_modules/xo-cli/node_modules/@babel/runtime/helpers/regenerator.js:52:51)
at Generator.next (/home/uga/.nvm/versions/node/v22.17.0/lib/node_modules/xo-cli/node_modules/@babel/runtime/helpers/regeneratorDefine.js:17:23)
at asyncGeneratorStep (/home/uga/.nvm/versions/node/v22.17.0/lib/node_modules/xo-cli/node_modules/@babel/runtime/helpers/asyncToGenerator.js:3:17)
at _next (/home/uga/.nvm/versions/node/v22.17.0/lib/node_modules/xo-cli/node_modules/@babel/runtime/helpers/asyncToGenerator.js:17:9)
at /home/uga/.nvm/versions/node/v22.17.0/lib/node_modules/xo-cli/node_modules/@babel/runtime/helpers/asyncToGenerator.js:22:7
at new Promise (<anonymous>)
at Peer.<anonymous> (/home/uga/.nvm/versions/node/v22.17.0/lib/node_modules/xo-cli/node_modules/@babel/runtime/helpers/asyncToGenerator.js:14:12)
at Peer.exec (/home/uga/.nvm/versions/node/v22.17.0/lib/node_modules/xo-cli/node_modules/json-rpc-peer/dist/index.js:182:20) {
code: -32000,
data: {
address: 'X.X.X.X',
code: 'ECONNREFUSED',
errno: -111,
message: 'connect ECONNREFUSED X.X.X.X:6640',
name: 'Error',
port: 6640,
stack: 'Error: connect ECONNREFUSED X.X.X.X:6640\n' +
' at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1637:16)\n' +
' at TCPConnectWrap.callbackTrampoline (node:internal/async_hooks:130:17)',
syscall: 'connect'
}
}
I have checked all tree node, on first and second, port 6640 are listening :
tcp LISTEN 0 10 0.0.0.0:6640 0.0.0.0:* users:(("ovsdb-server",pid=1559,fd=20))
But NOT on third node. Even that, on the third node, opevswitch service is up, and ovsdb-server running :
openvswitch.service - Open vSwitch
Loaded: loaded (/usr/lib/systemd/system/openvswitch.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/openvswitch.service.d
└─local.conf, slice.conf
Active: active (running) since mer. 2025-10-01 12:00:35 CEST; 22h ago
Process: 44006 ExecStop=/usr/share/openvswitch/scripts/ovs-ctl stop (code=exited, status=0/SUCCESS)
Process: 44042 ExecStart=/usr/share/openvswitch/scripts/ovs-start (code=exited, status=0/SUCCESS)
CGroup: /control.slice/openvswitch.service
├─44085 ovsdb-server: monitoring pid 44086 (healthy)
├─44086 ovsdb-server /run/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --ssl-ciphers=AES256-GCM-SHA38:A...
├─44100 ovs-vswitchd: monitoring pid 44101 (healthy)
└─44101 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --pidfile=/var/run/openvswitch/ovs-vswitchd.pid --detach --monitor
And, If I check on XO, network are created on every nodes :
xe pif-list uuid=bb594bff-0d27-5477-8e13-94691411fb18
uuid ( RO) : bb594bff-0d27-5477-8e13-94691411fb18
device ( RO): tunnel1234
MAC ( RO): 3a:ac:6d:14:86:7a
currently-attached ( RO): true
VLAN ( RO): -1
network-uuid ( RO): 4ee84cf6-8c8a-9b5f-5038-1fd980b22d1e
host-uuid ( RO): baeeaf84-c362-4057-999c-1c8fc57f3f33
So if the networks are created on the host, what is the impact of the connection being refused?
Ah... so I missed the fix by one commit and two hours...
I can confirm it resolve it.
Thanks!
Hi,
We are facing an issue : when selecting a template (bundled or custom), default ressources are not showing.
But, as you can see, Interface and Disk are "OK" (in green). If i add an Interface and no disk, the VM is created as expected with template disk, If I add a disk, it's added as a second disk.
The problem is that the disk is created on the default SR, even if the user don't have rights on it !
On my pool, only admin have a proper interface :
I've created another local admin, but same issue.
Another fact : I have a second pool, and on it, even admin have this issue...
I've updated XO from source just now, before posting but same thing.