Device stuck in booting state

Problem

Your device may become stuck displaying the status: booting. This can happen when you upgrade some versions of EVE-OS.
 
Note: Your device may appear to be inoperative. Be aware that its applications are still running.
 

Affected versions

EVE-OS versions 8.5.x and 9.3.0, and all versions prior to 8.12.0

Error messages

Device takes a long time to fetch controller certificate:

Aug 30, 2023 @ 12:23:12.662782081 error getCertsFromController failed: All attempts to connect to zedcloud.pmwtus.zededa.net/api/v2/edgedevice/certs failed: [send via eth0: interface eth0: no suitable IP address available send via eth1: interface eth1: no suitable IP address available] - 
Aug 30, 2023 @ 14:32:00.098321055 error getCertsFromController failed: All attempts to connect to zedcloud.pmwtus.zededa.net/api/v2/edgedevice/certs failed: [send via eth0: interface eth0: no suitable IP address available send via eth1: interface eth1: no suitable IP address available] -
Aug 31, 2023 @ 11:55:53.671565858 error getCertsFromController failed: All attempts to connect to zedcloud.pmwtus.zededa.net/api/v2/edgedevice/certs failed: [send via eth0 with src IP 192.168.112.66: interface eth0: no DNS server available send via eth1: interface eth1: no suitable IP address available] -
Aug 31, 2023 @ 11:55:53.671652620 info Initial getCertsFromController failed; switching to short timer -
Aug 31, 2023 @ 11:55:53.672296643 error getCertsFromController failed: All attempts to connect to zedcloud.pmwtus.zededa.net/api/v2/edgedevice/certs failed: [send via eth0 with src IP 192.168.112.66: interface eth0: no DNS server available send via eth1: interface eth1: no suitable IP address available] -
Aug 31, 2023 @ 11:56:18.672213189 info getCertsFromController succeeded; switching to long timer 86400 seconds

Device is unable to fetch configuration due to certificate update:

Aug 30, 2023 @ 12:23:34.194593713 error removeAndVerifyAuthContainer: local server cert hash(32) does not match in authen (32) [222 82 5 20 17 105 67 225 86 197 57 238 207 12 236 224 17 93 68 0 21 112 211 248 114 190 66 141 159 49 166 135], [217 38 129 189 139 107 238 195 202 170 10 57 252 69 108 48 213 102 148 40 5 39 249 62 125 183 140 101 210 255 117 42]
Aug 30, 2023 @ 12:23:34.194743385 error RemoveAndVerifyAuthContainer verify auth error removeAndVerifyAuthContainer: local server cert hash 32bytes does not match in authen, V2 server true, content len 0, url https://zedcloud.pmwtus.zededa.net/api/v2/edgedevice/id/173dh503-8436-4c0c-9e11-b01bfdde52a1/config, senderStatus 4
Aug 30, 2023 @ 12:23:34.194826880 error RemoveAndVerifyAuthContainer failed: removeAndVerifyAuthContainer: local server cert hash 32bytes does not match in authen

Issue might resolve itself once new controller certificate is fetched:

Aug 31, 2023 @ 11:52:05.625398542 info Controller cert delete
Aug 31, 2023 @ 11:52:05.643679567 info Controller cert create

Upgrade successfully proceeds without issues once certificate is fetched:

Aug 31, 2023 @ 11:52:55.226886575 info BaseOs status create 9.4.4-lts-kvm-amd64

Cause

This can happen when your edge node can't get the certificate due to lack of internet connection on the management interface. As a result, the edge node may skip fetching the certificate during the boot process.
 
Note that fetching the certificate may take longer for older EVE-OS versions. It also depends on the quality of internet connectivity.

Workaround

You can force the certificate to roll out. To do so, turn the device off and on until it successfully receives the certificate. This might take a few power cycles.

Solution

Upgrade to an LTS version.
Was this article helpful?
0 out of 0 found this helpful

Articles in this section