Page 1 of 1

WAPT server crashes constantly

Published: October 15, 2019 - 3:19 PM
by renaud.counhaye
Good morning,

My Linux server wapt 4.9.0-8-amd64 #1 SMP Debian 4.9.144-3.1 (2019-02-19) x86_64

WAPT Server version: 1.7.4
WAPT Agent version: 1.7.4.6143
WAPT Setup version: 1.7.4.6143

It tends to crash sporadically and randomly, to the point of blocking access to the console (Error 504), web access (Error 502), and client updates with a timeout.
On the server, I find the wapttasks service in a failed state.

● wapttasks.service loaded failed failed WAPT Tasks startup script

root@wapt # systemctl status wapttasks
● wapttasks.service - WAPT Tasks startup script
Loaded: loaded (/usr/lib/systemd/system/wapttasks.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2019-10-15 15:00:05 CEST; 18s ago Process: 611 ExecStart=/opt/wapt/bin/python /opt/wapt/waptserver/wapthuey.py waptenterprise.waptserver.wsus_tasks.huey Main PID: 611 (code=exited, status=1/FAILURE)

Oct 15 15:00:04 wapt systemd[1]: wapttasks.service: Unit entered failed state. Oct 15 15:00:04 wapt systemd[1]: wapttasks.service: Failed with result 'exit-code'. Oct 15 15:00:05 wapt systemd[1]: wapttasks.service: Service hold-off time over, scheduling restart.
Oct 15 15:00:05 wapt systemd[1]: Stopped WAPT Tasks startup script.
Oct 15 15:00:05 wapt systemd[1]: wapttasks.service: Start request repeated too quickly. Oct 15 15:00:05 wapt systemd[1]: Failed to start WAPT Tasks startup script. Oct 15 15:00:05 wapt systemd[1]: wapttasks.service: Unit entered failed state. Oct 15 15:00:05 wapt systemd[1]: wapttasks.service: Failed with result 'exit-code'.

The wapt server service is running, but if I check its status, I get a different story:

[ ~ ] root@wapt # systemctl status waptserver
● waptserver.service - WAPT Server startup script
Loaded: loaded (/usr/lib/systemd/system/waptserver.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2019-09-20 14:59:28 CEST; 3 weeks 4 days ago Main PID: 918 (python)
Tasks: 1 (limit: 4915)
CGroup: /system.slice/waptserver.service
└─918 /opt/wapt/bin/python /opt/wapt/waptserver/server.py

Oct 14 21:00:05 wapt python[918]: 2019-10-14 21:00:05,820 WARNING Invalid session None
Oct 14 23:57:04 wapt python[918]: 2019-10-14 23:57:04,954 WARNING Invalid session None
Oct 15 08:51:02 wapt python[918]: 2019-10-15 08:51:02,071 WARNING Invalid session None
Oct 15 12:57:09 wapt python[918]: peewee 2019-10-15 12:57:09,728 WARNING SocketIO connection refused for uuid , sid 07feOct 15 12:57:09 wapt python[918]: 2019-10-15 12:57:09,728 WARNING SocketIO connection refused for uuid , sid 07fe1be33bfOct 15 12:57:09 wapt python[918]: 2019-10-15 12:57:09,731 WARNING Application rejected connection
Oct 15 13:47:01 wapt python[918]: peewee 2019-10-15 13:47:01,889 CRITICAL Get_websocket_auth_token failed EWaptAuthenticOct 15 13:47:01 wapt python[918]: 2019-10-15 13:47:01,889 CRITICAL Get_websocket_auth_token failed EWaptAuthenticationFaOct 15 14:48:02 wapt python[918]: peewee 2019-10-15 14:48:02,176 CRITICAL Get_websocket_auth_token failed EWaptAuthenticOct 15 14:48:02 wapt python[918]: 2019-10-15 14:48:02,176 CRITICAL Get_websocket_auth_token failed EWaptAuthenticationFa

Rebooting the machine or stopping and then restarting Nginx, Waptserver, and wapttasks resolves the problem, but it's not ideal...
In fact, even after this procedure, wapttasks returns to its failed state.

Thank you for reading, hoping to hear from you soon and for your help if possible, have a good day.

Sep 20 14:59:19 wapt nginx[753]: nginx: [warn] "ssl_stapling" ignored, issuer certificate not found
Sep 20 14:59:20 wapt nginx[836]: nginx: [warn] "ssl_stapling" ignored, issuer certificate not found
Sep 20 14:59:20 wapt systemd[1]: nginx.service: Failed to read PID from file /run/nginx.pid: Invalid argument

Sep 20 14:59:23 wapt cron[850]: (CRON) INFO (Running @reboot jobs)
Sep 20 14:59:24 wapt python[697]: #033[91mError importing waptenterprise.waptserver.wsus_tasks.huey#033[0m
Sep 20 14:59:24 wapt python[697]: Traceback (most recent call last):
Sep 20 14:59:24 wapt python[697]: File "/opt/wapt/waptserver/wapthuey.py", line 37, in
Sep 20 14:59:24 wapt python[697]: huey_consumer.consumer_main()
Sep 20 14:59:24 wapt python[697]: File "/opt/wapt/lib/python2.7/site-packages/huey/bin/huey_consumer.py", line 43, in consumer_main Sep 20 14:59:24 wapt python[697]: huey_instance = load_huey(args[0]) Sep 20 14:59:24 wapt python[697]: File "/opt/wapt/lib/python2.7/site-packages/huey/bin/huey_consumer.py", line 18, in load_huey Sep 20 14:59:24 wapt python[697]: return load_class(path) Sep 20 14:59:24 wapt python[697]: File "/opt/wapt/lib/python2.7/site-packages/huey/utils.py", line 46, in load_class Sep 20 14:59:24 wapt python[697]: __import__(path) Sep 20 14:59:24 wapt python[697]: ImportError: No module named waptenterprise.waptserver.wsus_tasks
Sep 20 14:59:24 wapt systemd[1]: wapttasks.service: Main process exited, code=exited, status=1/FAILURE
Sep 20 14:59:24 wapt systemd[1]: wapttasks.service: Unit failed entered state.
Sep 20 14:59:24 wapt systemd[1]: wapttasks.service: Failed with result 'exit-code'.
Sep 20 14:59:24 wapt systemd[1]: wapttasks.service: Service hold-off time over, scheduling restart.
Sep 20 14:59:24 wapt systemd[1]: Stopped WAPT Tasks startup script.
Sep 20 14:59:24 wapt systemd[1]: Started WAPT Tasks startup script.
Sep 20 14:59:25 wapt python[856]: #033[91mError importing waptenterprise.waptserver.wsus_tasks.huey#033[0m
Sep 20 14:59:25 wapt python[856]: Traceback (most recent call last):
Sep 20 14:59:25 wapt python[856]: File "/opt/wapt/waptserver/wapthuey.py", line 37, in
Sep 20 14:59:25 wapt python[856]: huey_consumer.consumer_main()
Sep 20 14:59:25 wapt python[856]: File "/opt/wapt/lib/python2.7/site-packages/huey/bin/huey_consumer.py
Sep 20 14:59:25 wapt python[856]: huey_consumer.consumer_main()
Sep 20 14:59:25 wapt python[856]: File "/opt/wapt/lib/python2.7/site-packages/huey/bin/huey_consumer.py", line 43, in consumer_main
Sep 20 14:59:25 wapt python[856]: huey_instance = load_huey(args[0])
Sep 20 14:59:25 wapt python[856]: File "/opt/wapt/lib/python2.7/site-packages/huey/bin/huey_consumer.py", line 18, in load_huey
Sep 20 14:59:25 wapt python[856]: return load_class(path)
Sep 20 14:59:25 wapt python[856]: File "/opt/wapt/lib/python2.7/site-packages/huey/utils.py", line 46, in load_class
Sep 20 14:59:25 wapt python[856]: __import__(path)
Sep 20 14:59:25 wapt python[856]: ImportError: No module named waptenterprise.waptserver.wsus_tasks
Sep 20 14:59:25 wapt systemd[1]: wapttasks.service: Main process exited, code=exited, status=1/FAILURE
Sep 20 14:59:25 wapt systemd[1]: wapttasks.service: Unit entered failed state.
Sep 20 14:59:25 wapt systemd[1]: wapttasks.service: Failed with result 'exit-code'.
Sep 20 14:59:26 wapt systemd[1]: wapttasks.service: Service hold-off time over, scheduling restart.
Sep 20 14:59:26 wapt systemd[1]: Stopped WAPT Tasks startup script.
Sep 20 14:59:26 wapt systemd[1]: Started WAPT Tasks startup script.
Sep 20 14:59:26 wapt systemd[1]: Started Daily apt download activities.

Sep 20 14:59:26 wapt systemd[1]: apt-daily.timer: Adding 1h 59min 42.052407s random time.
Sep 20 14:59:26 wapt systemd[1]: apt-daily.timer: Adding 1h 23min 1.241385s random time.
Sep 20 14:59:27 wapt python[874]: #033[91mError importing waptenterprise.waptserver.wsus_tasks.huey#033[0m
Sep 20 14:59:27 wapt python[874]: Traceback (most recent call last):
Sep 20 14:59:27 wapt python[874]: File "/opt/wapt/waptserver/wapthuey.py", line 37, in
Sep 20 14:59:27 wapt python[874]: huey_consumer.consumer_main()
Sep 20 14:59:27 wapt python[874]: File "/opt/wapt/lib/python2.7/site-packages/huey/bin/huey_consumer.py", line 43, in consumer_main Sep 20 14:59:27 wapt python[874]: huey_instance = load_huey(args[0])
Sep 20 14:59:27 wapt python[874]: File "/opt/wapt/lib/python2.7/site-packages/huey/bin/huey_consumer.py", line 18, in load_huey Sep 20 14:59:27 wapt python[874]: return load_class(path)
Sep 20 14:59:27 wapt python[874]: File "/opt/wapt/lib/python2.7/site-packages/huey/utils.py", line 46, in load_class
Sep 20 14:59:27 wapt python[874]: __import__(path)
Sep 20 14:59:27 wapt python[874]: ImportError: No module named waptenterprise.waptserver.wsus_tasks
Sep 20 14:59:27 wapt systemd[1]: wapttasks.service: Main process exited, code=exited, status=1/FAILURE
Sep 20 14:59:27 wapt systemd[1]: wapttasks.service: Unit failed entered state.
Sep 20 14:59:27 wapt systemd[1]: wapttasks.service: Failed with result 'exit-code'.
Sep 20 14:59:27 wapt systemd[1]: wapttasks.service: Service hold-off time over, scheduling restart.
Sep 20 14:59:27 wapt systemd[1]: Stopped WAPT Tasks startup script.
Sep 20 14:59:27 wapt systemd[1]: Started WAPT Tasks startup script.
Sep 20 14:59:27 wapt python[903]: #033[91mError importing waptenterprise.waptserver.wsus_tasks.huey#033[0m
Sep 20 14:59:27 wapt python[903]: Traceback (most recent call last):

ep 20 15:00:10 wapt peewee 2019-09-20 15:00:10,626 CRITICAL Get_websocket_auth_token failed EWaptAuthenticationFailure(u"Request signature verification failed: SSL signature verification failed for certificate {'commonName': u'1591875c-95e0-433c-b448-38427397885d', 'organizationName': u'Microsoft'} issued by 1591875c-95e0-433c-b448-38427397885d",)
Sep 20 15:00:10 wapt python[918]: 2019-09-20 15:00:10,626 CRITICAL Get_websocket_auth_token failed EWaptAuthenticationFailure(u"Request signature verification failed: SSL signature verification failed for certificate {'commonName': u'1591875c-95e0-433c-b448-38427397885d', 'organizationName': u'Microsoft'} issued by 1591875c-95e0-433c-b448-38427397885d",)
Sep 20 15:00:17 wapt peewee 2019-09-20 15:00:17,724 CRITICAL Get_websocket_auth_token failed EWaptAuthenticationFailure('No uuid supplied',)
Sep 20 15:00:17 wapt python[918]: 2019-09-20 15:00:17,724 CRITICAL Get_websocket_auth_token failed EWaptAuthenticationFailure('No uuid supplied',)
Sep 20 15:01:10 wapt peewee 2019-09-20 15:01:10,684 CRITICAL Get_websocket_auth_token failed EWaptAuthenticationFailure(u"Request signature verification failed: SSL signature verification failed for certificate {'commonName': u'1591875c-95e0-433c-b448-38427397885d', 'organizationName': u'Microsoft'} issued by 1591875c-95e0-433c-b448-38427397885d",)
Sep 20 15:01:10 wapt python[918]: 2019-09-20 15:01:10,684 CRITICAL Get_websocket_auth_token failed EWaptAuthenticationFailure(u"Request signature verification failed: SSL signature verification failed for certificate {'commonName': u'1591875c-95e0-433c-b448-38427397885d', 'organizationName': u'Microsoft'} issued by 1591875c-95e0-433c-b448-38427397885d",)
Sep 20 15:01:17 wapt peewee 2019-09-20 15:01:17,871 CRITICAL Get_websocket_auth_token failed EWaptAuthenticationFailure('No uuid supplied',)
Sep 20 15:01:17 wapt python[918]: 2019-09-20 15:01:17,871 CRITICAL Get_websocket_auth_token failed EWaptAuthenticationFailure('No uuid supplied',)

Re: WAPT server constantly crashing

Published: October 16, 2019 - 11:56 AM
by htouvet
Good morning,
In fact, there are two problems with the question

The WaptTasks service currently only handles downloading Windows updates and may not be relevant to you. If it's experiencing an error, it's likely due to the absence of the Python modules corresponding to the Enterprise version. However, this shouldn't affect the other functions of Wapt.

The waptserver service is the culprit. If you get a 504 error (gateway timeout) in the console, but the waptserver service appears to be running, it's an internal problem within the waptserver process, such as a deadlock.
The waptserver process is single-threaded, with a cooperative multitasking system ("greenlet") to keep a large number of TCP (web)sockets (the 'connected' machines) active. If the waptserver service were multithreaded, it would spend more time switching contexts while waiting for data on the sockets than actually working. However, this might introduce a vulnerability to deadlocks.
We have three other clients (with a significant number of connected workstations) who experience this blocking issue intermittently, and analysis has not yet yielded conclusive results. The most likely cause is a deadlock on database transactions, within this context of collaborative multitasking
Restarting the waptserver service is enough to resolve the situation, so as a temporary workaround, we have installed a watchdog timer in a cron job on these clients that pings the waptserver service (https://waptserver/ping) and restarts it if there's an error. Something like this:
wget -q -O - http://127.0.0.1:8080/ping --no-check-certificate | grep "WAPT Server running" || (echo Restart; systemctl restart waptserver)
Not ideal, but helpful while awaiting further investigations

Re: WAPT server constantly crashing

Published: October 16, 2019 - 12:14 PM
by dcardon
Hello Renaud,
renaud.counhaye wrote: Oct 15, 2019 - 3:19 PM ...
It tends to crash sporadically and randomly, to the point of blocking access to the console (Error 504), web access (Error 502), and client updates with a timeout.
On the server, I find the wapttasks service in a failed state

...
Rebooting the machine or stopping and then restarting Nginx, Waptserver, and wapttasks resolves the problem, but it's not ideal...
In fact, even after this fix, wapttasks returns to its failed state.


Thanks for your detailed post. As Hubert mentioned above, we've already encountered the problem that was reported to us. It seems we're running into a deadlock issue, probably related to PostgreSQL. Could you run the following command the next time you encounter a deadlock on your WAPT server as root?
sudo -u postgres psql wapt
SELECT datname, usename, client_addr, xact_start, query_start, state_change, state, query FROM pg_stat_activity;

and send me the result via private message (dcardon AT tranquil DOT it).

Regards,

Denis