Scan stopped

Hi,

I have a scan of a specific VLAN running for 2 years now every month, but this september it suddenly stopped by itself.
the error I have in /var/log/gvm/ospd-openvas.log

OSPD[3407698] 2023-09-07 14:04:24,955: INFO: (ospd.command.command) Scan dc199ed4-9183-4875-8667-b09bf0c00b49 added to the queue in position 2.
OSPD[3407698] 2023-09-07 14:04:25,079: INFO: (ospd.ospd) Currently 1 queued scans.
OSPD[3407698] 2023-09-07 14:04:25,278: INFO: (ospd.ospd) Starting scan dc199ed4-9183-4875-8667-b09bf0c00b49.
OSPD[3407698] 2023-09-07 16:15:01,619: ERROR: (ospd.ospd) dc199ed4-9183-4875-8667-b09bf0c00b49: Exception Error while reading from connection : (104, ‘Connection reset by peer’) while scanning
Traceback (most recent call last):
File “/usr/lib/python3/dist-packages/redis/connection.py”, line 824, in read_response
response = self._parser.read_response(disable_decoding=disable_decoding)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/redis/connection.py”, line 467, in read_response
self.read_from_socket()
File “/usr/lib/python3/dist-packages/redis/connection.py”, line 421, in read_from_socket
bufflen = self._sock.recv_into(self._buffer)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/usr/lib/python3/dist-packages/ospd/ospd.py”, line 584, in start_scan
self.exec_scan(scan_id)
File “/usr/lib/python3/dist-packages/ospd_openvas/daemon.py”, line 1174, in exec_scan
target_is_finished = kbdb.target_is_finished(scan_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/ospd_openvas/db.py”, line 576, in target_is_finished
status = self._get_single_item(f’internal/{scan_id}')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/ospd_openvas/db.py”, line 470, in _get_single_item
return OpenvasDB.get_single_item(self.ctx, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/ospd_openvas/db.py”, line 268, in get_single_item
return ctx.lindex(name, index)
^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/redis/commands/core.py”, line 2564, in lindex
return self.execute_command(“LINDEX”, name, index)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/redis/client.py”, line 1238, in execute_command
return conn.retry.call_with_retry(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/redis/retry.py”, line 49, in call_with_retry
fail(error)
File “/usr/lib/python3/dist-packages/redis/client.py”, line 1242, in
lambda error: self._disconnect_raise(conn, error),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/redis/client.py”, line 1228, in _disconnect_raise
raise error
File “/usr/lib/python3/dist-packages/redis/retry.py”, line 46, in call_with_retry
return do()
^^^^
File “/usr/lib/python3/dist-packages/redis/client.py”, line 1239, in
lambda: self._send_command_parse_response(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/redis/client.py”, line 1215, in _send_command_parse_response
return self.parse_response(conn, command_name, **options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/redis/client.py”, line 1254, in parse_response
response = connection.read_response()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/redis/connection.py”, line 830, in read_response
raise ConnectionError(f"Error while reading from {hosterr}" f" : {e.args}")
redis.exceptions.ConnectionError: Error while reading from connection : (104, ‘Connection reset by peer’)
OSPD[3407698] 2023-09-07 16:15:01,626: INFO: (ospd.ospd) dc199ed4-9183-4875-8667-b09bf0c00b49: Host scan got interrupted. Progress: 98, Status: RUNNING
OSPD[3407698] 2023-09-07 16:15:01,627: INFO: (ospd.ospd) dc199ed4-9183-4875-8667-b09bf0c00b49: Scan interrupted.

How can I check which host failed ? or what happened more precisely?

Many thanks!

I updated everything, rebooted and launched it again and it stopped at 96%:

OSPD[1331] 2023-09-08 07:55:51,772: INFO: (ospd.ospd) Currently 1 queued scans.
OSPD[1331] 2023-09-08 07:55:51,933: INFO: (ospd.ospd) Starting scan 128e34f6-229a-498a-abf5-fba13b442fb2.
OSPD[1331] 2023-09-08 10:17:52,784: ERROR: (ospd.ospd) 128e34f6-229a-498a-abf5-fba13b442fb2: Exception Error while reading from connection : (104, ‘Connection reset by peer’) while scanning
Traceback (most recent call last):
File “/usr/lib/python3/dist-packages/redis/connection.py”, line 824, in read_response
response = self._parser.read_response(disable_decoding=disable_decoding)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/redis/connection.py”, line 467, in read_response
self.read_from_socket()
File “/usr/lib/python3/dist-packages/redis/connection.py”, line 421, in read_from_socket
bufflen = self._sock.recv_into(self._buffer)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/usr/lib/python3/dist-packages/ospd/ospd.py”, line 584, in start_scan
self.exec_scan(scan_id)
File “/usr/lib/python3/dist-packages/ospd_openvas/daemon.py”, line 1174, in exec_scan
target_is_finished = kbdb.target_is_finished(scan_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/ospd_openvas/db.py”, line 576, in target_is_finished
status = self._get_single_item(f’internal/{scan_id}')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/ospd_openvas/db.py”, line 470, in _get_single_item
return OpenvasDB.get_single_item(self.ctx, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/ospd_openvas/db.py”, line 268, in get_single_item
return ctx.lindex(name, index)
^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/redis/commands/core.py”, line 2564, in lindex
return self.execute_command(“LINDEX”, name, index)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/redis/client.py”, line 1238, in execute_command
return conn.retry.call_with_retry(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/redis/retry.py”, line 49, in call_with_retry
fail(error)
File “/usr/lib/python3/dist-packages/redis/client.py”, line 1242, in
lambda error: self._disconnect_raise(conn, error),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/redis/client.py”, line 1228, in _disconnect_raise
raise error
File “/usr/lib/python3/dist-packages/redis/retry.py”, line 46, in call_with_retry
return do()
^^^^
File “/usr/lib/python3/dist-packages/redis/client.py”, line 1239, in
lambda: self._send_command_parse_response(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/redis/client.py”, line 1215, in _send_command_parse_response
return self.parse_response(conn, command_name, **options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/redis/client.py”, line 1254, in parse_response
response = connection.read_response()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/redis/connection.py”, line 830, in read_response
raise ConnectionError(f"Error while reading from {hosterr}" f" : {e.args}")
redis.exceptions.ConnectionError: Error while reading from connection : (104, ‘Connection reset by peer’)
OSPD[1331] 2023-09-08 10:17:52,801: INFO: (ospd.ospd) 128e34f6-229a-498a-abf5-fba13b442fb2: Host scan got interrupted. Progress: 91, Status: RUNNING
OSPD[1331] 2023-09-08 10:17:52,801: INFO: (ospd.ospd) 128e34f6-229a-498a-abf5-fba13b442fb2: Scan interrupted.
OSPD[1082900] 2023-09-08 10:19:00,192: INFO: (ospd.main) Starting OSPd OpenVAS version 22.5.4.
OSPD[1082900] 2023-09-08 10:19:00,198: INFO: (ospd_openvas.messaging.mqtt) Successfully connected to MQTT broker
OSPD[1082900] 2023-09-08 10:19:10,246: INFO: (ospd_openvas.daemon) Loading VTs. Scans will be [requested|queued] until VTs are loaded. This may take a few minutes, please wait…
OSPD[1082900] 2023-09-08 10:19:49,299: INFO: (ospd_openvas.daemon) Finished loading VTs. The VT cache has been updated from version 0 to 202309070551.

How can I see what causes this? Machines are all on the same VLAN as the vulnerability scanner (no VLAN Firewall issue(s)), machines are up. I scan the complete VLAN not a list of IPs.

Did you checked why this happens, OOM killer for example ?

Indeed it’s happening “randomely” from 80% to 96% of the job ( ~100 VMs scanning).
I’ll allocate more RAM/CPUs to the machine.

looks more like a CPU issue:


something has changed in the last version of the scanner as now it’s overloading the CPUs a lot.
I already have 16 CPUs on this machine. Load is at 48 ( nothing else is running)

looks like redis server is not happy (this happens with 16CPUs/32GB of RAM)

Sep 11 09:39:47 svx-vs1 redis[1288]: Out Of Memory allocating 16390 bytes!
Sep 11 09:39:47 svx-vs1 redis[1288]:

                                 === REDIS BUG REPORT START: Cut & paste starting from here ===

Sep 11 09:39:47 svx-vs1 redis[1288]: ------------------------------------------------
Sep 11 09:39:47 svx-vs1 redis[1288]: !!! Software Failure. Press left mouse button to continue
Sep 11 09:39:47 svx-vs1 redis[1288]: Guru Meditation: Redis aborting for OUT OF MEMORY. Allocating 16390 bytes! #server.c:6658
Sep 11 09:39:47 svx-vs1 redis[1288]:

You need to find out why this happens. As well how do you schedule your tasks, max-parallel etc … It is clearly a resource issue.

Check for the latest stable GVM Toolchain, and turn on memory accounting to see what process it taking all your memory. Check over commitment and swap as well.

Indeed looks like redis is not happy. I added more ram/swap, same happened ( and swap is never used). It’s a monthly task I run for years without problem, it’s sequential 20 hosts at a time nothing fancy:

I’ll reduce to like 10

Thanks for the help! Reducing “Max concurrently scanned hosts” did the trick, looks like to me that the sweet spot to fine tune the scan is 1 CPU per concurrently scanned host (RAM is fine at 24GB) I have 90 host scanned (Full and Fast) in under 10 hours.