Redis crashes - double free or corruption

drm · May 18, 2021, 6:24pm

GVM versions

gsad: 21.04.0~git-e2a6556ec-gsa-21.04
gvmd: 21.4.0~git-2f2c39ee-gvmd-21.04
openvas-scanner: 21.4.1~git-2c77939e-openvas-21.04
gvm-libs: 21.4.1~git-04dd8db-gvm-libs-21.04

Environment

Operating system: Debian 10
Installation method / source: Source

Firstly, thanks to everyone at Greenbone for all the great work.

I know this is not a support channel but maybe you can give me some hints.

At 99% of a particular scan the redis-server shuts down and the scan changes status to Interrupted.
Here is the stack trace

I have been monitoring and specifically at 99% redis uses all memory (at least very close to it).
Openvas does set redis to use everything it can from the OS as it does not have a maxmemory set in the config.
Should it be crashing from this as it has no eviction? What happens when there is no memory left?

Once it shuts down I have to restart redis-server and the scanner.
It is using libglib2.0-dev for glib.

(A more a less related issue where someone had a similiar error and no memory)

Thank you in advance

cfi · May 19, 2021, 7:00am

The topic below should give an answer:

especially this comment:

drm · May 19, 2021, 10:03am

Thanks for the quick reply @cfi.
If the permissions were wrong could the task go to 99% or run some tasks altogether?
I believe everything is correct but double checked:

openvas -s

non_simult_ports = 139, 445, 3389, Services/irc
allow_simultaneous_ips = yes
safe_checks = yes
nasl_no_signature_check = yes
time_between_request = 0
expand_vhosts = yes
max_checks = 10
optimize_test = yes
report_host_details = yes
debug_tls = 0
config_file = /opt/gvm/etc/openvas/openvas.conf
unscanned_closed_udp = yes
drop_privileges = no
test_empty_vhost = no
plugins_timeout = 320
cgi_path = /cgi-bin:/scripts
checks_read_timeout = 5
unscanned_closed = yes
auto_enable_dependencies = yes
log_whole_attack = no
db_address = /run/redis-openvas/redis.sock
vendor_version = 
test_alive_hosts_only = no
log_plugins_name_at_load = no
scanner_plugins_timeout = 36000
timeout_retry = 3
max_hosts = 30
include_folders = /opt/gvm/var/lib/openvas/plugins
open_sock_max_attempts = 5
plugins_folder = /opt/gvm/var/lib/openvas/plugins

(/run/redis-openvas/redis.sock seems alright)

cat /opt/gvm/etc/openvas/openvas.conf

db_address = /run/redis-openvas/redis.sock
(it is in openvas.conf)

cat ./etc/redis/redis.conf

2021-05-19-100718_590x138_scrot

ps aux | grep ospd

(gvm user is running the process)

getent group redis

redis:x:109:gvm
(gvm is in redis’ group)

/run/redis-openvas# ls -hal

total 4.0K
drwxr-xr-x 1 redis redis 48 May 18 17:08 .
drwxr-xr-x 1 root root 178 May 19 06:43 …
-rw-rw---- 1 redis redis 4 May 18 17:08 redis-server.pid
srwxrwx— 1 redis redis 0 May 18 17:08 redis.sock

(redis owns the socket)

Perhaps it really crashes when it has no mem left. Any idea on what to debug?

Thanks!

cfi · May 19, 2021, 1:31pm

Unfortunately i don’t have any deeper knowledge on this topic but @jjnicola might be able to give some insight.

Otherwise trying the referenced pull request which avoids that openvas is crashing could also give a few additional log entries / hints.

immauss · May 19, 2021, 1:47pm

I’m certainly not a redis expert, but some time ago, based on a recommendation from another user, I raised the number of databases redis creates to 512. My original build had been using 128. If you are starting redis without the --databases option, then I think the default is only 16.

–databases 512

I remember it being needed for larger scans. Maybe this could resolve for you.

-Scott

mik · May 26, 2021, 12:20pm

Same situation here with redis.

It’s an Out of Memory
You can find it with “dmesg -T”

Everytime.

I tried to fix it with maxmemory but then ospd-scanner daemon is not happy.

So, no idea how to go with at the moment.

jjnicola · May 27, 2021, 6:45am

@drm I don’t think it is a permission issue. The scan should not start at all. Please check what @mik said, probably you have some issue related to memory.

I don’t know how many tasks you are running in parallel, how many hosts in parallel, how many plugins per host. If your hosts have virtual hosts, and if you have the “expand_vhosts” option enabled (default yes). This information would be useful to find the problem/solution.

Both ospd-openvas and openvas are able to limit the running processes in case of low available memory. Ospd-openvas will just avoid running a new task. Openvas will stop launching plugins against the target until there is enough available memory. Of course, this will slow down the scan, but avoid memory issues.
You can run less host in parallel (default 20) and less plugins per host in parallel (default 4). If not necessary, disable the expand_vhosts option.
If you need to scan many vhosts, you can split the task, and scan some vhosts excluding the others (add the vhosts the exclude hosts).

mik · May 27, 2021, 12:17pm

I found a solution for me -> 10GB Memory for the openvas VM.
I tried a lot, also with settings for redis. No chance.

Now increased Memory and my scans run…

I had 6gb before, this was not enough -> OOM.
Now its ok, but tested only with a task against to hosts. But in sequence.

drm · May 27, 2021, 4:09pm

Thank you for your time @cfi, @immauss, @jjnicola and @mik!

It certainly looks like memory issue. I made the recommended changes to redis (tph, overcommit and so on) but I am yet to retest.

It was my bad not to give the appropriate details:
openvas -s:

non_simult_ports = 139, 445, 3389, Services/irc
allow_simultaneous_ips = yes
safe_checks = yes
nasl_no_signature_check = yes
time_between_request = 0
expand_vhosts = yes
max_checks = 10
optimize_test = yes
report_host_details = yes
debug_tls = 0
config_file = /opt/gvm/etc/openvas/openvas.conf
unscanned_closed_udp = yes
drop_privileges = no
test_empty_vhost = no
plugins_timeout = 320
cgi_path = /cgi-bin:/scripts
checks_read_timeout = 5
unscanned_closed = yes
auto_enable_dependencies = yes
log_whole_attack = no
db_address = /run/redis-openvas/redis.sock
vendor_version = 
test_alive_hosts_only = no
log_plugins_name_at_load = no
scanner_plugins_timeout = 36000
timeout_retry = 3
max_hosts = 30
include_folders = /opt/gvm/var/lib/openvas/plugins
open_sock_max_attempts = 5
plugins_folder = /opt/gvm/var/lib/openvas/plugins

As @jjnicola stated, expand_vhosts is enabled.
The concurrency settings are default (4 for NVTs and 20 for hosts).
I will try to tweak all of the latter including changing memory.

Again, thank you for the help!

mik · May 28, 2021, 5:59am

For me it is done with enough of Memory. I decreased machine memory to 10GB and its ok. It uses 8G at the moment.
I wonder why memory consumption is that much high as before, but i don’t understand. We have to deal with it.

bricks · May 28, 2021, 11:21am

You really might check disabling the expand_vhosts preference. I suspect you are scanning a server with a lot of vhosts.