Operating system: Alpine Linux 3.12 based container on a FlatCar Linux 2345.3.0 Kernel: 4.19.106-flatcar Installation method / source: Alpine packages
We want to evaluate GVM/OpenVAS as an internal security scan service for our cloud resources. We’re using Kubernetes running on FlatCar Linux nodes in AWS. This means we cant use the official community edition ISO either. I managed to start everything. It was a big task as this isn’t designed for Kubernetes/cloud. It’s not a complaint, I just want to say that I had to be creative and I might have made some mistake during this process. (If you’re interested I can share the deployment details.)
Now I can login to the web UI, I can conduct a scan and I can see that the report is ready but I can’t show or export it. I can see the vulnerabilities in the Vulnerabilities page which are relevant to that one test node. So I think everything went well under the hood I just can’t see the report itself.
When I want to see it I’ll get back an error (changed // to __): Error while loading Results for Report db70a025-f89d-46d8-890a-1e27e4666c97 Please try again.
Thank you. I’m working on the have a working gvm-cli.
Meanwhile I just did another try now without any non-ASCII character in the names and single quoting either and I access the UI directly not through an ALB. The report now shown on the web UI, sometimes. Sometimes I have to wait minutes and it times out or appears. When I waited I tried to list the users. That process stuck as well with this output. After it exited the report just appeared suddenly on the UI.
md main:MESSAGE:: Greenbone Vulnerability Manager version 9.0.1 (DB revision 221)
md manage: INFO:: Getting users.
md manage:WARNING:: database must be initialised from scanner
md manage:WARNING:: sql_exec_internal: PQexec failed: ERROR: deadlock detected
DETAIL: Process 4330 waits for AccessExclusiveLock on relation 18126 of database 16385; blocked by process 4182.
Process 4182 waits for AccessShareLock on relation 18121 of database 16385; blocked by process 4330.
HINT: See server log for query details.
md manage:WARNING:: sql_exec_internal: SQL: CREATE OR REPLACE VIEW result_new_severities AS SELECT results.id as result, users.id as user, dynamic, override, CASE WHEN dynamic != 0 THEN CASE WHEN override != 0 THEN coalesce ((SELECT ov_new_severity FROM result_overrides WHERE result = results.id AND result_overrides.user = users.id AND severity_matches_ov (current_severity (results.severity, results.nvt), ov_old_severity) LIMIT 1), current_severity (results.severity, results.nvt)) ELSE current_severity (results.severity, results.nvt) END ELSE CASE WHEN override != 0 THEN coalesce ((SELECT ov_new_severity FROM result_overrides WHERE result = results.id AND result_overrides.user = users.id AND severity_matches_ov (results.severity, ov_old_severity) LIMIT 1), results.severity) ELSE results.severity END END AS new_severity FROM results, users, (SELECT 0 AS override UNION SELECT 1 AS override) AS override_opts, (SELECT 0 AS dynamic UNION SELECT 1 AS dynamic) AS dynamic_opts;
md manage:WARNING:: sqlv: sql_exec_internal failed
Also I’m experienced very long running time for every gvmd command (–get-users, --get-scanners). It looks like if it reinitializes the database every time. So it might be not an UI issue, it just so slow that it timeouts.
I also see this two process with different PIDs - assuming constantly restarting:
I did dig a little deeper. The results on the UI is not working because the reverse proxy in front of the GSA kills the connections if it’s stale for more than a minute. Also it doesn’t work from gvm-pyshell either (I copied the filter from the request captured on the UI) for the same reason (timeout, there is no reverse proxy here).
>>> results = gmp.get_results( details=True, filter="apply_overrides=0 levels=hml rows=100 min_qod=70 first=1 sort-reverse=severity _and_report_id=ec4f3697-68a3-4c83-a69a-641e6430b73a" )
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/usr/lib/python3.8/site-packages/gvm/protocols/gmpv7/__init__.py", line 3748, in get_results
File "/usr/lib/python3.8/site-packages/gvm/protocols/base.py", line 86, in _send_xml_command
File "/usr/lib/python3.8/site-packages/gvm/protocols/base.py", line 140, in send_command
File "/usr/lib/python3.8/site-packages/gvm/protocols/base.py", line 137, in send_command
response = self._read()
File "/usr/lib/python3.8/site-packages/gvm/protocols/base.py", line 63, in _read
File "/usr/lib/python3.8/site-packages/gvm/connections.py", line 136, in read
data = self._read()
File "/usr/lib/python3.8/site-packages/gvm/connections.py", line 98, in _read
File "/usr/lib/python3.8/ssl.py", line 1226, in recv
File "/usr/lib/python3.8/ssl.py", line 1101, in read
socket.timeout: The read operation timed out
If I disable the reverse proxy I’ll get an answer but it takes nearly 5 minutes. After increasing the PostgreSQL resources to 2vCPU and 8GiB RAM then it takes about 2 minutes. The whole database is only about 2GiB.
So, I’m a bit lost where to look at next. Does anyone has any ideas how to improve the performance?
I tried what I could but without any luck. Fetching any result is still unusable slow. I would love to use OpenVAS - even buy a commercial support after the evaulation - and spare a Nessus subscription but it’s a no-go with this issue. It’s very annoying that everything works - the updates, the scanning etc. - but the results don’t which is basically just a SELECT from the database.
We all know that virtualized IO could be painful slow, so you might wanna check a native installation. Don´t compare a uncoordinated installation with a Greenbone appliance build for scan performance, we have customers with millions of results in the database, so there must be something wrong with your self compiled installation.
Thanks for the answers. I know that something is wrong but I don’t know what. I spent my last day/week to investigate this deeper. I ran a lot of benchmarks and stress tests (from simple dd to pgbench) and based on that the virtual machine is at least as fast as my own notebook where I can see the reports. During this I found a small SQL injection too but I don’t know where to report it (gsad/gvmd/gvm-libs).
We doesn’t have any on-premise infrastructure so our only option is the cloud. We prefer to use Kubernetes as we already have all the basic infrastructure (RBAC, logging, monitoring etc.) elements there.
I’m using TCP connections where it’s possible to avoid having huge pods so I have a standalone Redis, PostgreSQL, GVM/OpenVAS and GSAD pods (and 2 standalone CronJobs to update the feeds). I tried to put them into one huge pod as a test and everything used a local UNIX socket, but that didn’t help either. The performance was the same.
After the gvm-pyshell I copied the SQL statement and executed that directly on the PostgreSQL using a local Unix socket. It took 90 seconds in Kubernetes to complete and 30 seconds on a test EC2 instance and on my notebook.
Now I’m trying to switch the PostgreSQL from Alpine to Ubuntu hoping that the muslc/glibc difference causes this issue. After that I can run the get_results tests you recommended. Unfortunately it takes a day to bring up a new test environment, finish the initial updates and run a test scan to gather some results.
I still have some time to continue experiments but after that we’ll buy a Nessus subscription and use a 3rd party AMI.
I did some tests. The Ubuntu based installation is significantly faster than the Alpine based one. It could be a muslc/glibc difference issue or something else. I didn’t dig deeper.
I did the tests on this deployment. Enabling or disabling the details flag didn’t introduced any change:
45.775 seconds with details.
45.942 seconds without details.
The results are the same as on the web UI, and these are the only results in the DB. I managed to make this a bit faster with some “HW optimization” so I’ll get back the same results in 34 seconds. Is it usual/acceptable or is it still too slow?
I continued to dig deeper. I found out that this is about only one specific query and it’s not the disk I/O at all - I managed to try it on a physical hardware -, it’s CPU bounded. One postgres process runs at 100% CPU for 35 seconds.
I started to cut of parts to be able to identify the problematic part - the EXPLAIN ANALYZE is huge and I didn’t find anything in it (me not an SQL expert).
The original query ran for: 32.421s
Without ORDER BY: 25.040s
Changed auto_type to static 0: 31.867s
Without ORDER BY AND changed auto_type to static 0: 1.548s
It looks like it’s getting slow when some views are involved and it burns CPU only, no disk is involved at all (this is why it’s slow on physical hardware).
I’m using PostgreSQL 12.2. Should I try with an older one? Is there an officially supported/recommended PostgreSQL version?
It looks like GVMD doesn’t support recent PostgreSQL versions as the get_results operation is 150 times slower than before. It’s not a problem - yet -, it was just very annoying that this is not documented anywhere. But at least I learned how to deploy the whole stack with blinded eye.
Hi, our reference system is the Debian version which was called stable at the release date. Our Greenbone OS is based on that system and of course Greenbone OS is always our target system. We mention this in the INSTALL.md files e.g. here.
Thanks, I read that. Unfortunately it only says that PostgreSQL >= 9.6 it doesn’t say that PostgreSQL < 12 (Buster has PostgreSQL 11). BTW I tried to use Debian I didn’t find any pre-compiled packages for it. I also understand that you’re concentrating on your official product. That’s okay, there is no problem with that.
My issue was originated by the fact that the GVM/OpenVAS stack is not cloud-ready. If there was “pre-packaged” cloud-ready product then we prefer to buy it instead of creating our own deployment. It’s still cheaper than spending literally more than a hundred of engineering hours. During that I had to introduce a vast amount of workarounds to make it work and when it didn’t worked I faced a complex, heavily modified system which I never used before so I didn’t had any previous experience either.
I’ll try to make our deployment open-source to help other people who wants to deploy GVM/OpenVAS to Kubernetes. If is there any.
TLDR: You might have issues when you’re start using PostgreSQL 12. I never saw this magnitude performance degradation between two PostgreSQL versions, it was totally unexpected.
Anyone can buy a ready product as VM, Physical Appliance or as Platform Service (Cloud Ready) at a large scale. If you want to use our free as freedom software, that is fine if you don´t have the technical experience or investing into this hours you can buy a solution from Greenbone that is your freedom. There is always a hard way and a easy one
As far as I know the packagers for Kali are working directly on Debian. Sources of the packages can be found at https://salsa.debian.org/pkg-security-team We (Greenbone) offer a ready to use VM called Greenbone Community Edition which is directly derived from our products. At the moment we are not going to provide any packages for any distribution.
Welcome to Free Software. Nobody can guaranty that a free software project works for your use case and environment. That’s also a reason why we just provide support for our products. It is nearly impossible to support all possible setups and differences. But it also shows the strength of free software everybody can try it out find solutions for their use case, do changes and let others know about that.
Thanks for this info! This will help us a lot in future. As I already wrote this is exactly what I like so much from Free Software. Lean from others and share. Personally I am using Postgres 12 too but I didn’t notice this huge performance gain yet.
Hi, I stumbled across your topic today. I would be interessted in your deployment (maybe this one https://github.com/admirito/gvm-containers here?) and I guess some other people on this forum too. At the moment my docker solution (https://github.com/carlstegmann/snippets/tree/master/gvm2008) is not ready to be deployed inside k8s. Some of the missing parts are splitting the components into separate containers and sharing the nvt / scap data between multiple k8s clusters, so that you don’t need to have same data synced all day long on the same dc where your k8s clusters reside
We created a totally different deployment. Split everything we could to avoid to run one huge container.
We implemented a few scripts to replace the official ones and created images on our own. I have to ask my manager if I can share the details and will be back with the answer.
To respond to the other answers. I looked first if is there a cloud-ready product from Greenbone but I didn’t found one. I saw only virtual appliances. They’re fine for an on-premise DC but for the cloud I was looking for an AMI or better AWS Marketplace subscription.