Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prompt user before changing NetBox database passwords out from underneath existing database #565

Closed
purplealien51 opened this issue Jan 26, 2025 · 6 comments
Assignees
Labels
bug Something isn't working control.py Related to control.py script netbox Related to Malcolm's use of NetBox
Milestone

Comments

@purplealien51
Copy link

purplealien51 commented Jan 26, 2025

Describe the bug
Netbox is not loading. Docker logs for netbox container says "password authentication failed for user netbox".

**Screenshots and/or Logs **

get_for_models(*apps.get_models()).values()', '                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^', '  File "/opt/netbox/venv/lib/python3.12/site-packages/django/contrib/contenttypes/models.py", line 91, in get_for_models', '    for ct in cts:', '  File "/opt/netbox/venv/lib/python3.12/site-packages/django/db/models/query.py", line 400, in __iter__', '    self._fetch_all()', '  File "/opt/netbox/venv/lib/python3.12/site-packages/django/db/models/query.py", line 1928, in _fetch_all', '    self._result_cache = list(self._iterable_class(self))', '                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^', '  File "/opt/netbox/venv/lib/python3.12/site-packages/django/db/models/query.py", line 91, in __iter__', '    results = compiler.execute_sql(', '              ^^^^^^^^^^^^^^^^^^^^^', '  File "/opt/netbox/venv/lib/python3.12/site-packages/django/db/models/sql/compiler.py", line 1560, in execute_sql', '    cursor = self.connection.cursor()', '             ^^^^^^^^^^^^^^^^^^^^^^^^', '  File "/opt/netbox/venv/lib/python3.12/site-packages/django/utils/asyncio.py", line 26, in inner', '    return func(*args, **kwargs)', '           ^^^^^^^^^^^^^^^^^^^^^', '  File "/opt/netbox/venv/lib/python3.12/site-packages/django/db/backends/base/base.py", line 316, in cursor', '    return self._cursor()', '           ^^^^^^^^^^^^^^', '  File "/opt/netbox/venv/lib/python3.12/site-packages/django/db/backends/base/base.py", line 292, in _cursor', '    self.ensure_connection()', '  File "/opt/netbox/venv/lib/python3.12/site-packages/django/utils/asyncio.py", line 26, in inner', '    return func(*args, **kwargs)', '           ^^^^^^^^^^^^^^^^^^^^^', '  File "/opt/netbox/venv/lib/python3.12/site-packages/django/db/backends/base/base.py", line 274, in ensure_connection', '    with self.wrap_database_errors:', '  File "/opt/netbox/venv/lib/python3.12/site-packages/django/db/utils.py", line 91, in __exit__', '    raise dj_exc_value.with_traceback(traceback) from exc_value', '  File "/opt/netbox/venv/lib/python3.12/site-packages/django/db/backends/base/base.py", line 275, in ensure_connection', '    self.connect()', '  File "/opt/netbox/venv/lib/python3.12/site-packages/django/utils/asyncio.py", line 26, in inner', '    return func(*args, **kwargs)', '           ^^^^^^^^^^^^^^^^^^^^^', '  File "/opt/netbox/venv/lib/python3.12/site-packages/django/db/backends/base/base.py", line 256, in connect', '    self.connection = self.get_new_connection(conn_params)', '                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^', '  File "/opt/netbox/venv/lib/python3.12/site-packages/django/utils/asyncio.py", line 26, in inner', '    return func(*args, **kwargs)', '           ^^^^^^^^^^^^^^^^^^^^^', '  File "/opt/netbox/venv/lib/python3.12/site-packages/django/db/backends/postgresql/base.py", line 277, in get_new_connection', '    connection = self.Database.connect(**conn_params)', '                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^', '  File "/opt/netbox/venv/lib/python3.12/site-packages/psycopg/connection.py", line 119, in connect', '    raise last_ex.with_traceback(None)', 'django.db.utils.OperationalError: connection failed: connection to server at "172.18.0.9", port 5432 failed: FATAL:  password authentication failed for user "netbox"', '', " loaded config '/etc/netbox/config/configuration.py'", " loaded config '/etc/netbox/config/extra.py'", " loaded config '/etc/netbox/config/logging.py'", " loaded config '/etc/netbox/config/plugins.py'", '']
django.db.utils.OperationalError: connection failed: connection to server at "172.18.0.9", port 5432 failed: FATAL:  password authentication failed for user "netbox"
[ Use DB_WAIT_DEBUG=1 in netbox.env to print full traceback for errors here ]
⏳ Waiting on DB... (0s / 30s)

Malcolm Version:

  • Version [e.g. v25.01.0]

How are you running Malcolm?

  • Malcolm running in Ubuntu 22.04 using docker

Additional context
Tried restarting netbox container after reseting password with ./auth_setup script but nothing works

@purplealien51 purplealien51 added the bug Something isn't working label Jan 26, 2025
@mmguero mmguero added this to Malcolm Jan 26, 2025
@trwagner1
Copy link

Is this a VM deployment? Did you configure the primary ID or did it request an IP with DHCP?

I've seen this when my VM or hardware has a DHCP assigned address and I configured the primary interface, but failed to restart the Malcolm server after completing the configuration and auth_setup script.

@mmguero , I'd recommend inserting a section into the initial configuration that detects if the primary nic is DHCP and prompt to configure the primary IPv4 address manually before the system reboots. That way, when the Malcolm configuration script runs, it's going to do so under the permanent IP address.

@purplealien51
Copy link
Author

Hi @trwagner1,
I have it deployed in Ubuntu VM running in Proxmox and using the git repo with docker.
VM is assigned with a static ip through pfsense DHCP server.
All containers are running except Netbox stuck at "starting" state. I can use Opensearch dashboards, Arkime, CyberChef, File Upload is working as well.
I have configured one interface which is mirrored and can see all traffic in Arkime and Dashboards
Only Netbox is giving Bad Gateway error

Image

@mmguero
Copy link
Collaborator

mmguero commented Feb 3, 2025

Hi, my apologies, I was on travel for DHS all last week and unable to monitor all of this. Getting caught up now. I'm working on seeing if I can reproduce the issue.

Thanks for the ideas @trwagner1, in this case where the networking is all internal to the internal docker network I don't think it's an issue with the IP address/DHCP.

If it's a matter of getting up and running again, I'm pretty sure that shutting down malcolm and running rm -rf ./netbox/postgres/* ./netbox/redis/* and then starting Malcolm back up should get you going again. That would, of course, delete your existing netbox inventory, however. If you've got a backup of it you could then restore it.

I'm working on trying to reproduce the issue locally.

@mmguero mmguero changed the title netbox not loading - 502 Bad gateway NetBox fails to start due to invalid internal password Feb 3, 2025
@mmguero mmguero added the netbox Related to Malcolm's use of NetBox label Feb 3, 2025
@mmguero mmguero self-assigned this Feb 3, 2025
@mmguero mmguero moved this to Todo (investigate) in Malcolm Feb 3, 2025
@mmguero mmguero added this to the v25.02.0 milestone Feb 3, 2025
@mmguero
Copy link
Collaborator

mmguero commented Feb 3, 2025

Okay, I was able to reproduce. Here's the scenario:

  1. Use ./scripts/auth_setup to set the netbox-related passwords in their respective environment variable files in ./config/
  2. Start Malcolm from a fresh/empty state, let NetBox bootstrap and populate itself
  3. Stop Malcolm (do not wipe, just stop)
  4. Use ./scripts/auth_setup to set the netbox-related passwords to new values
  5. Start Malcolm again, which will use the netbox database populated previously in step 2

Results:

Error messages like:

netbox-postgres-1     | 2025-02-03 16:52:07.184 UTC [57] FATAL:  password authentication failed for user "netbox"

Prognosis: Once the passwords have been changed "out from underneath" the NetBox postgresql database, there's not a whole lot one can do. That's kind of the point of the password. So here's what I propose we do to prevent this from happening:

  • During ./scripts/auth_setup, ONLY populate new NetBox-related passwords if either of the following is true:
    • the passwords have not been set yet (empty or all x's)
    • the user is prompted to confirm with a warning that they will need to wipe their netbox database to reset it

@mmguero mmguero changed the title NetBox fails to start due to invalid internal password NetBox fails to start due to invalid internal password if NetBox passwords have been changed Feb 3, 2025
@mmguero mmguero moved this from Todo (investigate) to Todo (develop) in Malcolm Feb 3, 2025
@mmguero mmguero assigned mmguero and unassigned mmguero Feb 3, 2025
@mmguero mmguero moved this from Todo (develop) to In Progress in Malcolm Feb 3, 2025
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Feb 3, 2025
@mmguero mmguero changed the title NetBox fails to start due to invalid internal password if NetBox passwords have been changed prompt user before changing NetBox database passwords out from underneath existing database Feb 3, 2025
@mmguero
Copy link
Collaborator

mmguero commented Feb 3, 2025

I've added a commit to prompt the user if netbox passwords have already been set:

Kazam_screencast_00000.mp4

I know this doesn't get you up and running if you're already in this state, but it should at least help to prevent users from doing so in the future.

@mmguero mmguero closed this as completed Feb 3, 2025
@github-project-automation github-project-automation bot moved this from In Progress to Done in Malcolm Feb 3, 2025
@mmguero mmguero added the control.py Related to control.py script label Feb 3, 2025
@mmguero mmguero marked this as a duplicate of #501 Feb 3, 2025
@purplealien51
Copy link
Author

By the way my netbox was not started with the fresh install
but what i did now is ./scripts/stop then ./scripts/wipe and then ./scripts/start
and netbox is working now.
Thanks for your support @mmguero. Appreciate it

piercema added a commit to piercema/Malcolm that referenced this issue Feb 12, 2025
* Bump development for v25.01.0, also update copyright year

* bump netbox to v4.1.10, osd_transform to v2.18.0, and fluent-bit to v3.2.4

* for cisagov#354, work in progress for Malcolm directly accepting syslog

* for cisagov#354, work in progress for Malcolm directly accepting syslog; (dashboard)

* cisagov#543, add naviation pane to non-network dashboards

* bump jinja to 3.1.5

* Documentation for cisagov#354, syslog

* replace old filebeat input for syslog with tcp/udp input and syslog processor, for cisagov#354

* Documentation for cisagov#354, syslog

* install.py tweak for cisagov#354

* minor fix for for cisagov#354, set host.name correctly

* bump netbox to v4.11.1 and elasticsearch-dsl to v8.17.1

* start of cisagov#356, normalize winlogbeats

* WIP of cisagov#356, normalize winlogbeats

* WIP of cisagov#356, normalize winlogbeats

* WIP of cisagov#356, fix for a dashboard

* WIP of cisagov#356, normalize winlogbeats

* Work in progress for cisagov#541, making sure conn.log and known_services.log get the ICS protocols assigned to them corrrectly and tagged appropriately

* Work in progress for cisagov#541

* standardize ICS protocols in network.protocol field, so they all get tagged with 'ics' properly cisagov#541

* fix cisagov#533, allow keystores to be created on startup even in hedgehog mode

* forgot to add file for cisagov#356

* For cisagov#524, handle filenames with spaces in extracted_files_http_server.py

* work for cisagov#542, preserve custom field formatting for index pattern on update of index pattern

* work for cisagov#542, preserve custom field formatting for index pattern on update of index pattern

* bump yq to v4.45.1

* for cisagov#551, URL pivot links from dashboards to arkime

* for cisagov#551, URL pivot links from dashboards to arkime

* fix pivot from arkime to dashboards and vice-versa when using a traefik or other reverse proxy

* for cisagov#551, URL pivot links from dashboards to netbox

* for cisagov#551, URL pivot links from dashboards to netbox

* for cisagov#551, URL pivot links from netbox to arkime/dashboards

* start of cisagov#553, update zeek to v7.1.0

* cisagov#553, handle conn.log for zeek v7.1.0 and documentation update

* cisagov#553, handle postgresql.log

* cisagov#553, handle postgresql.log

* cisagov#553, added PostgreSQL dashboard

* for cisagov#551, URL pivot links in dashboards (ignore date/times)

* start of omron fins integration, cisagov#554

* wip omron fins integration, , cisagov#554

* arkime to v5.6.0

* bump logstash and filebeat to v8.17.0

* Fix nginx filebeat

* WIP omron fins integration, cisagov#554

* WIP omron fins integration, cisagov#554

* WIP omron fins integration, cisagov#554

* WIP omron fins integration, cisagov#554

* WIP omron fins integration, cisagov#554

* dashboards tweaks

* fix links for hh redirect download

* First pass at adding suricata socket optimization

* fix issue with nginx proxy

* Setting debug to false

* Fixing permissions for socket

* html formatting

* documentation for workaround for UFW software firewall for Malcolm ISO should automatically open ports for syslog cisagov#560)

* Bump for v25.02.0 development

* restore _config.yml

* fix version

* I don't think we need a seperate pod for the socket-based suricata, that's what the offline one does now anyway, right?

* restore some comments, black python style

* some tweaks for cisagov#457, pulled jjrush's branch into mine for some fixes

* some tweaks for cisagov#457

* allow suricata to spawn threads

* logging tweaks

* more flexible verbosity for suricata

* some tweaks for cisagov#457, try to wait until PCAP is finished processing before moving on

* First pass at adding suricata socket optimization

* Setting debug to false

* Fixing permissions for socket

* for cisagov#457, a few tweaks of the suricata pcap processing mode after reviewing @jjrush's code

* for cisagov#457, monitor suricata.log to know when PCAP is done processing

* for cisagov#457, monitor suricata.log to know when PCAP is done processing

* for cisagov#457, signal suricata rules to reload after update

* for cisagov#457, signal suricata rules to reload after update

* for cisagov#457, fix processing of other log types

* for cisagov#457, fix processing of other log types

* for cisagov#457, signal suricata rules to reload after update

* decrease verbosity for log

* fix logic for autoarkime/forcearkime

* some tweaks for cisagov#457, don't bother keeping track of when suricata is done with a PCAP file. just let filebeat handle it and pick up the resultant eve.json files directly

* Standardizing healthcheck scripts, updating docker-compose, updating kubernetes

* Adding livenessProbe to htadmin

* cisagov#457, handle multiple Suricata PCAP processing threads

* cisagov#574, clear screen after auth_setup when using Dialog mode

* add the related.user field to the 'nginx Access Logs' table

* bump fluent bit to v3.2.5

* fixed import of ECS templates

* handle ARKIME_PORT value formatted like a URL in the init of the API container

* cisagov#565, warn user about overwriting netbox passwords if they've already been set

* fix cisagov#559, ANSI color codes from croc displayed

* Exception in build triggers

* for cisagov#557, try building dirinit with arm runner

* cisagov#557, use arm-hosted runners for github build actions

* restore _config.yml

* a bit of cleanup for Dockefiles/health check scripts

* minor fixes for health checks

* Tweaks for health checks

* restore _config.yml

* Tweaks for health checks

* build tweaks for health scripts

* bump capa to v9.0.0

* workaround for issue blocking cisagov#475, integration of sigma rules

* improvements to workaround for issue blocking cisagov#475, integration of sigma rules

* improvements to workaround for issue blocking cisagov#475, integration of sigma rules

* for cisagov#475, automatically apply aliases via index templates

* for cisagov#475, starting on mappings for security analytics

* for cisagov#585, include corelight/zeek-long-connections plugin for long connections (wIP)

* for cisagov#585, include corelight/zeek-long-connections plugin for long connections (WIP)

* for cisagov#585, include corelight/zeek-long-connections plugin for long connections (WIP)

* for cisagov#585, include corelight/zeek-long-connections plugin for long connections (WIP)

* demo fix

* for cisagov#585, show long connection count on connections dashboard

* decouple redis from netbox (cisagov#580)

* one more minor change to cisagov#491, moved all container health scripts into one place to make it easier to keep track of them

* decouple redis from netbox (cisagov#580) and reorganized some of the other netbox password stuff

* updated fluent bit

* fix filebeat health

---------

Co-authored-by: Seth Grover <seth.d.grover@gmail.com>
Co-authored-by: Jason Rush <jjrush-github@proton.me>
piercema added a commit to piercema/Malcolm that referenced this issue Feb 13, 2025
* Bump development for v25.01.0, also update copyright year

* bump netbox to v4.1.10, osd_transform to v2.18.0, and fluent-bit to v3.2.4

* for cisagov#354, work in progress for Malcolm directly accepting syslog

* for cisagov#354, work in progress for Malcolm directly accepting syslog; (dashboard)

* cisagov#543, add naviation pane to non-network dashboards

* bump jinja to 3.1.5

* Documentation for cisagov#354, syslog

* replace old filebeat input for syslog with tcp/udp input and syslog processor, for cisagov#354

* Documentation for cisagov#354, syslog

* install.py tweak for cisagov#354

* minor fix for for cisagov#354, set host.name correctly

* bump netbox to v4.11.1 and elasticsearch-dsl to v8.17.1

* start of cisagov#356, normalize winlogbeats

* WIP of cisagov#356, normalize winlogbeats

* WIP of cisagov#356, normalize winlogbeats

* WIP of cisagov#356, fix for a dashboard

* WIP of cisagov#356, normalize winlogbeats

* Work in progress for cisagov#541, making sure conn.log and known_services.log get the ICS protocols assigned to them corrrectly and tagged appropriately

* Work in progress for cisagov#541

* standardize ICS protocols in network.protocol field, so they all get tagged with 'ics' properly cisagov#541

* fix cisagov#533, allow keystores to be created on startup even in hedgehog mode

* forgot to add file for cisagov#356

* For cisagov#524, handle filenames with spaces in extracted_files_http_server.py

* work for cisagov#542, preserve custom field formatting for index pattern on update of index pattern

* work for cisagov#542, preserve custom field formatting for index pattern on update of index pattern

* bump yq to v4.45.1

* for cisagov#551, URL pivot links from dashboards to arkime

* for cisagov#551, URL pivot links from dashboards to arkime

* fix pivot from arkime to dashboards and vice-versa when using a traefik or other reverse proxy

* for cisagov#551, URL pivot links from dashboards to netbox

* for cisagov#551, URL pivot links from dashboards to netbox

* for cisagov#551, URL pivot links from netbox to arkime/dashboards

* start of cisagov#553, update zeek to v7.1.0

* cisagov#553, handle conn.log for zeek v7.1.0 and documentation update

* cisagov#553, handle postgresql.log

* cisagov#553, handle postgresql.log

* cisagov#553, added PostgreSQL dashboard

* for cisagov#551, URL pivot links in dashboards (ignore date/times)

* start of omron fins integration, cisagov#554

* wip omron fins integration, , cisagov#554

* arkime to v5.6.0

* bump logstash and filebeat to v8.17.0

* Fix nginx filebeat

* WIP omron fins integration, cisagov#554

* WIP omron fins integration, cisagov#554

* WIP omron fins integration, cisagov#554

* WIP omron fins integration, cisagov#554

* WIP omron fins integration, cisagov#554

* dashboards tweaks

* fix links for hh redirect download

* First pass at adding suricata socket optimization

* fix issue with nginx proxy

* Setting debug to false

* Fixing permissions for socket

* html formatting

* documentation for workaround for UFW software firewall for Malcolm ISO should automatically open ports for syslog cisagov#560)

* Bump for v25.02.0 development

* restore _config.yml

* fix version

* I don't think we need a seperate pod for the socket-based suricata, that's what the offline one does now anyway, right?

* restore some comments, black python style

* some tweaks for cisagov#457, pulled jjrush's branch into mine for some fixes

* some tweaks for cisagov#457

* allow suricata to spawn threads

* logging tweaks

* more flexible verbosity for suricata

* some tweaks for cisagov#457, try to wait until PCAP is finished processing before moving on

* First pass at adding suricata socket optimization

* Setting debug to false

* Fixing permissions for socket

* for cisagov#457, a few tweaks of the suricata pcap processing mode after reviewing @jjrush's code

* for cisagov#457, monitor suricata.log to know when PCAP is done processing

* for cisagov#457, monitor suricata.log to know when PCAP is done processing

* for cisagov#457, signal suricata rules to reload after update

* for cisagov#457, signal suricata rules to reload after update

* for cisagov#457, fix processing of other log types

* for cisagov#457, fix processing of other log types

* for cisagov#457, signal suricata rules to reload after update

* decrease verbosity for log

* fix logic for autoarkime/forcearkime

* some tweaks for cisagov#457, don't bother keeping track of when suricata is done with a PCAP file. just let filebeat handle it and pick up the resultant eve.json files directly

* Standardizing healthcheck scripts, updating docker-compose, updating kubernetes

* Adding livenessProbe to htadmin

* cisagov#457, handle multiple Suricata PCAP processing threads

* cisagov#574, clear screen after auth_setup when using Dialog mode

* add the related.user field to the 'nginx Access Logs' table

* bump fluent bit to v3.2.5

* fixed import of ECS templates

* handle ARKIME_PORT value formatted like a URL in the init of the API container

* cisagov#565, warn user about overwriting netbox passwords if they've already been set

* fix cisagov#559, ANSI color codes from croc displayed

* Exception in build triggers

* for cisagov#557, try building dirinit with arm runner

* cisagov#557, use arm-hosted runners for github build actions

* restore _config.yml

* a bit of cleanup for Dockefiles/health check scripts

* minor fixes for health checks

* Tweaks for health checks

* restore _config.yml

* Tweaks for health checks

* build tweaks for health scripts

* bump capa to v9.0.0

* workaround for issue blocking cisagov#475, integration of sigma rules

* improvements to workaround for issue blocking cisagov#475, integration of sigma rules

* improvements to workaround for issue blocking cisagov#475, integration of sigma rules

* for cisagov#475, automatically apply aliases via index templates

* for cisagov#475, starting on mappings for security analytics

* for cisagov#585, include corelight/zeek-long-connections plugin for long connections (wIP)

* for cisagov#585, include corelight/zeek-long-connections plugin for long connections (WIP)

* for cisagov#585, include corelight/zeek-long-connections plugin for long connections (WIP)

* for cisagov#585, include corelight/zeek-long-connections plugin for long connections (WIP)

* demo fix

* for cisagov#585, show long connection count on connections dashboard

* decouple redis from netbox (cisagov#580)

* one more minor change to cisagov#491, moved all container health scripts into one place to make it easier to keep track of them

* decouple redis from netbox (cisagov#580) and reorganized some of the other netbox password stuff

* updated fluent bit

* fix filebeat health

---------

Co-authored-by: Seth Grover <seth.d.grover@gmail.com>
Co-authored-by: Jason Rush <jjrush-github@proton.me>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working control.py Related to control.py script netbox Related to Malcolm's use of NetBox
Projects
Status: Done
Development

No branches or pull requests

3 participants