Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent leaves hanging child processes #2820

Closed
3 tasks
mssalvatore opened this issue Jan 11, 2023 · 1 comment · Fixed by #2824
Closed
3 tasks

Agent leaves hanging child processes #2820

mssalvatore opened this issue Jan 11, 2023 · 1 comment · Fixed by #2824
Assignees
Labels
Bug An error, flaw, misbehavior or failure in the Monkey or Monkey Island. Complexity: High Impact: High Plugins

Comments

@mssalvatore
Copy link
Collaborator

mssalvatore commented Jan 11, 2023

Description

After the agent completes its task, it is unable to exit. It seems that the process is being kept open by its child processes. It's unclear why the child processes do not exit. This occurs intermittently but happens most frequently if the agent is run within a docker container.

The effects of this are:

  1. The agent leaves processes hanging around and consuming some small amount of resources
  2. Other agents cannot start on the machine since the SystemSingleton is never properly released. It would seem that maybe one of the child processes has a copy of the file descriptor for the unix socket, but it's not quite clear how that's possible since this occurs even of the child processes are spawned before the socket is opened.

Tasks

  • Fix it (0.5d) @cakekoa
    • Rebase on develop
    • Run ETE tests
@mssalvatore mssalvatore added Bug An error, flaw, misbehavior or failure in the Monkey or Monkey Island. Impact: High Complexity: High Plugins labels Jan 11, 2023
@mssalvatore
Copy link
Collaborator Author

mssalvatore commented Jan 12, 2023

It turns out the root cause was the Log4Shell exploiter's LDAP server was not being properly closed. This is resolved by 69c2c88.

EDIT:
This still isn't resolved

2023-01-12 03:02:34,039 [115:ExploiterThread-04:DEBUG] ldap_server.stop.198: Stopping LDAP exploit server
2023-01-12 03:02:34,039 [211:ExploiterThread-04:INFO] _legacy.publishToNewObserver.147: Received SIGTERM, shutting down.
2023-01-12 03:02:34,040 [211:ExploiterThread-04:INFO] _legacy.publishToNewObserver.147: (TCP Port 8008 Closed)
2023-01-12 03:02:34,041 [211:ExploiterThread-04:INFO] _legacy.publishToNewObserver.147: Stopping factory <infection_monkey.exploit.log4shell_utils.ldap_server.LDAPServerFactory object at 0x7fa8b80bf7d0>
2023-01-12 03:02:34,041 [211:ExploiterThread-04:INFO] _legacy.publishToNewObserver.147: Main loop terminated.
2023-01-12 03:02:37,000 [115:AgentEventForwarder:DEBUG] agent_event_forwarder.flush.94: Sending 3 Agent events to the Island: [PropagationEvent(source=UUID('d2fbc46f-72d2-44dc-8341-460f77b75f3b'), target=IPv4Address('10.2.3.46'), timestamp=1673492534.8145223, tags=frozenset({'log4shell-exploiter', 'attack-t1203', 'attack-t1105'}), success=True, exploiter_name='Log4ShellExploiter', error_message=''), ExploitationEvent(source=UUID('d2fbc46f-72d2-44dc-8341-460f77b75f3b'), target=IPv4Address('10.2.3.46'), timestamp=1673492553.9918516, tags=frozenset({'log4shell-exploiter', 'attack-t1203', 'attack-t1110'}), success=True, exploiter_name='Log4ShellExploiter', error_message=''), PropagationEvent(source=UUID('d2fbc46f-72d2-44dc-8341-460f77b75f3b'), target=IPv4Address('10.2.3.46'), timestamp=1673492534.8145223, tags=frozenset({'log4shell-exploiter', 'attack-t1203', 'attack-t1105'}), success=True, exploiter_name='Log4ShellExploiter', error_message='')]
2023-01-12 03:02:37,001 [115:AgentEventForwarder:DEBUG] http_client._send_request.132: POST https://10.2.2.250:5000/api/agent-events, timeout=5
2023-01-12 03:02:37,068 [115:AutomatedMasterThread:DEBUG] http_client._send_request.132: GET https://10.2.2.250:5000/api/agent-signals/d2fbc46f-72d2-44dc-8341-460f77b75f3b, timeout=2.5
2023-01-12 03:02:37,884 [115:TCPConnectionHandler:DEBUG] tcp_connection_handler.run.42: New connection received from: ('10.2.2.12', 39970)
2023-01-12 03:02:37,886 [115:TCPConnectionHandler:DEBUG] relay_user_handler.add_relay_user.64: Added relay user RelayUser(address='10.2.2.12', time_remaining=0.000)
2023-01-12 03:02:37,899 [115:SocketsPipeThread-2:DEBUG] tcp_pipe_spawner._handle_pipe_closed.58: Closing pipe <SocketsPipe(SocketsPipeThread-2, started daemon 140362366555904)>
2023-01-12 03:02:42,078 [115:AutomatedMasterThread:DEBUG] http_client._send_request.132: GET https://10.2.2.250:5000/api/agent-signals/d2fbc46f-72d2-44dc-8341-460f77b75f3b, timeout=2.5
2023-01-12 03:02:42,924 [115:TCPConnectionHandler:DEBUG] tcp_connection_handler.run.42: New connection received from: ('10.2.2.12', 39976)
2023-01-12 03:02:42,925 [115:TCPConnectionHandler:DEBUG] relay_user_handler.disconnect_user.88: Disconnected user 10.2.2.12
2023-01-12 03:02:46,893 [115:AgentHeart:DEBUG] http_client._send_request.132: POST https://10.2.2.250:5000/api/agent/d2fbc46f-72d2-44dc-8341-460f77b75f3b/heartbeat, timeout=5
2023-01-12 03:02:47,089 [115:AutomatedMasterThread:DEBUG] http_client._send_request.132: GET https://10.2.2.250:5000/api/agent-signals/d2fbc46f-72d2-44dc-8341-460f77b75f3b, timeout=2.5
2023-01-12 03:02:49,053 [115:ExploiterThread-04:WARNING] ldap_server.stop.206: Timed out while waiting for the LDAP exploit server to stop

mssalvatore added a commit that referenced this issue Jan 12, 2023
Just because the Twisted reactor failed to start doesn't mean that the
server process is not running. The server process should be stopped
before raising the LDAPServerStartError, otherwise the server or process
may be left running indefinitely.

Fixes #2820
mssalvatore added a commit that referenced this issue Jan 12, 2023
mssalvatore added a commit that referenced this issue Jan 12, 2023
Just because the Twisted reactor failed to start doesn't mean that the
server process is not running. The server process should be stopped
before raising the LDAPServerStartError, otherwise the server or process
may be left running indefinitely.

Fixes #2820
mssalvatore added a commit that referenced this issue Jan 12, 2023
mssalvatore added a commit that referenced this issue Jan 12, 2023
Just because the Twisted reactor failed to start doesn't mean that the
server process is not running. The server process should be stopped
before raising the LDAPServerStartError, otherwise the server or process
may be left running indefinitely.

Fixes #2820
mssalvatore added a commit that referenced this issue Jan 12, 2023
mssalvatore added a commit that referenced this issue Jan 12, 2023
Using Twisted for the Log4Shell exploiter has been nothing but trouble
since the beginning. When we refactor this exploiter we should use
another solution. In the meanwhile, we must be doing something wrong WRT
stopping Twisted. The heavy-handed approach is to SIGKILL the process.
This isn't ideal, but will be changed when we refactor this component.

Issue #2820
mssalvatore added a commit that referenced this issue Jan 13, 2023
Just because the Twisted reactor failed to start doesn't mean that the
server process is not running. The server process should be stopped
before raising the LDAPServerStartError, otherwise the server or process
may be left running indefinitely.

Fixes #2820
mssalvatore added a commit that referenced this issue Jan 13, 2023
mssalvatore added a commit that referenced this issue Jan 13, 2023
Using Twisted for the Log4Shell exploiter has been nothing but trouble
since the beginning. When we refactor this exploiter we should use
another solution. In the meanwhile, we must be doing something wrong WRT
stopping Twisted. The heavy-handed approach is to SIGKILL the process.
This isn't ideal, but will be changed when we refactor this component.

Issue #2820
cakekoa pushed a commit that referenced this issue Jan 13, 2023
cakekoa pushed a commit that referenced this issue Jan 13, 2023
Using Twisted for the Log4Shell exploiter has been nothing but trouble
since the beginning. When we refactor this exploiter we should use
another solution. In the meanwhile, we must be doing something wrong WRT
stopping Twisted. The heavy-handed approach is to SIGKILL the process.
This isn't ideal, but will be changed when we refactor this component.

Issue #2820
cakekoa added a commit that referenced this issue Jan 13, 2023
Forked processes will inherit all resources from the parent process.
This includes the socket we use for ensuring only a single agent is
running at any given time. Additionaly, threads will also be inherited
by the forked process, which could cause problems.

Using a spawn context should fix our singleton issue, and give the
process a cleaner environment in which to run.

Issue #2820
cakekoa added a commit that referenced this issue Jan 13, 2023
Forked processes will inherit all resources from the parent process.
This includes the socket we use for ensuring only a single agent is
running at any given time. Additionaly, threads will also be inherited
by the forked process, which could cause problems.

Using a spawn context should fix our singleton issue, and give the
process a cleaner environment in which to run.

Issue #2820
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug An error, flaw, misbehavior or failure in the Monkey or Monkey Island. Complexity: High Impact: High Plugins
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants