-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Ingest-Management]: "Enable elastic security agent" page instead of host appears under "Administrator>Host" tab, when user first forcefully un-enroll the agent and then re-enrolled the agent from Fleet tab. #73272
Comments
Please review the defect @rahulgupta-qasource |
Pinging @elastic/ingest-management (Team:Ingest Management) |
Reviewed and assigned to @EricDavisX |
I'm sorry this sat idle for so many days - can you re-test on BC 6 (not BC 7) please? Specific fixes for unenrolling were in BC5 and 6 that I hope help this. If it still is evidenced, please provide the browser dev console output to see what calls are made and if any had errors or strange responses in some form. |
@rahulgupta-qasource can you take the re-test on this if you have time? |
please re-assign me back when action is back on my side. I'm also removing the impact:high label, this feels more moderate to me, if it still amounts to Endpoint being installed. Also reviewing more.... I think maybe @kevinlog should review the screenshots, I think this might be on the Security App work-flow side. Can you poke in please? |
@rahulgupta-qasource this will happen when the Endpoint hasn't sent any documents to ES most likely. Can you verify that the Endpoint is successfully stood up and communicating with ES in this scenario? I'll run through the scenario myself as well to see. FYI @EricDavisX |
@EricDavisX @rahulgupta-qasource here's my test: BC8 Stack and Agent/Endpoint
Endpoint gone (shows onboarding screen - potentially confusing, but expected for now):
I think this is because the Endpoint is still running (I assume because the forced-uenroll didn't fully send all correct messages) What has your experience been when re-enrolling after a forced un-enroll? Have you seen this case? |
I think it may genuinely just take a full minute or more for Endpoint to finish un-installing and deleting files - so your results may be expected. If we had any time frame estimates to cite between when you did the unenroll and then the re-enroll attempt it could help? But this is good evidence I think its working. |
@EricDavisX thanks for the insight. I went ahead and manually uninstalled the Endpoint, re-enrolled the Agent and everything is back up and running again. |
@EricDavisX @rahulgupta-qasource after running through unenroll + force unenroll again, I'm seeing that the Endpoint is not being stopped after 15 or so minutes. I'm not sure what's the expected behavior here. Note that I just ran the Agent from the cmd, I didn't install the service on Windows. I'm not sure if that makes a difference in force unenroll FYI @ph @ruflin @blakerouse |
@kevinlog Do you have by chance any log files from Agent / Endpoint to see what is happening there? |
@ruflin here are the Endpoint logs. Here are the Agent logs (I zipped the entire folder) I'm not seeing anywhere in the Endpoint logs of receiving a "stop", etc. Although, I'm not quite sure what that would look like. FYI @ferullo |
@EricDavisX @ruflin sorry to spam you - but as I was collecting the logs above, the Endpoint did finally stop running, but it took about 30 min after I force unenrolled the Agent. So it seems like it is working, it just takes significantly longer than when you unenroll normally. |
Endpoint stopping after 30 mins seems like an Endpoint side feature, that it hadn't heard from Agent in 30 mins so it shut itself down. With the logs, we can hopefully track what Agent did and didn't send prior to that we might know where there may be an Agent/Endpoint integration bug |
Endpoint does not have this feature. @gogochan can you help with any Endpoint coordination needed for this. |
@michalpristas @blakerouse Would be great to get your eyes on this when you are back (both are out at the moment). |
Seems like Endpoint is not able to populate document on Elasticsearch as @kevinlog described. I see 401 in the Endpoint log
When a user clicks on unenroll and then does force unenroll. The Agent remains running along with ElasticEndpoint, and Beats. If a user comes back to the machine and re-enrolls the Agent, I suppose this process terminates the Agent from the previous enrollment, but it leaves Elastic Endpoint untouched. I think this is where we have a potential problem. The token Endpoint received from the previous Agent is no longer valid, it needs to be reloaded. |
Further investigation shows that upon It was observed that Elastic Agent didn't send the new API token to Elastic Endpoint even after re-enroll, leaving Elastic Endpoint with old invalid API token. A work around is to trigger rev number change by modifying the configuration from the Fleet. |
I'm so pleased the team persisted and we found the bug to fix! Excellent work folks. @kamalpreetpahwa-qasource @rahulgupta-qasource I think we should add some new content to the regression suite, I think there is actually a much larger matrix of state changes to cover than I realized. I'd like to review with @gogochan @kevinlog and @blakerouse to see what we have automated and what we need to cover better manually until we have more automation around this. The 'timing' of when the user unenrolls and then possibly 'too quickly' clicks the force-unenroll is challenging as we don't have much insight into it. I'd like to get some help drawing out a nicer state diagram to track what test cases there are, to start. something like the below (but better): test content that covers a few scenarios, all starting from known working happy Endpoint/Agent state, as:
|
Hi @EricDavisX Thank you for sharing the feedback. We have validated this ticket and above mentioned scenarios on Windows 10, Linux 'CentOS 7' VM and Mac Mojave 10.14.1 on Kibana BC9 cloud environment and found it fixed. Executed below steps to validate the ticket:
Observation: Moreover, we have created 21 testcases for above mentioned scenarios(07 each for Windows, Linux and Mac) and passed them under Agent status on Unenroll, Force unenroll , Re-enroll and restarting TestRun. Hence, we are closing this bug |
Bug Conversion: 21 Testcases(07 each for Windows, Linux and macOS) already exists for this ticket under following sections: |
Kibana version:
Kibana: 7.9 BC4
Elasticsearch version:
Elasticsearch: 7.9 BC4
Agent version:
Agent: 7.9 BC4
Browser version:
Windows 10, Chrome
Original install method (e.g. download page, yum, from source, etc.):
From 7.9 BC4
Description
[Ingest-Management]: "Enable elastic security agent" page instead of host appears under "Administrator>Host" tab, when user first forcefully un-enroll the agent and then re-enrolled the agent from Fleet tab.
Preconditions
Steps to Reproduce
Test data
N/A
Impacted Test case id
N/A
Actual Result
"Enable elastic security agent" page instead of host appears under "Administrator>Host" tab, when user first forcefully un-enroll the agent and then re-enrolled the agent from Fleet tab.
Expected Result
Host with Online status should appear under "Administrator>Host" tab, when user first forcefully un-enroll the agent and then re-enrolled the agent from Fleet tab.
What's working
N/A
What's not working
N/A
Screenshot
Logs
N/A
The text was updated successfully, but these errors were encountered: