Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

win_domain fails using vmware_tools connection plugin #57607

Closed
agowa opened this issue Jun 10, 2019 · 16 comments · Fixed by #57661
Closed

win_domain fails using vmware_tools connection plugin #57607

agowa opened this issue Jun 10, 2019 · 16 comments · Fixed by #57661
Labels
affects_2.9 This issue/PR affects Ansible v2.9 bug This issue/PR relates to a bug. cloud has_pr This issue has an associated PR. module This issue/PR relates to a module. python3 support:community This issue/PR relates to code supported by the Ansible community. support:core This issue/PR relates to code supported by the Ansible Engineering Team. traceback This issue/PR includes a traceback. vmware VMware community windows Windows community

Comments

@agowa
Copy link
Contributor

agowa commented Jun 10, 2019

SUMMARY

I tried to use this module against a windows server 2019 core using the vmware_tools transporter and it fails with a exception, no real error message is shown.

Edit: The win_domain module invokes Install-ADDSForest, which stops the Netlogon service. The module starts it again, but that is to late, as the vmware_tools connection plugin already failed and terminated script execution prematurely. Execution of the win_domain module stops right after invoking Install-ADDSForest.

ISSUE TYPE
  • Bug Report
COMPONENT NAME

win_domain
vmware_tools

ANSIBLE VERSION
ansible-playbook 2.9.0.dev0
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/user/PycharmProjects/ansible-2/lib/ansible
  executable location = /home/user/PycharmProjects/ansible-2/bin/ansible-playbook
  python version = 3.7.3 (default, Mar 26 2019, 21:43:19) [GCC 8.2.1 20181127]
CONFIGURATION

OS / ENVIRONMENT

ArchLinux

STEPS TO REPRODUCE

Run this module against a windows server 2019 core server.

EXPECTED RESULTS

Deploys a domain controller

ACTUAL RESULTS
TASK [activedirectory : Create Domain and Forest] *****************************************
task path: /home/user/****/roles/internal/activedirectory/tasks/install_server.yml:23
Using module file /home/user/PycharmProjects/ansible-2/lib/ansible/modules/windows/win_domain.ps1
The full traceback is:
Traceback (most recent call last):
  File "/home/user/PycharmProjects/ansible-2/lib/ansible/executor/task_executor.py", line 145, in run
    res = self._execute()
  File "/home/user/PycharmProjects/ansible-2/lib/ansible/executor/task_executor.py", line 635, in _execute
    result = self._handler.run(task_vars=variables)
  File "/home/user/PycharmProjects/ansible-2/lib/ansible/plugins/action/normal.py", line 46, in run
    result = merge_hash(result, self._execute_module(task_vars=task_vars, wrap_async=wrap_async))
  File "/home/user/PycharmProjects/ansible-2/lib/ansible/plugins/action/__init__.py", line 917, in _execute_module
    res = self._low_level_execute_command(cmd, sudoable=sudoable, in_data=in_data)
  File "/home/user/PycharmProjects/ansible-2/lib/ansible/plugins/action/__init__.py", line 1060, in _low_level_execute_command
    rc, stdout, stderr = self._connection.exec_command(cmd, in_data=in_data, sudoable=sudoable)
  File "/home/user/PycharmProjects/ansible-2/lib/ansible/plugins/connection/vmware_tools.py", line 453, in exec_command
    pid_info = self._get_pid_info(pid)
  File "/home/user/PycharmProjects/ansible-2/lib/ansible/plugins/connection/vmware_tools.py", line 395, in _get_pid_info
    processes = self.processManager.ListProcessesInGuest(vm=self.vm, auth=self.vm_auth, pids=[pid])
  File "/usr/lib/python3.7/site-packages/pyVmomi/VmomiSupport.py", line 580, in <lambda>
    self.f(*(self.args + (obj,) + args), **kwargs)
  File "/usr/lib/python3.7/site-packages/pyVmomi/VmomiSupport.py", line 386, in _InvokeMethod
    return self._stub.InvokeMethod(self, info, args)
  File "/usr/lib/python3.7/site-packages/pyVmomi/SoapAdapter.py", line 1374, in InvokeMethod
    raise obj # pylint: disable-msg=E0702
pyVmomi.VmomiSupport.vmodl.fault.SystemError: (vmodl.fault.SystemError) {
   dynamicType = <unset>,
   dynamicProperty = (vmodl.DynamicProperty) [],
   msg = 'A general system error occurred: vix error codes = (1, 0).\n',
   faultCause = <unset>,
   faultMessage = (vmodl.LocalizableMessage) [],
   reason = 'vix error codes = (1, 0).\n'
}

fatal: [ADS01]: FAILED! => {
    "msg": "Unexpected failure during module execution.",
    "stdout": ""
}
@ansibot
Copy link
Contributor

ansibot commented Jun 10, 2019

Files identified in the description:

If these files are inaccurate, please update the component name section of the description or use the !component bot command.

click here for bot help

@ansibot
Copy link
Contributor

ansibot commented Jun 10, 2019

@ansibot ansibot added affects_2.9 This issue/PR affects Ansible v2.9 bug This issue/PR relates to a bug. module This issue/PR relates to a module. needs_triage Needs a first human triage before being processed. python3 support:core This issue/PR relates to code supported by the Ansible Engineering Team. traceback This issue/PR includes a traceback. windows Windows community labels Jun 10, 2019
@ShachafGoldstein
Copy link
Contributor

ShachafGoldstein commented Jun 10, 2019

Other modules work correctly?

Did you verify all requirements at https://docs.ansible.com/ansible/latest/plugins/connection/vmware_tools.html

@jborean93
Copy link
Contributor

We don't support PSCore right now, the main problem is that we don't test it in CI and can never guarantee that it will continue to work. I won't close this because I'm unsure whether vmware_tools needs to be changed to report the proper error but the Windows module will require PS Desktop to run.

@ansibot ansibot removed the needs_triage Needs a first human triage before being processed. label Jun 10, 2019
@agowa
Copy link
Contributor Author

agowa commented Jun 10, 2019

@jborean93 : it is not about pscore, it is about the server core, e.g. without desktop experiance. It has the normal powershell installed.
image

@ShachafGoldstein : Other modules work correctly, it might just be incorrect error propagation, as it works after patching the win_domain_controller module to no longer try to install gui components.

@jborean93
Copy link
Contributor

Ah sorry, it's late at night and I misread what you said.

@agowa
Copy link
Contributor Author

agowa commented Jun 10, 2019

Ok, this issue is not caused by trying to install the administration center. Apparently this one can be installed on core servers. It was just fixed, as I had a win_feature step before invoking this module that already installed the features.
This error occurs every time this module tries to install anything.

So some investigation is still required...

@ShachafGoldstein
Copy link
Contributor

can you put the output of running it verbose here? -vvv

@agowa agowa changed the title win_domain_controller fails on windows server core win_domain fails on windows server core Jun 10, 2019
@agowa
Copy link
Contributor Author

agowa commented Jun 10, 2019

!component win_domain
Apparently I'm looking at the wrong file (confused win_domain with win_domain_controller should have looked more closely...), no wonder that nothing seam to be effective, it is just flickering, sometimes it works sometimes it does not...

@ShachafGoldstein the output above is already -vvv or do you mean -vvvv?

@ansibot
Copy link
Contributor

ansibot commented Jun 10, 2019

Files identified in the description:

If these files are inaccurate, please update the component name section of the description or use the !component bot command.

click here for bot help

@agowa
Copy link
Contributor Author

agowa commented Jun 10, 2019

Finally I think I found the issue. It is a race condition in conjunction with the vmware_tools connection plugin.
The win_domain module invokes Install-ADDSForest, which stops the Netlogon service. The module starts it again, but that is to late, as the vmware_tools connection plugin already failed and terminated script execution prematurely. Execution of the win_domain module stops right after invoking Install-ADDSForest.

The same error occurs when replacing the win_domain modules content with:

#!powershell

#Requires -Module Ansible.ModuleUtils.Legacy
Set-StrictMode -Version 2
$ErrorActionPreference = "Stop"

Stop-Service "Netlogon"
Start-Sleep -Seconds 20
Start-Service "Netlogon"

@agowa agowa changed the title win_domain fails on windows server core win_domain fails using vmware_tools connection plugin Jun 10, 2019
@ansibot
Copy link
Contributor

ansibot commented Jun 10, 2019

Files identified in the description:

If these files are inaccurate, please update the component name section of the description or use the !component bot command.

click here for bot help

@ansibot
Copy link
Contributor

ansibot commented Jun 10, 2019

@ansibot ansibot added cloud support:community This issue/PR relates to code supported by the Ansible community. vmware VMware community labels Jun 10, 2019
@jborean93 jborean93 reopened this Jun 10, 2019
@jborean93
Copy link
Contributor

Sorry accidentally closed the issue. I’m not sure there would be anything we can do here. The AD cmdlet stops Netlogon itself and we force start it even though a reboot should be done straight away so the next winrm/psrp task can continue connect and actually do the reboot. Potentially you may need to run with async and ultimately ignore the error when the connection was dropped but that’s dependent on if the service actually gets started or is just killed before it reaches that step

@agowa
Copy link
Contributor Author

agowa commented Jun 10, 2019

Can we add some error handling into the connection plugin?
vmware_tools.py Line 398
something that catches the vim.vmodl.fault.SystemError exception. So that one is able to do if failed or something like that within the play. I don't know any way to handle the python error directly from within the play, so can we convert it somehow into something that a user can handle within a play if it is expected?

In its simplest form, to just allow one to do retries: 2 or ignore_errors: true. Apparently this allows the module to complete just fine.
Or maybe something that it more clear using a new option ignore_python_exception, but that may be a huge change...

The exception:

(vmodl.fault.SystemError) {
   dynamicType = <unset>,
   dynamicProperty = (vmodl.DynamicProperty) [],
   msg = 'A general system error occurred: vix error codes = (1, 0).\n',
   faultCause = <unset>,
   faultMessage = (vmodl.LocalizableMessage) [],
   reason = 'vix error codes = (1, 0).\n'
}

Others struggling with this error (but by using the powershell api): https://www.van-gelderen.eu/invoke-vmscript-a-general-system-error-occurred-vix-error-codes-1-0/
They also just try catch and ignore it, something we should not do directly within the connector, but would be desirable within the play/role...

@jborean93
Copy link
Contributor

Can we add some error handling into the connection plugin?

Yep, would just need to raise a PR and the maintainer would have to verify that it is the correct thing to do.

I don't know any way to handle the python error directly from within the play, so can we convert it somehow into something that a user can handle within a play if it is expected?

It probably would have to raise AnsibleConnectionFailure but you can have ignore_unreachable to handle this type of failure.

@ansibot ansibot added the has_pr This issue has an associated PR. label Jul 28, 2019
@ansible ansible locked and limited conversation to collaborators Dec 17, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
affects_2.9 This issue/PR affects Ansible v2.9 bug This issue/PR relates to a bug. cloud has_pr This issue has an associated PR. module This issue/PR relates to a module. python3 support:community This issue/PR relates to code supported by the Ansible community. support:core This issue/PR relates to code supported by the Ansible Engineering Team. traceback This issue/PR includes a traceback. vmware VMware community windows Windows community
Projects
None yet
4 participants