Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

providers: add a new "azurestack" platform (client logic for AzureStackHub) #441

Closed
darkmuggle opened this issue Jun 25, 2020 · 10 comments
Closed

Comments

@darkmuggle
Copy link
Contributor

Afterburn fails completely due to 500 errors on the metadata source. With coreos/ignition@0c0ec63 I was able to boot on AzureStack.

A complete different issue:

s)...[   32.768570] NetworkManager[568]: <info>  [1593115470.6525] dhcp4 (eth0): option private_245          => 'a8:3f:81:10'

And then:

[   64.908820] afterburn[658]: Jun 25 19:57:52.985 WARN Failed to get fabric address from DHCP: maximum number of retries (60) reached
[   64.988395] afterburn[658]: Jun 25 19:57:52.986 INFO Using fallback address
[   65.033307] afterburn[658]: Jun 25 19:57:52.986 INFO Fetching http://168.63.129.16/?comp=versions: Attempt #1
^M[     *] A start job is running for Afterburn Hostname (52s / no limit)
[   65.566088] afterburn[658]: Jun 25 19:57:53.643 INFO Fetch successful
[   65.621959] afterburn[658]: Jun 25 19:57:53.643 INFO Fetching http://168.63.129.16/machine/?comp=goalstate: Attempt #1
[   65.698749] afterburn[658]: Jun 25 19:57:53.651 INFO Fetch successful
[   65.747770] afterburn[658]: Jun 25 19:57:53.659 INFO Fetching http://169.254.169.254/metadata/instance/compute/name?api-version=2017-08-01&format=text: Attempt #1
[   65.942651] afterburn[658]: Jun 25 19:57:53.674 INFO Failed to fetch: 500 Internal Server Error

And ending with:

Displaying logs from failed units: afterburn-hostname.service
-- Logs begin at Thu 2020-06-25 20:04:16 UTC, end at Thu 2020-06-25 20:05:59 UTC. --
Jun 25 20:05:51 afterburn[655]: Jun 25 20:05:51.338 INFO Failed to fetch: 500 Internal Server Error
Jun 25 20:05:51 afterburn[655]: Error: failed to run
Jun 25 20:05:51 afterburn[655]: Caused by: writing hostname
Jun 25 20:05:51 afterburn[655]: Caused by: failed to get hostname
Jun 25 20:05:51 afterburn[655]: Caused by: maximum number of retries (10) reached
Jun 25 20:05:51 afterburn[655]: Caused by: failed to fetch: 500 Internal Server Error
Jun 25 20:05:51 systemd[1]: afterburn-hostname.service: Main process exited, code=exited, status=1/FAILURE
Jun 25 20:05:51 systemd[1]: afterburn-hostname.service: Failed with result 'exit-code'.
Jun 25 20:05:51 systemd[1]: Failed to start Afterburn Hostname.
@darkmuggle darkmuggle changed the title AzureStack gives 500 error on the meta-data server Support on AzureStack Jun 25, 2020
@darkmuggle
Copy link
Contributor Author

darkmuggle commented Jun 25, 2020

Since this AzureStack and its not a supported variant (AFAIK), I'm calling this a feature request and NOT a bug. Unless other victim are eager to work on this, I'd like to volunteer myself.

Work on this is tenatively scheduled for the 4.7 OCP cycle.

@darkmuggle
Copy link
Contributor Author

/cc @cfBrianMiller

@darkmuggle darkmuggle self-assigned this Jun 25, 2020
@lucab
Copy link
Contributor

lucab commented Jun 26, 2020

option private_245 => 'a8:3f:81:10'

This in fact 168.63.129.16. So, bad that we don't have #146 but good that the fallback worked there too.

http://169.254.169.254/metadata/instance/compute/name?api-version=2017-08-01&format=text

According to coreos/fedora-coreos-tracker#476 (comment) the problem is with the API version. Which is weird because the (Azure) platform docs at https://docs.microsoft.com/en-us/azure/virtual-machines/windows/instance-metadata-service#versioning explicitly mention the version we are using. See https://docs.microsoft.com/en-us/azure-stack/user/azure-stack-vm-considerations?view=azs-2002#api-versions on API versions for AzureStack.

Going a bit further, the hostname is the simplest logic on Azure, so it's concerning that already this one fails on AzureStack. How do SSH keys logic and boot check-in logic behave on such platform?

@darkmuggle
Copy link
Contributor Author

Allegedly -- and I've asked for documentation -- but AzureStack does not support the instance meta-data service.

@pekramp
Copy link

pekramp commented Jun 26, 2020

Allegedly -- and I've asked for documentation -- but AzureStack does not support the instance meta-data service.

got the documentation for you https://docs.microsoft.com/en-us/azure-stack/user/azure-stack-vm-considerations?view=azs-2002

Azure Instance Metadata Service | The Azure Instance Metadata Service provides info about running VM instances that can be used to manage and set up your VM. | The Azure Instance Metadata Service isn't supported on Azure Stack Hub.

@lucab
Copy link
Contributor

lucab commented Jun 26, 2020

Which then makes me wonder, where does an AzureStack instance get its hostname? Is that in the DHCP options?

@darkmuggle
Copy link
Contributor Author

darkmuggle commented Aug 13, 2020

Now that Ignition [1] treats Azure Stack as a separate platform, we might have "just enough" to get FCOS/RHCOS booted on Azure{Stack,Hub} [1a, 1b]. The provided OVF from Microsoft looks suspect to me; the XML looks like its describing a Windows instance. Regardless the OVF XML given to us for AzureStack deviates substantially from what we know exists on Azure.

I started a stub [3], but after looking FCOS [4] packaging and RHCOS's previous failure to boot (caused by Afterburn checking in as if it was on AzureStack) is really superfluous.

The next steps are:

  • Confirm that the OVF provided by MS is, in fact, from a Linux VM
  • Attempt to boot RHCOS 4.6 or FCOS and see what happens

[1] https://github.com/coreos/ignition/blob/master/internal/providers/azurestack/azurestack.go
[1a] caveat emptor: this has not been tested on Azure Stack
[1b] caveat emptor: "just enough" is assumed to mean basic function means booting to a console. No Afterburn support. Remote access would be dependant on SSH keys provided by Ignition. Node is likely unusable beyond a POC.
[2] #463 (comment)
[3] 40f00e8
[4] https://src.fedoraproject.org/rpms/rust-afterburn/blob/master/f/rust-afterburn.spec#_49-53 uses the defaults set in
https://github.com/coreos/afterburn/blob/master/systemd/afterburn-checkin.service which would not apply to AzureStack

@darkmuggle
Copy link
Contributor Author

/cc @ashcrow @miabbott

@darkmuggle
Copy link
Contributor Author

As it turns out, we need to implement check-in support. FCOS will boot, but it will NOT check-in and get a hostname.

@lucab lucab changed the title Support on AzureStack providers: add a new "azurestack" platform (agent logic for AzureStack) Sep 15, 2020
@lucab lucab changed the title providers: add a new "azurestack" platform (agent logic for AzureStack) providers: add a new "azurestack" platform (agent logic for AzureStackHub) Sep 15, 2020
@lucab lucab changed the title providers: add a new "azurestack" platform (agent logic for AzureStackHub) providers: add a new "azurestack" platform (client logic for AzureStackHub) Sep 15, 2020
@prestist
Copy link
Contributor

Done in #561

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants