Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenStack] installer is stuck on "Image import started." #122

Closed
VivianMeally opened this issue Mar 30, 2020 · 8 comments
Closed

[OpenStack] installer is stuck on "Image import started." #122

VivianMeally opened this issue Mar 30, 2020 · 8 comments

Comments

@VivianMeally
Copy link

Installer failing on uploading Fcos image to Glance.

OpenStack Version: Train

OKD Version: release:4.4.0-0.okd-2020-03-28-092308

Log:

DEBUG Generating Terraform Variables...
INFO Obtaining RHCOS image file from 'https://builds.coreos.fedoraproject.org/prod/streams/stable/builds/31.20200310.3.0/x86_64/fedora-coreos-31.20200310.3.0-openstack.x86_64.qcow2.xz?sha256=dc045796f1ed272a1a77fc9fd59b667a4c646fb1e4129b1f8f31578a58e3bf8b'
INFO The file was found in cache: /root/.cache/openshift-installer/image_cache/c3c5779901a590f83626725a3aa0fed5. Reusing...
DEBUG Creating a Glance image for RHCOS...
DEBUG Image dev3-fj2kp-rhcos was created.
DEBUG Checking if the image import mechanism is supported
DEBUG Glance Direct image import plugin was found
DEBUG Using Image Import API to upload RHCOS to the image "dev3-fj2kp-rhcos" with ID "143848fe-785f-4be6-bab8-4e7f02e74ec8"
DEBUG The data was uploaded.
DEBUG Begin image import for the image "dev3-fj2kp-rhcos" with ID "143848fe-785f-4be6-bab8-4e7f02e74ec8"
DEBUG Image import started.

@vrutkovs
Copy link
Member

@mandre any ideas how to debug image upload to Glance?

@mandre
Copy link

mandre commented Mar 31, 2020

Could you let us know what the status of the image is with openstack image show dev3-fj2kp-rhcos.

From what I can tell from the logs, the image was uploaded correctly to your cloud and it is being imported into glance using the direct import plugin. It means that glance will chose the correct image format depending on your backend. It may be that this operation is particularly slow on your cloud?

If the image status is importing, the image import is still in progress. It will move to active once finished and the installer will go on with the deployment.

Do you know what backend you are using for glance?

You can see it in glance configuration file (if you're administering your openstack cloud):
https://docs.openstack.org/glance/latest/configuration/configuring.html#configuring-glance-storage-backends

If you have access to the glance logs, you might also find some information about what is going on.

@VivianMeally
Copy link
Author

VivianMeally commented Mar 31, 2020

Do you know what backend you are using for glance?

ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable)

If the image status is importing, the image import is still in progress. It will move to active once finished and the installer will go on with the deployment.

Waited more than 1h, it was still in "importing" status

At the weekend I have tested this version, import worked fine.
https://origin-release.svc.ci.openshift.org/releasestream/4.4.0-0.okd/release/4.4.0-0.okd-2020-03-25-102431

openstack image show dev3-fj2kp-rhcos
+------------------+------------------------------------------------------------------------------+
| Field            | Value                                                                        |
+------------------+------------------------------------------------------------------------------+
| checksum         | None                                                                         |
| container_format | bare                                                                         |
| created_at       | 2020-03-30T20:09:52Z                                                         |
| disk_format      | qcow2                                                                        |
| file             | /v2/images/143848fe-785f-4be6-bab8-4e7f02e74ec8/file                         |
| id               | 143848fe-785f-4be6-bab8-4e7f02e74ec8                                         |
| min_disk         | 0                                                                            |
| min_ram          | 0                                                                            |
| name             | dev3-fj2kp-rhcos                                                             |
| owner            | 0d4d02f870474b16a7132cc8a6bb439e                                             |
| properties       | locations='[]', os_hash_algo='None', os_hash_value='None', os_hidden='False' |
| protected        | False                                                                        |
| schema           | /v2/schemas/image                                                            |
| size             | None                                                                         |
| status           | importing                                                                    |
| tags             | openshiftClusterID=dev3-fj2kp                                                |
| updated_at       | 2020-03-30T20:10:14Z                                                         |
| virtual_size     | None                                                                         |
| visibility       | shared                                                                       |
+------------------+------------------------------------------------------------------------------+

Glance log
Looks like it was not properly posted to glance

2020-03-30 19:39:56.946 30 INFO eventlet.wsgi.server [req-4851cab3-db74-4991-b030-065b31b6ee60 9898a4854b5f40a686131e3e7ca79176 0d4d02f870474b16a7132cc8a6bb439e - default default] 192.168.145.1,192.168.14.72 - - [30/Mar/2020 19:39:56] "POST /v2/images HTTP/1.1" 201 1023 1.134829
2020-03-30 19:40:18.322 29 INFO eventlet.wsgi.server [req-11ad5f89-2655-48d0-b122-7f2a21f3a1a7 9898a4854b5f40a686131e3e7ca79176 0d4d02f870474b16a7132cc8a6bb439e - default default] 192.168.145.1,192.168.14.72 - - [30/Mar/2020 19:40:18] "POST /v2/images/ff05918e-4a12-41ce-9b38-c599876ecea3/import HTTP/1.1" 202 206 0.388214
2020-03-30 19:40:18.342 29 INFO glance.domain [-] Task [c6672fed-56d1-4fc9-912a-66e778c2fc1b] status changing from pending to processing
2020-03-30 19:40:18.407 29 INFO glance_store._drivers.filesystem [-] Directory to write image files does not exist (/tmp/staging/). Creating.
2020-03-30 19:40:18.596 29 INFO glance.common.scripts.image_import.main [-] Task c6672fed-56d1-4fc9-912a-66e778c2fc1b: Got image data uri file:///tmp/staging/ff05918e-4a12-41ce-9b38-c599876ecea3 to be imported
2020-03-30 19:40:18.598 29 WARNING glance.common.scripts.image_import.main [-] Task c6672fed-56d1-4fc9-912a-66e778c2fc1b failed with exception [Errno 2] No such file or directory: '/tmp/staging/ff05918e-4a12-41ce-9b38-c599876ecea3': IOError: [Errno 2] No such file or directory: '/tmp/staging/ff05918e-4a12-41ce-9b38-c599876ecea3'
2020-03-30 19:40:18.598 29 INFO glance.common.scripts.image_import.main [-] Task c6672fed-56d1-4fc9-912a-66e778c2fc1b: Could not import image file file:///tmp/staging/ff05918e-4a12-41ce-9b38-c599876ecea3
2020-03-30 19:40:18.604 29 WARNING glance.async_.taskflow_executor [-] Task 'api_image_import-ImportToStore-c6672fed-56d1-4fc9-912a-66e778c2fc1b' (b4fcffd2-e4f5-4b54-9d2e-03204b0cc956) transitioned into state 'FAILURE' from state 'RUNNING'
3 predecessors (most recent first):
  Atom 'api_image_import-ConfigureStaging-c6672fed-56d1-4fc9-912a-66e778c2fc1b' {'intention': 'EXECUTE', 'state': 'SUCCESS', 'requires': {}, 'provides': 'file:///tmp/staging/ff05918e-4a12-41ce-9b38-c599876ecea3'}
  |__Atom 'api_image_import_retry' {'intention': 'EXECUTE', 'state': 'SUCCESS', 'requires': {}, 'provides': [(None, {})]}
     |__Flow 'api_image_import': IOError: [Errno 2] No such file or directory: '/tmp/staging/ff05918e-4a12-41ce-9b38-c599876ecea3'
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor Traceback (most recent call last):
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor   File "/var/lib/kolla/venv/lib/python2.7/site-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor     result = task.execute(**arguments)
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor   File "/var/lib/kolla/venv/lib/python2.7/site-packages/glance/async_/flows/api_image_import.py", line 249, in execute
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor     backend=self.backend)
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor   File "/var/lib/kolla/venv/lib/python2.7/site-packages/glance/common/scripts/image_import/main.py", line 154, in set_image_data
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor     "task_id": task_id})
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor   File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor     self.force_reraise()
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor   File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor     six.reraise(self.type_, self.value, self.tb)
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor   File "/var/lib/kolla/venv/lib/python2.7/site-packages/glance/common/scripts/image_import/main.py", line 145, in set_image_data
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor     data_iter = script_utils.get_image_data_iter(uri)
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor   File "/var/lib/kolla/venv/lib/python2.7/site-packages/glance/common/scripts/utils.py", line 139, in get_image_data_iter
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor     return open(uri, "rb")
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor IOError: [Errno 2] No such file or directory: '/tmp/staging/ff05918e-4a12-41ce-9b38-c599876ecea3'
2020-03-30 19:40:18.604 29 ERROR glance.async_.taskflow_executor 
2020-03-30 19:40:18.616 29 WARNING glance.async_.taskflow_executor [-] Task 'api_image_import-ImportToStore-c6672fed-56d1-4fc9-912a-66e778c2fc1b' (b4fcffd2-e4f5-4b54-9d2e-03204b0cc956) transitioned into state 'REVERTED' from state 'REVERTING'
2020-03-30 19:40:18.623 29 WARNING glance.async_.taskflow_executor [-] Task 'api_image_import-ConfigureStaging-c6672fed-56d1-4fc9-912a-66e778c2fc1b' (fb9869eb-fc6d-4eb2-9dd2-6a3824cee275) transitioned into state 'REVERTED' from state 'REVERTING'
2020-03-30 19:40:18.631 29 WARNING glance.async_.taskflow_executor [-] Flow 'api_image_import' (34b36fa2-79a6-47ef-8529-3f612a896ee2) transitioned into state 'REVERTED' from state 'RUNNING'
2020-03-30 19:40:18.632 29 ERROR glance.async_.taskflow_executor [-] Failed to execute task c6672fed-56d1-4fc9-912a-66e778c2fc1b: [Errno 2] No such file or directory: '/tmp/staging/ff05918e-4a12-41ce-9b38-c599876ecea3': IOError: [Errno 2] No such file or directory: '/tmp/staging/ff05918e-4a12-41ce-9b38-c599876ecea3'

@mandre
Copy link

mandre commented Mar 31, 2020

It seems like the image wasn't staged properly. I checked our code and found we didn't check for failures correctly. This should be fixed with openshift/installer#3385

Now that doesn't explain why your image wasn't stage -- but at least with the above patch the installer should let you know then the image failed to upload.

Perhaps glance logs tell you the reason why it failed to stage the image?

Also, if you're using multiple stores, make sure to read the following doc carefully: https://docs.openstack.org/glance/latest/admin/multistores.html#reserved-stores

@LorbusChris
Copy link
Contributor

LorbusChris commented Mar 31, 2020

This might be unrelated to this specific failure (and you'll hit after finding a way to proceed), but unfortunately the Ignition binary in FCOS does not yet understand HTTPHeaders in Ignition config, so provisioning is likely broken due to this for OpenShift right now. The next release of Ignition will support this (it has already merged into master).

@mandre
Copy link

mandre commented Apr 2, 2020

I got access to another cloud where the glance direct import was enabled, and I got the same result where the installer also blocks on image import. The image was still in uploading status, meaning it has finished uploading but hasn't started the image import (I know... poor naming) after leaving it running overnight. I also tried using the glance image-create-via-import command, same result. Which leads me to think, it's a misconfiguration of glance but also that glance direct import might be broken for A LOT of people.

As a consequence we decided to switch back to the usual image upload in the installer until we're able to detect if the cloud really supports glance direct image import: openshift/installer#3394

@vrutkovs
Copy link
Member

vrutkovs commented Apr 2, 2020

fcos cherrypick for this PR - openshift/installer#3397

@LorbusChris
Copy link
Contributor

Ignition v2.3.0 with stable spec 3.1 was released last Thursday and will likely make its way into the next Fedora CoreOS release images.

We also need to update the import here: https://github.com/openshift/installer/blob/fcos/pkg/tfvars/openstack/bootstrap_ignition.go#L14

binnes pushed a commit to binnes/okd that referenced this issue Jul 11, 2023
* Change bugtracker link to Github issue page.

* Video link for FCOS

* Remove beta text
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants