-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add Alibaba Cloud platform #5018
Conversation
Hi @bd233. Thanks for your PR. I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@bd233 can you rebase to fix the conflict |
/ok-to-test |
3af7987
to
8cf6c99
Compare
d21ccc0
to
b97cae9
Compare
f24a725
to
5cf8748
Compare
I disagree. The image should be determined at the machine-pool level and not the platform level. |
I trust your opinion as this is your realm of expertise. I was basing my recommendations on the other providers who store the image on the platform itself (vsphere, AWS, baremetal, ovirt, kubevirt, openstack). The recent changes have broken the installer by not obeying the If I attempt to add a check for the existing image in the Alibaba section (in
How do you recommend we solve the issue with obeying the environment variable override? |
I don't think it's a bug in the recent changes but rather the the bootstrap image asset. To me it seems like the bootstrap image should have a dependency on the rhcos image asset, rather than calling
I'll take a look at whether we have a bug with this. What is the status with publishing the rhcos images? |
@patrickdillon I spoke with @miabbott and he mentioned we are waiting on getting some accounts worked out and a bucket to store the images. We are or have been close on this for a while. Hopefully the trt team can make progress. |
The other platforms--with the exception of AWS--define the image once at the platform level for the entire cluster and not for each machine pool. AWS is the only plaform where the user can set the OS image in either the platform or the machine pool. This was done in #3308 with little context for why the option of setting it in the machine pool was added. My contention is that it is incorrect to provide the option to set it at the platform level and at the machine-pool level. It just makes the code more complicated. If the user wants to set it once for the entire cluster, then the user should use the default machine pool. It looks to me like there is actually a bug in the Bootstrap Image asset. The only platform that actually uses that asset is the baremetal platform. The other platforms should either (1) generate nothing for the bootstrap image or (2) generate the actual image that is going to be used for bootstrapping, even though the asset is not used. If it were the latter, then the asset should respect the environment variable override. |
I completely agree with you. I don't think it is good to store it on the platform level and storing the image in the machinepool makes much more sense.
The following code calls @patrickdillon @staebler |
@kwoodson would you take a look at #5267 and see if that fixes the problem for you? I believe it should and is one way of fixing the issue. |
I discovered today that the instance type for the master does not meet the required resources. This will need to be updated. I was able to achieve this with the following code. Please update the code so that masters default to have the
I also updated the instance names to match the generated names with an extra
This change will require an update to the private zone terraform module here https://github.com/openshift/installer/blob/2226d559a2010d8564f3697af8cf741e029d453e/data/data/alibabacloud/cluster/privatezone/privatezone.tf#L65 |
@bd233 @dongchen126 |
@patrickdillon @staebler I rebased today and fixed the go.mod dependencies and was able to install a cluster successfully (https://github.com/kwoodson/installer/tree/alibaba_v21). There will be a couple of more items but I want to set expectations for merge. Thanks! |
@bd233 @dongchen126 Feel free to rebase and resolve any conflicts. The correct dependencies can be found in the master branch's go.mod file here https://github.com/openshift/installer/blob/master/go.mod or I resolved them in a branch here https://github.com/kwoodson/installer/tree/alibaba_v21. This will simplify the process to build and test. Let me know if you have any questions. |
Thanks, I have updated and squash all commits. |
Has been fixed. |
I have reorganized the commits from this PR in #5291. I outlined a plan in that PR for how to best proceed with the code review of this very large PR. I believe once we reorganize the commits, it will make the PR easier to review, the installer team can then do a review, Alibaba can address immediate concerns and then we can merge. |
Support the new Alibaba Cloud platform, adding required types and initial assets: - cluster - installconfig - machine Alibaba: move Alibaba cloud to the 'HiddenPlatformNames' slice That'll make it hidden for end users while openshift installer are not yet GA Alibaba: add Terraform Templates Add Terraform templates to create cluster resource on Alibaba Cloud Alibaba: not proxy the ECS metadata servies Should not proxy the instance metadata services The address '100.100.100.200' is used to metadata services in Alibaba Cloud ECS instance Alibaba: support to generate cloud-creds secret Add the 'AlibabaCloudCredsSecretData' struct into 'pkg/asset/manifests/template.go', it is used to generate cloud-creds secret Alibaba: add provider config Add the Alibaba cloud provider config in manifests Alibaba: add machine spec Add the spec of Alibaba Cloud virtual machine Alibaba: support to generate the 'terraform.tfvars' file Support to generate the terraform.tfvars file by install-config and machine info Alibaba: support to generate DNS configuration Add privatezone ID to generate DNS configuration Alibaba: update machines asset update Alibaba Cloud machines asset Alibaba: support to check creds use NewClient function to check creds Alibaba: update vendor use command 'go mod vendor' to update vendor Alibaba: fix: rename "ResourceGroupName" to "ResourceGroupID" On Alibaba Cloud,the resource group ID is used to specify the resource group to which they belong Alibaba: add SLB listeners health checks Add SLB listeners ports(80, 443) health check Alibaba: enable OSS service automatically Add 'alicloud_oss_service' data source, and this can enable OSS service automatically Alibaba: fix to generate bootstrap user-data Add terraform variable 'bootstrap_stub_ignition' as user-data of bootstrap instance, and generate it by function 'generateIgnitionShim'. Alibaba: update the image section Update the image section, and added an 'ImageID' for Alibaba 'MachinePool' Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: split terraform templates into stages Split terrafrom into bootstrap and cluster stages. Signed-off-by: dongchen126 <dc.dc@alibaba-inc.com> Alibaba: fix: Delete unused outputs Delete some unused outputs and add newlines at the end of some files Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: add validation for provisioning Add validation for provisioning. Now just add check instance type. Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: add destroy code Add destroy code to delete cloud resource Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: move alibabacloud platform out of hiddenplatformnames Move alibabacloud out of hiddenplatformnames to the supported platforms since Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: update to use cluster-api-provider-alibabacloud package The cluster-api-provider-alibabacloud repo has been updated and supported, using the library’s AlibabaCloudMachineProviderConfig component. In the future, the code will be synchronized to github.com/openshift/cluster-api-provider-alibaba repo. Using github.com/openshift/cluster-api-provider-alibaba is the more recommended way. Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: update the scheme and version Update the 'AddToScheme' and 'SchemeGroupVersion' for machines of master and worker Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: fix APIVersion to alibabacloudmachineproviderconfig.openshift.io The APIVersion needs to be 'alibabacloudmachineproviderconfig.openshift.io' in 'pkg/asset/machines/alibabacloud/machinesets.go' file Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: specify the SystemDiskCategory and SystemDiskSize for AlibabaCloudMachineProviderConfig Fix the problem: the system disk information is missing in the generated terraform variables file. Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: store credentials on disk in a config file After user input credentials, stores them on disk in a config file Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: update terraform provider plugins Update the Alibaba Cloud terraform provider plugin Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: replace resource alicloud_eip with alicloud_eip_address The resource alicloud_eip has been deprecated from version 1.126.0. And use new resource alicloud_eip_address. Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: assign a public IP to bootstrap instance Assign a public IP to bootstrap instance by specifying the internet_max_bandwidth_out parameter Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: add bootstrap instance to the security group of master The bootstrap instance should be in master instance security group. Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: unify cluster outputs.tf and bootstrap variables.tf file Unify the output.tf of the cluster and the variables.tf of the bootstrap Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: fix: alibabacloud destroyer creator is not initialized Fix the error that the destroyer creator of alibabacloud is not initialized Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: remove redundant NatGatway and EIP resource One NatGatway can fully meet the needs of the network Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: fix to 'Config.BaseDomain' is nil Use ClusterDomain instead of BaseDomain to query DNS resources Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: fix the problem RAM must use https request 1. RAM request scheme must be https 2. Add automatic retry function Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: add tags for bootstrap security group Add tags for bootstrap security group Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: extend the expiration time of the bucket URL Extend the expiration time of the bucket URL to 2 hours Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: update destroy code 1. Updated the function call method,reducing the parameters passed 2. Added some log output to facilitate debugging 3. Detach policys with roles Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: fix manifests dns DNS config 1. fix public zone not be specified 2. fconfig dns private zone by name Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: update terraform provider version to 1.132.0 This version fixes the NotApplicable error that the user of the international station creates alicloud_pvtz_service resource Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: add hostname to instances Add hostname to bootstrap and master instances Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: fix the error that DescribeAvailableInstanceType response is empty Adjust parameter order and add instanceType parameter for request, to fix the error that DescribeAvailableInstanceType response is empty. Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: fix the error that the custom image is unavailable Sets the imageID from `required` to `a`, to fix the error that the custom image is unavailable Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: add ali_nat_gatway_zone_id variable Add a variable ali_nat_gatway_zone_id to createNAT Gatway to solve the problem that some availability zones do not support NAT gatway creation Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: fix DependencyViolation error Fix the DependencyViolation error that occasionally appeared when the cluster was destroyed. Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: Add bootstrap instance to internal SLB 1. Add bootstrap instance to internal SLB. 2. The alicloud_slb_attachment has been deprecated, replace with alicloud_slb_backend_server In the future, plan to use alicloud_slb_server_group Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: add snat to the master node VSwitchs The masters require internet access so the instances can pull images during startup, add snat to the master node VSwitchs. Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: fix the issue that checking VPC release failed Check resources multiple times through 'Poll' function Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: fix gatway spelling errors Fix spelling errors gatway, the correct wording is gateway Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: delete key pair variables The key pair only supports the configuration of the root user, not valid for core user Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: remove extra spaces Remove extra spaces to fix CI Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: update infrastructure asset Update infrastructure asset with 'AlibabaCloudPlatformType' and 'AlibabaCloudPlatformStatus' Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: delete AccessKeyID and AccessKeySecret item in provider config Fix: the instance has been bound to RAM Role, no need to specify AK and SK. Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: fix: match the hostname Tag and instance name match hostname Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: update RAM role policy Update policy document of master and worker node RAM role Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: add static validation for resourceGroupID 1. Add static validation for resourceGroupID 2. Fix incorrect 'instanceTypePath' Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: fix: 'Region' should not be obtained from the client Obtain the 'Region' from the 'Config', not the client Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: fix: delete unnecessary changes and comments Delete some unnecessary changes and comments Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: fix: add some 'optional' and 'omitempty' tag Add 'optional' and 'omitempty' tags for some optional attribute Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: fix some variable name problems Fix some variable name problem Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: fix: add DiskCategory type Add 'DiskCategory' type to be used for 'SystemDiskCategory' Alibaba: separate 'GenerateIgnitionShim' as a common part Generate an ignition file logic on Alibaba is similar to AWS, separate 'GenerateIgnitionShim' from AWS as a common part, so that Alibaba platform can also be used. Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: add MachinePool verification Update 'InatanceType' verification, add 'ClusterName' verification Alibaba: fix default master machine pool instance type Update default machine pool instance type to 'xlarge' for the master Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: update format of the master instance name Change the format of the master instance name, separated by '-' Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: update DNS config Update the privatzone ID configuration of DNS, replace zone name with zone ID and add a zone type tag Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: refactor destroy code Now when deleting a resource fails, it will always try again, instead of returning an Error. But now it is used to delete resources synchronously, consider using asynchronous mode later. Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com> Alibaba: update cloud-creds-secret.yaml.template Add Alibaba Cloud creds in cloud-creds-secret.yaml.template file Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com>
This commit was produced by running , , and all modules verified. Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com>
@bd233: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@bd233: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/uncc |
No description provided.