[BUG] Slack Terraform CI automation timeouts #1238

smoya · 2024-06-05T11:34:10Z

Describe the bug.

@Shurtu-gal did an excellent job with automating the creation and maintainability of AsyncAPI Slack channels and user groups. See #1072

However, we faced a blocker issue that makes the Terraform manifest to fail due to timeouts requesting Slack API.

The TF provider is not optimized at all. I have the feeling this code is being executed per each managed Usergroup whenever TF wants to refresh its state: https://github.com/pablovarela/terraform-provider-slack/blob/master/slack/resource_usergroup.go#L108-L128, so potentially we are calling the usergroups.list API method on each usergroup we have.

Expected behavior

I believe we could do some work on the provider repo (it's written in go, seems easy to read and understand at a glance) so we can implement some caching or whatever mechanism we decide. But in short term, I can't see how to fix it.

Screenshots

How to Reproduce

terraform apply with the proper Slack Token configured (ask @derberg or me)

🥦 Browser

None

👀 Have you checked for similar open issues?

I checked and didn't find similar issue

🏢 Have you read the Contributing Guidelines?

I have read the Contributing Guidelines

Are you willing to work on this issue ?

None

The text was updated successfully, but these errors were encountered:

derberg · 2024-06-10T12:23:28Z

probably more issues are there - we just merged PR with new WG

https://github.com/asyncapi/community/actions/runs/9446779045/job/26017154335

smoya · 2024-06-10T13:10:21Z

https://github.com/asyncapi/community/actions/runs/9446779045/job/26017154335

I don't see if this is the issue but what I see is that both the channel and the group have the same handle wg_marketing and that's completely incompatible as mentioned in the header comment of the file:

The handle should be unique and not in use by a member, channel, or another group.

Shurtu-gal · 2024-06-10T14:00:40Z

The issue is with invalid yaml. If a string has colon it needs to be double quoted.
Can be seen here :

community/WORKING_GROUPS.yaml

Line 31 in c9ebbc2

    
           description: The group is dedicated to leveraging marketing strategies to achieve two key objectives: promoting AsyncAPI adoption and highlighting community achievements. By strategically showcasing AsyncAPI capabilities and celebrating community successes, the group drives both user growth and community engagement. It shares a vision of close collaboration between AsyncAPI community and sponsors.

cc: @smoya @derberg

smoya · 2024-06-10T14:07:16Z

The issue is with invalid yaml. If a string has colon it needs to be double quoted. Can be seen here :

community/WORKING_GROUPS.yaml

Line 31 in c9ebbc2

description: The group is dedicated to leveraging marketing strategies to achieve two key objectives: promoting AsyncAPI adoption and highlighting community achievements. By strategically showcasing AsyncAPI capabilities and celebrating community successes, the group drives both user growth and community engagement. It shares a vision of close collaboration between AsyncAPI community and sponsors.

cc: @smoya @derberg

Yup, fix is here #1251

derberg · 2024-06-11T10:59:56Z

oh thanks, I suggest we need a workflow like https://github.com/asyncapi/community/blob/master/.github/workflows/validate-maintainers.yml#L12-L43 with json schema that we validate against - as these issues will pop up regularly, and with JSON schema you can do lots of validation cases, even pattern validation.

derberg · 2024-06-11T11:04:05Z

regarding timeouts

workflows have option to react of failure? are we able to parse error in such step, figure it is timeout and retry?

other than that, minimum we can do is drop error in slack, that someone needs to rerun the job, we support such things already - we can have custom message that tags certain people for example

smoya · 2024-06-11T15:00:47Z

workflows have option to react of failure? are we able to parse error in such step, figure it is timeout and retry?

You can retry as many times you want that it will keep failing. As stated in the description of the issue:

so potentially we are calling the usergroups.list API method on each usergroup we have.

We have more user groups than the API rate limit allows per minute (20 calls). See

The issue is that the TF provider seems to be doing one call to such API per group instead of just one for getting all of them (pending to be confirmed but 95% convinced)

derberg · 2024-06-12T16:37:30Z

You can retry as many times you want that it will keep failing. As stated in the description of the issue:

sorry, that wasn't clear for me. So basically it means automation will always fail atm?

btw - it fails for different reason here https://github.com/asyncapi/community/actions/runs/9464298763/job/26071337562

and what about GitHub teams automation?

smoya · 2024-06-13T07:53:38Z

btw - it fails for different reason here https://github.com/asyncapi/community/actions/runs/9464298763/job/26071337562

I don't understand such an error. In fact I can't reproduce the same state as in our CI even though the tfstate file is the same. That's weird... @Shurtu-gal any idea? I expect terraform plan in master branch to have the same plan as in the link @derberg shared, but it's not the case in my local env

Examples of things my terraform plan says:

Terraform planned the following actions, but then encountered a problem:

  # module.channels.slack_conversation.channels["01_introductions"] will be updated in-place
  ~ resource "slack_conversation" "channels" {
      + action_on_destroy                  = "archive"
      + action_on_update_permanent_members = "none"
      + adopt_existing_channel             = true
        id                                 = "C023GJWH33K"
        name                               = "01_introductions"
        # (10 unchanged attributes hidden)
    }

  # module.channels.slack_conversation.channels["02_general"] will be updated in-place
  ~ resource "slack_conversation" "channels" {
      + action_on_destroy                  = "archive"
      + action_on_update_permanent_members = "none"
      + adopt_existing_channel             = true
        id                                 = "C34F2JV0U"
        name                               = "02_general"
        # (10 unchanged attributes hidden)
    }

smoya · 2024-06-13T08:32:46Z

bounty/candidate

Shurtu-gal · 2024-06-14T19:32:18Z

@smoya checked for various stuff:

Bot account may not be in channel anymore: It is there in 02_general so cannot be the reason.
ID may have changed: That's also not the case.
Finally maybe the app has been uninstalled or deactivated. https://stackoverflow.com/questions/75432078/error-an-api-error-occurred-account-inactive

@derberg you would need to check both the bot-token in secret as well as the app maybe.

smoya · 2024-09-18T12:18:59Z

It seems there is a fork of the terraform provider that handles timeouts when creating groups. See pablovarela/terraform-provider-slack#223 (comment)

smoya added the bug Something isn't working label Jun 5, 2024

smoya mentioned this issue Jun 5, 2024

Automate Slack user groups + channels creation and management #1072

Closed

thulieblack mentioned this issue Jun 10, 2024

chore: add Marketing WORKING_GROUP #1130

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Slack Terraform CI automation timeouts #1238

[BUG] Slack Terraform CI automation timeouts #1238

smoya commented Jun 5, 2024

derberg commented Jun 10, 2024

smoya commented Jun 10, 2024

Shurtu-gal commented Jun 10, 2024 •

edited

Loading

smoya commented Jun 10, 2024

derberg commented Jun 11, 2024

derberg commented Jun 11, 2024

smoya commented Jun 11, 2024 •

edited

Loading

derberg commented Jun 12, 2024

smoya commented Jun 13, 2024 •

edited

Loading

smoya commented Jun 13, 2024

Shurtu-gal commented Jun 14, 2024

smoya commented Sep 18, 2024

[BUG] Slack Terraform CI automation timeouts #1238

[BUG] Slack Terraform CI automation timeouts #1238

Comments

smoya commented Jun 5, 2024

Describe the bug.

Expected behavior

Screenshots

How to Reproduce

🥦 Browser

👀 Have you checked for similar open issues?

🏢 Have you read the Contributing Guidelines?

Are you willing to work on this issue ?

derberg commented Jun 10, 2024

smoya commented Jun 10, 2024

Shurtu-gal commented Jun 10, 2024 • edited Loading

smoya commented Jun 10, 2024

derberg commented Jun 11, 2024

derberg commented Jun 11, 2024

smoya commented Jun 11, 2024 • edited Loading

derberg commented Jun 12, 2024

smoya commented Jun 13, 2024 • edited Loading

smoya commented Jun 13, 2024

Shurtu-gal commented Jun 14, 2024

smoya commented Sep 18, 2024

Shurtu-gal commented Jun 10, 2024 •

edited

Loading

smoya commented Jun 11, 2024 •

edited

Loading

smoya commented Jun 13, 2024 •

edited

Loading