Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move config related to FAB auth manager to FAB provider #36232

Merged
merged 10 commits into from
Jan 3, 2024

Conversation

vincbeck
Copy link
Contributor

Some config is now used only by the new fab provider. Thus, we should move this config to the provider

See #32210.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@vincbeck vincbeck added the AIP-56 Extensible user management label Dec 14, 2023
@potiuk
Copy link
Member

potiuk commented Dec 19, 2023

Let's see :)

@vincbeck
Copy link
Contributor Author

vincbeck commented Dec 19, 2023

Same error. I only disable the flag spellcheck for iteration #2 and #3. It should work since the issue is on the same package: apache-airflow-providers-fab. Though, interesting enough, the issue happen on apache-airflow-providers

@potiuk
Copy link
Member

potiuk commented Dec 19, 2023

Same error. I only disable the flag spellcheck for iteration #2 and #3. It should work since the issue is on the same package: apache-airflow-providers-fab. Though, interesting enough, the issue happen on apache-airflow-providers

Yeah. That's the thing.

This is what happens:

  1. How apache-airflow-providers work.

The apache-airflow-providers package extracts information from other providers and performs summary of "core extensions". In case of configuration changes it goes through the list of of all providers and finds out which one has the configuration defined, and based on that creates an "index" page where it links to configuration page of the apache-airflow-providers-fab provider. This is based on the current source code. So Sphinx expects that the configuration page is there.

  1. Fetching the current inventory:

If you unfold the commands running the spellcheck you will find this:

apache-airflow-providers-fab: Fetched inventory: http://apache-airflow-docs.s3-website.eu-central-1.amazonaws.com/docs/apache-airflow-providers-fab/stable/objects.inv

This means that before spellchecking start, the "current" inventory from s3 is fatched. This is the "inventory" of FAB provider that we pushed to s3 when it was built last time successfully in our canary build. This inventory does NOT contain configurations-ref page because it was not there in main when the inventory was built

  1. When 1st pass spellchecking starts for apache-airflow-providers the local apache-airflow-providers-fab sphix documentation is not built, so inventory is used. Without configuration-ref page

In this case what SHOULD happen - we should rebuild all the packages and only THEN run spellchecking. That will first build all the packages and then, when we got them locally built, they will be used instead of inventory (and in this case, the fab one locally built will have the configuration-ref page.

Looking at the output, I believe the 2nd psss did not happen at all. Possibly (without looking at the code) the 2nd pass is not currently executed at all when we have spellcheck only case ? Which I guess is logical, because you can't fix anything when you run spellcheck-only

@potiuk
Copy link
Member

potiuk commented Dec 19, 2023

I guessed correctly:

    to_retry_packages = [
        package_name
        for package_name, errors in package_build_errors.items()
        if any(any((m in e.message) for m in ERRORS_ELIGIBLE_TO_REBUILD) for e in errors)
    ]

this list of to_retry_packages only checks package_build_errors not package_spelling_errors.

And anyhow it would not work because if there are package spelling errors and the flag spellcheck_only is true, you have to not only retry buillding the failed package (apache-airflow-providers) but you also should ACTUALLY BUILD the apache-airflow-providers-fab package. Only then the problem should be solved.

So I'd say smthing like that should work:

In retry_building_docs_if_needed:

  1. the to_retry_packages should also check package_spelling_errors

  2. if spellcheck_only flag is set to true and there are any to_retry_packages detected, then the full list of built packages should be used. We currently do not have it inside the "retry_building_docs_if_needed" so we need to pass it. the spellcheck_only should be set to false when calling build_docs_for_packages again and all errors seen so far should be cleared (both spelling and build ones) - because we ware rebuilding and spellchecking all those packages from the scratch.

@potiuk
Copy link
Member

potiuk commented Dec 19, 2023

And then of course you should revert passing the "spellcheck_only" flag to use the original flag. We need tht in order to find out what mode the original build was called with.

@vincbeck
Copy link
Contributor Author

It worked locally :) Let's see!

@potiuk
Copy link
Member

potiuk commented Dec 19, 2023

It worked locally :) Let's see!

It does look like it could work :D. And we will see it in the job output as well.

@vincbeck
Copy link
Contributor Author

Arggg ... It is failing :|

@potiuk
Copy link
Member

potiuk commented Dec 20, 2023

OK. I think the problem is that spell-checking does not properly detect "build" errors. I saw it in the past that when spelling fails with build errors, there is no "actual" error printed. It's because it does:

spelling_errors.extend(parse_spelling_warnings(warning_text, self._src_dir))

and it does not do:

 spelling_errors.extend(parse_sphinx_warnings(warning_text, self._src_dir))

So the "unknown document: 'apache-airflow-providers-fab:configurations-ref'" is not added as "SpellingError" because it is not detected by "parse_spelling_warnings".

I thing the

 all_spelling_errors: dict[str, list[SpellingError]] = defaultdict(list)

should be changed to:

 all_spelling_errors: dict[str, list[SpellingError|DocBuildError]] = defaultdict(list)

(plus of course all the consequences) and both spelling and doc build errors should be parsed and added there.

That should not only fix the problem but improve the output of "--spell-check-only" when there is a build error (and not actual spelling error)

@potiuk
Copy link
Member

potiuk commented Dec 20, 2023

actually... I think better and simpler solution will be to add:

build_errors.extend(parse_sphinx_warnings(warning_text, self._src_dir))

in check_spelling.. And then 🤦 .... you can likely even revert checking the spelling errors - only leave the build_errors . Because spelling will then produce the build_errors on its own :)

@potiuk
Copy link
Member

potiuk commented Jan 2, 2024

Ah... I was also wondering WHY it was running so long - and I see you added use public runners - that's why it was running sooooooo long... any problems when using self-hosted ones?

@vincbeck
Copy link
Contributor Author

vincbeck commented Jan 2, 2024

Ah... I was also wondering WHY it was running so long - and I see you added use public runners - that's why it was running sooooooo long... any problems when using self-hosted ones?

Yeah jobs were not getting picked up for at least an hour so I went with the public runners

@potiuk
Copy link
Member

potiuk commented Jan 2, 2024

Strange. I see jobs being picked up now.. Maybe something temporary

@potiuk
Copy link
Member

potiuk commented Jan 3, 2024

🎉 SUCCESS

except the MSSQL failure that is 🙀

@vincbeck vincbeck removed the use public runners Makes sure that Public runners are used even if commiters creates the PR (useful for testing) label Jan 3, 2024
@vincbeck vincbeck closed this Jan 3, 2024
@vincbeck vincbeck reopened this Jan 3, 2024
@vincbeck
Copy link
Contributor Author

vincbeck commented Jan 3, 2024

Spellcheck is failing in self hosted runner but succeeding in public runners ... Interesting ... I'll try again with public runners

@vincbeck vincbeck added the use public runners Makes sure that Public runners are used even if commiters creates the PR (useful for testing) label Jan 3, 2024
@vincbeck vincbeck closed this Jan 3, 2024
@vincbeck vincbeck reopened this Jan 3, 2024
@potiuk
Copy link
Member

potiuk commented Jan 3, 2024

Hmmmm.... That's .... Interesting. Bigger parallelism ?

@potiuk
Copy link
Member

potiuk commented Jan 3, 2024

BTW. It's a bit worrying that it fails on self-hosted - eventually this build should succed in main - because that is how inventories will get pushed to S3, so if it fails on self-hosted, merging it is risky.

@potiuk
Copy link
Member

potiuk commented Jan 3, 2024

Believe it or not - but I think the failure on self-hosted runners was accidental

Error 143 returned
None
Running command: <id>
Running command: <id>
Running command: <docker info '{{println .SecurityOptions}}'>
Running command: <docker run ./scripts/in_container:/opt/airflow/scripts/in_container python:3.8-slim-bookworm>
Error: Process completed with exit code 143.

This is not a "doc-build" failure. It usually indicates that something was wrong on the machine ...

@potiuk
Copy link
Member

potiuk commented Jan 3, 2024

So ... I'd try to remove use public and re-run. Hoping that it will succeed.

@vincbeck
Copy link
Contributor Author

vincbeck commented Jan 3, 2024

BTW. It's a bit worrying that it fails on self-hosted - eventually this build should succed in main - because that is how inventories will get pushed to S3, so if it fails on self-hosted, merging it is risky.

I thought the issue was because we had in-flights doc changes (referencing doc not merged yet). To me, once merged there will be no issue. Am I wrong?

@potiuk
Copy link
Member

potiuk commented Jan 3, 2024

I thought the issue was because we had in-flights doc changes (referencing doc not merged yet).

Nope. This is because there are changes in several packages in the same PR and those packages refer to each other.

To me, once merged there will be no issue. Am I wrong?

Once merged AND doc build succedds AND inventories are pushed to S3 (which only happens in canary main build which runs after merge.

@potiuk
Copy link
Member

potiuk commented Jan 3, 2024

Once merged AND doc build succedds AND inventories are pushed to S3 (which only happens in canary main build which runs after merge.

But yeah. Just realized that in this case doc build always succeds, it's the spellcheck that is wrong - so in fact it does not matter if spellcheck fails or not ...

@vincbeck
Copy link
Contributor Author

vincbeck commented Jan 3, 2024

Greeeeeeennnnnnn!!! 🥳

@eladkal
Copy link
Contributor

eladkal commented Jan 3, 2024

Greeeeeeennnnnnn!!! 🥳

🎉🎉🎉🎉

@vincbeck vincbeck merged commit 28cad70 into apache:main Jan 3, 2024
51 checks passed
@vincbeck vincbeck deleted the vincbeck/fab_config branch January 3, 2024 19:46
@potiuk
Copy link
Member

potiuk commented Jan 3, 2024

:WOOOOHOOO!

@ephraimbuddy ephraimbuddy added this to the Airflow 2.9.0 milestone Jan 10, 2024
@ephraimbuddy ephraimbuddy added the type:misc/internal Changelog: Misc changes that should appear in change log label Jan 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AIP-56 Extensible user management area:providers area:webserver Webserver related Issues kind:documentation type:misc/internal Changelog: Misc changes that should appear in change log use public runners Makes sure that Public runners are used even if commiters creates the PR (useful for testing)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants