Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add migration progress documentation #3333

Merged
merged 25 commits into from
Nov 22, 2024
Merged

Conversation

JCZuurmond
Copy link
Member

@JCZuurmond JCZuurmond commented Nov 19, 2024

Changes

Add migration progress design documentation

Linked issues

Progresses #2074
Related #3067

Functionality

  • added relevant user documentation: docs/migration-progress.md

@JCZuurmond JCZuurmond added the documentation Improvements or additions to documentation label Nov 19, 2024
@JCZuurmond JCZuurmond self-assigned this Nov 19, 2024
@JCZuurmond JCZuurmond marked this pull request as ready for review November 19, 2024 12:39
@JCZuurmond JCZuurmond requested a review from a team as a code owner November 19, 2024 12:39
Copy link

github-actions bot commented Nov 19, 2024

✅ 6/6 passed, 7m44s total

Running from acceptance #7509

docs/migration-progress.md Outdated Show resolved Hide resolved
docs/migration-progress.md Outdated Show resolved Hide resolved
Copy link
Contributor

@asnare asnare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

progress by updating the following [UCX catalog](#create-ucx-catalog-command) tables:
The manually triggered `migration-progress-experimental` workflow updates a **subset** of
the [inventory tables](#assessment-workflow) to [track Unity Catalog compatability](docs/migration-progress.md) of Hive
and workspace objects that need to be migrated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Describe the purpose and business purpose.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a reference to the dashboard, would like to cover your comment either in the dashboard documentation (shortly) and/or the migration progress documentation (more extensively), as this section is only about the workflow and less about the "why"

for Hive data objects, this means that the objects are migrated to Unity Catalog.

### Owner

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Define ownership. Explain the logic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate issue: #3067

docs/migration-progress.md Show resolved Hide resolved
@JCZuurmond JCZuurmond added this pull request to the merge queue Nov 22, 2024
Merged via the queue into main with commit 60718a7 Nov 22, 2024
7 checks passed
@JCZuurmond JCZuurmond deleted the docs/add-migration-progress branch November 22, 2024 14:08
gueniai added a commit that referenced this pull request Dec 2, 2024
* Added `assign-owner-group` command ([#3111](#3111)). The Databricks Labs Unity Catalog Exporter (UCX) tool now includes a new `assign-owner-group` command, allowing users to assign an owner group to the workspace. This group will be designated as the owner for all migrated tables and views, providing better control and organization of resources. The command can be executed in the context of a specific workspace or across multiple workspaces. The implementation includes new classes, methods, and attributes in various files, such as `cli.py`, `config.py`, and `groups.py`, enhancing ownership management functionality. The `assign-owner-group` command replaces the functionality of issue [#3075](#3075) and addresses issue [#2890](#2890), ensuring proper schema ownership and handling of crawled grants. Developers should be aware that running the `migrate-tables` workflow will result in assigning a new owner group for the Hive Metastore instance in the workspace installation.
* Added `opencensus` to known list ([#3052](#3052)). In this release, we have added OpenCensus to the list of known libraries in our configuration file. OpenCensus is a popular set of tools for distributed tracing and monitoring, and its inclusion in our system will enhance support and integration for users who utilize this tool. This change does not affect existing functionality, but instead adds a new entry in the configuration file for OpenCensus. This enhancement will allow our library to better recognize and work with OpenCensus, enabling improved performance and functionality for our users.
* Added default owner group selection to the installer ([#3370](#3370)). A new class, AccountGroupLookup, has been added to the AccountGroupLookup module to select the default owner group during the installer process, addressing previous issue [#3111](#3111). This class uses the workspace_client to determine the owner group, and a pick_owner_group method to prompt the user for a selection if necessary. The ownership selection process has been improved with the addition of a check in the installer's `_static_owner` method to determine if the current user is part of the default owner group. The GroupManager class has been updated to use the new AccountGroupLookup class and its methods, `pick_owner_group` and `validate_owner_group`. A new variable, `default_owner_group`, is introduced in the ConfigureGroups class to configure groups during installation based on user input. The installer now includes a unit test, "test_configure_with_default_owner_group", to demonstrate how it sets expected workspace configuration values when a default owner group is specified during installation.
* Added handling for non UTF-8 encoded notebook error explicitly ([#3376](#3376)). A new enhancement has been implemented to address the issue of non-UTF-8 encoded notebooks failing to load by introducing explicit error handling for this case. A UnicodeDecodeError exception is now caught and logged as a warning, while the notebook is skipped and returned as None. This change is implemented in the load_dependency method in the loaders.py file, which is a part of the assessment workflow. Additionally, a new unit test has been added to verify the behavior of this change, and the assessment workflow has been updated accordingly. The new test function in test_loaders.py checks for different types of exceptions, specifically PermissionError and UnicodeDecodeError, ensuring that the system can handle notebooks with non-UTF-8 encoding gracefully. This enhancement resolves issue [#3374](#3374), thereby improving the overall robustness of the application.
* Added migration progress documentation ([#3333](#3333)). In this release, we have updated the `migration-progress-experimental` workflow to track the migration progress of a subset of inventory tables related to workspace resources being migrated to Unity Catalog (UCX). The workflow updates the inventory tables and tracks the migration progress in the UCX catalog tables. To use this workflow, users must attach a UC metastore to the workspace, create a UCX catalog, and ensure that the assessment job has run successfully. The `Migration Progress` section in the documentation has been updated with a new markdown file that provides details about the migration progress, including a migration progress dashboard and an experimental migration progress workflow that generates historical records of inventory objects relevant to the migration progress. These records are stored in the UCX UC catalog, which contains a historical table with information about the object type, object ID, data, failures, owner, and UCX version. The migration process also tracks dangling Hive or workspace objects that are not referenced by business resources, and the progress is persisted in the UCX UC catalog, allowing for cross-workspace tracking of migration progress.
* Added note about running assessment once ([#3398](#3398)). In this release, we have introduced an update to the UCX assessment workflow, which will now only be executed once and will not update existing results in repeated runs. To accommodate this change, we have updated the README file with a note clarifying that the assessment workflow is a one-time process. Additionally, we have provided instructions on how to update the inventory and findings by uninstalling and reinstalling the UCX. This will ensure that the inventory and findings for a workspace are up-to-date and accurate. We recommend that software engineers take note of this change and follow the updated instructions when using the UCX assessment workflow.
* Allowing skipping TACLs migration during table migration ([#3384](#3384)). A new optional flag, "skip_tacl_migration", has been added to the configuration file, providing users with more flexibility during migration. This flag allows users to control whether or not to skip the Table Access Control Language (TACL) migration during table migrations. It can be set when creating catalogs and schemas, as well as when migrating tables or using the `migrate_grants` method in `application.py`. Additionally, the `install.py` file now includes a new variable, `skip_tacl_migration`, which can be set to `True` during the installation process to skip TACL migration. New test cases have been added to verify the functionality of skipping TACL migration during grants management and table migration. These changes enhance the flexibility of the system for users managing table migrations and TACL operations in their infrastructure, addressing issues [#3384](#3384) and [#3042](#3042).
* Bump `databricks-sdk` and `databricks-labs-lsql` dependencies ([#3332](#3332)). In this update, the `databricks-sdk` and `databricks-labs-lsql` dependencies are upgraded to versions 0.38 and 0.14.0, respectively. The `databricks-sdk` update addresses conflicts, bug fixes, and introduces new API additions and changes, notably impacting methods like `create()`, `execute_message_query()`, and others in workspace-level services. While `databricks-labs-lsql` updates ensure compatibility, its changelog and specific commits are not provided. This pull request also includes ignore conditions for the `databricks-sdk` dependency to prevent future Dependabot requests. It is strongly advised to rigorously test these updates to avoid any compatibility issues or breaking changes with the existing codebase. This pull request mirrors another ([#3329](#3329)), resolving integration CI issues that prevented the original from merging.
* Explain failures when cluster encounters Py4J error ([#3318](#3318)). In this release, we have made significant improvements to the error handling mechanism in our open-source library. Specifically, we have addressed issue [#3318](#3318), which involved handling failures when the cluster encounters Py4J errors in the `databricks/labs/ucx/hive_metastore/tables.py` file. We have added code to raise noisy failures instead of swallowing the error with a warning when a Py4J error occurs. The functions `_all_databases()` and `_list_tables()` have been updated to check if the error message contains "py4j.security.Py4JSecurityException", and if so, log an error message with instructions to update or reinstall UCX. If the error message does not contain "py4j.security.Py4JSecurityException", the functions log a warning message and return an empty list. These changes also resolve the linked issue [#3271](#3271). The functionality has been thoroughly tested and verified on the labs environment. These improvements provide more informative error messages and enhance the overall reliability of our library.
* Rearranged job summary dashboard columns and make job_name clickable ([#3311](#3311)). In this update, the job summary dashboard columns have been improved and the need for the `30_3_job_details.sql` file, which contained a SQL query for selecting job details from the `inventory.jobs` table, has been eliminated. The dashboard columns have been rearranged, and the `job_name` column is now clickable, providing easy access to job details via the corresponding job ID. The changes include modifying the dashboard widget and adding new methods for making the `job_name` column clickable and linking it to the job ID. Additionally, the column titles have been updated to display more relevant information. These improvements have been manually tested and verified in a labs environment.
* Refactor refreshing of migration-status information for tables, eliminate another redundant refresh ([#3270](#3270)). This pull request refactors the way table records are enriched with migration-status information during encoding for the history log in the `migration-progress-experimental` workflow. It ensures that the refresh of migration-status information is explicit and under the control of the workflow, addressing a previously expressed intent. A redundant refresh of migration-status information has been eliminated and additional unit test coverage has been added to the `migration-progress-experimental` workflow. The changes include modifying the existing workflow, adding new methods for refreshing table migration status without updating the history log, and splitting the crawl and update-history-log tasks into three steps. The `TableMigrationStatusRefresher` class has been introduced to obtain the migration status of a table, and new tests have been added to ensure correctness, making the `migration-progress-experimental` workflow more efficient and reliable.
* Safe read files in more places ([#3394](#3394)). This release introduces significant improvements to file handling, addressing issue [#3386](#3386). A new function, `safe_read_text`, has been implemented for safe reading of files, catching and handling exceptions and returning None if reading fails. This function is utilized in the `is_a_notebook` function and replaces the existing `read_text` method in specific locations, enhancing error handling and robustness. The `databricks labs ucx lint-local-code` command and the `assessment` workflow have been updated accordingly. Additionally, new test files and methods have been added under the `tests/integration/source_code` directory to ensure comprehensive testing of file handling, including handling of unsupported file types, encoding checks, and ignorable files.
* Track `DirectFsAccess` on `JobsProgressEncoder` ([#3375](#3375)). In this release, the open-source library has been updated with new features related to tracking Direct File System Access (DirectFsAccess) in the JobsProgressEncoder. This change includes the addition of a new `_direct_fs_accesses` method, which detects direct filesystem access by code used in a job and generates corresponding failure messages. The DirectFsAccessCrawler object is used to crawl and track file system access for directories and queries, providing more detailed tracking and encoding of job progress. Additionally, new methods `make_job` and `make_dashboard` have been added to create instances of Job and Dashboard, respectively, and new unit and integration tests have been added to ensure the proper functionality of the updated code. These changes improve the functionality of JobsProgressEncoder by providing more comprehensive job progress information, making the code more modular and maintainable for easier management of jobs and dashboards. This release resolves issue [#3059](#3059) and enhances the tracking and encoding of job progress in the system, ensuring more comprehensive and accurate reporting of job status and issues.
* Track `UsedTables` on `TableProgressEncoder` ([#3373](#3373)). In this release, the tracking of `UsedTables` has been implemented on the `TableProgressEncoder` in the `tables_progress` function, addressing issue [#3061](#3061). The workflow `migration-progress-experimental` has been updated to incorporate this change. New objects, `self.used_tables_crawler_for_paths` and `self.used_tables_crawler_for_queries`, have been added as instances of a class responsible for crawling used tables. A `full_name` property has been introduced as a read-only attribute for a source code class, providing a more convenient way of accessing and manipulating the full name of the source code object. A new integration test for the `TableProgressEncoder` component has also been added, specifically testing table failure scenarios. The `TableProgressEncoder` class has been updated to track `UsedTables` using the `UsedTablesCrawler` class, and a new class, `UsedTable`, has been introduced to represent the catalog, schema, and table name of a table. Two new unit tests have been added to ensure the correct functionality of this feature.
@gueniai gueniai mentioned this pull request Dec 2, 2024
gueniai added a commit that referenced this pull request Dec 2, 2024
* Added `assign-owner-group` command
([#3111](#3111)). The
Databricks Labs Unity Catalog Exporter (UCX) tool now includes a new
`assign-owner-group` command, allowing users to assign an owner group to
the workspace. This group will be designated as the owner for all
migrated tables and views, providing better control and organization of
resources. The command can be executed in the context of a specific
workspace or across multiple workspaces. The implementation includes new
classes, methods, and attributes in various files, such as `cli.py`,
`config.py`, and `groups.py`, enhancing ownership management
functionality. The `assign-owner-group` command replaces the
functionality of issue
[#3075](#3075) and addresses
issue [#2890](#2890),
ensuring proper schema ownership and handling of crawled grants.
Developers should be aware that running the `migrate-tables` workflow
will result in assigning a new owner group for the Hive Metastore
instance in the workspace installation.
* Added `opencensus` to known list
([#3052](#3052)). In this
release, we have added OpenCensus to the list of known libraries in our
configuration file. OpenCensus is a popular set of tools for distributed
tracing and monitoring, and its inclusion in our system will enhance
support and integration for users who utilize this tool. This change
does not affect existing functionality, but instead adds a new entry in
the configuration file for OpenCensus. This enhancement will allow our
library to better recognize and work with OpenCensus, enabling improved
performance and functionality for our users.
* Added default owner group selection to the installer
([#3370](#3370)). A new
class, AccountGroupLookup, has been added to the AccountGroupLookup
module to select the default owner group during the installer process,
addressing previous issue
[#3111](#3111). This class
uses the workspace_client to determine the owner group, and a
pick_owner_group method to prompt the user for a selection if necessary.
The ownership selection process has been improved with the addition of a
check in the installer's `_static_owner` method to determine if the
current user is part of the default owner group. The GroupManager class
has been updated to use the new AccountGroupLookup class and its
methods, `pick_owner_group` and `validate_owner_group`. A new variable,
`default_owner_group`, is introduced in the ConfigureGroups class to
configure groups during installation based on user input. The installer
now includes a unit test, "test_configure_with_default_owner_group", to
demonstrate how it sets expected workspace configuration values when a
default owner group is specified during installation.
* Added handling for non UTF-8 encoded notebook error explicitly
([#3376](#3376)). A new
enhancement has been implemented to address the issue of non-UTF-8
encoded notebooks failing to load by introducing explicit error handling
for this case. A UnicodeDecodeError exception is now caught and logged
as a warning, while the notebook is skipped and returned as None. This
change is implemented in the load_dependency method in the loaders.py
file, which is a part of the assessment workflow. Additionally, a new
unit test has been added to verify the behavior of this change, and the
assessment workflow has been updated accordingly. The new test function
in test_loaders.py checks for different types of exceptions,
specifically PermissionError and UnicodeDecodeError, ensuring that the
system can handle notebooks with non-UTF-8 encoding gracefully. This
enhancement resolves issue
[#3374](#3374), thereby
improving the overall robustness of the application.
* Added migration progress documentation
([#3333](#3333)). In this
release, we have updated the `migration-progress-experimental` workflow
to track the migration progress of a subset of inventory tables related
to workspace resources being migrated to Unity Catalog (UCX). The
workflow updates the inventory tables and tracks the migration progress
in the UCX catalog tables. To use this workflow, users must attach a UC
metastore to the workspace, create a UCX catalog, and ensure that the
assessment job has run successfully. The `Migration Progress` section in
the documentation has been updated with a new markdown file that
provides details about the migration progress, including a migration
progress dashboard and an experimental migration progress workflow that
generates historical records of inventory objects relevant to the
migration progress. These records are stored in the UCX UC catalog,
which contains a historical table with information about the object
type, object ID, data, failures, owner, and UCX version. The migration
process also tracks dangling Hive or workspace objects that are not
referenced by business resources, and the progress is persisted in the
UCX UC catalog, allowing for cross-workspace tracking of migration
progress.
* Added note about running assessment once
([#3398](#3398)). In this
release, we have introduced an update to the UCX assessment workflow,
which will now only be executed once and will not update existing
results in repeated runs. To accommodate this change, we have updated
the README file with a note clarifying that the assessment workflow is a
one-time process. Additionally, we have provided instructions on how to
update the inventory and findings by uninstalling and reinstalling the
UCX. This will ensure that the inventory and findings for a workspace
are up-to-date and accurate. We recommend that software engineers take
note of this change and follow the updated instructions when using the
UCX assessment workflow.
* Allowing skipping TACLs migration during table migration
([#3384](#3384)). A new
optional flag, "skip_tacl_migration", has been added to the
configuration file, providing users with more flexibility during
migration. This flag allows users to control whether or not to skip the
Table Access Control Language (TACL) migration during table migrations.
It can be set when creating catalogs and schemas, as well as when
migrating tables or using the `migrate_grants` method in
`application.py`. Additionally, the `install.py` file now includes a new
variable, `skip_tacl_migration`, which can be set to `True` during the
installation process to skip TACL migration. New test cases have been
added to verify the functionality of skipping TACL migration during
grants management and table migration. These changes enhance the
flexibility of the system for users managing table migrations and TACL
operations in their infrastructure, addressing issues
[#3384](#3384) and
[#3042](#3042).
* Bump `databricks-sdk` and `databricks-labs-lsql` dependencies
([#3332](#3332)). In this
update, the `databricks-sdk` and `databricks-labs-lsql` dependencies are
upgraded to versions 0.38 and 0.14.0, respectively. The `databricks-sdk`
update addresses conflicts, bug fixes, and introduces new API additions
and changes, notably impacting methods like `create()`,
`execute_message_query()`, and others in workspace-level services. While
`databricks-labs-lsql` updates ensure compatibility, its changelog and
specific commits are not provided. This pull request also includes
ignore conditions for the `databricks-sdk` dependency to prevent future
Dependabot requests. It is strongly advised to rigorously test these
updates to avoid any compatibility issues or breaking changes with the
existing codebase. This pull request mirrors another
([#3329](#3329)), resolving
integration CI issues that prevented the original from merging.
* Explain failures when cluster encounters Py4J error
([#3318](#3318)). In this
release, we have made significant improvements to the error handling
mechanism in our open-source library. Specifically, we have addressed
issue [#3318](#3318), which
involved handling failures when the cluster encounters Py4J errors in
the `databricks/labs/ucx/hive_metastore/tables.py` file. We have added
code to raise noisy failures instead of swallowing the error with a
warning when a Py4J error occurs. The functions `_all_databases()` and
`_list_tables()` have been updated to check if the error message
contains "py4j.security.Py4JSecurityException", and if so, log an error
message with instructions to update or reinstall UCX. If the error
message does not contain "py4j.security.Py4JSecurityException", the
functions log a warning message and return an empty list. These changes
also resolve the linked issue
[#3271](#3271). The
functionality has been thoroughly tested and verified on the labs
environment. These improvements provide more informative error messages
and enhance the overall reliability of our library.
* Rearranged job summary dashboard columns and make job_name clickable
([#3311](#3311)). In this
update, the job summary dashboard columns have been improved and the
need for the `30_3_job_details.sql` file, which contained a SQL query
for selecting job details from the `inventory.jobs` table, has been
eliminated. The dashboard columns have been rearranged, and the
`job_name` column is now clickable, providing easy access to job details
via the corresponding job ID. The changes include modifying the
dashboard widget and adding new methods for making the `job_name` column
clickable and linking it to the job ID. Additionally, the column titles
have been updated to display more relevant information. These
improvements have been manually tested and verified in a labs
environment.
* Refactor refreshing of migration-status information for tables,
eliminate another redundant refresh
([#3270](#3270)). This pull
request refactors the way table records are enriched with
migration-status information during encoding for the history log in the
`migration-progress-experimental` workflow. It ensures that the refresh
of migration-status information is explicit and under the control of the
workflow, addressing a previously expressed intent. A redundant refresh
of migration-status information has been eliminated and additional unit
test coverage has been added to the `migration-progress-experimental`
workflow. The changes include modifying the existing workflow, adding
new methods for refreshing table migration status without updating the
history log, and splitting the crawl and update-history-log tasks into
three steps. The `TableMigrationStatusRefresher` class has been
introduced to obtain the migration status of a table, and new tests have
been added to ensure correctness, making the
`migration-progress-experimental` workflow more efficient and reliable.
* Safe read files in more places
([#3394](#3394)). This
release introduces significant improvements to file handling, addressing
issue [#3386](#3386). A new
function, `safe_read_text`, has been implemented for safe reading of
files, catching and handling exceptions and returning None if reading
fails. This function is utilized in the `is_a_notebook` function and
replaces the existing `read_text` method in specific locations,
enhancing error handling and robustness. The `databricks labs ucx
lint-local-code` command and the `assessment` workflow have been updated
accordingly. Additionally, new test files and methods have been added
under the `tests/integration/source_code` directory to ensure
comprehensive testing of file handling, including handling of
unsupported file types, encoding checks, and ignorable files.
* Track `DirectFsAccess` on `JobsProgressEncoder`
([#3375](#3375)). In this
release, the open-source library has been updated with new features
related to tracking Direct File System Access (DirectFsAccess) in the
JobsProgressEncoder. This change includes the addition of a new
`_direct_fs_accesses` method, which detects direct filesystem access by
code used in a job and generates corresponding failure messages. The
DirectFsAccessCrawler object is used to crawl and track file system
access for directories and queries, providing more detailed tracking and
encoding of job progress. Additionally, new methods `make_job` and
`make_dashboard` have been added to create instances of Job and
Dashboard, respectively, and new unit and integration tests have been
added to ensure the proper functionality of the updated code. These
changes improve the functionality of JobsProgressEncoder by providing
more comprehensive job progress information, making the code more
modular and maintainable for easier management of jobs and dashboards.
This release resolves issue
[#3059](#3059) and enhances
the tracking and encoding of job progress in the system, ensuring more
comprehensive and accurate reporting of job status and issues.
* Track `UsedTables` on `TableProgressEncoder`
([#3373](#3373)). In this
release, the tracking of `UsedTables` has been implemented on the
`TableProgressEncoder` in the `tables_progress` function, addressing
issue [#3061](#3061). The
workflow `migration-progress-experimental` has been updated to
incorporate this change. New objects,
`self.used_tables_crawler_for_paths` and
`self.used_tables_crawler_for_queries`, have been added as instances of
a class responsible for crawling used tables. A `full_name` property has
been introduced as a read-only attribute for a source code class,
providing a more convenient way of accessing and manipulating the full
name of the source code object. A new integration test for the
`TableProgressEncoder` component has also been added, specifically
testing table failure scenarios. The `TableProgressEncoder` class has
been updated to track `UsedTables` using the `UsedTablesCrawler` class,
and a new class, `UsedTable`, has been introduced to represent the
catalog, schema, and table name of a table. Two new unit tests have been
added to ensure the correct functionality of this feature.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants