Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(path): convert to native encoding on Windows #806

Merged
merged 14 commits into from
Feb 6, 2024
Merged

Conversation

lotem
Copy link
Member

@lotem lotem commented Feb 3, 2024

Follow @fxliang 's PR, use u8path on Windows to convert UTF-8 string to Windows native path.
Closes #804

Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: installation.yaml should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to enable paths with non-ASCII characters. It should be updated to UTF-8 after this change.

Pull request

Issue tracker

Fixes will automatically close the related issue

Fixes #

Feature

Describe feature of pull request

Unit test

  • Done

Manual test

  • Done

Code Review

  1. Unit and manual test pass
  2. GitHub Action CI pass
  3. At least one contributor reviews and votes
  4. Can be merged clean without conflicts
  5. PR will be merged by rebase upstream base

Additional Info

@lotem lotem requested a review from fxliang February 3, 2024 12:27
@lotem lotem force-pushed the path-fix branch 2 times, most recently from 78bf12f to 39ee053 Compare February 3, 2024 13:30
@lotem
Copy link
Member Author

lotem commented Feb 3, 2024

PTAL @fxliang
Subclassed path to conditionally call u8path in the constructor from string.

@lotem lotem force-pushed the path-fix branch 2 times, most recently from 9c7d59e to 6a5d8dc Compare February 3, 2024 13:57
@lotem
Copy link
Member Author

lotem commented Feb 3, 2024

need MORE work.

Follow @fxliang 's PR, use `u8path` on Windows to convert UTF-8 string
to Windows native path.
Closes rime#804

Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: `installation.yaml` should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to
enable paths with non-ASCII characters. It should be updated to UTF-8
after this change.
@lotem lotem force-pushed the path-fix branch 3 times, most recently from f588ef6 to 88de798 Compare February 5, 2024 11:41
@lotem lotem merged commit 6546689 into rime:master Feb 6, 2024
9 checks passed
fxliang added a commit to fxliang/weasel that referenced this pull request Feb 6, 2024
fxliang added a commit to fxliang/weasel that referenced this pull request Feb 6, 2024
fxliang added a commit to fxliang/weasel that referenced this pull request Feb 6, 2024
lotem added a commit to rime/weasel that referenced this pull request Feb 8, 2024
lotem added a commit to rime/weasel that referenced this pull request Feb 8, 2024
lotem added a commit to rime/weasel that referenced this pull request Feb 8, 2024
lotem added a commit to rime/weasel that referenced this pull request Feb 8, 2024
lotem added a commit to rime/weasel that referenced this pull request Feb 8, 2024
lotem added a commit to rime/weasel that referenced this pull request Feb 8, 2024
lotem added a commit to rime/weasel that referenced this pull request Feb 8, 2024
lotem added a commit to rime/weasel that referenced this pull request Feb 8, 2024
lotem added a commit to rime/weasel that referenced this pull request Feb 8, 2024
lotem added a commit to rime/weasel that referenced this pull request Feb 8, 2024
graphemecluster pushed a commit to TypeDuck-HK/librime that referenced this pull request Mar 18, 2024
refactor: convert path to native encoding on Windows

feat(rime_api): provide secure version of path getter functions `RimeApi::get_*_dir_s`.

Follow @fxliang 's PR, use `u8path` on Windows to convert UTF-8 string
to Windows native path.

Closes rime#804
Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: Most `string` filenames in APIs are changed to `path`;
`installation.yaml` should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to
enable paths with non-ASCII characters. It should be updated to UTF-8
after this change.

Details of the code refactor

Wrap `std::filesystem::path` in a thin wrapper class `rime::path` which calls `std::filesystem::u8path` in the constructor on Windows.

Operator `/=` and `/` are also overloaded to convert the right operand from UTF-8 string to native path.

Follow these rules to apply correct conversion between `string` and `rime::path`:

- construct `rime::path` with UTF-8 encoded string;
- get native string by `path::u8string`;
- to extract UTF-8 string from `path`, for example to find schema ID from file name, call `path::u8string`;
- avoid implicit conversion from string, which results in `std::filesystem::path` without performing UTF-8 to native conversion;
- explicitly construct `rime::path` from `std::filesystem::path` before append operation, to ensure the overloaded operator with string conversion is used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: 词库文件名不支持中文 同步文件夹路径含中文时,实际同步到非预期的路径
1 participant