-
Notifications
You must be signed in to change notification settings - Fork 268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request]: Improve Chinese analyzer #1308
Labels
feature request
New feature or request
Comments
2 tasks
yingfeng
pushed a commit
that referenced
this issue
Jun 11, 2024
Introduced CutGrain for Chinese analyzer Issue link:#1308 - [x] New Feature (non-breaking change which adds functionality) - [x] Test cases
yingfeng
added a commit
that referenced
this issue
Jun 11, 2024
### What problem does this PR solve? 1.Inherit from CommonLanguageAnalyzer instead of Analyzer 2.Return logical offset through CommonLanguageAnalyzer 3.Stemmer could be generated for Latin tokens Issue link:#1308 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring
yuzhichang
added a commit
that referenced
this issue
Jun 12, 2024
Fix a bug of Chinese phrase query Issue link:#1308 - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Test cases
yuzhichang
pushed a commit
that referenced
this issue
Jun 21, 2024
### What problem does this PR solve? 1. Chinese jieba analyzer will output " " for latin tokens. 2. Standard analyzer will output discontinuous if delimiter exists between tokens. Issue link:#1308 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)
1 task
yingfeng
added a commit
that referenced
this issue
Jun 21, 2024
### What problem does this PR solve? Use modified jieba query segmentation for fine grained Chinese analyzer. Issue :#1308 ### Type of change - [x] New Feature (non-breaking change which adds functionality)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is there an existing issue for the same feature request?
Describe the feature you'd like
Current Jieba analyzer for Chinese has several problems:
The text was updated successfully, but these errors were encountered: