-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YouTube] Fix parsing short relative date formats (English only) #1068
Conversation
59ba106
to
019a6a9
Compare
LGTM, but what about the other languages? I'd guess we need to update them, too. Should we ask the community for help here? |
extractor/src/main/java/org/schabi/newpipe/extractor/localization/TimeAgoParser.java
Show resolved
Hide resolved
Parsing YouTube data in other languages is currently disabled and it would require more extensive changes to the dictionary and the parser. I have implemented (and tested) a parser that works with all languages. Here is the dictionary for it. Also note that there are some special cases that have to be handled seperately (e.g. in French https://code.thetadev.de/ThetaDev/rustypipe/src/branch/main/testfiles/dict/dictionary.json |
Yes, but the timeago parser can be also used for something else than YouTube by clients, even if that's its main goal. Is the data you provided extracted from YouTube? Do you wish to do other languages support using your similar approach to us? I think we should have a parser separating date units (seconds, hours, days, ...), digits (1, 2, 3, ...) and number units (tens, hundreds, thousands, ...) for each language we want to add support.
Did you mean the verb |
This is the French term in question (5 years ago):
I currently have a special case for the French language which checks if the string ends with The data from the parsing dictionary is a combination of data extracted from YouTube and the CLDR repository. |
Merging this PR, as we would need to overhaul the timeago parser system to work with other languages on short time units. Thanks for the fix! |
I added support for the new, short date format (e.g. 1wk ago).
Fixes #1067