-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ambigious TV-Shows such as "Wilfred.US" wrecks title parsing #75
Comments
I've mostly ignored the tv show parsing, if you want to improve it feel free. I think i fixed a similar issue in movies by looking for the movie year and assuming things before it were the title. I'm sure something similar can be done for tv |
I guess there could be a small possibility for a cornercase something like:
But then.. Why would a show end with "the" (given that the default is to strip the US country at the end).. Thats something you could check for if that ever became a thing, which is unlikely but chance never zero.. Assuming something about titles in the first place is flaky at best, there is always possibility for another weird title. It kind of sucks the scene does it like this, because there is no way to discern if its actually part of the title or not except for the small clues such as the case of |
I wrote some logic for this that I think makes sense. I think you will be able to tell from it how I think the most reasonable way to handle it would be. If next last word is not /**
* @param {SceneTags} scenetags
*/
function stripTVShowCountry(scenetags) {
const lastElement = -1
const words = scenetags.title.split(' ')
if (scenetags.type === 'tvshow' &&
words.at(lastElement)?.match(/(?<country>US|UK|NZ|AU|CA)/u) &&
words.at(lastElement-1) !== 'the'
) {
scenetags.title = words.slice(0, lastElement).join(' ')
}
return scenetags
}
// Ends with country
console.log("Ends with country")
console.log(stripTVShowCountry(null, {title: 'Wilfred US', type: 'tvshow'}))
console.log(stripTVShowCountry(null, {title: 'Oy mate Crocodile Hunter AU', type: 'tvshow'}))
console.log()
// Ends with actual country, next last is "the". Concludes its a real title
console.log("Ends with country, next last is 'the'. Concludes its a real title")
console.log(stripTVShowCountry(null, {title: 'Soldiers in the US', type: 'movie'}))
console.log(stripTVShowCountry(null, {title: 'Food in the US', type: 'tvshow'}))
console.log(stripTVShowCountry(null, {title: 'Queen of the UK', type: 'tvshow'})) Example: Output
|
Hello.
Nice lib, but there is one issue I found that I think needs to be fixed, I'll gladly help as long as we can agree on the issue.
For example Wilfred exists as both an AU and US show.
AU (first released, 2007)
https://www.themoviedb.org/tv/3297
US (2011)
https://www.themoviedb.org/tv/39525-wilfred
This means that now the title is parsed as
Wilfred US
.It would be a safe assumption to think that any tag in capitalized country-code
US|UK|AU|NZ|CA
would mean ambigous titles and narrowing down to the specific show in respective country.Of course the rare occassion could happen that some title would be..
Toys.R.Us
, but unlikely that it would be capitalized.. If so thats a real corner-case not worth optimizing for!https://scenerules.org/html/2020_WDX_unformatted.html
The text was updated successfully, but these errors were encountered: