Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import: use filename sorting order information #2738

Closed
Vrihub opened this issue Nov 12, 2017 · 6 comments
Closed

import: use filename sorting order information #2738

Vrihub opened this issue Nov 12, 2017 · 6 comments
Labels
needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature."

Comments

@Vrihub
Copy link
Contributor

Vrihub commented Nov 12, 2017

Problem

Sometimes when I import an album beets correctly identifies all tracks except for a couple that get swapped by mistake, I guess because they have similar length.

The import scenario where this happens is quite special:

  • I run import on a directory whose files all belong to the same album: they come from ripping one of my CDs
  • The files don't have any tag, so I always "Enter search" first and choose a candidate
  • The files have names that respect the track order in the album (e.g. 01.mp3, 02.mp3, etc)

In this scenario, I guess all the information that beets has is:

  1. The album candidate that I chose from the results of the search
  2. The number of files in the directory and their length
  3. The file names, and hence their alpha/numerical sorting order

The only way I can explain the swapping error is that currently beets only uses 1. and 2. to match files with tracks, and files/tracks with similar length can be swapped by mistake. If this is correct, I'd like beets to also use 3., i.e. the filename sorting, when it can be helpful to avoid swapping tracks.

I'm not sure what is the best way to implement this:

  • A command line option to import could be added to force beets to use filename sorting order instead of the default logic for matching tracks; so I would use that option in a scenario like the one above, and use the default logic when i run import on directories containing "random" files

  • The default matching logic could be extended to also use the filename sorting order "when in doubt" (e.g. in choosing between similar tracks), though I'm not sure how difficult it is

@sampsyo sampsyo added the needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." label Nov 12, 2017
@sampsyo
Copy link
Member

sampsyo commented Nov 12, 2017

That's a great summary of a specific situation!

It might be useful to take a look at the fromfilename plugin. The idea is to use titles and numbers from filenames to aid matching. It's safe to turn on because it only kicks in as a backup. I'm not sure, however, if it works if the filenames only have track numbers—it would be worth experimenting.

@Vrihub
Copy link
Contributor Author

Vrihub commented Nov 12, 2017

Well, in fact I already use fromfilename, but it doesn't seem to help in this scenario, though I'm not really sure if the reason is:

  • because it fails to extract and use the track numbers from the filenames
  • because it only operates before making a search (i.e. trying to extract keywords from the file names), as its documentation says (I haven't looked at the code)

@sampsyo
Copy link
Member

sampsyo commented Nov 13, 2017

Aha! I would put my money on the former. It would be worthwhile to do some investigation—it seems like extending that plugin is probably the right way to go.

@Vrihub
Copy link
Contributor Author

Vrihub commented Dec 11, 2017

Ok, I fixed the fromfilename plugin to also extract track numbers from "trivial" filenames like 01.mp3 etc, that got ignored so far. See: #2759

I've tested it in the scenario I described in my first message and it works as intended: i.e. I can enter a manual search and choose the best candidate, but now beets will remember track numbers discovered by the fromfilename plugin, so it won't make mistakes by arbitrarily swapping tracks with similar length.

Notes:

  1. I can't exclude there could be subtle side-effects from the interaction between the final regexp I added to the PATTERNS list and previous regexps. I'm also not sure if filenames like "Track 01" etc still deserve a special treatment. Please review!

  2. I suspect the whole list of regexps in PATTERNS could be rewritten and made shorter and more flexible, e.g. also allowing for "_" as a separator between fields, while currently only " " or " - " are supported. If you agree I can try to come up with some suggestions.

@Vrihub
Copy link
Contributor Author

Vrihub commented Dec 21, 2017

For the records: please remember that everything discussed so far only applies to files that miss the "title" tag: the fromfilename plugin will only operate if a "title" tag is missing.

If your files contain garbage (title) tags and you'd like to override the garbage with the information inferred from the file name, at the moment the only option is to delete the title tag using some other tool before importing the file (beets' from_scratch, zero and scrub features are not useful in this context because they operate later in the import process).

@sampsyo
Copy link
Member

sampsyo commented Dec 22, 2017

Yes, good point! It would be interesting to think about alternatives, but it's not 100% clear how this should work. For example, you could imagine a new config option for the fromfilename plugin that overrides the tags, but you'd need to toggle this on and off depending on how poorly ordinary tag-based importing goes…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature."
Projects
None yet
Development

No branches or pull requests

2 participants