-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using beets for Video Library Organization? #1935
Comments
Filebot + ACM with some config is what you need |
Thanks for starting the discussion! I've always thought expanding to cover other media types would be an interesting direction. One major roadblock is that many video file types, last I checked, don't have standard metadata fields. And in any case, it's not common to find tagged video files even with non-standard formats, so we'd need to rely much more on filename heuristics. See also #1160. I can see this going one of two ways:
|
Your second point is interesting, @sampsyo. Could we perhaps move |
Yep, that was always the long-term goal with the |
Filebot has everything you are looking for and more - it integrates perfectly for me between flexget and kodi. |
I can't speak to video, but I just wanted to put in a vote for a beets ebook organizer. There really is not a KISS tool for ebooks. Calibre is, as far as I know, the only Linux software that scrapes and organizes ebooks; however, it's a train wreck, modifying files without warning the user, forcing exactly one organizational scheme, etc. Nothing like the elegance of beets! |
I would be way more interested in being able to use beets to rename/organize video files and generate nfo files for kodi. Solutions like filebot are great, but anything that requires java to run and frequently breaks on updates does not interest me in the least. I'd try to intergrate https://github.com/guessit-io/guessit into beets. It does a fantastic job of guessing information based on filename and the output would probably make a good starting search on tmdb or tvdb. I see the workflow better suited to a plugin (but i'm not a dev). Man.... with the inline plugin.... this would be so awesome. |
Until one day beets manage video files, I wrote flinck whose goal is to create symlinks to your movies files and organize them by year/genre/whatever ... But it's not the swiss-knife for video files as beet is for audio : no database, no renaming, just symlinking really. I released a new version yesterday and would appreciate some feedbacks. |
@sampsyo I think I'd prefer if video (and eBook?) management was kept in beets rather than split off into separate (even if related) projects. I already do use beets for some (music) video management, so it's not impossible to envision to me. Of course, this might make some feel beets become bloated, so maybe it'd be possible to "modularise" |
@Freso I concur that opting for modularity would be the way to go. The ability to scrape new media types seems like it belongs as a plugin, prodding the community to create plugins they desire. Sadly, I realize, eBooks will probably not be the community's highest priority... Not sure if others are aware, but MediaElch works quite well for video. Perhaps some of the scrapping work can be borrowed from that. |
Well, the upshot is that modularity is a good idea for lots of reasons! Even boring ones like maintainability that have nothing to do with video or ebooks. So I'm all for it, especially if it helps us build tools that feel engineered for different use cases. |
Somewhat related, but MusicBrainz actually includes video tracks (because some albums include bonus promotional videos and the like). As an initial step toward potential video library support, perhaps MusicBrainz video track support could be added to beets? |
Oh my, I'd love it if beets could also take care of TV shows and/or movies. Sickbeard, SickGear, SickRage, MediaElch, FileBot etc. are all way too heavy and complex, especially if you just want to point the program to a directory with a tv series and have it rename all episodes appropriately. |
I was thinking of Beets as I experimented with https://github.com/perkeep/perkeep -- the intended use cases and workflows are quite different, but having a central store with a flexible metadata system is something both of these systems share. Perkeep could serve as an example of how to provide a number of modular "importers" which produce metadata in a single database. @sampsyo -- how modular is the import flow currently, and how hard would it be to extend to arbitrary file types? |
You're right; there is a certain similarity in philosophies there! I'd be interested to explore this more deeply. To answer your direct question, the importer pipeline is reasonably reusable, although there is a fair amount of music-specific logic mixed in there: mostly surrounding albums that group together individual tracks. One thing that is very abstract, however, is the database layer. Take a look at our |
@sampsyo -- would it be helpful to track this effort as a separate bug? Something like "Modularize the importer and support file types without inline metadata"? Or do you feel this is outside of the scope of what should be supported by the Beets project? I did take a look at dbcore! If one wanted to create a separate tool for importing, setting metadata, and querying over arbitrary local files, it seems like this would be a great place to start. Do you have strong feelings on whether that is the best route? |
Sure; a separate thread sounds good! I guess the way I’d put the project is: let’s make the With hard work in place, I can imagine it going one of two ways: either resume the same components (dbcore + this new importer module) to make a beets-like tool for video, or just extend beets for other media types in place. I have a less strong feeling about which of those is a better idea, but both seem worth exploring. |
@sampsyo, it looks like the majority of the changes would need to go into beets/library.py or beets/mediafile.py -- LibModel and Library are mostly generic enough and beets/importer.py doesn't seem to know too much about the individual models, but Item and Album are very audio specific. Video items might overlap enough with the fields in Item that it makes sense to support them in beets/mediafile.py, but generic files like text documents, binaries, source files, etc. wouldn't fit very well. One approach would be to add distinct model/database types to beets/library.py for file types which don't have the typical music associated metadata. LibModel/FileItem (any file), MediaItem (common media related fields) VideoItem, AudioItem, AudioAlbum, ImageItem, TextItem, etc. However, the ideal outcome might be to allow defining different media types as plugins so that the end user could choose which sorts of files they want to have in their library. Naively, I could imagine something like:
Does this seem like a reasonable approach? |
Yeah, that would be cool! I like the idea of model types provided by plugins. An inconvenient piece to deal with will be creating and destroying SQLite tables that back these models. I’d be interested to look into a more detailed design for how that would work. |
Howdy, @sampsyo -- to prove to myself whether a tool like beets is the right one for this job, I threw together a prototype using dbcore for crawling non-music files. I've found that adding items to an on-disk (ext4) database is several orders of magnitude slower than an in-memory one. For an import with only 847 records:
Each file has ~10 (non-flexible) attributes. I'm setting them all with a single model.update() call (which, from a quick code perusal, seems to result in an SQL 'UPDATE' query for each attribute). I was initially setting each attribute one per expression which (due to the parenthetical above) seems to have no impact on performance. Am I using the library incorrectly or is this the expected performance? |
Wow; awesome! Except for the performance. I'm not sure what to "expect" for performance, but that's certainly not good—maybe this would be a good lens to use for performance optimization. Would it make sense to do a little profiling? (If so, may I recommend SnakeViz to explore the data?) |
Was about to open an issue but luckily found out it is already been worked on! |
@sampsyo, I took it for a spin in snakeviz. Unsurprisingly, the majority of the time is being spent in On the beets side, over 95% of the time is spent in Removing an unnecessary Next improvement was setting the values for the entry at model instantiation time rather than a) instantiating with empty values, b) setting the values by attr or bulk Next area for exploration may be supporting a bulk To summarize, the overall control flow now looks something like this:
|
It's worth noting that each |
Awesome work here. That's sort of good news that we can blame our very inefficient database usage rather than anything running "in Python"! Just to help me track this: where is the inner loop that you're referring to? That's in your own client code, right? (Not in beets itself?) To summarize potential changes from the beets side that you mentioned:
|
The |
For Updating / Writing, the first one probably could be solved introducing the unit of work pattern (The trade-off would be more memory to keep track of the objects). A example exists in SqlAlchemy (Session) The models would have a reference to the session (which is bad, IMHO) and one would commit after all operations are done (The default way would be always commit the changes, to not break the API, at least initially): # pseudocode
class Model(object):
def store(self, mode='now'):
self.session.add(self)
if mode == 'now':
self.session.commit()
|
@sampsyo, I think the right target to shoot for is that the dbcore overhead should be less than the time it takes to crawl files on the target filesystem. Do you think this is achievable with SQL data store? |
Here is a really nice article on this topic: https://stackoverflow.com/questions/1711631/improve-insert-per-second-performance-of-sqlite -- they are using the C bindings for sqlite, but I suspect many of the lessons could apply for Python. |
Yeah, that seems like a reasonable goal to at least shoot for. Have you checked, for instance, what the proportion of filesystem to database time is in the optimized version of your current crawler? |
@sampsyo, database time is still over 92% of total runtime. I suspect this is also a significant bottleneck when importing large music libraries. |
Got it. It does seem like this should be achievable in the limit—the main impediment is figuring out the right abstractions to allow clients to express a high-performance treatment of the database. |
@sampsyo -- thinking a bit more, we can do something relatively uninvasive by providing a
Caveat emptor: in order to realize the performance improvements of bulk operations, callers would need to explicitly opt into this use. I threw together a quick prototype of this and I'm seeing total runtime down to less than 5s, with less than 10% of total time spent in dbcore/sqlite3. Some notes about the prototype:
Regardless, I think this is really promising and it's now relatively clear to me that pursuing a single transaction will yield the most significant performance increase. |
Yes, absolutely! A bulk insert would be a great way to do it. You could even imagine letting the This sounds awesome. Any chance you can put together a PR for closer review? |
I think I'm of the team that this is outside the scope of beets, but a fork that deals with videos could be interesting. I think there's also a limited use case for this. As mentioned, there isn't really a good standard for tagging or a whole lot that's worthwhile to tag or update as time goes on. The biggest advantage I suppose would be the database querying, but having an application like beets just to provide a cli query for your videos seems overkill when the common uses for a query could be implemented through other commandline utilities by parsing an organized video directory, or with GUI applications. What I would recommend for the majority of people is:
If there's still interest, I think the best route, as suggested, is to create a fork of beets focused on videos. I'm going to close this since this doesn't seem like something beets should implement, but feel free to continue discussion here or on discourse. |
I was wondering if there was any interest in adding video library support to beets. I really like the workflow of importing my media and playing it with
beet play
without the overhead of a full blown media player. I was curious what would be necessary for beets to implement video support. So far what I think would be required at a minimum is this:I don't really think this is in scope for the beets project and a lot of it won't carry over into video organizing but I can't seem to find a media organizer for videos that doesn't require a server or a gui (Plex, Kodi, Emby, etc) and most of those projects require manual editing of filenames in order for the lookup to succeed.
I was wondering what everyone thought about this functionality being in beets or another project similar to beets. I know I'd find it useful but I'm not too sure if it's worthy of being incorporated into the main project.
Desired Workflow
Ideally this would be the workflow I'm looking to achieve:
beet import The\ Princess\ Bride.mp4
.--as-video
flag would be needed for an alternate import method.Then after this is complete
beet play Princess Bride
would start playing that file with the configured media player.The text was updated successfully, but these errors were encountered: