-
-
Notifications
You must be signed in to change notification settings - Fork 344
[Partial] PostgreSQL support for magneticod #214
[Partial] PostgreSQL support for magneticod #214
Conversation
I might take some time out of my days to implement the magneticow part, as I do need it. |
@kescherCode Nice. But you'd better ask @boramalper if he's going to merge it at all :) |
@skobkin That's totally up to him :) I just really would like to get rid of this huge SQLite file ;) |
@skobin I have just set up your branch on my machine and got this error after 96 torrents successfully being added to the DB:
Encoding issues hooray! Looks like some filenames will trip postgres. I wonder if we should use an encoding other than UTF8 in the postgres database? Edit: Upon further investigation, the torrent in question contains Chinese characters. That should not be a problem for UTF8, however. |
@kescherCode I've just forgot to add non-unicode checking for files. I'll try to fix it today.
I think it's better to drop non-unicode torrents because we can have problems with them in other applications (besides of |
Fixed in 27c5446.
I gave it one more thought and... I still don't think that non-unicode file names is a good idea. It would probably crash torrent client or cause some filesystem problems.
Yep. But it's not so good for you to waste your time if it will not be merged and you're not planning to maintain a fork. If so it's better to ask @boramalper in the Gitter yourself. |
Oh of course I’d merge! I’ll check the PR this week when I have some time. =) |
Magnetico new download files, so never use torrent file path with filesystem. If that torrent exists then somebody use that then there is filesystem & client which able to handle downloading it. |
I disagree. There are a lot of malformed torrents out there that intentionally try to mess up clients, mainly by anti-piracy activists. |
It's logically incorrect. It's probable but not necessary. Also according to BEP_0003 which describes torrent metadata format:
I don't think that we must support DHT messages which are not following the same specs as |
I also added |
Moved my own magnetico-web project to PostgreSQL with this engine. Compared to SQLite it's faster and more convenient to work with. Also migrated all of my magnetico-python data to the new database with my migration tool. If you want to test this engine, you can try my fork for now using Docker image Usage examples could be found here until this PR is merged. |
Thank you for your hard work, sincerely appreciated! I have left some comments, and I would be glad if you can address the first and the third. |
Thank you for your comments too. I've removed unused method as you requested. But I'm not sure that using |
See #218 (comment) for an interesting point-of-view. |
@Glandos What's your point? |
I was just bringing attention to a comment in favor of this PR. I personally found that the arguments in this comments worth reading. |
Ok. I did some things here:
If you were using old version with I'll rebuild my Docker image and migrate my instance to the new code soon. |
Thanks to @skobkin and everyone else for their contribution, much appreciated! =) This has now landed on v0.12.0 (also on Docker Hub) so please let us know your experience with it --- by opening separate issues. |
Could you support PGSQL in magneticow also ? |
If someone is willing to implement it, sure. I rarely have time these days, and even when I do, magnetico has no longer been a priority. |
Not right now. I did what I had time for and what I needed personally. Some people only use You can check my magnetico-web project which I use as an alternative web interface for Magnetico database (PostgreSQL). It may or may not require some knowledge of PHP to run it. |
I am working on improving my magnetico database merger to support postgresql, especially to migrate from sqlite to postgresql. Is UTF-8 part of the bittorrent protocol? Or can the name be in any encoding? |
And, I've tested with magneticow, it can happily returns results with those invalid UTF-8 characters replaced. |
here.
See linked comment above. BEP-0003 says:
It's not explicitly says that it should be also UTF-8 outside of torrent file (magnet link for example). But I don't see why should we support data which is not compliant with that spec. If you want, you can check an alternative approach. |
Sometimes, I should try to find the information myself, especially when it's so easy to find. So thanks for that mention I think I'll use |
https://gitlab.com/skobkin/magnetico-go-migrator 😄 You can also build it with the newer version of magnetico persistence package though. UPD: Just be aware that it uses |
I didn't know about this tool 👍 However, it seems that it performs a simple migration from sqlite to postgresql. My tool is designed to merge databases. I'm not really proud of the code… It tries to be as fast as possible, given the constraints, but it's far from good code. |
There is no much difference in this case because magnetico's |
With the issues I currently encounter on migrating my 42GB SQLite database to Postgres, I think using My work on it is not over, but basically when importing a massive dump, you need to:
With only the insert multi-values phase, I still have an import time of about 24-25 hours. |
Yes, kind of. I migrated ~12M of torrents too and it took several hours as I remember. The point of such approach is that it's native to Magnetico, uses the same API and most likely will not cause broken data or anything. If you sure that you know what you're doing, then making more performant solution is of course the better way. Second advantage of using magnetico's API is that you can run the daemon while also importing old torrents to the database, it will still check for duplicates. So if you worry that you miss some very important torrent, you can use it.
Huh. Maybe your machine is somewhat CPU or IO constrained. But still you need to do that only once, so it shouldn't be a problem BTW you can also try to run multi-threaded or multi-process import reading batches from different |
My merger is ready. It can only merge from SQLite into either PostgreSQL or SQLite. It is quite fast to populate a new PostgreSQL database from SQLite (my current use case), but updating a large PostgreSQL can be improved, e.g. by using a temporary table. Anyway, I have been running magneticod with PostgreSQL for four days, and I don't see any performance improvement when crawling DHT. Maybe there are some network restrictions out of control… |
That's because you shouldn't see any. SQLite wasn't bottleneck for crawler. |
In this PR I implemented simple database engine for PostgreSQL.
It implements
pkg/persistence/interface.go
partially only formagneticod
part.I'm not sure if I'll implement
magneticow
part (I'm not using it) at all. That's why I decided to share my contribution now. Probably someone can implement other parts of the interface if needed.