-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit re-fetches of the Zotero library #58
Comments
That would be great; the performance of that connection, indeed, leaves room for improvements. ;) Currently, the bib-file exposed by BBT is only read on-demand if the user connects to BBT for the first time or subsequently requests to reload the library (e.g., because of modified or added references). This gif may give you a rough idea (notice the Is it possible that the multiple requests you are seeing target different libraries (main library and group libraries)? Full disclosure, I'm currently exploring whether a different approach to searching Zotero may be better (in short, BBT CAYW search, |
The behavior described here sounds reasonable and then I'd see no reason to change anything, but @jrennstich describes clicking The problem for him is exacerbated by a yet-unfixed problem that full library requests take unreasonably long -- this is on me to fix, but his computer took an unfortunate moment (always unfortunate for @jrennstich of course, but unfortunate in the sense that I don't like having open unsolved problems on my plate) to demand repairs. WRT speeding up bib access -- I've done some recent (5.2.X) performance work that should make fetches substantially less painful, but perhaps not enough for your use-case. BBT exports are relatively heavyweight, and even with a fully filled cache, 24k items take 10-15 seconds to lay out on disk. pandoc-zotxt should work I think. I can't see why I'd object to this -- BBT is good at solving some problems, not others, and I hold no illusions on how speedy it is 🙄 . Another option would be to expose an endpoint where citr test whether an auto-export has been set up for a specific path, and set one up if not. That would fully decouple the two while keeping the cooperation in place; potential problem is that you would have to detect when the file on disk changes. The write to the file by BBT is atomic (I write to a temp file and once done it is renamed to the target) so you'd not get partial results, but still. OTOH, in that connect screen it shouldn't be too hard to detect that the file time has been updated since last check. |
Hmm, I'll have to check dig into this. Unfortunately, I'm completely swamped right now and won't get around to it before April.
This also sounds like a useful solution to decouple, reloading the bibliography from the addin. Checking when the file changed on disk should be easy enough. Do you think an additional speed-up could be gained from supporting CSL JSON rather than relying on BibTeX as suggested in #59? |
I thought CSL JSON was going to easily beat the TeX export formats on speed, but that turns out to be false at the moment. For context, my CSL exporters do barely anything but re-use the existing Zotero CSL converters, but the combination looks to be slower than BBT TeX, which is strange, because the cold-cache version does a lot less than the TeX formats, and the hot-cache scenario should simply be the same, roughly. In any case, there's still benefits to using CSL:
Simple, non-scientific test: export of 24k items:
I'm going to look into the performance problem with CSL. This should not be the case. |
Thanks for the benchmark, that's interesting. I agree using JSON would avoid lossy conversion between formats. I currently use BibTeX because it works with |
We've been able to implement some substantial speedups in retorquere/zotero-better-bibtex#1389; I'm doing some tidying up, and then I'll cut a new release in the next few days. But I'm still open to create an endpoint that citr can talk to to set up an auto-export in an automatic way. It'd also be possible to create an endpoint to query for collections so not the entirely library needs to be fetched, which would net a performance benefit but which would make the UI on the citr side more involved. |
I don't mean to be (too) pedantic about this, but that's a flexibility win at the cost of a quality loss. |
The CSL performance issue has been fixed in 5.2.16. |
Thanks, I'll take a look the next chance I get! |
It looks like the number of times citr requests the full library from BBT can be optimized. For large libraries this should yield a performance improvement.
I'm open to adding an endpoint in BBT that would allow testing whether the library has changed since last fetch, but to do this effectively, I must understand what triggers a re-read of the BBT-produced bib file, and whether it's cached on the citr end
The text was updated successfully, but these errors were encountered: