-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "Dreyfus" (Erlang side of Search) to CouchDB #2037
Conversation
This is the first public release of Dreyfus under the Apache Software License, version 2.
When loading design docs with the ID, we want to add ejson_body to the list of options so the body is converted to term from binary. This makes dreyfus compatible with COUCHDB2.0 BugzID:50663
Add ejson_body
Previous if the anonymous opening process died it would cause dreyfus_index_manager to exit and kill all other dreyfus_index processes. This lead to bad things happening in multiple places. This fix just tracks the anonymous opener processs in the ?BY_PID table so we know about it when it dies. The exit status is then reused the same as would have been for an error returned when opening the index. Fixes: apache#2
Handle errors when opening an index
BugzID: 52884
Use couch_crypto
Update to new couch_epi API
This metric is different from what we track in clouseau. This basically will track the overall time it took for the search request, where as the one in Clouseau will only track the search latency at the shard level. This will come in handy for tracking the search latency issues as seen by the end users and also allow us to add pager triggers based on this.
…fus-httpd-search Add new metric to track the "search request" latency.
…ponse Fix http handler response
…-crash Don't crash on invalid ddoc
Rexi receives an error message that does not match any of the tuples for the first argument of dreyfus_fabric:handle_error_message/6. This leads to a function_clause error that bubbles up to chttpd which gets logged because there is a stack trace involved. By adding an extra clause that matches all error messages, we attempt to see if progress can be made by calling handle_error. If progress can succeed, then nothing gets logged. If the progress cannot succeed, an error will be sent back to the user again via chttpd:send_error/2. However, if no stack trace is involved in the error, we won't log the message. Adding an extra couch_log:error here assures we have something in the logs to make note of the error message to help us debug. BugzId: 75421
…ogging Improve Search Logging
If a database is deleted very soon after creation, it's possible for the open_int call to return {not_found, no_db_file}. Expect this and error cleanly out of gen_server initialisation. BugzID: 77650
Tolerate open_int failure
Counters were not stored as proper orddict (in a sorted manner). Because of this when some Workers replied, they could not be found in Counters and updated. dreyfus_fabric_search will keep waiting for these Counters, which would lead to timeout on a request. BugzID: 84146
…cement Fix shard replacement
Currently the only way to get the disk size information for the search index is to use the _search_info end point. But using this point would lead to opening the search index which is non trivial as lucene has an overhead for opening the index. These changes would add a new end point _search_disk_size to get the disk size information without opening the search index. Sarnie can use this new end point and avoid opening the search index. BugzId:87336
…nd-point Add new end point to get disk size information for search index
-1 on landing until this gets cleared up and my previous comment (re:
principle of least astonishment and Search not working if Clouseau isn't
connected) is addressed.
+1 once it is. :)
|
Yep understood and agreed. |
Testing for clouseau is a better approach but be aware that dreyfus intentionally tolerates clouseau not being up, any search request is retried a few times (with a sleep inbetween) if it’s down, in order to mask a clouseau reboot (typically for upgrades, but JVM’s do sometimes OOM).
… On 14 Jun 2019, at 21:50, Adam Kocoloski ***@***.***> wrote:
@nowisesys pointed out an interesting issue with this Pull Request. We have an ancient pattern in Mango which does a module_loaded(dreyfus_index) to check for the presence of Dreyfus before advertising text indexes. In release mode that module is not loaded on startup, so even if Dreyfus is included in a release the check can fail. Oops!
A trivial switch here would be to use dreyfus_sup instead of dreyfus_index as the former will be loaded as part of the supervision tree, but that doesn't actually confirm a working Search system since this PR proposes to always start the Dreyfus app. I think we should come up with a smarter approach -- perhaps testing for a connected Clouseau node instead?
/cc @rnewson @tonysun83 for comment
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
There is a companion open source project to CouchDB called Clouseau that provides Lucene-powered fulltext search to CouchDB. This patch provides a way to include Clouseau as a sidecar in each CouchDB Pod. The latest release of CouchDB does not yet support this sidecar out of the box; see apache/couchdb#2037. This integration is therefore disabled by default. Signed-off-by: Adam Kocoloski <kocolosk@apache.org>
Previously we had been using module_loaded(dreyfus_index) as a check for the presence of the Search system. There are two issues with this approach going forward: 1. Dreyfus is going to be included in every build 2. An Erlang release loads modules lazily and so this check could accidentally fail even on a Search-enabled system. This patch changes the check to one that makes an RPC request to the Clouseau (Java) subsystem. This should be a low-cost operation, but I haven't benchmarked it.
I made a small change to replace the check for the Search
Mango
I think there are three followups:
|
for ref this is what dreyfus does for retries: https://github.com/cloudant-labs/dreyfus/blob/master/src/dreyfus_httpd.erl#L585 but obviously that is only for direct _search http requests, not via mango. |
As a user, I have a question. Why not include Clouseau as well? |
…m#14822) * Bump to latest CouchDB release Signed-off-by: Adam Kocoloski <kocolosk@apache.org> * Add experimental integration with Lucene search There is a companion open source project to CouchDB called Clouseau that provides Lucene-powered fulltext search to CouchDB. This patch provides a way to include Clouseau as a sidecar in each CouchDB Pod. The latest release of CouchDB does not yet support this sidecar out of the box; see apache/couchdb#2037. This integration is therefore disabled by default. Signed-off-by: Adam Kocoloski <kocolosk@apache.org> Signed-off-by: Nigel Foucha <nigel.foucha@gmail.com>
@chintan-mishra good question. The Closeau codebase is a very different thing. It's all Scala/Java, so the release packaging machinery that we use for the main Erlang codebase wouldn't be applicable. It runs as a completely separate process, so the scripts that we use to start/stop/manage CouchDB wouldn't apply. It's also a relatively mature project and not one where the current IBM maintainers foresee a lot of additional contributions. I'd like to hear your opinion as a user. I can imagine you want the Search add-on to be enabled as simply as possible. I'd like to see that too, but personally I think the right approach is to ensure that when the Clouseau service is installed using the convenience binaries CouchDB recognizes it and uses it automatically, rather than pulling the whole codebase "in-tree". |
Hello @kocolosk
The most fantastic thing would be to update Clouseau to Java 8+ and the latest Lucene . I've already opened an issue for this: cloudant-labs/clouseau#22 |
Updating to Java 8 is non-trivial as several dependencies need updating and there is no upstream support for them anymore. a Pull Request for all that work would move things forward, of course, but an Issue reporting what we already know isn't going to. As the author of dreyfus/clouseau, I should point out that those projects apply to CouchDB 2.0 specifically, they are tightly coupled/aligned with that architecture. Future versions of CouchDB that use FoundationDB as the base layer will need an alternative solution, which I, and my team, are working on right now (http://github.com/cloudant-labs/fdblucene is the fundamental piece). This work is pure Java and targets JDK 8, though I personally develop against JDK 11. |
The latest commit preserves the exponential backoff for Search requests, but if the final error message indicates an inability to connect to Clouseau the error message will be
That seems like a reasonable improvement to me. I also did an ad hoc measurement of the time required to do the |
OK, so now the Personally I'm satisfied with the PR at this point. |
@kocolosk yes I would prefer Full-Text Search to be one of the config options in the CouchDB config. Edit: The approach mentioned by you looks easy. I am up for that type of setup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't the foggiest of how to start reviewing this, but with the 30s backoff that fails if you don't have Clouseau running sounds fine to me.
I asked @kocolosk to note the new configurable parameter(s) in rel/overlay/etc/default.ini
, but @rnewson if there are any other dreyfus specific ones, would you mind bringing those over too? Thanks.
Those configurables should also be documented in the documentation PR, if you please. Thanks.
very happy to see this, thanks everyone! |
The clouseau_rpc:version() call actually takes a few milliseconds to complete, so instead we first check for a hidden clouseau node already connected to our node. If we don't find it, we do the version() RPC.
9a31987
to
0d32708
Compare
Doing a traditional merge commit here to be consistent with our past process on |
…m#14822) * Bump to latest CouchDB release Signed-off-by: Adam Kocoloski <kocolosk@apache.org> * Add experimental integration with Lucene search There is a companion open source project to CouchDB called Clouseau that provides Lucene-powered fulltext search to CouchDB. This patch provides a way to include Clouseau as a sidecar in each CouchDB Pod. The latest release of CouchDB does not yet support this sidecar out of the box; see apache/couchdb#2037. This integration is therefore disabled by default. Signed-off-by: Adam Kocoloski <kocolosk@apache.org> Signed-off-by: Andrii Nasinnyk <anasinnyk@macpaw.com>
Overview
This contribution pulls in the "Dreyfus" OTP application, which teaches CouchDB how to communicate with the "Clouseau" Java application for Lucene-powered Search and Mango text indexes. Once this PR is merged it will no longer be necessary to use a custom fork of CouchDB in order to enable Search.
Testing recommendations
In order to fully test this PR you'll need a build of Clouseau. Those can be found at
There's also a public Maven repo:
I've gone ahead and built a Docker image which can be used with Kubernetes:
and updated the upstream Helm chart to use it. If you want a one-liner to install CouchDB+Search on a Kube cluster of your choice you can try:
Related Issues or Pull Requests
apache/couchdb-documentation#418
helm/charts#14822
Checklist
Posting a draft PR for discussion. Will lift the draft flag once documentation and a Clouseau build are available.