Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to determine mongo collection for indexed document #101

Closed
wesleyarchbell opened this issue Jul 15, 2013 · 22 comments
Closed

Unable to determine mongo collection for indexed document #101

wesleyarchbell opened this issue Jul 15, 2013 · 22 comments

Comments

@wesleyarchbell
Copy link

I am unable to determine which mongo-db collection a document that is indexed in ES belongs too. I have created a unique river for each mongo db collection but they are all indexed under the same index. Is there any way of determining for a given document which mongodb collection it belongs too?

@richardwilly98
Copy link
Owner

Hello,

You have a couple of options with the current release:

  • Create a new index for each river.
  • Use script filter to add an additional attribute to the document to be indexed.

Let me know if the first 2 options work for you.

I could also create a new settings options/include_collection which provide the attribute name where the collection name will stored.

Thanks,
Richard.

@wesleyarchbell
Copy link
Author

For the mean time I just do a mongo db find in each collection by document id to find the collection it belongs too as I didn't want to have to go update documents but I reckon the settings option is the way to go.

richardwilly98 added a commit that referenced this issue Jul 16, 2013
- New parameter options/include_collection can be used to insert the
collection name in the document indexed by ES
@wesleyarchbell
Copy link
Author

Thanks, which release will this make it into?

@richardwilly98
Copy link
Owner

Hi,

That's available in version 1.6.11 just released today.

Thanks,
Richard.

On Tuesday, July 16, 2013, Wesley Archbell wrote:

Thanks, do which release will it make it into?


Reply to this email directly or view it on GitHubhttps://github.com//issues/101#issuecomment-21038133
.

@wesleyarchbell
Copy link
Author

Thanks, if i used this version (1.6.11) will it still work with older versions of mongodb and elasticsearch? specifically mongodb version 2.4.4 & ES version 0.90.1?

@richardwilly98
Copy link
Owner

Hi,

Yes it should be working.

Thanks,
Richard.

Sent via BlackBerry by AT&T

-----Original Message-----
From: Wesley Archbell notifications@github.com
Date: Wed, 17 Jul 2013 15:47:33
To: richardwilly98/elasticsearch-river-mongodbelasticsearch-river-mongodb@noreply.github.com
Reply-To: richardwilly98/elasticsearch-river-mongodb reply@reply.github.com
Cc: Richard Louaprerichard.louapre@gmail.com
Subject: Re: [elasticsearch-river-mongodb] Unable to determine mongo
collection for indexed document (#101)

Thanks, if i used this version (1.6.11) will it still work with older versions of mongodb? specifically version 2.4.4?


Reply to this email directly or view it on GitHub:
#101 (comment)

@wesleyarchbell
Copy link
Author

Ok thanks will give it a try

@wesleyarchbell
Copy link
Author

One more thing, where is the 'options/include_collection' set in? Which file?

@richardwilly98
Copy link
Owner

Hi,

In the river settings the wiki should be updated.

Thanks,
Richard.

Sent via BlackBerry by AT&T

-----Original Message-----
From: Wesley Archbell notifications@github.com
Date: Wed, 17 Jul 2013 17:06:27
To: richardwilly98/elasticsearch-river-mongodbelasticsearch-river-mongodb@noreply.github.com
Reply-To: richardwilly98/elasticsearch-river-mongodb reply@reply.github.com
Cc: Richard Louaprerichard.louapre@gmail.com
Subject: Re: [elasticsearch-river-mongodb] Unable to determine mongo
collection for indexed document (#101)

One more thing, where is the 'options/include_collection' set in? Which file?


Reply to this email directly or view it on GitHub:
#101 (comment)

@wesleyarchbell
Copy link
Author

OK Thanks :)

@wesleyarchbell
Copy link
Author

Ive just tried including the change with the new option with v1.6.11, but i dont see any new field in the source fields..

@richardwilly98
Copy link
Owner

Hi,

I will not be able to help before few days (currently on vacations).

But you compare / check the example available in HEAD (issues/101 folder). There is also a test case available.

Can you also provide your river settings?

Thanks,
Richard.

Sent via BlackBerry by AT&T

-----Original Message-----
From: Wesley Archbell notifications@github.com
Date: Wed, 17 Jul 2013 17:46:10
To: richardwilly98/elasticsearch-river-mongodbelasticsearch-river-mongodb@noreply.github.com
Reply-To: richardwilly98/elasticsearch-river-mongodb reply@reply.github.com
Cc: Richard Louaprerichard.louapre@gmail.com
Subject: Re: [elasticsearch-river-mongodb] Unable to determine mongo
collection for indexed document (#101)

Ive just tried including the change with the new option with v1.6.11, but i dont see any new field in the source fields..


Reply to this email directly or view it on GitHub:
#101 (comment)

@wesleyarchbell
Copy link
Author

No worries, thanks will check it out. Have a good break man :)

@wesleyarchbell
Copy link
Author

Hi Richard, when you get back would u be able to have a look at this, I've reviewed the test for fix#101 but my data is not reflecting the change in the document source fields i.e. there is no 'include_collection' field after i have created the river with the options/include_collection setting. I can see that the field is included in one of the rivers by using the head plugin and viewing in browser:

_index | _type | _id | _score | type | db | collection | include_collection | name | throttle_size |
_river | readcloud_wiley | _meta | 1 | book | rdb | wiley | wileybook-index | 50 |

But not present in document source..
The data gets inserted via a bulk import (mongoexport and mongorestore after river is created for each collection)

Thanks in advance.

@richardwilly98
Copy link
Owner

I have successfully re-executed the test located here [1].
@wesleyarchbell can you please share the river settings used?

[1] - https://github.com/richardwilly98/elasticsearch-river-mongodb/tree/master/resources/issues/101

Thanks,
Richard.

@wesleyarchbell
Copy link
Author

Hi richard, sorry for the delay, here is my settings for a river:

   {
    "type":"mongodb",
    "mongodb":{
        "db":"test",
        "collection":"collection123"
    },
    "options": {
        "include_collection":"collection123"
    },
    "index":{
        "name":"book-index",
        "throttle_size":"50",
        "type":"book"
    }
  }

@richardwilly98
Copy link
Owner

@wesleyarchbell I have added additional traces. Can you please try the snapshot version available here [1]?

  • Stop ES
  • Replace $ES_HOME\plugins\river-mongodb elasticsearch-river-mongodb-1.6.11.jar by elasticsearch-river-mongodb-1.6.12-SNAPSHOT.jar
  • Restart ES

Enable logging for the river:
In $ES_HOME\config\logging.yml add the following in logger section

  river.mongodb: TRACE
  org.elasticsearch.river.mongodb.MongoDBRiver$Indexer: TRACE

Please send me ES log file.

[1] - https://dl.dropboxusercontent.com/u/64847502/elasticsearch-river-mongodb-1.6.12-SNAPSHOT.zip

@wesleyarchbell
Copy link
Author

https://www.dropbox.com/s/fm9am817rk3x3nr/elasticsearch.log

It seems the include_collection option is empty..

This is the exact curl command i use to create river:

 curl -XPUT 'http://'"$ES_HOST"':'"$ES_PORT"'/_river/'"$DATABASE"'_'"$collection"'/_meta' -d '
    {
        "type":"mongodb",
        "mongodb":{
            "db":"'$DATABASE'",
            "collection":"'$collection'"
        },
        "options": {
            "include_collection":"'$collection'"
        },
        "index":{
            "name":"book-index",
            "throttle_size":"50",
            "type":"book"
        }
    }

@richardwilly98
Copy link
Owner

@wesleyarchbell from the code it seems that the options section is not recognized.

Can you please execute curl -XGET http://localhost:9200/_river/{river-name}/_meta

@wesleyarchbell
Copy link
Author

The curl comment results in:

{"_index":"_river","_type":"readcloud_wiley","_id":"_meta","_version":1,"exists":true, "_source" : 
    {
        "type":"mongodb",
        "mongodb":{
            "db":"readcloud",
            "collection":"wiley"
        },
        "options": {
            "include_collection":"wiley"
        },
        "index":{
            "name":"book-index",
            "throttle_size":"50",
            "type":"book"
        }
    }}

@richardwilly98
Copy link
Owner

options should be inside mongodb:

{
        "type":"mongodb",
        "mongodb":{
            "db":"readcloud",
            "collection":"wiley",
            "options": {
                "include_collection":"wiley"
            }
        },
        "index":{
            "name":"book-index",
            "throttle_size":"50",
            "type":"book"
        }
    }

@wesleyarchbell
Copy link
Author

Dow! Thanks Richard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants