read operation to server failed on database local #123

dblado · 2013-09-12T22:42:19Z

I've been seeing the following exception in elasticsearch today:

[2013-09-12 22:36:49,823][ERROR][river.mongodb ] [M] [mongodb][vendors_river] Mongo gave an exception
com.mongodb.MongoException$Network: Read operation to server ip-10-181-140-155/10.181.140.155:27017 failed on database local
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:253)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:216)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:288)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:273)
at com.mongodb.DBCursor._check(DBCursor.java:368)
at com.mongodb.DBCursor._hasNext(DBCursor.java:459)
at com.mongodb.DBCursor.hasNext(DBCursor.java:484)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.run(MongoDBRiver.java:1211)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at org.bson.io.Bits.readFully(Bits.java:46)
at org.bson.io.Bits.readFully(Bits.java:33)
at org.bson.io.Bits.readFully(Bits.java:28)
at com.mongodb.Response.(Response.java:40)
at com.mongodb.DBPort.go(DBPort.java:142)
at com.mongodb.DBPort.call(DBPort.java:92)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:244)
... 8 more

I did add two indexes to my mongo collection last night. In the mongo I see the following query executing (the client is the elasticsearch river):

    {
        "opid" : 3150,
        "active" : true,
        "secs_running" : 40,
        "op" : "query",
        "ns" : "local.oplog.rs",
        "query" : {
            "$query" : {
                "$and" : [
                    {
                        "$or" : [
                            {
                                "ns" : "vendop.vendor"
                            },
                            {
                                "ns" : "vendop.$cmd"
                            }
                        ]
                    },
                    {
                        "ts" : {
                            "$gt" : Timestamp(1378589134, 45)
                        }
                    }
                ]
            },
            "$orderby" : {
                "$natural" : 1
            }
        },
        "client" : "10.159.61.227:38940",
        "desc" : "conn103",
        "threadId" : "0x7f31d9c25700",
        "connectionId" : 103,
        "locks" : {
            "^" : "r",
            "^local" : "R"
        },
        "waitingForLock" : false,
        "numYields" : 171,
        "lockStats" : {
            "timeLockedMicros" : {
                "r" : NumberLong(79863946),
                "w" : NumberLong(0)
            },
            "timeAcquiringMicros" : {
                "r" : NumberLong(40534881),
                "w" : NumberLong(0)
            }
        }
    }
]

}

one of these queries takes a long time to execute:

Thu Sep 12 22:37:21.277 [conn87] query local.oplog.rs query: { $query: { $and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts: { $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } } cursorid:22909236201096610 ntoreturn:0 ntoskip:0 nscanned:18878069 keyUpdates:0 numYields: 3873 locks(micros) r:1835294560 nreturned:92 reslen:15638 932067ms

Thu Sep 12 22:37:21.277 [conn90] query local.oplog.rs query: { $query: { $and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts: { $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } } cursorid:22909235441195695 ntoreturn:0 ntoskip:0 nscanned:18878069 keyUpdates:0 numYields: 3099 locks(micros) r:1480829880 nreturned:92 reslen:15638 751916ms

Thu Sep 12 22:37:21.277 [conn91] query local.oplog.rs query: { $query: { $and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts: { $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } } cursorid:22909237130526342 ntoreturn:0 ntoskip:0 nscanned:18878069 keyUpdates:0 numYields: 2894 locks(micros) r:1362870376 nreturned:92 reslen:15638 691878ms

Thu Sep 12 22:37:21.278 [conn85] query local.oplog.rs query: { $query: { $and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts: { $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } } cursorid:22909235755883542 ntoreturn:0 ntoskip:0 nscanned:18878069 keyUpdates:0 numYields: 4315 locks(micros) r:2072460163 nreturned:92 reslen:15638 1052126ms

I'm on es 0.90.3, river-mongodb 1.7.0, mongodb 2.4.5.

any ideas what is causing the queries from river-mongodb to mongo to take so long to run?

The text was updated successfully, but these errors were encountered:

dblado · 2013-09-13T04:13:21Z

updated w/ more log info:

The org.elasticsearch.action.NoShardAvailableActionException is also new. Any help would be greatly appreciated.

[2013-09-13 04:06:41,811][INFO ][node ] [Spiral] version[0.90.3], pid[11057], build[5c38d60/2013-08-06T13:18:31Z]
[2013-09-13 04:06:41,812][INFO ][node ] [Spiral] initializing ...
[2013-09-13 04:06:41,873][INFO ][plugins ] [Spiral] loaded [mongodb-river, mapper-attachments, lang-javascript], sites [river-mongodb, head]
[2013-09-13 04:06:44,635][INFO ][node ] [Spiral] initialized
[2013-09-13 04:06:44,635][INFO ][node ] [Spiral] starting ...
[2013-09-13 04:06:44,738][INFO ][transport ] [Spiral] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.159.61.227:9300]}
[2013-09-13 04:06:47,774][INFO ][cluster.service ] [Spiral] new_master [Spiral][1-o0xVXqTw-hIS9zNXLnZA][inet[/10.159.61.227:9300]], reason: zen-disco-join (elected_as_master)
[2013-09-13 04:06:47,810][INFO ][discovery ] [Spiral] elasticsearch/1-o0xVXqTw-hIS9zNXLnZA
[2013-09-13 04:06:47,838][INFO ][http ] [Spiral] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.159.61.227:9200]}
[2013-09-13 04:06:47,839][INFO ][node ] [Spiral] started
[2013-09-13 04:06:47,861][DEBUG][action.get ] [Spiral] [_river][0]: failed to execute [[_river][vendors_river][_meta]: routing [null]]
org.elasticsearch.action.NoShardAvailableActionException: [_river][0] null
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:123)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:72)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:47)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:61)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:179)
at org.elasticsearch.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:112)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:85)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:59)
at org.elasticsearch.river.routing.RiversRouter$1.execute(RiversRouter.java:109)
at org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:103)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
[2013-09-13 04:06:48,773][INFO ][gateway ] [Spiral] recovered [2] indices into cluster_state
[2013-09-13 04:06:49,122][DEBUG][action.get ] [Spiral] [_river][0]: failed to execute [[_river][vendors_river][_meta]: routing [null]]
org.elasticsearch.action.NoShardAvailableActionException: [_river][0] null
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:123)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:72)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:47)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:61)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:179)
at org.elasticsearch.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:112)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:85)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:59)
at org.elasticsearch.river.routing.RiversRouter$1.execute(RiversRouter.java:109)
at org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:103)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
[2013-09-13 04:06:49,344][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] Prefix: [[Spiral] [mongodb][vendors_river] ] - name: [river.mongodb]
[2013-09-13 04:06:49,355][INFO ][river.mongodb ] mongoServersSettings: [{port=27017, host=db.prod.vendop.com}]
[2013-09-13 04:06:49,356][INFO ][river.mongodb ] Server: db.prod.vendop.com - 27017
[2013-09-13 04:06:49,359][TRACE][river.mongodb ] mongoOptionsSettings: {initial_timestamp={script_type=js, script=var date = new Date(); date.setSeconds(date.getSeconds() + 5); new java.lang.Long(date.getTime());}, secondary_read_preference=true}
[2013-09-13 04:06:49,730][TRACE][river.mongodb ] initialTimestamp script returned: 1379045214687
[2013-09-13 04:06:49,738][INFO ][river.mongodb ] [Spiral] [mongodb][vendors_river] Using mongodb server(s): host [db.prod.vendop.com], port [27017]
[2013-09-13 04:06:49,739][INFO ][river.mongodb ] [Spiral] [mongodb][vendors_river] MongoDB River Plugin version: [1.7.0]
[2013-09-13 04:06:49,739][INFO ][river.mongodb ] [Spiral] [mongodb][vendors_river] starting mongodb stream. options: secondaryreadpreference [true], drop_collection [false], include_collection [], throttlesize [500], gridfs [false], filter [], db [vendop], collection [vendor], script [null], indexing to [vendors]/[vendop]
[2013-09-13 04:06:49,964][TRACE][river.mongodb ] [Spiral] [mongodb][vendors_river] serverStatus: { "serverUsed" : "ip-10-181-140-155/10.181.140.155:27017" , "host" : "ip-10-181-140-155" , "version" : "2.4.5" , "process" : "mongod" , "pid" : 1719 , "uptime" : 25144.0 , "uptimeMillis" : 25144689 , "uptimeEstimate" : 24948.0 , "localTime" : { "$date" : "2013-09-13T04:06:49.934Z"} , "asserts" : { "regular" : 0 , "warning" : 0 , "msg" : 0 , "user" : 0 , "rollovers" : 0} , "backgroundFlushing" : { "flushes" : 419 , "total_ms" : 366 , "average_ms" : 0.8735083532219571 , "last_ms" : 0 , "last_finished" : { "$date" : "2013-09-13T04:06:45.362Z"}} , "connections" : { "current" : 32 , "available" : 19968 , "totalCreated" : 476} , "cursors" : { "totalOpen" : 34 , "clientCursors_size" : 34 , "timedOut" : 388 , "totalNoTimeout" : 23} , "dur" : { "commits" : 88 , "journaledMB" : 0.0 , "writeToDataFilesMB" : 0.0 , "compression" : 0.0 , "commitsInWriteLock" : 0 , "earlyCommits" : 0 , "timeMs" : { "dt" : 3026 , "prepLogBuffer" : 0 , "writeToJournal" : 0 , "writeToDataFiles" : 0 , "remapPrivateView" : 0}} , "extra_info" : { "note" : "fields vary by platform" , "heap_usage_bytes" : 64792968 , "page_faults" : 258863} , "globalLock" : { "totalTime" : 25144690000 , "lockTime" : 22065038 , "currentQueue" : { "total" : 0 , "readers" : 0 , "writers" : 0} , "activeClients" : { "total" : 23 , "readers" : 23 , "writers" : 0}} , "indexCounters" : { "accesses" : 1050 , "hits" : 1050 , "misses" : 0 , "resets" : 0 , "missRatio" : 0.0} , "locks" : { "." : { "timeLockedMicros" : { "R" : 1720996 , "W" : 22065038} , "timeAcquiringMicros" : { "R" : 2022298 , "W" : 859466476}} , "admin" : { "timeLockedMicros" : { } , "timeAcquiringMicros" : { }} , "local" : { "timeLockedMicros" : { "r" : 820853527560 , "w" : 66890} , "timeAcquiringMicros" : { "r" : 417564575283 , "w" : 894218}} , "vendop" : { "timeLockedMicros" : { "r" : 3367283 , "w" : 1696108} , "timeAcquiringMicros" : { "r" : 1756863 , "w" : 140851}} , "config" : { "timeLockedMicros" : { "r" : 23137 , "w" : 0} , "timeAcquiringMicros" : { "r" : 1159 , "w" : 0}}} , "network" : { "bytesIn" : 807799 , "bytesOut" : 15995589 , "numRequests" : 12997} , "opcounters" : { "insert" : 55 , "query" : 2600 , "update" : 11 , "delete" : 1 , "getmore" : 0 , "command" : 11593} , "opcountersRepl" : { "insert" : 0 , "query" : 0 , "update" : 0 , "delete" : 0 , "getmore" : 0 , "command" : 0} , "recordStats" : { "accessesNotInMemory" : 337 , "pageFaultExceptionsThrown" : 324 , "config" : { "accessesNotInMemory" : 0 , "pageFaultExceptionsThrown" : 0} , "local" : { "accessesNotInMemory" : 1 , "pageFaultExceptionsThrown" : 1} , "vendop" : { "accessesNotInMemory" : 336 , "pageFaultExceptionsThrown" : 323}} , "repl" : { "setName" : "set1" , "ismaster" : true , "secondary" : false , "hosts" : [ "ip-10-181-140-155:27017"] , "primary" : "ip-10-181-140-155:27017" , "me" : "ip-10-181-140-155:27017"} , "writeBacksQueued" : false , "mem" : { "bits" : 64 , "resident" : 7313 , "virtual" : 33443 , "supported" : true , "mapped" : 16409 , "mappedWithJournal" : 32818} , "metrics" : { "document" : { "deleted" : 1 , "inserted" : 55 , "returned" : 42419 , "updated" : 11} , "getLastError" : { "wtime" : { "num" : 0 , "totalMillis" : 0} , "wtimeouts" : 0} , "operation" : { "fastmod" : 3 , "idhack" : 247 , "scanAndOrder" : 17} , "queryExecutor" : { "scanned" : 7532378605} , "record" : { "moves" : 3} , "repl" : { "apply" : { "batches" : { "num" : 0 , "totalMillis" : 0} , "ops" : 0} , "buffer" : { "count" : 0 , "maxSizeBytes" : 268435456 , "sizeBytes" : 0} , "network" : { "bytes" : 0 , "getmores" : { "num" : 0 , "totalMillis" : 0} , "ops" : 0 , "readersCreated" : 3} , "oplog" : { "insert" : { "num" : 68 , "totalMillis" : 11} , "insertBytes" : 13725} , "preload" : { "docs" : { "num" : 0 , "totalMillis" : 0} , "indexes" : { "num" : 0 , "totalMillis" : 0}}} , "ttl" : { "deletedDocuments" : 0 , "passes" : 419}} , "ok" : 1.0}
[2013-09-13 04:06:49,964][TRACE][river.mongodb ] [Spiral] [mongodb][vendors_river] process: mongod
[2013-09-13 04:06:50,005][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] vendop.vendor last timestamp: { "$ts" : 1378589134 , "$inc" : 45}
[2013-09-13 04:06:50,009][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}

[2013-09-13 04:07:50,074][ERROR][river.mongodb ] [Spiral] [mongodb][vendors_river] Mongo gave an exception
com.mongodb.MongoException$Network: Read operation to server ip-10-181-140-155/10.181.140.155:27017 failed on database local
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:253)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:216)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:288)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:273)
at com.mongodb.DBCursor._check(DBCursor.java:368)
at com.mongodb.DBCursor._hasNext(DBCursor.java:459)
at com.mongodb.DBCursor.hasNext(DBCursor.java:484)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.run(MongoDBRiver.java:1211)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at org.bson.io.Bits.readFully(Bits.java:46)
at org.bson.io.Bits.readFully(Bits.java:33)
at org.bson.io.Bits.readFully(Bits.java:28)
at com.mongodb.Response.(Response.java:40)
at com.mongodb.DBPort.go(DBPort.java:142)
at com.mongodb.DBPort.call(DBPort.java:92)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:244)
... 8 more
[2013-09-13 04:07:50,077][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] vendop.vendor last timestamp: { "$ts" : 1378589134 , "$inc" : 45}
[2013-09-13 04:07:50,078][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}
[2013-09-13 04:08:50,138][ERROR][river.mongodb ] [Spiral] [mongodb][vendors_river] Mongo gave an exception
com.mongodb.MongoException$Network: Read operation to server ip-10-181-140-155/10.181.140.155:27017 failed on database local
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:253)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:216)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:288)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:273)
at com.mongodb.DBCursor._check(DBCursor.java:368)
at com.mongodb.DBCursor._hasNext(DBCursor.java:459)
at com.mongodb.DBCursor.hasNext(DBCursor.java:484)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.run(MongoDBRiver.java:1211)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at org.bson.io.Bits.readFully(Bits.java:46)
at org.bson.io.Bits.readFully(Bits.java:33)
at org.bson.io.Bits.readFully(Bits.java:28)
at com.mongodb.Response.(Response.java:40)
at com.mongodb.DBPort.go(DBPort.java:142)
at com.mongodb.DBPort.call(DBPort.java:92)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:244)
... 8 more
[2013-09-13 04:08:50,141][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] vendop.vendor last timestamp: { "$ts" : 1378589134 , "$inc" : 45}
[2013-09-13 04:08:50,142][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}

richardwilly98 · 2013-09-13T10:01:43Z

How big is "oplog.rs" collection?
The river is using a tailable cursor to query this capped collection so that should be fast.
Regarding the second exception "org.elasticsearch.action.NoShardAvailableActionException" it is a new known issue with ES 0.90.3. It does not seem to affect the river but I am currently looking at it.

dblado · 2013-09-13T16:00:40Z

the oplog.rs collection is 9553 (I believe):

set1:PRIMARY> db.printReplicationInfo()
configured oplog size: 9553.9705078125MB
log length start to end: 2219103secs (616.42hrs)
oplog first event time: Sun Aug 18 2013 21:05:59 GMT+0000 (UTC)
oplog last event time: Fri Sep 13 2013 13:31:02 GMT+0000 (UTC)
now: Fri Sep 13 2013 15:58:46 GMT+0000 (UTC)
set1:PRIMARY> db.getReplicationInfo()
{
"logSizeMB" : 9553.9705078125,
"usedMB" : 8992.25,
"timeDiff" : 2219103,
"timeDiffHours" : 616.42,
"tFirst" : "Sun Aug 18 2013 21:05:59 GMT+0000 (UTC)",
"tLast" : "Fri Sep 13 2013 13:31:02 GMT+0000 (UTC)",
"now" : "Fri Sep 13 2013 15:58:58 GMT+0000 (UTC)"
}

richardwilly98 · 2013-09-13T16:16:30Z

@dblado I was actually asking the number of rows in "oplog.rs" (sorry for the confusion).

The way the river monitors this collection is the recommended way by Mongo [1] to access mapped collection using a tailable cursor.

I have the feeling something is not correctly setup in your MongoDB instance.

[1] - http://docs.mongodb.org/manual/tutorial/create-tailable-cursor/

dblado · 2013-09-13T16:18:03Z

no worries -- db.oplog.rs.count() shows 18878152 records

On Fri, Sep 13, 2013 at 9:16 AM, Richard Louapre
notifications@git.luolix.topwrote:

@dblado https://github.com/dblado I was actually asking the number of
rows in "oplog.rs" (sorry for the confusion).

The way the river monitors this collection is the recommended way by Mongo
[1] to access mapped collection using a tailable cursor.

I have the feeling something is not correctly setup in your MongoDB
instance.

[1] - http://docs.mongodb.org/manual/tutorial/create-tailable-cursor/

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24405936
.

richardwilly98 · 2013-09-13T16:37:20Z

Hi,

There are actually hardcoded values for "connect timeout" and "socket timeout" in the current implementation... That seems to be the issue has it takes more than 15 seconds in your scenario.

    // TODO: MongoClientOptions should be configurable
    private Mongo getMongoClient() {
        if (mongo == null) {
            Builder builder = MongoClientOptions.builder()
                    .autoConnectRetry(true).connectTimeout(15000)
                    .socketKeepAlive(true).socketTimeout(60000);
            if (definition.isMongoUseSSL()) {
                builder.socketFactory(getSSLSocketFactory());
            }

            MongoClientOptions mco = builder.build();
            mongo = new MongoClient(definition.getMongoServers(), mco);
        }
        return mongo;
    }

I will make a change to be able to specify these parameters in the river settings.

dblado · 2013-09-13T17:01:32Z

Yea but I don't the indexing through the river to be taking so darn long --
I want as near real time as possible. I have the db and es on 2 m1.large
instances in EC2 -- could my problem be undersized hardware or nodes?

On Fri, Sep 13, 2013 at 9:37 AM, Richard Louapre
notifications@git.luolix.topwrote:

Hi,

There are actually hardcoded values for "connect timeout" and "socket
timeout" in the current implementation... That seems to be the issue has it
takes more than 15 seconds in your scenario.
// TODO: MongoClientOptions should be configurable
private Mongo getMongoClient() {
    if (mongo == null) {
        Builder builder = MongoClientOptions.builder()
                .autoConnectRetry(true).connectTimeout(15000)
                .socketKeepAlive(true).socketTimeout(60000);
        if (definition.isMongoUseSSL()) {
            builder.socketFactory(getSSLSocketFactory());
        }

        MongoClientOptions mco = builder.build();
        mongo = new MongoClient(definition.getMongoServers(), mco);
    }
    return mongo;
}
I will make a change to be able to specify these parameters in the river
settings.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24407363
.

dblado · 2013-09-13T17:07:40Z

the queries that the river sends to mongo are taking 20+ minutes to execute. the load on both m1.large instances are near 0 since i'm the only one using it at the moment. there might be a few other hits to the db and es from the app but not much

richardwilly98 · 2013-09-13T17:14:34Z

Can you do an .explain() on the query to see if it can be optimized or indexes can be created?
I will have a new SNAPSHOT ready for testing soon (if you are interested).

dblado · 2013-09-13T17:24:45Z

not sure how to do the explain...I thought I could do:

set1:PRIMARY> db.oplog.rs.find({ $query: { $and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts: { $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } }).explain()
Fri Sep 13 17:24:15.723 JavaScript execution failed: SyntaxError: Unexpected number

I just copied one of the offending queries from the mongo logfile

dblado · 2013-09-13T17:26:06Z

ya, send me a jar file w/ the timeout hardcode increased (or customizable) and I'll give it a try -- maybe once the river receives the response to this specific query my problem will go away. Although I did set an initial timestamp on the river to force to ignore old oplog entries

richardwilly98 · 2013-09-13T17:29:22Z

Here is the snapshot [1]. Please give it a try.

No need to change the river settings.

Just stop ES replace the jar files and restart.

[1] - https://dl.dropboxusercontent.com/u/64847502/elasticsearch-river-mongodb-1.7.1-SNAPSHOT.zip

dblado · 2013-09-13T17:34:17Z

still seeing the read timed out exception:

[2013-09-13 17:31:45,243][INFO ][node ] [Jimmy Woo] version[0.90.3], pid[13379], build[5c38d60/2013-08-06T13:18:31Z]
[2013-09-13 17:31:45,244][INFO ][node ] [Jimmy Woo] initializing ...
[2013-09-13 17:31:45,306][INFO ][plugins ] [Jimmy Woo] loaded [mongodb-river, mapper-attachments, lang-javascript], sites [river-mongodb, head]
[2013-09-13 17:31:48,056][INFO ][node ] [Jimmy Woo] initialized
[2013-09-13 17:31:48,056][INFO ][node ] [Jimmy Woo] starting ...
[2013-09-13 17:31:48,157][INFO ][transport ] [Jimmy Woo] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.159.61.227:9300]}
[2013-09-13 17:31:51,193][INFO ][cluster.service ] [Jimmy Woo] new_master [Jimmy Woo][rLYcLpO8QtyDJ1JaGrRvIg][inet[/10.159.61.227:9300]], reason: zen-disco-join (elected_as_master)
[2013-09-13 17:31:51,229][INFO ][discovery ] [Jimmy Woo] elasticsearch/rLYcLpO8QtyDJ1JaGrRvIg
[2013-09-13 17:31:51,246][INFO ][http ] [Jimmy Woo] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.159.61.227:9200]}
[2013-09-13 17:31:51,246][INFO ][node ] [Jimmy Woo] started
[2013-09-13 17:31:51,281][DEBUG][action.get ] [Jimmy Woo] [_river][0]: failed to execute [[_river][vendors_river][_meta]: routing [null]]
org.elasticsearch.action.NoShardAvailableActionException: [_river][0] null
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:123)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:72)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:47)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:61)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:179)
at org.elasticsearch.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:112)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:85)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:59)
at org.elasticsearch.river.routing.RiversRouter$1.execute(RiversRouter.java:109)
at org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:103)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
[2013-09-13 17:31:52,232][INFO ][gateway ] [Jimmy Woo] recovered [2] indices into cluster_state
[2013-09-13 17:31:52,396][DEBUG][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] Prefix: [[Jimmy Woo] [mongodb][vendors_river] ] - name: [river.mongodb]
[2013-09-13 17:31:52,406][INFO ][river.mongodb ] mongoServersSettings: [{port=27017, host=db.prod.vendop.com}]
[2013-09-13 17:31:52,406][INFO ][river.mongodb ] Server: db.prod.vendop.com - 27017
[2013-09-13 17:31:52,568][TRACE][river.mongodb ] mongoOptionsSettings: {initial_timestamp={script_type=js, script=var date = new Date(); date.setSeconds(date.getSeconds() + 5); new java.lang.Long(date.getTime());}, secondary_read_preference=true}
[2013-09-13 17:31:52,827][TRACE][river.mongodb ] initialTimestamp script returned: 1379093517797
[2013-09-13 17:31:52,835][INFO ][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] Using mongodb server(s): host [db.prod.vendop.com], port [27017]
[2013-09-13 17:31:52,836][INFO ][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] MongoDB River Plugin version: [1.7.0]
[2013-09-13 17:31:52,836][INFO ][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] starting mongodb stream. options: secondaryreadpreference [true], drop_collection [false], include_collection [], throttlesize [500], gridfs [false], filter [], db [vendop], collection [vendor], script [null], indexing to [vendors]/[vendop]
[2013-09-13 17:31:53,286][TRACE][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] serverStatus: { "serverUsed" : "ip-10-181-140-155/10.181.140.155:27017" , "host" : "ip-10-181-140-155" , "version" : "2.4.5" , "process" : "mongod" , "pid" : 4808 , "uptime" : 41183.0 , "uptimeMillis" : 41182854 , "uptimeEstimate" : 40862.0 , "localTime" : { "$date" : "2013-09-13T17:31:53.170Z"} , "asserts" : { "regular" : 0 , "warning" : 0 , "msg" : 0 , "user" : 0 , "rollovers" : 0} , "backgroundFlushing" : { "flushes" : 686 , "total_ms" : 136 , "average_ms" : 0.19825072886297376 , "last_ms" : 0 , "last_finished" : { "$date" : "2013-09-13T17:31:30.401Z"}} , "connections" : { "current" : 31 , "available" : 19969 , "totalCreated" : 699} , "cursors" : { "totalOpen" : 34 , "clientCursors_size" : 34 , "timedOut" : 652 , "totalNoTimeout" : 20} , "dur" : { "commits" : 88 , "journaledMB" : 0.0 , "writeToDataFilesMB" : 0.0 , "compression" : 0.0 , "commitsInWriteLock" : 0 , "earlyCommits" : 0 , "timeMs" : { "dt" : 3033 , "prepLogBuffer" : 0 , "writeToJournal" : 0 , "writeToDataFiles" : 0 , "remapPrivateView" : 0}} , "extra_info" : { "note" : "fields vary by platform" , "heap_usage_bytes" : 64683304 , "page_faults" : 372578} , "globalLock" : { "totalTime" : 41182884000 , "lockTime" : 22903163 , "currentQueue" : { "total" : 0 , "readers" : 0 , "writers" : 0} , "activeClients" : { "total" : 20 , "readers" : 20 , "writers" : 0}} , "indexCounters" : { "accesses" : 318 , "hits" : 318 , "misses" : 0 , "resets" : 0 , "missRatio" : 0.0} , "locks" : { "." : { "timeLockedMicros" : { "R" : 2835632 , "W" : 22903163} , "timeAcquiringMicros" : { "R" : 3015339 , "W" : 1469808177}} , "admin" : { "timeLockedMicros" : { } , "timeAcquiringMicros" : { }} , "local" : { "timeLockedMicros" : { "r" : 1429902579128 , "w" : 58855} , "timeAcquiringMicros" : { "r" : 727932420060 , "w" : 227207}} , "vendop" : { "timeLockedMicros" : { "r" : 2435005 , "w" : 1234970} , "timeAcquiringMicros" : { "r" : 828842 , "w" : 55977}} , "test" : { "timeLockedMicros" : { "r" : 18719 , "w" : 0} , "timeAcquiringMicros" : { "r" : 90 , "w" : 0}}} , "network" : { "bytesIn" : 1105334 , "bytesOut" : 10207728 , "numRequests" : 18142} , "opcounters" : { "insert" : 9 , "query" : 2973 , "update" : 5 , "delete" : 1 , "getmore" : 0 , "command" : 16927} , "opcountersRepl" : { "insert" : 0 , "query" : 0 , "update" : 0 , "delete" : 0 , "getmore" : 0 , "command" : 0} , "recordStats" : { "accessesNotInMemory" : 102 , "pageFaultExceptionsThrown" : 93 , "local" : { "accessesNotInMemory" : 1 , "pageFaultExceptionsThrown" : 1} , "test" : { "accessesNotInMemory" : 0 , "pageFaultExceptionsThrown" : 0} , "vendop" : { "accessesNotInMemory" : 101 , "pageFaultExceptionsThrown" : 92}} , "repl" : { "setName" : "set1" , "ismaster" : true , "secondary" : false , "hosts" : [ "ip-10-181-140-155:27017"] , "primary" : "ip-10-181-140-155:27017" , "me" : "ip-10-181-140-155:27017"} , "writeBacksQueued" : false , "mem" : { "bits" : 64 , "resident" : 7283 , "virtual" : 33602 , "supported" : true , "mapped" : 16489 , "mappedWithJournal" : 32978} , "metrics" : { "document" : { "deleted" : 1 , "inserted" : 9 , "returned" : 72010 , "updated" : 5} , "getLastError" : { "wtime" : { "num" : 0 , "totalMillis" : 0} , "wtimeouts" : 0} , "operation" : { "fastmod" : 1 , "idhack" : 82 , "scanAndOrder" : 6} , "queryExecutor" : { "scanned" : 12572869515} , "record" : { "moves" : 3} , "repl" : { "apply" : { "batches" : { "num" : 0 , "totalMillis" : 0} , "ops" : 0} , "buffer" : { "count" : 0 , "maxSizeBytes" : 268435456 , "sizeBytes" : 0} , "network" : { "bytes" : 0 , "getmores" : { "num" : 0 , "totalMillis" : 0} , "ops" : 0 , "readersCreated" : 3} , "oplog" : { "insert" : { "num" : 14 , "totalMillis" : 5} , "insertBytes" : 4388} , "preload" : { "docs" : { "num" : 0 , "totalMillis" : 0} , "indexes" : { "num" : 0 , "totalMillis" : 0}}} , "ttl" : { "deletedDocuments" : 0 , "passes" : 686}} , "ok" : 1.0}
[2013-09-13 17:31:53,286][TRACE][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] process: mongod
[2013-09-13 17:31:53,346][DEBUG][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] vendop.vendor last timestamp: { "$ts" : 1378589134 , "$inc" : 45}
[2013-09-13 17:31:53,350][DEBUG][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}

[2013-09-13 17:32:53,383][ERROR][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] Mongo gave an exception
com.mongodb.MongoException$Network: Read operation to server ip-10-181-140-155/10.181.140.155:27017 failed on database local
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:253)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:216)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:288)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:273)
at com.mongodb.DBCursor._check(DBCursor.java:368)
at com.mongodb.DBCursor._hasNext(DBCursor.java:459)
at com.mongodb.DBCursor.hasNext(DBCursor.java:484)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.run(MongoDBRiver.java:1211)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at org.bson.io.Bits.readFully(Bits.java:46)
at org.bson.io.Bits.readFully(Bits.java:33)
at org.bson.io.Bits.readFully(Bits.java:28)
at com.mongodb.Response.(Response.java:40)
at com.mongodb.DBPort.go(DBPort.java:142)
at com.mongodb.DBPort.call(DBPort.java:92)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:244)
... 8 more
[2013-09-13 17:32:53,386][DEBUG][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] vendop.vendor last timestamp: { "$ts" : 1378589134 , "$inc" : 45}
[2013-09-13 17:32:53,387][DEBUG][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}

dblado · 2013-09-13T17:39:10Z

this explain seems to be doing something:
set1:PRIMARY> db.oplog.rs.find({ "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}).explain()

it's been running a couple minutes already but I guess that's normal because the explain() still runs the query normally just outputs additional data??

richardwilly98 · 2013-09-13T18:09:44Z

I don't believe you are have correctly the SNAPHOT version. The log file should show:
MongoDB River Plugin version: [1.7.1-SNAPSHOT] instead of MongoDB River Plugin version: [1.7.0]

Did you replace elasticsearch-river-mongodb-1.7.0.jar by elasticsearch-river-mongodb-1.7.1-SNAPSHOT.jar in $ES_HOME/plugins/river-mongodb?

dblado · 2013-09-13T18:14:54Z

my bad -- I didn't delete 1.7.0 just copied in the SNAPSHOT -- running again

On Fri, Sep 13, 2013 at 11:09 AM, Richard Louapre
notifications@git.luolix.topwrote:

I don't believe you are have correctly the SNAPHOT version. The log file
should show:
MongoDB River Plugin version: [1.7.1-SNAPSHOT] instead of MongoDB River
Plugin version: [1.7.0]

Did you replace elasticsearch-river-mongodb-1.7.0.jar by
elasticsearch-river-mongodb-1.7.1-SNAPSHOT.jar in
$ES_HOME/plugins/river-mongodb?

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24413525
.

dblado · 2013-09-13T18:21:03Z

in the startup of ES I'm seeing:

[2013-09-13 18:14:26,106][TRACE][river.mongodb ] mongoOptionsSettings: {initial_timestamp={script_type=js, script=var date = new Date(); date.setSeconds(date.getSeconds() + 5); new java.lang.Long(date.getTime());}, secondary_read_preference=true}
[2013-09-13 18:14:26,514][TRACE][river.mongodb ] initialTimestamp script returned: 1379096071448

but the query to mongo doesn't seem to respect the initial timestamp:
[2013-09-13 18:14:26,997][DEBUG][river.mongodb ] [Topspin] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}

dblado · 2013-09-13T18:21:41Z

here is the explain:

set1:PRIMARY> db.oplog.rs.find({ "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}).explain()

{
"cursor" : "ForwardCappedCursor",
"isMultiKey" : false,
"n" : 0,
"nscannedObjects" : 18878164,
"nscanned" : 18878164,
"nscannedObjectsAllPlans" : 18878164,
"nscannedAllPlans" : 18878164,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 3989,
"nChunkSkips" : 0,
"millis" : 930870,
"indexBounds" : {

},
"server" : "ip-10-181-140-155:27017"

}

dblado · 2013-09-13T18:24:41Z

wow, just had LOTS of activity in the ES log:
seems like everything updated now in the index
http://pastebin.com/SyVfbrwY

dblado · 2013-09-13T18:27:47Z

things appear to be coming in live now -- so the last_ts in the river was GMT: Sat, 07 Sep 2013 21:25:34 GMT. According to my log files my issue has actually been happening for a while I just never noticed until yesterday. I'm guessing river queries were taking longer and longer each time because it was trying to play catchup??

richardwilly98 · 2013-09-13T18:37:13Z

The initial sync can definitely take a long time but hopefully it should be fast once all data are in sync.
Is it what you are seeing?

Regarding the initial timestamp can you please provide your river settings so I can take a look?

dblado · 2013-09-13T18:41:39Z

here is my river setting:

{
"index": {
"name": "vendors"
},
"_enabled": true,
"_name": "vendors_river",
"mongodb": {
"db": "vendop",
"servers": [
{
"port": 27017,
"host": "db.prod.vendop.com"
}
],
"collection": "vendor",
"options": {
"initial_timestamp": {
"script_type": "js",
"script": "var date = new Date(); date.setSeconds(date.getSeconds() + 5); new java.lang.Long(date.getTime());"
},
"secondary_read_preference": true
}
},
"type": "mongodb"
}

dblado · 2013-09-13T18:44:32Z

as soon as it caught the river was near live but it's slowing down -- response time from mongo is 400000+ ms

richardwilly98 · 2013-09-13T18:55:34Z

Did you try to register the river multiple times? Did you set "initial_timestamp" in one of your last version of the river settings?

"initial_timestamp" is used only if _last_ts does not exist.

So you want to use this parameter make sure the previous river settings have been deleted first:

curl -XDELETE localhost:9200/_river/vendors_river

richardwilly98 · 2013-09-13T19:15:08Z

-- response time from mongo is 400000+ ms

Can you tell which query is slowing down?

dblado · 2013-09-13T19:26:45Z

ahh ok, will remember to delete the river and not use multiple versions to
modify the timestamp -- didn't see that in the docs.

seems all the queries form the river are slow still:

Fri Sep 13 18:20:51.854 [conn744] query local.oplog.rs query: { $query: {
$and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts:
{ $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } }
cursorid:188027466574947012 ntoreturn:0 ntoskip:0 nscanned:18878150
keyUpdates:0 numYields: 2320 locks(micros) r:1053631876 nreturned:101
reslen:17333 536445ms
Fri Sep 13 18:20:51.854 [conn744] end connection 10.159.61.227:39294 (17
connections now open)
Fri Sep 13 18:20:51.854 [conn741] query local.oplog.rs query: { $query: {
$and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts:
{ $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } }
cursorid:188027467292149267 ntoreturn:0 ntoskip:0 nscanned:18878150
keyUpdates:0 numYields: 2965 locks(micros) r:1408357829 nreturned:101
reslen:17333 716624ms
Fri Sep 13 18:20:51.854 [conn741] end connection 10.159.61.227:39291 (16
connections now open)
Fri Sep 13 18:20:51.858 [conn737] query local.oplog.rs query: { query: {
$and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts:
{ $gt: { $ts: 1378589134.0, $inc: 45.0 } } } ] }, $explain: true }
ntoreturn:0 ntoskip:0 nscanned:18878164 keyUpdates:0 numYields: 3989
locks(micros) r:1831009723 nreturned:1 reslen:401 930874ms
Fri Sep 13 18:20:54.785 [conn754] query local.oplog.rs query: { $query: {
$and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts:
{ $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } }
cursorid:188039966367486858 ntoreturn:0 ntoskip:0 nscanned:18878150
keyUpdates:0 numYields: 1492 locks(micros) r:762280729 nreturned:101
reslen:17333 387781ms
Fri Sep 13 18:20:54.785 [conn746] query local.oplog.rs query: { $query: {
$and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts:
{ $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } }
cursorid:188039965764332883 ntoreturn:0 ntoskip:0 nscanned:18878150
keyUpdates:0 numYields: 1576 locks(micros) r:824228690 nreturned:101
reslen:17333 419325ms

looks like those are all for the same timestamp

On Fri, Sep 13, 2013 at 12:15 PM, Richard Louapre
notifications@git.luolix.topwrote:

-- response time from mongo is 400000+ ms

Can you tell which query is slowing down?

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24417964
.

richardwilly98 · 2013-09-13T19:50:17Z

It looks like there might be an undocumented QUERYOPTION_OPLOGREPLAY option [1] to make query on oplog.rs more efficient.

I will test it and let you know.

Thanks,
Richard.

[1] - https://jira.mongodb.org/browse/JAVA-771

richardwilly98 · 2013-09-13T20:58:44Z

Can you please try this version [1]?

QUERYOPTION_OPLOGREPLAY has been implemented.

[1] - https://dl.dropboxusercontent.com/u/64847502/elasticsearch-river-mongodb-1.7.1-SNAPSHOT.zip

dblado · 2013-09-14T17:54:28Z

everything seems to have stabilized w/ the most recent change to use QUERYOPTION_OPLOGREPLAY. WIll continue monitoring and let you know if I have other issues.

Thanks so much for the quick turnaround.

richardwilly98 · 2013-09-15T12:59:31Z

@dblado could you please provide the response time of the query using QUERYOPTION_OPLOGREPLAY?

dblado · 2013-09-15T20:30:04Z

What is odd is that the oplog is now empty...even when I make changes to the db...so nothing is being indexed??

dblado · 2013-09-15T21:17:34Z

oh nevermind, I was looking at localhost :)

here are log entries:

Sun Sep 15 21:13:42.290 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:114 nreturned:0 reslen:20 5010ms
Sun Sep 15 21:13:47.302 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:93 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:13:52.314 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:124 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:13:57.325 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:103 nreturned:0 reslen:20 5010ms
Sun Sep 15 21:14:02.337 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:137 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:14:07.349 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:135 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:14:12.361 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:161 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:14:17.373 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:106 nreturned:0 reslen:20 5010ms
Sun Sep 15 21:16:33.228 [conn12] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:489838434203944 ntoreturn:0 keyUpdates:0 locks(micros) r:114 nreturned:0 reslen:20 5010ms
Sun Sep 15 21:16:38.240 [conn12] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:489838434203944 ntoreturn:0 keyUpdates:0 locks(micros) r:111 nreturned:0 reslen:20 5010ms
Sun Sep 15 21:16:43.252 [conn12] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:489838434203944 ntoreturn:0 keyUpdates:0 locks(micros) r:122 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:16:48.263 [conn12] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:489838434203944 ntoreturn:0 keyUpdates:0 locks(micros) r:104 nreturned:0 reslen:20 5010ms

dblado · 2013-09-16T03:47:29Z

each query is still taking about 5 seconds -- considering that there are no updated records in the db in the last few hours that seems a bit high to me

richardwilly98 · 2013-09-16T12:22:39Z

Could you please provide the response time for these 3 queries (sorry I don't have a MongoDB instance with a huge oplog.rs):

db.oplog.rs.find({ $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } })
db.oplog.rs.find({ $query: { ts: { $gt: Timestamp 1379278843000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } })
db.oplog.rs.find({ $query: { ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } })

dblado · 2013-09-16T16:47:16Z

Had to modify them to:

db.oplog.rs.find({ $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp(1379278843,1) } }, $orderby: { $natural: 1 } })

db.oplog.rs.find({ $query: { ts: { $gt: Timestamp(1379278843,1) }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } })

db.oplog.rs.find({ $query: { ts: { $gt: Timestamp(1379278843,1) } }, $orderby: { $natural: 1 } })

waiting for the results

dblado · 2013-09-16T17:02:31Z

results:

Mon Sep 16 16:50:40.649 [conn15] query local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1.0 } } ntoreturn:0 ntoskip:0 nscanned:18878319 keyUpdates:0 numYields: 919 locks(micros) r:613458050 nreturned:0 reslen:20 308179ms

Mon Sep 16 16:52:50.190 [conn17] query local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1.0 } } ntoreturn:0 ntoskip:0 nscanned:18878319 keyUpdates:0 numYields: 1062 locks(micros) r:701058589 nreturned:0 reslen:20 351803ms

Mon Sep 16 16:52:50.190 [conn16] query local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379278843000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1.0 } } ntoreturn:0 ntoskip:0 nscanned:18878319 keyUpdates:0 numYields: 1164 locks(micros) r:770496332 nreturned:0 reslen:20 386413ms

dblado · 2013-09-16T17:04:13Z

the river is producing these queries right now @ 5seconds per query:

Mon Sep 16 17:03:09.425 [conn8] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:39500982229712 ntoreturn:0 keyUpdates:0 locks(micros) r:108 nreturned:0 reslen:20 5010ms
Mon Sep 16 17:03:12.546 [conn7] run command admin.$cmd { ismaster: 1 }
Mon Sep 16 17:03:12.546 [conn7] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Mon Sep 16 17:03:12.546 [conn3] run command admin.$cmd { ismaster: 1 }
Mon Sep 16 17:03:12.547 [conn3] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Mon Sep 16 17:03:14.437 [conn8] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:39500982229712 ntoreturn:0 keyUpdates:0 locks(micros) r:157 nreturned:0 reslen:20 5011ms
Mon Sep 16 17:03:17.547 [conn7] run command admin.$cmd { ismaster: 1 }
Mon Sep 16 17:03:17.547 [conn7] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Mon Sep 16 17:03:17.547 [conn3] run command admin.$cmd { ismaster: 1 }
Mon Sep 16 17:03:17.548 [conn3] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Mon Sep 16 17:03:19.448 [conn8] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:39500982229712 ntoreturn:0 keyUpdates:0 locks(micros) r:92 nreturned:0 reslen:20 5010ms

richardwilly98 · 2013-09-18T17:41:16Z

5 seconds seems still a lot to me.

I have changed the order if the index filter in oplog.rs: ts comes first.

Can you please give a try to this version [1]?

[1] - https://dl.dropboxusercontent.com/u/64847502/elasticsearch-river-mongodb-1.7.1-SNAPSHOT.zip

dblado · 2013-09-18T21:12:26Z

yea 5 seconds is still a lot of time.

new jar didn't change it

Wed Sep 18 21:11:05.118 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:188 nreturned:0 reslen:20 5011ms
Wed Sep 18 21:11:09.905 [conn63] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:09.905 [conn63] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:09.986 [conn67] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:09.986 [conn67] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:10.130 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:187 nreturned:0 reslen:20 5011ms
Wed Sep 18 21:11:14.908 [conn63] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:14.908 [conn63] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:14.988 [conn67] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:14.988 [conn67] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:15.142 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:182 nreturned:0 reslen:20 5011ms
Wed Sep 18 21:11:19.910 [conn63] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:19.910 [conn63] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:19.989 [conn67] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:19.989 [conn67] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:20.154 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:187 nreturned:0 reslen:20 5011ms
Wed Sep 18 21:11:22.439 [DataFileSync] flushing mmaps took 0ms for 17 files
Wed Sep 18 21:11:24.913 [conn63] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:24.913 [conn63] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:24.990 [conn67] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:24.990 [conn67] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:25.166 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:195 nreturned:0 reslen:20 5011ms

richardwilly98 · 2013-09-28T12:35:13Z

@dblado I have just create a new post in MongoDB user group [1].

[1] - https://groups.google.com/forum/#!topic/mongodb-user/E7BSv624nBg

richardwilly98 · 2013-11-02T12:25:37Z

@dblado just a quick note according to Asya Kamsky the log should be interpreted as:
Wed Sep 18 21:11:25.166 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:195 nreturned:0 reslen:20 5011ms

r:195 -> it took 195 microseconds, the rest of the time was spent waiting.

- Implement suggested changes - https://groups.google.com/forum/#!topic/mongodb-user/E7BSv624nBg

richardwilly98 mentioned this issue Sep 25, 2013

Remove commented out code or place a comment as to why it's commented out #135

Closed

richardwilly98 added a commit that referenced this issue Nov 2, 2013

Change oplog.rs filter - #123

80d1c6e

- Implement suggested changes - https://groups.google.com/forum/#!topic/mongodb-user/E7BSv624nBg

richardwilly98 closed this as completed Nov 2, 2013

richardwilly98 mentioned this issue Nov 4, 2013

River issuing too many queries to oplog.rs #164

Closed

read operation to server failed on database local #123

read operation to server failed on database local #123

Comments

dblado commented Sep 12, 2013

dblado commented Sep 13, 2013

richardwilly98 commented Sep 13, 2013

dblado commented Sep 13, 2013

richardwilly98 commented Sep 13, 2013

dblado commented Sep 13, 2013

richardwilly98 commented Sep 13, 2013

dblado commented Sep 13, 2013

dblado commented Sep 13, 2013

richardwilly98 commented Sep 13, 2013

dblado commented Sep 13, 2013

dblado commented Sep 13, 2013

richardwilly98 commented Sep 13, 2013

dblado commented Sep 13, 2013

dblado commented Sep 13, 2013

richardwilly98 commented Sep 13, 2013

dblado commented Sep 13, 2013

dblado commented Sep 13, 2013

dblado commented Sep 13, 2013

dblado commented Sep 13, 2013

dblado commented Sep 13, 2013

richardwilly98 commented Sep 13, 2013

dblado commented Sep 13, 2013

dblado commented Sep 13, 2013

richardwilly98 commented Sep 13, 2013

richardwilly98 commented Sep 13, 2013

dblado commented Sep 13, 2013

richardwilly98 commented Sep 13, 2013

richardwilly98 commented Sep 13, 2013

dblado commented Sep 14, 2013

richardwilly98 commented Sep 15, 2013

dblado commented Sep 15, 2013

dblado commented Sep 15, 2013

dblado commented Sep 16, 2013

richardwilly98 commented Sep 16, 2013

dblado commented Sep 16, 2013

dblado commented Sep 16, 2013

dblado commented Sep 16, 2013

richardwilly98 commented Sep 18, 2013

dblado commented Sep 18, 2013

richardwilly98 commented Sep 28, 2013

richardwilly98 commented Nov 2, 2013