Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak when saving and indexing (clouseau) a big document (10MiB) #3688

Closed
rmartinez-dasnano opened this issue Aug 1, 2021 · 0 comments

Comments

@rmartinez-dasnano
Copy link

Description

Using ibm docker (couchdb+lucene index) based on couchdb 3.1.1 (ibmcom/couchdb3:3.1.1), we are storing a big document into couchdb.
One of the fields in the document contains an array of arrays created from a 10MiB csv. When this document is created container memory usage starts growing until it reaches memory limit for the container (8Gib), or host limit (running in a 16GiB machine), or pocess limit

  • If machine limit is reached, oom-killer kills beam.smp
  • If container limit is reached, container is restarted.
  • When none of these limits is reached, then the issue seems to be in dreyfus index updater. It's seems there is an OOM in OS process
out of memory
[info] 2021-08-01T10:09:57.661891Z couchdb@127.0.0.1 <0.236.0> -------- couch_proc_manager <0.18737.1> died normal
[error] 2021-08-01T10:09:57.661891Z couchdb@127.0.0.1 <0.22307.1> -------- OS Process Error <0.18737.1> :: {os_process_error,{exit_status,1}}
[error] 2021-08-01T10:09:57.662507Z couchdb@127.0.0.1 emulator -------- Error in process <0.22307.1> on node 'couchdb@127.0.0.1' with exit value:
{{nocatch,{os_process_error,{exit_status,1}}},[{couch_os_process,prompt,2,[{file,"src/couch_os_process.erl"},{line,59}]},{couch_query_servers,proc_prompt,2,[{file,"src/couch_query_servers.erl"},{line,520}]},{dreyfus_index_updater,update_or_delete_index,4,[{file,"src/dreyfus_index_updater.erl"},{line,141}]},{dreyfus_index_updater,load_docs,2,[{file,"src/dreyfus_index_updater.erl"},{line,80}]},{couch_bt_engine,drop_reductions,4,[{file,"src/couch_bt_engine.erl"},{line,1177}]},{couch_btree,stream_kv_node2,8,[{file,"src/couch_btree.erl"},{line,851}]},{couch_btree,stream_kp_node,7,[{file,"src/couch_btree.erl"},{line,778}]},{couch_btree,fold,4,[{file,"src/couch_btree.erl"},{line,224}]}]}

Javascript fragment in the desgin document indexing this field: joins position 1 of arrays in a variable and then we index it in a single call to index function (built string size is ~ 11MiB). It's the same if we call index function as many times as rows in the array, same error.

if (doc.content && typeof(doc.content) !== 'undefined') {
           var stringToIndex="";
           for (i = 0; i < doc.content.length; i++) {
                 stringToIndex= stringToIndex+ doc.content[i][1]+" ";
            } 
            index("doc_number", stringToIndex, {"boost": 1, "facet":false, "index": true, "store": false});
}

To avoid os_process_error issue, we tried to increase max memory of couchjs processes with COUCHDB_QUERY_SERVER_JAVASCRIPT

environment:
  - COUCHDB_USER=****
  - COUCHDB_PASSWORD=****
  - COUCHDB_QUERY_SERVER_JAVASCRIPT="/opt/couchdb/bin/couchjs -S 536870912 /opt/couchdb/share/server/main.js"

But it seems it has no effect

docker container top 5bb095ba7a85
UID                 PID                 PPID                C                   STIME               TTY                 TIME                CMD
1001                333932              333911              0                   09:21               ?                   00:00:00            runsvdir -P -H /etc/service log: ...........................................................................................................................................................................................................................................................................................................................................................................................................
1001                334124              333932              0                   09:21               ?                   00:00:00            runsv couchdb
1001                334125              333932              0                   09:21               ?                   00:00:00            runsv couchdb-search
1001                334127              334125              0                   09:21               ?                   00:00:56            java -server -Xmx2G -Dsun.net.inetaddr.ttl=30 -Dsun.net.inetaddr.negative.ttl=30 -Dlog4j.configuration=file:/opt/couchdb-search/etc/log4j.properties -XX:OnOutOfMemoryError=kill -9 %p -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -classpath /opt/couchdb-search/lib/* com.cloudant.clouseau.Main /opt/couchdb-search/etc/clouseau.ini
1001                334141              333932              0                   09:21               ?                   00:00:00            /opt/couchdb/bin/../erts-9.3.3.14/bin/epmd -daemon
1001                366563              334124              13                  11:41               ?                   00:09:12            /opt/couchdb/bin/../erts-9.3.3.14/bin/beam.smp -K true -A 16 -Bd -- -root /opt/couchdb/bin/.. -progname couchdb -- -home /opt/couchdb -- -boot /opt/couchdb/bin/../releases/3.1.1/couchdb -kernel inet_dist_listen_min 9100 -kernel inet_dist_listen_max 9100 -crypto fips_mode true -kernel error_logger silent -sasl sasl_error_logger false -noshell -noinput -name couchdb@127.0.0.1 -config /opt/couchdb/bin/../releases/3.1.1/sys.config -setcookie monster
1001                366594              366563              0                   11:41               ?                   00:00:00            erl_child_setup 1048576
1001                366647              366594              0                   11:41               ?                   00:00:00            inet_gethost 4
1001                366648              366647              0                   11:41               ?                   00:00:00            inet_gethost 4
1001                367618              366594              0                   11:45               ?                   00:00:30            ./bin/couchjs ./share/server/main.js
1001                367619              366594              0                   11:45               ?                   00:00:30            ./bin/couchjs ./share/server/main.js
1001                367659              366594              0                   11:45               ?                   00:00:29            ./bin/couchjs ./share/server/main.js
1001                369407              366594              0                   11:55               ?                   00:00:16            ./bin/couchjs ./share/server/main.js
1001                369417              366594              0                   11:55               ?                   00:00:17            ./bin/couchjs ./share/server/main.js
1001                369427              366594              0                   11:55               ?                   00:00:17            ./bin/couchjs ./share/server/main.js
1001                370634              366594              0                   12:07               ?                   00:00:12            ./bin/couchjs ./share/server/main.js
1001                370808              366594              0                   12:09               ?                   00:00:06            ./bin/couchjs ./share/server/main.js

    PID USUARIO   PR  NI    VIRT    RES    SHR S  %CPU  %MEM     HORA+ ORDEN                                                                                                                                                                                                      
 334127 1001      20   0 7741732 218544      0 S   0,3   1,4   0:58.75 java                                                                                                                                                                                                       
 366563 1001      20   0 5061660 195320  11820 S   4,6   1,2   9:52.21 beam.smp                                                                                                                                                                                                   
 367659 1001      20   0 1770568 140796  10688 S   0,0   0,9   0:29.68 couchjs                                                                                                                                                                                                    
 369427 1001      20   0 1770548 140676  10632 S   0,0   0,9   0:17.14 couchjs                                                                                                                                                                                                    
 369417 1001      20   0 1770504 140052  10368 S   0,0   0,9   0:17.03 couchjs                                                                                                                                                                                                    
 367618 1001      20   0 1769084 139404  10608 S   0,0   0,9   0:30.41 couchjs                                                                                                                                                                                                    
 370634 1001      20   0 1769480 139092  10464 S   0,0   0,9   0:12.20 couchjs                                                                                                                                                                                                    
 367619 1001      20   0 1768648 138664  10252 S   0,0   0,9   0:30.17 couchjs                                                                                                                                                                                                    
 369407 1001      20   0 1767160 136556  10536 S   0,0   0,9   0:16.80 couchjs                                                                                                                                                                                                    
 370808 1001      20   0 1727328  96888  10536 S   0,0   0,6   0:06.60 couchjs                                                                                                                                                                                                    
 366648 1001      20   0   14248   1636   1496 S   0,0   0,0   0:00.00 inet_gethost                                                                                                                                                                                               
 366594 1001      20   0    4368   1316   1232 S   0,0   0,0   0:00.33 erl_child_setup                                                                                                                                                                                            
 366647 1001      20   0   12124   1276   1172 S   0,0   0,0   0:00.00 inet_gethost                                                                                                                                                                                               
 334141 1001      20   0   12128     88      0 S   0,0   0,0   0:00.21 epmd                                                                                                                                                                                                       
 334124 1001      20   0    4244     44      0 S   0,0   0,0   0:00.01 runsv                                                                                                                                                                                                      
 333932 1001      20   0    4396     28      0 S   0,0   0,0   0:00.46 runsvdir                                                                                                                                                                                                   
 334125 1001      20   0    4244      4      0 S   0,0   0,0   0:00.00 runsv                                                                                                                                                                                                      

Steps to Reproduce

  • Start docker image ibmcom/couchdb3:3.1.1
  • Create the document
  • Wait 5 seconds and memory will grow from 350Mib aprox to 6GiB

Expected Behaviour

  • Memory consumption does not grow so much. 10MyB document -> 6Gib of memory consumotion looks like a memory leak.
  • COUCHDB_QUERY_SERVER_JAVASCRIPT env variable should work as expected

Your Environment

  • CouchDB version used: 3.1.1

  • Docker version: Docker version 20.10.7, build f0df350

  • Container limits:

         deploy:
        resources:
          limits:
            cpus: "4"
            memory: 8192M
          reservations:
            cpus: "4"
            memory: 4096M
    
  • Couchdb configuration:

      [attachments] compressible_types="text/*, application/javascript, application/json, application/xml"
    [attachments] compression_level="8"
    [chttpd] backlog="512"
    [chttpd] bind_address="any"
    [chttpd] max_db_number_for_dbs_info_req="100"
    [chttpd] port="5984"
    [chttpd] prefer_minimal="Cache-Control, Content-Length, Content-Range, Content-Type, ETag, Server, Transfer-Encoding, Vary"
    [chttpd] require_valid_user="false"
    [chttpd] server_options="[{backlog, 512}, {acceptor_pool_size, 64}, {max, 4096}]"
    [chttpd] socket_options="[{sndbuf, 262144}, {nodelay, true}]"
    [cluster] n="3"
    [cluster] q="2"
    [cors] credentials="false"
    [couch_httpd_auth] allow_persistent_cookies="true"
    [couch_httpd_auth] auth_cache_size="50"
    [couch_httpd_auth] authentication_db="_users"
    [couch_httpd_auth] authentication_redirect="/_utils/session.html"
    [couch_httpd_auth] iterations="10"
    [couch_httpd_auth] require_valid_user="false"
    [couch_httpd_auth] secret="aaaaaa"
    [couch_httpd_auth] timeout="600"
    [couch_peruser] database_prefix="userdb-"
    [couch_peruser] delete_dbs="false"
    [couch_peruser] enable="false"
    [couchdb] attachment_stream_buffer_size="4096"
    [couchdb] changes_doc_ids_optimization_threshold="100"
    [couchdb] database_dir="./data"
    [couchdb] default_engine="couch"
    [couchdb] default_security="admin_only"
    [couchdb] file_compression="snappy"
    [couchdb] max_dbs_open="10000"
    [couchdb] max_document_size="4294967296"
    [couchdb] os_process_timeout="120000"
    [couchdb] single_node="true"
    [couchdb] users_db_security_editable="false"
    [couchdb] uuid="aaaa"
    [couchdb] view_index_dir="./data"
    [couchdb_engines] couch="couch_bt_engine"
    [csp] enable="true"
    [dreyfus] name="clouseau@127.0.0.1"
    [fabric] request_timeout="infinity"
    [feature_flags] partitioned||*="true"
    [httpd] allow_jsonp="false"
    [httpd] authentication_handlers="{couch_httpd_auth, cookie_authentication_handler}, {couch_httpd_auth, default_authentication_handler}"
    [httpd] bind_address="any"
    [httpd] enable_cors="false"
    [httpd] enable_xframe_options="false"
    [httpd] max_http_request_size="4294967296"
    [httpd] port="5986"
    [httpd] secure_rewrites="true"
    [httpd] socket_options="[{sndbuf, 262144}]"
    [indexers] couch_mrview="true"
    [ioq] concurrency="10"
    [ioq] ratio="0.01"
    [ioq.bypass] compaction="false"
    [ioq.bypass] os_process="true"
    [ioq.bypass] read="true"
    [ioq.bypass] shard_sync="false"
    [ioq.bypass] view_update="true"
    [ioq.bypass] write="true"
    [log] level="debug"
    [log] writer="stderr"
    [query_server_config] os_process_limit="2000"
    [query_server_config] os_process_soft_limit="1000"
    [query_server_config] reduce_limit="true"
    [replicator] connection_timeout="30000"
    [replicator] http_connections="20"
    [replicator] interval="60000"
    [replicator] max_churn="20"
    [replicator] max_jobs="500"
    [replicator] retries_per_request="5"
    [replicator] socket_options="[{keepalive, true}, {nodelay, false}]"
    [replicator] ssl_certificate_max_depth="3"
    [replicator] startup_jitter="5000"
    [replicator] verify_ssl_certificates="false"
    [replicator] worker_batch_size="500"
    [replicator] worker_processes="4"
    [ssl] port="6984"
    [uuids] algorithm="sequential"
    [uuids] max_count="1000"
    [vendor] name="The Apache Software Foundation"
    
  • Operating system and version: Ubuntu 20.04

Additional Context

@apache apache locked and limited conversation to collaborators Aug 1, 2021
@wohali wohali closed this as completed Aug 1, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

2 participants