Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

线程池并行查询报错 #2081

Closed
1 task done
jokerCoCo opened this issue Jan 9, 2023 · 6 comments · Fixed by #2199
Closed
1 task done

线程池并行查询报错 #2081

jokerCoCo opened this issue Jan 9, 2023 · 6 comments · Fixed by #2199
Labels
bug Something isn't working

Comments

@jokerCoCo
Copy link

jokerCoCo commented Jan 9, 2023

Problem Type (问题类型)

other exception / error (其他异常报错)

Before submit

  • 我已经确认现有的 IssuesFAQ 中没有相同 / 重复问题

Environment (环境信息)

  • Server Version: v0.12.0
  • Backend: RocksDB x nodes, HDD or SSD
  • OS: xx CPUs, xx G RAM, Centos 7.x
  • Data Size: 1000 个vertices, 500 个edges

Your Question (问题描述)

我尝试测试线程池并行查询图数据库,并行查询每个节点的上下游节点,线程池最大并发量为10;当并发量多时会报如下错误:
java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: Pending tasks size 10001 has exceeded the max limit 10000
请问下这是需要配置什么参数吗?

Vertex/Edge example (问题点 / 边数据举例)

No response

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

No response

@javeme
Copy link
Contributor

javeme commented Jan 9, 2023

可以调整一下MAX_PENDING_TASKS参数的值(需要重新编译):

不过不太确定为何要创建那么多异步任务,是什么样的应用场景呢?

@jokerCoCo
Copy link
Author

jokerCoCo commented Jan 10, 2023

这其实也是我的问题所在,正常来讲我线程池最大并发量为10,创建的任务个数最多的时候也就8个。我代码大体实现如下:
1、创建一个与数据库连接的工具类,内部hugeClient在运行时是一直保持连接状态的
2、当dao层查询时都会调用hugeClient对象获取gremlinManager对象执行查询任务。
3、查询语句如下:
g.V().has(’age‘, gte(threshold)).out().path() ,通过预编译、excute()方式查询
目前一次查询结果大概在1000多条。
4、等待所有查询都查询结束后再执行后续操作
我在想是不是因为hugeClient这个连接一直在导致异步任务数不断累加造成的?

@jokerCoCo
Copy link
Author

jokerCoCo commented Jan 10, 2023

更新一下,查看后台hugegraph-server.log日志发现如下问题:
[gremlin-server-exec-6]2023-01-10 14:35:16,879[INFO ] | Remove left index: v[1000001], query: Query * from VERTEX where [3 >= 10] |
[gremlin-server-exec-8]2023-01-10 14:35:16,879[INFO ] | Remove left index: v[1000002], query: Query * from VERTEX where [3 >= 10] |
[gremlin-server-exec-6]2023-01-10 14:35:16,879[INFO ] | Remove left index: v[1000003], query: Query * from VERTEX where [ 3 >= 10] |
查看hugegraph-master源码如下:

`    protected Id asyncRemoveIndexLeft(ConditionQuery query,
                                      HugeElement element) {
        LOG.info("Remove left index: {}, query: {}", element, query);
        RemoveLeftIndexJob job = new RemoveLeftIndexJob(query, element);
        HugeTask<?> task = EphemeralJobBuilder.of(this.graph())
                                              .name(element.id().asString())
                                              .job(job)
                                              .schedule();
        return task.id();
    }`

使用“g.V().has(’age‘, gte(threshold)).out().path()”好像是为图中每个节点都创建一个异步任务,最终导致查询任务数超出限制。

之后改为g.V().where(values(’age‘), is(gte(threshold))).out().path()正常。这是否是bug?

个人还有下面两个疑问:
1、我给age创建了索引,使用values的方式是否可以走索引?
2、并发场景下预编译语言java代码如何写?下面写的是否有问题

String gremlin =  “g.V().where(values(’age‘), is(gte(threshold))).out().path()”;
hugeClient.gremlin().gremlin(gremlin).binding("threshold", threshold).execute();

@github-actions
Copy link

Due to the lack of activity, the current issue is marked as stale and will be closed after 20 days, any update will remove the stale label

@javeme javeme added bug Something isn't working and removed inactive labels Feb 3, 2023
@javeme
Copy link
Contributor

javeme commented Feb 3, 2023

感谢反馈,看起来是对于left index的处理不够完善,后续需要改进EphemeralJob的schedule。

两个疑问:
1、我给age创建了索引,使用values的方式是否可以走索引?
2、并发场景下预编译语言java代码如何写?下面写的是否有问题

分别解答:

  1. 不会走索引
  2. 看起来没有问题

@JackyYangPassion
Copy link
Contributor

存储后端是什么,我尝试本地复现下这个问题 @jokerCoCo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants