Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Memory improvements in FastJsonNode (#5088)
Fixes #5124, DGRAPH-1170 While converting a subgraph to JSON response, an intermediate data structure called fastJsonNode tree is formed. We have observed when response to be returned is big(specially in recurse queries), this datastructure itself can occupy lot of memory and leading to OOM in some cases. This PR aims to reduce space occupied by fastJsonNode tree. fastJsonNode tree is kind of n-ary tree, where each fastJsonNode maintains some meta data and list of its children. This PR tries to reduce space occupied by each node in following way: For each response a separate datastructure called encoder is formed which is responsible for maintaining meta data for all fastJsonNodes. encoder has metaSlice and childrenMap where all meta and children list are maintained for all fastJsonNodes. Index at which meta for a fastJsonNode is present, becomes its value and hence type of a fastJsonNode is uint32. meta for a fastJsonNode(present at int(fastJsonNode) value in metaSlice) is of uint64 type. It stores all the info for a fastJsonNode. Most significant bit stores value of List field, bytes 7-6 stores attr id and bytes 4 to 1 stores arena offset((explained below)). encoder has attrMap which has mapping of predicates to unique uint16 number. encoder also has arena. arena is a larger []byte, which stores bytes for each leaf node. It offsets are stored in fastJsonNode meta. arena stores same []byte only once and keeps a map for memhash([]byte) to offset mapping. On this change, I am able to run some of queries which were resulting in OOM current master. Profile for a query when RSS usage was around 30GB master profile on the query: File: dgraph Build ID: 4009644c7dfb41957358d88f228b977b2fb552c7 Type: inuse_space Time: Apr 11, 2020 at 10:31pm (IST) Entering interactive mode (type "help" for commands, "o" for options) (pprof) top Showing nodes accounting for 26.74GB, 98.87% of 27.05GB total Dropped 133 nodes (cum <= 0.14GB) Showing top 10 nodes out of 42 flat flat% sum% cum cum% 16.62GB 61.43% 61.43% 16.62GB 61.43% github.com/dgraph-io/dgraph/query.makeScalarNode 4.23GB 15.63% 77.06% 4.23GB 15.63% github.com/dgraph-io/dgraph/query.(*fastJsonNode).New 2.03GB 7.51% 84.57% 2.03GB 7.51% github.com/dgraph-io/dgraph/query.stringJsonMarshal 1.64GB 6.08% 90.65% 15.99GB 59.10% github.com/dgraph-io/dgraph/query.(*fastJsonNode).AddListValue 0.88GB 3.25% 93.90% 5.24GB 19.36% github.com/dgraph-io/dgraph/query.(*fastJsonNode).SetUID 0.44GB 1.61% 95.52% 0.44GB 1.61% github.com/dgraph-io/dgraph/query.(*fastJsonNode).AddListChild 0.36GB 1.32% 96.84% 0.39GB 1.45% github.com/dgraph-io/dgraph/worker.(*queryState).handleValuePostings.func1 0.25GB 0.92% 97.76% 0.25GB 0.92% github.com/dgraph-io/ristretto.newCmRow 0.16GB 0.6% 98.36% 0.16GB 0.6% github.com/dgraph-io/badger/v2/skl.newArena 0.14GB 0.51% 98.87% 0.14GB 0.51% github.com/dgraph-io/ristretto/z.(*Bloom).Size This PR profile: File: dgraph Build ID: 8fd737a95d4edf3ffb305638081766fd0044e99d Type: inuse_space Time: Apr 15, 2020 at 11:20am (IST) Entering interactive mode (type "help" for commands, "o" for options) (pprof) top Showing nodes accounting for 15557.02MB, 98.53% of 15789.61MB total Dropped 168 nodes (cum <= 78.95MB) Showing top 10 nodes out of 60 flat flat% sum% cum cum% 6341.11MB 40.16% 40.16% 6341.11MB 40.16% github.com/dgraph-io/dgraph/query.(*encoder).appendAttrs 4598.84MB 29.13% 69.29% 4598.84MB 29.13% bytes.makeSlice 3591.16MB 22.74% 92.03% 3591.16MB 22.74% github.com/dgraph-io/dgraph/query.(*encoder).newNode 365.52MB 2.31% 94.34% 408.54MB 2.59% github.com/dgraph-io/dgraph/worker.(*queryState).handleValuePostings.func1 256MB 1.62% 95.97% 256MB 1.62% github.com/dgraph-io/ristretto.newCmRow 166.41MB 1.05% 97.02% 166.41MB 1.05% github.com/dgraph-io/badger/v2/skl.newArena 140.25MB 0.89% 97.91% 140.25MB 0.89% github.com/dgraph-io/ristretto/z.(*Bloom).Size 91.12MB 0.58% 98.48% 98.05MB 0.62% github.com/dgraph-io/dgraph/posting.(*List).Uids 6MB 0.038% 98.52% 122.23MB 0.77% github.com/dgraph-io/dgraph/worker.(*queryState).handleUidPostings.func1 0.64MB 0.004% 98.53% 387.17MB 2.45% github.com/dgraph-io/ristretto.NewCache
- Loading branch information