-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory usage of DartScanner #634
Comments
This comment was originally written by zundel@google.com We looked a bit at #1 and it is tricky due to the scanner's ability to do rollback. We need to keep all parsed tokens in memory in case of rollback (or cleverly rewind and re-tokenize to a particular place in that case.) As an alternative to save some memory, we noticed that the entire source is kept around. We tried to eliminate first all references to the source in DartScanner by making new String() instances for each TokenData.value field. That helped there, but the DartSource object also keeps around a full copy of the source. |
Removed Area-Compiler label. |
Added this to the Later milestone. |
Removed this from the M3 milestone. |
This comment was originally written by amouravski@google.com Added Editor-AnalysisEngine label. |
dartc has been deprecated in favor of the new analysis engine. Removed AnalysisEngine, Editor-AnalysisEngine labels. |
Changes: ``` > git log --format="%C(auto) %h %s" c1eb6cb..b149f80 https://dart.googlesource.com/protobuf.git/+/b149f80 Remove the top level analysis_options.yaml file (#656) https://dart.googlesource.com/protobuf.git/+/8f6f307 Fix comment syntax https://dart.googlesource.com/protobuf.git/+/665f7b0 Remove trailing whitespace in protobuf/LICENSE https://dart.googlesource.com/protobuf.git/+/9ffbaf2 Fix default list type for frozen messages (#654) https://dart.googlesource.com/protobuf.git/+/a68bb5a Fix Fixed32 to be parsed as unsigned when parsing proto3 protos (#655) https://dart.googlesource.com/protobuf.git/+/6957c98 Convert the field started with underscore and digit to a legal name (#651) https://dart.googlesource.com/protobuf.git/+/71defca Account for double.infinity and double.nan in json serializers (#652) https://dart.googlesource.com/protobuf.git/+/5a48349 Allow `Timestamp.toDateTime()` to return a `DateTime` in local timezone (#653) https://dart.googlesource.com/protobuf.git/+/117e869 Make the toString of enums be the value if names are omitted (#649) https://dart.googlesource.com/protobuf.git/+/88c4016 Align the hashCode of a message with an empty unknown field-set with that of no unknown field set (#648) https://dart.googlesource.com/protobuf.git/+/eed09c4 Fix proto3 repeated field encoding without packed option (#635) https://dart.googlesource.com/protobuf.git/+/8f587b1 Simplify `_FieldSet` getters (#646) https://dart.googlesource.com/protobuf.git/+/494f189 Fix compile-protos.sh interpreter (#645) https://dart.googlesource.com/protobuf.git/+/5e1a422 Fix typo in protobuf-0.9.0+1 changelog https://dart.googlesource.com/protobuf.git/+/46df68a Add `UseResult` annotation to `rebuild` (#631) https://dart.googlesource.com/protobuf.git/+/ff5304f Migrate protoc_plugin to null safety (#642) https://dart.googlesource.com/protobuf.git/+/657197d Fix typo in GeneratedMessage.copyWith documentation (#641) https://dart.googlesource.com/protobuf.git/+/e30a522 Fix sharing coded buffer bytes when parsing `bytes` fields (#640) https://dart.googlesource.com/protobuf.git/+/810b166 Clear unknown field when setting an extension field with the same tag (#639) https://dart.googlesource.com/protobuf.git/+/d30623b Treat empty and uninitialized Maps the same in equality checks (#638) https://dart.googlesource.com/protobuf.git/+/4fe3ee4 Make `MapFieldInfo` key and value types non-nullable (#600) https://dart.googlesource.com/protobuf.git/+/c26ac34 Add grpc example to protoc_plugin README (#514) https://dart.googlesource.com/protobuf.git/+/c35d787 Revert changes to reserved names to maintain backwards compat (#636) https://dart.googlesource.com/protobuf.git/+/146b186 Remove unused `GeneratedMessage` constructors (#634) https://dart.googlesource.com/protobuf.git/+/1b12ac9 Remove a closure in `_FieldSet.hashCode` (#633) https://dart.googlesource.com/protobuf.git/+/5731242 Minor fix in query_bench set_nested_value benchmark: (#630) https://dart.googlesource.com/protobuf.git/+/767ce81 Remove fold() closures from `FieldSet._hashCode`. (#554) https://dart.googlesource.com/protobuf.git/+/99bc541 protoc_plugin README fixes and tweaks: (#617) https://dart.googlesource.com/protobuf.git/+/e282e17 protobuf benchs: invoke protoc once with all protos (#623) https://dart.googlesource.com/protobuf.git/+/bef672b protobuf benchs: fix old --trust-type-annotations flag (#622) https://dart.googlesource.com/protobuf.git/+/d072e5f dependabot: check for updates monthly (#620) https://dart.googlesource.com/protobuf.git/+/96bdf38 Fix TypeRegistry passing when unpacking nested Any messages from JSON (#568) https://dart.googlesource.com/protobuf.git/+/4ec722a Correctly combine hash codes for repeated enums. (#556) https://dart.googlesource.com/protobuf.git/+/b96dc21 Testing: don't return static methods in findMemberNames (#618) https://dart.googlesource.com/protobuf.git/+/09e8a8d Update protobuf/benchmarks README for #553 https://dart.googlesource.com/protobuf.git/+/3ef4539 Add a benchmark for computing hashCodes. (#553) https://dart.googlesource.com/protobuf.git/+/d232e6e Tweak READMEs: (#610) https://dart.googlesource.com/protobuf.git/+/0f01fa9 Update protobuf pubspec.yaml (#616) https://dart.googlesource.com/protobuf.git/+/a10426b Update protoc_plugin pubspec.yaml (#615) https://dart.googlesource.com/protobuf.git/+/a0021c7 Make `protoName` unCamelCase lazy (#606) https://dart.googlesource.com/protobuf.git/+/ded1ac7 Document map key and value fields (#603) https://dart.googlesource.com/protobuf.git/+/8792f2a Remove redundant check in PbMap equality check (#604) https://dart.googlesource.com/protobuf.git/+/b7f9569 Tweak BuilderInfo and FieldInfo docs: (#597) https://dart.googlesource.com/protobuf.git/+/6f85c32 Remove a redundant cast (#598) https://dart.googlesource.com/protobuf.git/+/3df8669 Latest mono_repo (#601) https://dart.googlesource.com/protobuf.git/+/9da84ae Fix a potential issue in CodedBufferWriter (#594) https://dart.googlesource.com/protobuf.git/+/6be405f Remove old and unused test file (#589) https://dart.googlesource.com/protobuf.git/+/900cef5 Fix protoc_plugin run-tests make rule (#586) https://dart.googlesource.com/protobuf.git/+/2546269 Fix rounding when handling negative timestamps (#580) https://dart.googlesource.com/protobuf.git/+/8afce8d Fix Readme `pub get` instead of `pub install`. (#486) https://dart.googlesource.com/protobuf.git/+/782fd24 Avoid runtime function type check in lazily created singleton creator functions (#574) https://dart.googlesource.com/protobuf.git/+/a7e75cb Update README.md https://dart.googlesource.com/protobuf.git/+/23136dc Version bump (#563) https://dart.googlesource.com/protobuf.git/+/18346f5 Convert null keys and values to default when parsing map entries (#536) https://dart.googlesource.com/protobuf.git/+/bb4cf0b Bump to latest mono_repo - use latest actions (#561) https://dart.googlesource.com/protobuf.git/+/835ab75 Remove unneeded imports https://dart.googlesource.com/protobuf.git/+/ef733ac latest mono_repo https://dart.googlesource.com/protobuf.git/+/ecfb862 Fix test for latest analysis (#543) https://dart.googlesource.com/protobuf.git/+/26a0a26 fix insecure link in protoc_plugin readme https://dart.googlesource.com/protobuf.git/+/444e855 Bump dart-lang/setup-dart from 1.1 to 1.2 (#535) https://dart.googlesource.com/protobuf.git/+/d3e0e4a Fix analysis options https://dart.googlesource.com/protobuf.git/+/cf29280 Fix comment references ``` Diff: https://dart.googlesource.com/protobuf.git/+/c1eb6cb51af39ccbaa1a8e19349546586a5c8e31~..b149f801cf7a5e959cf1dbf72d61068ac275f24b/ Tested: relying on existing SDK and protobuf tests Change-Id: I7f89b998a0aba13999d180ee9814a26a5f1d054d Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/245228 Reviewed-by: Alexander Markov <alexmarkov@google.com> Reviewed-by: Ömer Ağacan <omersa@google.com> Commit-Queue: Devon Carew <devoncarew@google.com>
This issue was originally filed by zundel@google.com
When running some unit tests with heap space constrained to 32M, 2 tests run out of memory:
=== debugia32 dartc co19/LibTest/core/List/sort/List/sort/A01/t06 ===
=== debugia32 dartc co19/LibTest/core/List/sort/List/sort/A01/t05 ===
Top on the heap histogram are DartScanner.Location, DartScanner.Position and DartScanner.TokenData, with over 200K instances each adding up to 75M of the 32M heap.
The scanner currently tokenizes the entire file into memory, which may be over aggressive. If we do not keep references to these objects throughout the parse, we may be able to GC them if we only tokenize the file in chunks.
In DartScanner.Location, the code currently stores Position objects for start,end of each token. Each Position object contains 3 integers, line # ,column #, and offset.
We could reduce memory usage by storing only 2 integers in Location as byte offsets for start/end from the start of the file. Then, for that source file keep an index to indicate what offset corresponds with each line number. Since the column and line position is rarely accessed, we could use something as simple as an array and use binary search to find the right line number for a given character offset.
The text was updated successfully, but these errors were encountered: