Boxed tokens #34

CAD97 · 2019-11-02T19:54:18Z

This is the result of applying this technique of boxing/deduping GreenToken to compress GreenElement and, as a result, the allocation for a GreenNode's children. It also applies a minimal flexible array member technique for the GreenNode's children, allocating them inline.

This is as far as the technique can go with using the standard Arc and NodeOrToken. GreenElement = NodeOrToken<Arc<GreenNode>, Arc<GreenToken>> is 3xusize large, but can theoretically be niched to 2xusize. (NB: the same niche optimization can be applied to SyntaxElement.)

I've split this PR up into commits each representing a logical step.

GreenToken      32
Arc<GreenNode>  16
Arc<GreenToken> 8
GreenElement    24

SyntaxNode    8
SyntaxToken   16
SyntaxElement 24

It is possible to optimize size further, but I'd like to propose those in a second PR on top of this one. Remaining potential optimization steps along these lines:

Implement a specific GreenElement to manually niche Arc<GreenNode> and Arc<GreenToken>.
- Could be done by the compiler. sizeof(GreenElement) = 2xusize.
Implement a specific SyntaxElement to manually niche SyntaxNode and SyntaxToken.
- Could be done by the compiler. sizeof(SyntaxElement) = 2xusize.
Implement a specific Arc<GreenNode> to make ptr-to-GreenNode a thin ptr instead of a fat ptr.
- Standard GreenElement = NodeOrToken would be 2xusize. Manually non-zero-cost niched GreenElement (tag in alignment bits) would be 1xusize.

Results in rust-analyzer:

With this branch

PS D:\usr\Documents\Code\Rust\rust-analyzer> cargo run --bin ra_cli --release -- analysis-stats .
    Finished release [optimized + debuginfo] target(s) in 0.46s
     Running `target\release\ra_cli.exe analysis-stats .`
Database loaded, 221 roots, 1.0601376s
Crates in this dir: 27
Total modules found: 331
Total declarations: 11135
Total functions: 3839
Item Collection: 11.8499283s, 0b allocated 0b resident
Total expressions: 89244
Expressions of unknown type: 6960 (7%)
Expressions of partially unknown type: 3522 (3%)
Type mismatches: 3568
Inference: 36.3460289s, 0b allocated 0b resident
Total: 48.1964408s, 0b allocated 0b resident
PS D:\usr\Documents\Code\Rust\rust-analyzer> Measure-Command { type "D:\rust-lang\src\libcore\unicode\tables.rs" | cargo run --bin ra_cli --release -- symbols }
    Finished release [optimized + debuginfo] target(s) in 0.45s


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 738
Ticks             : 7380228
TotalDays         : 8.54193055555555E-06
TotalHours        : 0.000205006333333333
TotalMinutes      : 0.01230038
TotalSeconds      : 0.7380228
TotalMilliseconds : 738.0228



PS D:\usr\Documents\Code\Rust\rust-analyzer> Measure-Command { type "D:\rust-lang\src\libcore\unicode\tables.rs" | cargo run --bin ra_cli --release -- parse --no-dump } 
    Finished release [optimized + debuginfo] target(s) in 0.44s
     Running `target\release\ra_cli.exe parse --no-dump`


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 603
Ticks             : 6035426
TotalDays         : 6.98544675925926E-06
TotalHours        : 0.000167650722222222
TotalMinutes      : 0.0100590433333333
TotalSeconds      : 0.6035426
TotalMilliseconds : 603.5426

Without this branch (5451bfb9)

PS D:\usr\Documents\Code\Rust\rust-analyzer> cargo run --bin ra_cli --release -- analysis-stats .
    Finished release [optimized + debuginfo] target(s) in 0.45s
     Running `target\release\ra_cli.exe analysis-stats .`
Database loaded, 220 roots, 1.0174838s
Crates in this dir: 27
Total modules found: 331
Total declarations: 11135
Total functions: 3839
Item Collection: 10.509602s, 0b allocated 0b resident
Total expressions: 89241                                                                                                                                                 
Expressions of unknown type: 6959 (7%)
Expressions of partially unknown type: 3522 (3%)
Type mismatches: 3569
Inference: 34.963529s, 0b allocated 0b resident
Total: 45.4737377s, 0b allocated 0b resident
PS D:\usr\Documents\Code\Rust\rust-analyzer> Measure-Command { type "D:\rust-lang\src\libcore\unicode\tables.rs" | cargo run --bin ra_cli --release -- symbols }         
    Finished release [optimized + debuginfo] target(s) in 0.44s


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 587
Ticks             : 5875475
TotalDays         : 6.80031828703704E-06
TotalHours        : 0.000163207638888889
TotalMinutes      : 0.00979245833333333
TotalSeconds      : 0.5875475
TotalMilliseconds : 587.5475



PS D:\usr\Documents\Code\Rust\rust-analyzer> Measure-Command { type "D:\rust-lang\src\libcore\unicode\tables.rs" | cargo run --bin ra_cli --release -- parse --no-dump } 
    Finished release [optimized + debuginfo] target(s) in 0.44s
     Running `target\release\ra_cli.exe parse --no-dump`


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 573
Ticks             : 5737453
TotalDays         : 6.64057060185185E-06
TotalHours        : 0.000159373694444444
TotalMinutes      : 0.00956242166666667
TotalSeconds      : 0.5737453
TotalMilliseconds : 573.7453

I can't test allocation pressure on Windows. The way this is right here, it looks like a consistent loss.

CAD97 · 2019-11-05T00:56:43Z

Oh and by the way: I ran cargo miri test and everything passes here. Unfortunately, it does appear that miri does not like flexible array members (for further optimization).

matklad · 2019-11-09T09:28:09Z

src/syntax_text.rs

@@ -97,6 +96,7 @@ impl SyntaxText {

    pub fn for_each_chunk<F: FnMut(&str)>(&self, mut f: F) {
        enum Void {}
+        #[allow(clippy::unit_arg)]


I personally don't like clippy-related attributes in the source. Could it be removed?

Moreover, in this case, this is clearly a false-positive, as () is spelled explicitly :)

matklad · 2019-11-09T09:29:02Z

src/syntax_text.rs

@@ -191,7 +191,6 @@ fn zip_texts<I: Iterator<Item = (SyntaxToken, TextRange)>>(xs: &mut I, ys: &mut
        x.1 = TextRange::from_to(x.1.start(), x.1.len() - advance);
        y.1 = TextRange::from_to(y.1.start(), y.1.len() - advance);
    }
-    None


Good catch!

matklad · 2019-11-09T09:35:48Z

src/syntax_text.rs

@@ -59,7 +58,7 @@ impl SyntaxText {

    pub fn slice<R: private::SyntaxTextRange>(&self, range: R) -> SyntaxText {
        let start = range.start().unwrap_or_default();
-        let end = range.end().unwrap_or(self.len());
+        let end = range.end().unwrap_or_else(|| self.len());


This also seems like a false-positive to me: everything here should be inalienable and side-effect and allocation free, so I'd be surprised if there are any significant perf differences. OTOH, readability suffers because of a lambda.

On a more general note, I do appreciate fixing clippy lints (even I don't agree with some particular ones), but it might be prudent to separate "hey, I've run clippy" and "hey, I've completely rewritten the core of this library, It's supper fast now. Oh, and I've added a dozen of unsafe blocks" pull requests :D

matklad · 2019-11-09T09:44:27Z

src/api.rs

@@ -146,10 +146,10 @@ impl<L: Language> fmt::Display for SyntaxElement<L> {
 }

 impl<L: Language> SyntaxNode<L> {
-    pub fn new_root(green: GreenNode) -> SyntaxNode<L> {
+    pub fn new_root(green: Arc<GreenNode>) -> SyntaxNode<L> {


API wise, it might be a good idea to hide the fact that we use Arc internally, and just document that GreenNode is cheap to clone.

matklad · 2019-11-09T10:13:02Z

src/cursor.rs

@@ -125,7 +124,7 @@ impl FreeList {
        for _ in 0..FREE_LIST_LEN {
            res.try_push(&mut Rc::new(NodeData {
                kind: Kind::Free { next_free: None },
-                green: ptr::NonNull::dangling(),


Hm, ptr::NonNull::danging seems much more straightforward to me than the dummy node shenanigans. Why do we need this change?

ptr::NonNull::dangling (as well as ptr::dangling) requires T: Sized.

The next PR changes to provide a dangling fn.

CAD97 · 2019-11-14T22:37:32Z

See #35 now.

CAD97 added 4 commits October 31, 2019 20:58

Split green into submodules

a774bef

Arc-ify GreenElement

6c25862

Trailing objects in GreenNode

ed98c88

impl ToOwned for GreenNode

c03a3bb

CAD97 mentioned this pull request Nov 4, 2019

Trailing objects in GreenNode #32

Closed

CAD97 mentioned this pull request Nov 5, 2019

Custom GreenElement #35

Closed

matklad reviewed Nov 9, 2019

View reviewed changes

CAD97 closed this Nov 14, 2019

CAD97 deleted the boxed-tokens branch November 14, 2019 22:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Boxed tokens #34

Boxed tokens #34

CAD97 commented Nov 2, 2019 •

edited

Loading

CAD97 commented Nov 5, 2019 •

edited

Loading

matklad Nov 9, 2019 •

edited

Loading

matklad Nov 9, 2019

matklad Nov 9, 2019

matklad Nov 9, 2019

matklad Nov 9, 2019

CAD97 Nov 9, 2019 •

edited

Loading

CAD97 commented Nov 14, 2019

Boxed tokens #34

Boxed tokens #34

Conversation

CAD97 commented Nov 2, 2019 • edited Loading

CAD97 commented Nov 5, 2019 • edited Loading

matklad Nov 9, 2019 • edited Loading

Choose a reason for hiding this comment

matklad Nov 9, 2019

Choose a reason for hiding this comment

matklad Nov 9, 2019

Choose a reason for hiding this comment

matklad Nov 9, 2019

Choose a reason for hiding this comment

matklad Nov 9, 2019

Choose a reason for hiding this comment

CAD97 Nov 9, 2019 • edited Loading

Choose a reason for hiding this comment

CAD97 commented Nov 14, 2019

CAD97 commented Nov 2, 2019 •

edited

Loading

CAD97 commented Nov 5, 2019 •

edited

Loading

matklad Nov 9, 2019 •

edited

Loading

CAD97 Nov 9, 2019 •

edited

Loading