Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fewer string allocations (cont'd) #40

Merged
merged 3 commits into from
Dec 22, 2016
Merged

Conversation

janester
Copy link
Contributor

@janester janester commented Nov 1, 2016

Context

I made a previous PR replacing the strings with frozen strings or symbols wherever possible (#37)

@vmg gave it a 👀 and said

I think the next step would be to switch over to bang methods and incremental string composition, like @aroben did for our fork of statsd. :)

So, using github/statsd-ruby#16 as inspiration, I took a stab at replacing the string modification methods with their ! equivalents and incrementally creating the string being sent in the send_stats method

This Change

  • updates the send_stats method to incrementally generate the string instead of one big string interpolation
  • update all of the gsubs to gsub! and update the tr to tr!
  • add a few more tests to test the string formatting

All of the tests are still passing! 💃 🎉

NOTE: I only updated the send_stats method because that's the one we'd be using at the highest volume. I can make follow-up PRs for format_event and format_service_check if we think those will also be used at a high volume.

--

@vmg @aroben @jnunemaker can you review this please? 🙏 💖

@degemer can you also review this since you looked at the last one? 😊

…erpolating many strings and change the gsub's to gsub! to modify the same string in memory instead of making a new one
@janester
Copy link
Contributor Author

janester commented Nov 1, 2016

Benchmarking

I used the benchmark-memory gem to benchmark the difference between this branch and the master branch (gist) and here are the results:

Calculating -------------------------------------
              master   128.801M memsize (     0.000  retained)
                         2.300M objects (     0.000  retained)
                        50.000  strings (     0.000  retained)
      current branch    67.900M memsize (     0.000  retained)
                         1.100M objects (     0.000  retained)
                        50.000  strings (     0.000  retained)

Comparison:
      current branch:   67900050 allocated
              master:  128801180 allocated - 1.90x more

almost 2x fewer allocations! 🎉

Copy link
Contributor

@jnunemaker jnunemaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks @janester. Left one note. I'll let @vmg and @aroben review this more thoroughly since they know more about the changes in github/statsd-ruby. If either of you do not have time, let me know and I'll go through in more detail.

end

def escape_tag_content(tag)
remove_pipes(tag).gsub COMMA, BLANK
def escape_tag_content!(tag)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We often freeze tags. I believe this will cause an exception to raise for frozen strings. Should we use gsub for frozen strings or something else?

irb(main):001:0> RPC_STORE = "rpc_store:memcached".freeze
=> "memcached"
irb(main):002:0> RPC_STORE.gsub!("|", "")
RuntimeError: can't modify frozen String
    from (irb):2:in `gsub!'
    from (irb):2
    from /Users/jnunemaker/.rbenv/versions/2.2.4/bin/irb:11:in `<main>'
irb(main):003:0> RPC_STORE.gsub("|", "")
=> "rpc_store:memcached"

Copy link

@vmg vmg Nov 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I can see other paths where this could be an issue. We cannot assume that anything passed in by the user is not frozen. Or to put it other way: it's bad form to modify the arguments that the user is passing to the library: even if they're not frozen, the user won't expect those arguments to change. The ! methods can only be used on strings we've allocated inside the library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We often freeze tags. I believe this will cause an exception to raise for frozen strings. Should we use gsub for frozen strings or something else?

Ah, yes. I'll add a test for that and handle frozen strings differently.

Or to put it other way: it's bad form to modify the arguments that the user is passing to the library: even if they're not frozen, the user won't expect those arguments to change. The ! methods can only be used on strings we've allocated inside the library.

Yeah, I definitely felt weird about that and noticed that @aroben dup-ed
the stat in here, but it also seemed counterproductive to dup the arguments when the point is to reduce allocations. I'll add the dups and run the benchmark again to see what the difference is 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I addressed everything in b2d1ca6

I did another benchmark after adding the dups and it's still an improvement over what we had:

Calculating -------------------------------------
              master   128.800M memsize (     0.000  retained)
                         2.300M objects (     0.000  retained)
                        50.000  strings (     0.000  retained)
      current branch    79.900M memsize (     0.000  retained)
                         1.400M objects (     0.000  retained)
                        50.000  strings (     0.000  retained)

Comparison:
      current branch:   79900050 allocated
              master:  128800050 allocated - 1.61x more

Can you both give this another look? 🙏 💖

- update the send_stats to dup the args that are passed in and then modify them instead of modifying the original ones
- undo the additional tests for the string modifications because they were already existing in a different part of the file
- add tests for the cases when frozen strings are passed in
- undo adding the ! to all of the string mod methods
- add a escape_tag_content! method that modifies the original string instead of returning a new one
Copy link
Contributor

@jnunemaker jnunemaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling ❄️ strings. Looks good. Left one question, but nothing worth stopping this. @vmg knows more about this stuff than me, so I'd definitely wait for his review. :)

end

describe "tag names" do
it "replaces reserved chars for tags" do
@statsd.increment('stat', tags: ["name:foo,bar|foo"])
@statsd.socket.recv.must_equal ['stat:1|c|#name:foobarfoo']
end

it "handles the cases when some tags are frozen strings" do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❄️ 👍

Is the only goal to prove that using frozen strings doesn't raise or should this have a socket recv like some of the other metrics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the only goal was to make sure that ❄️ strings don't raise errors. I can change the description of the test if it's unclear.

@degemer
Copy link
Member

degemer commented Dec 22, 2016

Thank you very much for this improvement! I'm really sorry for the late review, don't hesitate to ping when I'm taking too much time.
This will be part of the next release (beginning of January, sooner if possible).

@degemer degemer merged commit 3b78391 into DataDog:master Dec 22, 2016
@degemer degemer added this to the 2.2.0 milestone Dec 22, 2016
@janester
Copy link
Contributor Author

No worries @degemer. Thanks for reviewing and merging 💖


@vmg how do those last few changes look to you? If you see places that need to be improved, I can open up a follow up PR 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants