-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fewer string allocations (cont'd) #40
Conversation
…erpolating many strings and change the gsub's to gsub! to modify the same string in memory instead of making a new one
BenchmarkingI used the benchmark-memory gem to benchmark the difference between this branch and the
almost 2x fewer allocations! 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
end | ||
|
||
def escape_tag_content(tag) | ||
remove_pipes(tag).gsub COMMA, BLANK | ||
def escape_tag_content!(tag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We often freeze tags. I believe this will cause an exception to raise for frozen strings. Should we use gsub for frozen strings or something else?
irb(main):001:0> RPC_STORE = "rpc_store:memcached".freeze
=> "memcached"
irb(main):002:0> RPC_STORE.gsub!("|", "")
RuntimeError: can't modify frozen String
from (irb):2:in `gsub!'
from (irb):2
from /Users/jnunemaker/.rbenv/versions/2.2.4/bin/irb:11:in `<main>'
irb(main):003:0> RPC_STORE.gsub("|", "")
=> "rpc_store:memcached"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I can see other paths where this could be an issue. We cannot assume that anything passed in by the user is not frozen. Or to put it other way: it's bad form to modify the arguments that the user is passing to the library: even if they're not frozen, the user won't expect those arguments to change. The !
methods can only be used on strings we've allocated inside the library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We often freeze tags. I believe this will cause an exception to raise for frozen strings. Should we use gsub for frozen strings or something else?
Ah, yes. I'll add a test for that and handle frozen strings differently.
Or to put it other way: it's bad form to modify the arguments that the user is passing to the library: even if they're not frozen, the user won't expect those arguments to change. The ! methods can only be used on strings we've allocated inside the library.
Yeah, I definitely felt weird about that and noticed that @aroben dup
-ed
the stat in here, but it also seemed counterproductive to dup
the arguments when the point is to reduce allocations. I'll add the dup
s and run the benchmark again to see what the difference is 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I addressed everything in b2d1ca6
I did another benchmark after adding the dup
s and it's still an improvement over what we had:
Calculating -------------------------------------
master 128.800M memsize ( 0.000 retained)
2.300M objects ( 0.000 retained)
50.000 strings ( 0.000 retained)
current branch 79.900M memsize ( 0.000 retained)
1.400M objects ( 0.000 retained)
50.000 strings ( 0.000 retained)
Comparison:
current branch: 79900050 allocated
master: 128800050 allocated - 1.61x more
Can you both give this another look? 🙏 💖
- update the send_stats to dup the args that are passed in and then modify them instead of modifying the original ones - undo the additional tests for the string modifications because they were already existing in a different part of the file - add tests for the cases when frozen strings are passed in - undo adding the ! to all of the string mod methods - add a escape_tag_content! method that modifies the original string instead of returning a new one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tackling ❄️ strings. Looks good. Left one question, but nothing worth stopping this. @vmg knows more about this stuff than me, so I'd definitely wait for his review. :)
end | ||
|
||
describe "tag names" do | ||
it "replaces reserved chars for tags" do | ||
@statsd.increment('stat', tags: ["name:foo,bar|foo"]) | ||
@statsd.socket.recv.must_equal ['stat:1|c|#name:foobarfoo'] | ||
end | ||
|
||
it "handles the cases when some tags are frozen strings" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❄️ 👍
Is the only goal to prove that using frozen strings doesn't raise or should this have a socket recv like some of the other metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the only goal was to make sure that ❄️ strings don't raise errors. I can change the description of the test if it's unclear.
Thank you very much for this improvement! I'm really sorry for the late review, don't hesitate to ping when I'm taking too much time. |
Context
I made a previous PR replacing the strings with frozen strings or symbols wherever possible (#37)
@vmg gave it a 👀 and said
So, using github/statsd-ruby#16 as inspiration, I took a stab at replacing the string modification methods with their
!
equivalents and incrementally creating the string being sent in thesend_stats
methodThis Change
send_stats
method to incrementally generate the string instead of one big string interpolationgsub
s togsub!
and update thetr
totr!
All of the tests are still passing! 💃 🎉
NOTE: I only updated the
send_stats
method because that's the one we'd be using at the highest volume. I can make follow-up PRs forformat_event
andformat_service_check
if we think those will also be used at a high volume.--
@vmg @aroben @jnunemaker can you review this please? 🙏 💖
@degemer can you also review this since you looked at the last one? 😊