-
Notifications
You must be signed in to change notification settings - Fork 122
textproto: Reduce amount of allocations where possible #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Expect typical header to contain no more than 32 fields. Expect typical field to be not more than 256 octets long.
Codecov Report
@@ Coverage Diff @@
## master #57 +/- ##
==========================================
- Coverage 61.32% 61.08% -0.25%
==========================================
Files 16 16
Lines 843 848 +5
==========================================
+ Hits 517 518 +1
- Misses 280 284 +4
Partials 46 46
Continue to review full report at Codecov.
|
|
Oh, I think it might be a good idea to take a look at how net/textproto makes approximation of fields count in header. Since textproto.ReadHeader is also used for multipart headers, it might be a good idea to not preallocate a lot of fields for them. Upd: Not sure if it is worth it. It saves around 1 KiB of memory per message at cost of making ReadHeader 20% slower (memory saving is measured using maddy benchmarks, speed is from textproto bench in this PR) |
textproto/header.go
Outdated
| // headerFields.Del for details. | ||
| m map[string]*headerField | ||
| // Second and further values. | ||
| mDups map[string][]*headerField |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I'm not sure how I feel about this. Header is already quite complicated, maybe we could extract the logic into MultiStringMap or something? Maybe we could simplify the patch by either using m if there's a single value, either using mDups if there are multiple but avoid having to split values across both?
Maybe we can merge the other commits, and discuss this in a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could simplify the patch by either using m if there's a single value, either using mDups if there are multiple but avoid having to split values across both?
I guess that's probably the better way. I will take a look. For now, I removed that commit from this PR.
emersion
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Several preallocations and one clever trick.
Here is comparison with old code using benchmarks added in d569c5e:
Header 1 is "the most typical", header 2 contains several very big fields, header 3 contains a lot of small fields.
For reference, here are results using net/textproto parser:
What I think is worth exploring is possibility of deferring trimAroundNewlines to Header.Get/Fields.Value time. This could avoid extra allocation for fields we are not actually interested in. With header fields littered with server tracing information, this can be useful (side note: headers I used in benchmarks are from actual messages l received).