Datapoints overwhelm the metrics queue and blow up ram usage. (bp #8272) #8282

mergify · 2020-02-14T19:13:02Z

This is an automated backport of pull request #8272 done by Mergify.io

Cherry-pick of 17fb825 has failed:

On branch mergify/bp/v0.22/pr-8272
Your branch is up to date with 'origin/v0.22'.

You are currently cherry-picking commit 17fb8258e.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:

	modified:   core/src/banking_stage.rs
	modified:   core/src/replay_stage.rs
	modified:   ledger/src/blockstore_meta.rs
	modified:   metrics/src/datapoint.rs

Unmerged paths:
  (use "git add <file>..." to mark resolution)

	both modified:   core/src/cluster_info.rs
	both modified:   ledger/src/blockstore_processor.rs

To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

Mergify commands and options

More conditions and actions can be found in the [documention](https://doc.mergify.io/).

You can also trigger Mergify actions by commenting on this pull request:

@Mergifyio refresh will re-evaluate the rules
@Mergifyio rebase will rebase this PR
@Mergifyio backports <destination> will backport this PR on <destination> branch

Additionally, on Mergify dashboard you can:

look at your merge queues
generate the Mergify configuration with the simulator.

Finally, you can contact us on https://mergify.io/

solana-grimes · 2020-02-14T19:14:43Z

😱 New commits were pushed while the automerge label was present.

codecov · 2020-02-14T22:32:37Z

Codecov Report

Merging #8282 into v0.22 will increase coverage by 21.5%.
The diff coverage is 63.9%.

@@           Coverage Diff            @@
##           v0.22   #8282      +/-   ##
========================================
+ Coverage     59%   80.5%   +21.5%     
========================================
  Files        244     254      +10     
  Lines      71618   55359   -16259     
========================================
+ Hits       42282   44607    +2325     
+ Misses     29336   10752   -18584

automerge (cherry picked from commit 17fb825)

…olana-labs#8065) Processing new epoch (`Bank::process_new_epoch`) involves collecting stake delegations twice: 1) In `Bank::compute_new_epoch_caches_and_rewards`, to create a stake history entry and refresh vote accounts. 2) In `Bank::get_epoch_reward_calculate_param_info`, which is then used in `Bank::calculate_stake_vote_rewards` to calculate rewards for stakers and voters. The overall time of crossing the epoch boundary is ~519ms: ``` update_epoch_us=519953i ``` Where the two heaviest operations are `collect()`` calls on stake delegations, each of them taking ~200-220ms. Reduce that to just one collect by passing the vector 1) with freshly computed stake history and vote accounts to `Bank::begin_partitioned_rewards`. This way, we can avoid calling `Bank::get_epoch_reward_calculate_param_info`. The new time of crossing the epoch boundary is ~337ms: ``` update_epoch_us=337371i ``` Making that change possible required several refactors: * Tale `&PointValue` in `Bank::create_epoch_rewards_sysvar`. That makes it easier to operate on references of `PartitionedRewardsCalculation`. Copying integers from `PointValue` is cheap and has no visible performance impact. * Split `Stakes::activate_epoch`, that was performing calculations and mutating the cache at the same time. The calculations got split to `Stakes::calculate_activated_stake` that takes `&self`. * Add `Stakes::stake_delegations_ves` method. Stake delegations are stored as hash array mapped trie (HAMT)[0], which means that inserts, deletions and lookups are average-case O(1) and worst-case O(log n). However, the performance of iterations is poor due to depth-first traversal and jumps. Currently it's also impossible to iterate over it with rayon. That issue is known and handled by converting the HAMT to a vector with `stakes.stake_delegations.iter().collect()`. Move that trick to a dedicated method that describes the performance consequences. * Add `FilteredStakeDelegation` wrapper type, that wraps a vector of stake delegations and acts as a lazy iterator that filters out ones with insufficient stake. * Split the code dealing with rewards calculation and vote rewards distribution into separate methods: * `Bank::calculate_rewards` that takes `&self` and does not acquire any locks. * `Bank::begin_partitioned_rewards` that takes `&mut self`, sets calculation status and creates a sysvar. * `Bank::distribute_vote_rewards` that stores partitioned rewards and increases capitalization. [0] https://en.wikipedia.org/wiki/Hash_array_mapped_trie Fixes: solana-labs#8282

mergify bot added the automerge Merge this Pull Request automatically once CI passes label Feb 14, 2020

solana-grimes removed the automerge Merge this Pull Request automatically once CI passes label Feb 14, 2020

Datapoints overwhelm the metrics queue and blow up ram usage. (#8272)

599958c

automerge (cherry picked from commit 17fb825)

mvines force-pushed the mergify/bp/v0.22/pr-8272 branch from c54eda3 to 599958c Compare February 15, 2020 04:21

mergify bot added the automerge Merge this Pull Request automatically once CI passes label Feb 15, 2020

solana-grimes merged commit 3534a7c into v0.22 Feb 15, 2020

mergify bot deleted the mergify/bp/v0.22/pr-8272 branch February 15, 2020 05:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Datapoints overwhelm the metrics queue and blow up ram usage. (bp #8272) #8282

Datapoints overwhelm the metrics queue and blow up ram usage. (bp #8272) #8282

Uh oh!

mergify bot commented Feb 14, 2020

Uh oh!

solana-grimes commented Feb 14, 2020

Uh oh!

codecov bot commented Feb 14, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Datapoints overwhelm the metrics queue and blow up ram usage. (bp #8272) #8282

Datapoints overwhelm the metrics queue and blow up ram usage. (bp #8272) #8282

Uh oh!

Conversation

mergify bot commented Feb 14, 2020

To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

Uh oh!

solana-grimes commented Feb 14, 2020

Uh oh!

codecov bot commented Feb 14, 2020

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants