这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@ddvlad
Copy link
Contributor

@ddvlad ddvlad commented Oct 28, 2025

No description provided.

@ddvlad ddvlad force-pushed the parse-improvements branch from cbb36cb to faa256c Compare October 28, 2025 08:54
@ddvlad ddvlad force-pushed the parse-improvements branch from faa256c to 0f5d4ca Compare October 28, 2025 13:31
@ddvlad ddvlad force-pushed the parse-improvements branch from 0f5d4ca to c83f74e Compare October 29, 2025 15:17
ddvlad and others added 2 commits November 14, 2025 12:03
We now parse STEs on multiple cores, in 64k chunks. To keep output
deterministic, we buffer the resulting output before reassembling it in
order, but if the user values speed over determinism we expose the
--no_ordered_output flag.

This patch, together with writing output directly to the file, improves
STE parse speed by 7.5x. The numbers below do not reflect that because
the tool runtime also includes loading data from the disk, which is
something we have not optimized.

Measurements taken on a 4M STE dump, excluding FW dump time (so, with
--skip_dump):

			Wall time		Max RSS
Before			3:24			12.0G
This patch		1:00			 8.3G
This patch, no_ordered	0:57			 8.5G

Signed-off-by: Hamdan Igbaria <hamdani@nvidia.com>
Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com>
Instead of having one large string and printing it at once at the end of
all data parsed, split this writes to be done on each level. This
significantly reduces the memory footprint of the tool.

This patch, together with STE parse parallelization, improves STE parse
speed by 7.5x. The numbers below do not reflect that because the tool
runtime also includes loading data from the disk, which is something we
have not optimized.

Measurements taken on a 4M STE dump, excluding FW dump time (so, with
--skip_dump):

			Wall time		Max RSS
Before			0:57			8.5G
Before, no_ordered	0:57			8.3G
After			0:48			6.0G
After, no_ordered	0:45			3.8G

Signed-off-by: Hamdan Igbaria <hamdani@nvidia.com>
Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com>
@ddvlad ddvlad force-pushed the parse-improvements branch from c83f74e to 49cd1ae Compare November 14, 2025 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants