-
Notifications
You must be signed in to change notification settings - Fork 267
Add build time optimization documentation #3608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: mpodkory/find-transform-optimization
Are you sure you want to change the base?
Add build time optimization documentation #3608
Conversation
b057073 to
5de5575
Compare
5de5575 to
52fa8f6
Compare
Add documentation for: - sequence_map_inverse: O(N) to O(1) via pack expansion (95% time reduction) - calculate_element_space_size: fold expression (73% time reduction) Update case studies section with these optimizations.
| { | ||
| return Sequence<F{}(Number<Is>{})...>{}; | ||
| } | ||
| using type = decltype(make(__make_integer_seq<std::integer_sequence, index_t, N>{})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two things here, the compiler intrinsic __make_integer_seq and removing the recursive generation. How much do you gain from pure C++17 standard (removing the recursion), versus full optimization with the compiler intrinsic? Is this fixed in C++2x?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell there's no way to switch from recursive type generation without using the intrinsic. You can check out the related LLVM and GCC issues such as this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__make_integer_seq is actually wrapped in std::make_integer_sequence but this is one more layer of template instantiation nvm, the main problem was drift between the documentation and implementation, I updated docs with rationale for using the intrinsic
BUILD_TIME_OPTIMIZATION.md
Outdated
|
|
||
| ## Optimization Techniques | ||
|
|
||
| ### 1. Replace O(N) Recursion with O(1) Pack Expansion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example is for replacing recursion with pack expansion, but it also has a compiler intrinsic. Can we split that into two different examples? I think everyone would agree with the parameter pack expansion, but considering the compiler intrinsic is a separate issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The compiler intrinsic doesn't exist without the parameter packs, so it's impossible to separate.
| @@ -0,0 +1,327 @@ | |||
| # Build Time Optimization | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this down into the source, where we are making the code changes. It's not customer documentation, it's aimed at the developers.
Can we align on the goal of this doc? This is kind of all over the place. If it's general info, it should probably go in the tracking bug. In fact, the cleanest way is some comments in the tracking bug that link to documented changes in the source. Then the only need for a markdown file is to track files we need to work on and what has been optimized. The scripts should be documented, so that's not needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, let's move to, say, include/ck
The high-level goal is to document the optimization attempts for the metaprogramming constructs that we have, as well as collect the techniques in an accessible way
When relying on the tracking issue we need to keep in mind that the source code and github infra are different sources of information; from recent discussions I had an impression we wanted to start storing the design documentation in the source, which this file would be a start for
ecef7c8 to
80c4f97
Compare
Changes: - Move to include/ck/ (developer documentation, not customer-facing) - Add tracking issue link at top - Fix section structure (sequential numbering 1-5) - Remove mismatched transform_tensor_descriptor example - Clarify O(N) constexpr loop vs template recursion distinction - Remove "Case Studies" section (redundant with tracking issue) - Simplify examples for clarity
80c4f97 to
71413bd
Compare
Summary
BUILD_TIME_OPTIMIZATION.mddocumenting techniques for reducing C++ template instantiation overheadTracking issue: #3575