⚠ This page is served via a proxy. Original site: https://github.com
This service does not collect credentials or authentication data.
Skip to content

Conversation

@ConvolutedDog
Copy link
Contributor

@ConvolutedDog ConvolutedDog commented Dec 9, 2025

Support multiple metrics and improve data aggregation

Intro

This commit introduces significant improvements to the nsight-python API, particularly around metrics collection and data handling.

Users can now collect multiple related metrics in a single profiling run, which is essential for performance analysis. For example, comparing both load and store operations (smsp__sass_inst_executed_op_shared_ld.sum and smsp__sass_inst_executed_op_shared_st.sum) in one run is much more convenient than running separate profiling sessions for each metric.

Instead of needing separate profiling sessions for each metric:

@nsight.analyze.kernel(runs=1, metric="smsp__sass_inst_executed_op_shared_ld.sum")
def profile1(...):
    ...
@nsight.analyze.kernel(runs=1, metric="smsp__sass_inst_executed_op_shared_st.sum")
def profile2(...):
    ...

All metrics can now be specified with a sequence of metric names:

@nsight.analyze.kernel(
    runs=1,
    metrics=[
        "smsp__sass_inst_executed_op_shared_ld.sum",
        "smsp__sass_inst_executed_op_shared_st.sum"
    ]
) 
def profile(...):
    ...

Key Changes:

  1. Multiple Metrics Support:

    • Changed metric parameter to metrics (now accepts a sequence of strings)
    • Updated all examples/tests to use metrics=["metric1", "metric2"] format
  2. Enhanced Data Aggregation:

    • Improved _value_aggregator factory function with better data aggragation
    • Added support for np.array for multiple metrics in aggregation
    • Exploded dataframe for multiple metrics in one row
  3. Plotting Improvements:

    • Enhanced _validate_metric() to check for complex data structures
    • Better error messages when plotting multiple metrics
    • Support for scalar-only visualization
  4. New Examples:

    • Added 09_advanced_metric_custom.py: Custom metrics from multiple metrics
    • Added 10_combine_kernel_metrics.py: Combining metrics from multiple kernels
    • Added 11_output_csv.py: CSV output control example
  5. API Consistency:

    • Updated documentation and examples to reflect new parameter names
    • Better handling of edge cases in data transformation

Breaking Changes:

  • metric parameter renamed to metrics (now accepts list)
  • Multiple metrics now stored as tuples in DataFrame

Files Updated:

  • Modified: .gitignore, docs, examples, README
  • Added: examples/09*, examples/10*, examples/11*
  • Modified core modules: analyze.py, collection/core.py, collection/ncu.py,
    exceptions.py, extraction.py, transformation.py, utils.py
  • Updated tests to reflect API changes

This update provides more flexible metrics collection while maintaining backward compatibility for single-metric use cases.

Signed-off-by: ConvolutedDog [email protected]

@ConvolutedDog
Copy link
Contributor Author

@Alok-Joshi Hi Alok, could we set up basic CI (lint/build documentation) to run automatically on push? Since fixing lint errors often requires multiple attempts.

@Alok-Joshi
Copy link
Collaborator

@Alok-Joshi Hi Alok, could we set up basic CI (lint/build documentation) to run automatically on push? Since fixing lint errors often requires multiple attempts.

That is unfortunately not possible due to organisation policy. You can run the lint, doc-build locally to verify the fixing of issues -

   hatch run lint:check   # To run lint
   hatch run docs:build  # To build docs

@ConvolutedDog ConvolutedDog force-pushed the enhance-multiple-metrics branch from f08074f to 893166d Compare December 12, 2025 02:28
@ConvolutedDog
Copy link
Contributor Author

Added missing Signed-off-by signatures for all commits.

@Alok-Joshi
Copy link
Collaborator

/ok to test 0fcf91c

@Alok-Joshi
Copy link
Collaborator

@ConvolutedDog the dataframe with multiple metrics looks like this -

image

For every (annotation, config_values), the "Metric" is a list. Also, the corresponding values calculated per metric are also a list. Such dataframes may difficult to query, or convert to csv. Instead of this, you can follow approach where for every (annotation, config_values), we have k rows, where is the number of metrics. So the above dataframe would look something like -

image

@Alok-Joshi
Copy link
Collaborator

@ConvolutedDog As you have filled #12 and #11, the TODOs you have added in the code can be removed.

@ConvolutedDog
Copy link
Contributor Author

OK, I'll add a transformation it and delete the TODOs.

@Alok-Joshi
Copy link
Collaborator

Alok-Joshi commented Dec 12, 2025

@ConvolutedDog Can you please merge the main branch of upstream into this feature branch? We have added tests in the CI

@ConvolutedDog
Copy link
Contributor Author

@ConvolutedDog Can you please merge the main branch of upstream into this feature branch? We have added CI

OK.

This commit fixes the data aggregation issue when specifying comma-separated
metrics in the @nsight.analyze.kernel decorator.

Users can now collect multiple related metrics in a single profiling run,
which is essential for performance analysis. For example, comparing both
load and store operations (smsp__sass_inst_executed_op_shared_ld.sum and
smsp__sass_inst_executed_op_shared_st.sum) in one run is much more convenient
than running separate profiling sessions for each metric.

Instead of needing separate profiling sessions for each metric:

@nsight.analyze.kernel(runs=1, metric=smsp__sass_inst_executed_op_shared_ld.sum)
def profile1(...): ...
@nsight.analyze.kernel(runs=1, metric=smsp__sass_inst_executed_op_shared_st.sum)
def profile2(...): ...

All metrics can now be specified with comma separation:

@nsight.analyze.kernel(runs=1, metric=smsp__sass_inst_executed_op_shared_ld.sum,smsp__sass_inst_executed_op_shared_st.sum)
def profile1(...): ...

Key changes:
 - Support comma-separated metric specification in @nsight.analyze.kernel
 - Merge results into unified DataFrame with 'Metric' column
 - Add example demonstrating shared memory load/store profiling
 - Add validation to prevent @plot usage with multiple metrics
 - Extend data aggregation to handle multiple metric DataFrames

Signed-off-by: ConvolutedDog <[email protected]>
This commit introduces significant improvements to the nsight-python API,
particularly around metrics collection and data handling:

1. Multiple Metrics Support:
   - Changed `metric` parameter to `metrics` (now accepts sequence of strings)
   - Updated all examples/tests to use `metrics=["metric1", "metric2"]` format
   - Modified `derive_metric` to `derive_metrics` for consistency

2. Enhanced Data Aggregation:
   - Improved `_value_aggregator` factory function with better data aggragation
   - Added support for np.array for multiple metrics in aggregation

3. Plotting Improvements:
   - Enhanced `_validate_metric()` to check for complex data structures
   - Better error messages when plotting multiple metrics
   - Support for scalar-only visualization

4. New Examples:
   - Added `09_advanced_metric_custom.py`: Custom metrics from multiple metrics
   - Added `10_combine_kernel_metrics.py`: Combining metrics from multiple kernels
   - Added `11_output_csv.py`: CSV output control example

5. API Consistency:
   - Updated documentation and examples to reflect new parameter names
   - Better handling of edge cases in data transformation

- `metric` parameter renamed to `metrics` (now accepts list)
- `derive_metric` parameter renamed to `derive_metrics`
- Multiple metrics now stored as tuples in DataFrame

- Modified: `.gitignore`, docs, examples, README
- Added: examples/09*, examples/10*, examples/11*
- Modified core modules: `analyze.py`, `collection/core.py`, `collection/ncu.py`,
  `exceptions.py`, `extraction.py`, `transformation.py`, `utils.py`
- Updated tests to reflect API changes

This update provides more flexible metrics collection while maintaining backward
compatibility for single-metric use cases.

Signed-off-by: ConvolutedDog <[email protected]>
Signed-off-by: ConvolutedDog <[email protected]>
Signed-off-by: ConvolutedDog <[email protected]>
Signed-off-by: ConvolutedDog <[email protected]>
- Add _explode_dataframe() to flatten metric columns into rows
- Fix metric error message formatting

Signed-off-by: ConvolutedDog <[email protected]>
@ConvolutedDog ConvolutedDog force-pushed the enhance-multiple-metrics branch from 0fcf91c to 7c778cb Compare December 12, 2025 14:16
@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 12, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@ConvolutedDog
Copy link
Contributor Author

@Alok-Joshi @bastianhagedorn Could you please review the latest commit?


return wrapper

def _explode_dataframe(self, df: pd.DataFrame) -> pd.DataFrame:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should be part of the nsight/extraction.py module, and should be called in the extract_df_from_report function. If the raw_df is exploded once, we will not have to explode the processed df. The nsight/transformation.py module can work with the exploded raw df instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've refactored the code.

And also fix a bug:

# Normalize the AvgValue by the values of the normalization annotation
agg_df["AvgValue"] = agg_df["NormalizationValue"] / agg_df["AvgValue"]

Changed to:

agg_df["AvgValue"] = agg_df["AvgValue"] / agg_df["NormalizationValue"]

@Alok-Joshi
Copy link
Collaborator

Can you please extend the test_normalize_against for multiple metrics? The check we have is sufficient, you just need to extend the test for multiple metrics.

…ization

From agg_df["AvgValue"] = agg_df["AvgValue"] / agg_df["NormalizationValue"]
To agg_df["AvgValue"] = agg_df["NormalizationValue"] / agg_df["AvgValue"]

Signed-off-by: ConvolutedDog <[email protected]>
@ConvolutedDog
Copy link
Contributor Author

Can you please extend the test_normalize_against for multiple metrics? The check we have is sufficient, you just need to extend the test for multiple metrics.

Done. Add the test_parameter_normalize_against_multiple_metrics to handle multiple metrics scenarios.

@Alok-Joshi
Copy link
Collaborator

Overall this change looks good to me. Only thing remaining is -

  1. Extend test_parameter_metric to take in multiple_metrics and check the following in the dataframe -
    • Validate the behaviour of nsight.analyze.plot when multiple metrics in dataframe
    • Validate the dataframe itself to ensure the requested metrics are present in the dataframe

Signed-off-by: ConvolutedDog <[email protected]>
Signed-off-by: ConvolutedDog <[email protected]>
@Alok-Joshi
Copy link
Collaborator

/ok to test 481ca2b

Signed-off-by: ConvolutedDog <[email protected]>
@Alok-Joshi
Copy link
Collaborator

/ok to test f32cc7b

@ConvolutedDog
Copy link
Contributor Author

@Alok-Joshi Please rerun the CI.

@ConvolutedDog
Copy link
Contributor Author

@Alok-Joshi I'm a bit confused about this CI error (https://github.com/NVIDIA/nsight-python/actions/runs/20237826873/job/58097206975#step:5:1698). Could it be because many print(result.to_dataframe()) statements?

@Alok-Joshi
Copy link
Collaborator

@Alok-Joshi I'm a bit confused about this CI error (https://github.com/NVIDIA/nsight-python/actions/runs/20237826873/job/58097206975#step:5:1698). Could it be because many print(result.to_dataframe()) statements?

The triton example is failing because of the following error -
malloc(): unaligned tcache chunk detected

Not sure about the error; do not really have a root cause, but it can be ignored as it seems unrelated to your change. I re-ran the CI job and now it passes.

Also the intermediate AttributeError: 'NoneType' object has no attribute 'to_dataframe' errors are because of #12 .

Signed-off-by: ConvolutedDog <[email protected]>
Copy link
Collaborator

@bastianhagedorn bastianhagedorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for your efforts

@bastianhagedorn
Copy link
Collaborator

/ok to test 64df1e6

@Alok-Joshi
Copy link
Collaborator

Merging. Thanks @ConvolutedDog !

@Alok-Joshi Alok-Joshi merged commit 65a73f6 into NVIDIA:main Dec 16, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants