⚠ This page is served via a proxy. Original site: https://github.com
This service does not collect credentials or authentication data.
Skip to content

Conversation

@JasMehta08
Copy link
Contributor

This Pull request:

improves the existing Performance Metrics in RNTuple.

Changes or fixes:

This PR addresses the issue that had been raised for the improvements of metrics. It implements,

  • GetSparseness(): Calculates the ratio of explicit payload bytes read to the total file size.
  • GetRandomness(): Calculates the ratio of seek distance to bytes read (identifies inefficient access patterns).
  • GetTransactions(): Tracks the total number of physical I/O operations (now using a dedicated fTransactions atomic counter).

Implementation Details:

These have been implemented through the introduction of transient, thread-safe members in RNTupleMetrics,

  • std::atomicstd::uint64_t fSumSkip: To track total seek distance.

  • std::atomicstd::uint64_t fExplicitBytesRead: To track payload bytes.

  • std::atomicstd::uint64_t fTransactions: To track I/O operations.

I have changed RPageStorageFile to update these counters during ReadBuffer, ReadV, and LoadClusters operations. The use of std::atomic ensures thread safety with negligible performance overhead on the I/O path.

Reset Functionality & Accuracy:
I added a Reset() method to RNTupleMetrics. This is needed for obtaining accurate Randomness metrics for the analysis loop.

  • Without Reset(): The metric is dominated by the initial file seek (Header → Footer → Header), resulting in a Randomness score > 2.0.

  • With Reset(): Users can clear the counters after initialization, isolating the steady-state performance of their event loop.

Verification:

Added a new unit test TEST(Metrics, IOMetrics) in tree/ntuple/test/ntuple_metrics.cxx to verify the logic and the Reset() behavior.

Also tested locally using this code

Click to view Manual Verification Code & Output
#include <ROOT/RNTupleModel.hxx>
#include <ROOT/RNTupleWriter.hxx>
#include <ROOT/RNTupleReader.hxx>
#include <ROOT/RNTupleMetrics.hxx>
#include <iostream>

int main()
{
   using namespace ROOT::Experimental;
   using namespace ROOT;


   std::string ntupleName = "ntuple";
   std::string fileName = "verify_metrics.root";

   // Create a dummy RNTuple
   std::cout << "[INFO] Creating dummy RNTuple..." << std::endl;
   {
      auto model = RNTupleModel::Create();
      auto field = model->MakeField<int>("val");
      auto writer = RNTupleWriter::Recreate(std::move(model), ntupleName, fileName);
      for (int i = 0; i < 1000; ++i) {
         *field = i;
         writer->Fill();
      }
   }

   // Read it back
   std::cout << "[INFO] Reading RNTuple and enabling metrics..." << std::endl;
   auto reader = RNTupleReader::Open(ntupleName, fileName);
   reader->EnableMetrics();

   // Read everything to force I/O
   for (auto entry : *reader) {
      (void)entry;
   }

   // Print Metrics
   auto& metrics = reader->GetMetrics();
   
   std::cout << "\n===== RNTuple Performance Metrics Report =====" << std::endl;
   std::cout << "Transactions: " << metrics.GetTransactions() << " (Expected > 0)" << std::endl;
   std::cout << "Sparseness:   " << metrics.GetSparseness() << " (Expected > 0.0)" << std::endl;
   std::cout << "Randomness:   " << metrics.GetRandomness() << " (Expected > 0.0)" << std::endl;
   std::cout << "==============================================\n" << std::endl;

   // Verification Logic
   bool passed = true;
   if (metrics.GetTransactions() <= 0) {
       std::cerr << "[FAIL] Transactions should be > 0" << std::endl;
       passed = false;
   }
   if (metrics.GetSparseness() <= 0.0 || metrics.GetSparseness() > 1.0) {
       std::cerr << "[FAIL] Sparseness should be in (0.0, 1.0]" << std::endl;
       passed = false;
   }
   if (metrics.GetRandomness() <= 0.0) {
       std::cerr << "[FAIL] Randomness should be > 0.0" << std::endl;
       passed = false;
   }

   // Verify Reset functionality
   std::cout << "[INFO] Testing Reset()..." << std::endl;
   reader->GetMetrics().Reset();
   
   // We expect Randomness to go from ~2.13 down to 0.0
   if (metrics.GetSparseness() == 0.0 && metrics.GetRandomness() == 0.0) {
       std::cout << "[SUCCESS] Reset() worked! Randomness dropped to 0.0" << std::endl;
   } else {
       std::cerr << "[FAIL] Reset() failed! Counters are not zero." << std::endl;
       passed = false;
   }
   
   if (passed) {
       std::cout << "[SUCCESS] All metrics logic verified!" << std::endl;
       return 0;
   } else {
       std::cout << "[FAILURE] Some metrics checks failed." << std::endl;
       return 1;
   }
}
[INFO] Reading RNTuple and enabling metrics...

===== RNTuple Performance Metrics Report =====
Transactions: 2 (Expected > 0)
Sparseness:   0.181648 (Expected > 0.0)
Randomness:   2.13746 (Expected > 0.0)
==============================================

[INFO] Testing Reset()...
[SUCCESS] Reset() worked! Randomness dropped to 0.0
[SUCCESS] All metrics logic verified!
#include <ROOT/RNTupleModel.hxx>
#include <ROOT/RNTupleWriter.hxx>
#include <ROOT/RNTupleReader.hxx>
#include <ROOT/RNTupleMetrics.hxx>
#include <iostream>
#include <cstdlib>

int main()
{
   using namespace ROOT::Experimental;
   using namespace ROOT;

   std::string ntupleName = "ntuple";
   std::string fileName = "verify_metrics_v2.root";

   // Create Data (50MB analysis dataset)
   {
      auto model = RNTupleModel::Create();
      auto fieldPt = model->MakeField<float>("pt");
      auto fieldEta = model->MakeField<float>("eta");
      auto fieldPhi = model->MakeField<float>("phi");

      // Realistic cluster size for analysis (1MB)
      RNTupleWriteOptions options;
      options.SetApproxZippedClusterSize(1024 * 1024); 
      
      auto writer = RNTupleWriter::Recreate(std::move(model), ntupleName, fileName, options);
      
      std::cout << "[INFO] Creating 50MB analysis dataset..." << std::endl;
      // ~4.2M events × 12 bytes/event = ~50MB
      for (int i = 0; i < 4200000; ++i) {
         *fieldPt = 20.0f + static_cast<float>(std::rand() % 100);
         *fieldEta = -2.5f + static_cast<float>(std::rand() % 500) / 100.0f;
         *fieldPhi = -3.14f + static_cast<float>(std::rand() % 628) / 100.0f;
         writer->Fill();
         
         if (i % 500000 == 0) {
            std::cout << "  Written " << (i / 1000) << "k events..." << std::endl;
         }
      }
      std::cout << "[INFO] Dataset created successfully." << std::endl;
   }

   // Open and IMMEDIATE RESET
   std::cout << "[INFO] Opening RNTuple..." << std::endl;
   auto reader = RNTupleReader::Open(ntupleName, fileName);
   reader->EnableMetrics();
   
   // FORCE the first seek (Tail -> Head) so we can clear it
   try {
        // LoadEntry is void but might throw if out of range, 0 is safe here
        reader->LoadEntry(0); 
   } catch (...) {}

   auto& m_pre = reader->GetMetrics();
   std::cout << "\n[DEBUG] Pre-Reset Metrics:" << std::endl;
   std::cout << "  Transactions: " << m_pre.GetTransactions() << std::endl;
   std::cout << "  Randomness:   " << m_pre.GetRandomness() << std::endl;
   std::cout << "  Total File Size: " << m_pre.GetTotalFileSize() << " bytes" << std::endl;
   std::cout << "  N Entries: " << reader->GetNEntries() << std::endl;

   std::cout << "[INFO] Resetting Metrics to isolate analysis phase..." << std::endl;
   reader->GetMetrics().Reset();

   // Analysis Loop (actually access the data)
   std::cout << "[INFO] Running analysis on " << reader->GetNEntries() << " events..." << std::endl;
   
   auto viewPt = reader->GetView<float>("pt");
   auto viewEta = reader->GetView<float>("eta");
   auto viewPhi = reader->GetView<float>("phi");
   
   long selectedEvents = 0;
   double sumPt = 0.0;
   
   for (auto i : reader->GetEntryRange()) {
      float pt = viewPt(i);
      float eta = viewEta(i);
      float phi = viewPhi(i);
      
      if (pt > 25.0 && std::abs(eta) < 2.4) {
         selectedEvents++;
         sumPt += pt;
      }
      
      if (i % 500000 == 0 && i > 0) {
         std::cout << "  Processed " << (i / 1000) << "k events..." << std::endl;
      }
   }
   
   std::cout << "[INFO] Analysis complete. Selected " << selectedEvents 
             << " events, <pT> = " << (sumPt / selectedEvents) << " GeV" << std::endl;

   auto& metrics = reader->GetMetrics();

   // Report Analysis Metrics
   std::cout << "\n===== Analysis Performance Report =====" << std::endl;
   std::cout << "Transactions: " << metrics.GetTransactions() << " (Expected > 0)" << std::endl;
   std::cout << "Sparseness:   " << metrics.GetSparseness() << std::endl;
   std::cout << "Randomness:   " << metrics.GetRandomness() << " (Expected low for sequential)" << std::endl;
   std::cout << "======================================\n" << std::endl;

   if (metrics.GetTransactions() > 0 && metrics.GetRandomness() < 1.0) {
       std::cout << "[SUCCESS] Analysis metrics look good!" << std::endl;
       std::cout << "          Sequential reading achieved (low randomness)." << std::endl;
       return 0;
   } else {
       std::cerr << "[FAIL] Unexpected metrics pattern." << std::endl;
       std::cerr << "       Transactions: " << metrics.GetTransactions() << std::endl;
       std::cerr << "       Randomness: " << metrics.GetRandomness() << std::endl;
       return 1;
   }
}
[INFO] Creating 50MB analysis dataset...
  Written 0k events...
  Written 500k events...
  Written 1000k events...
  Written 1500k events...
  Written 2000k events...
  Written 2500k events...
  Written 3000k events...
  Written 3500k events...
  Written 4000k events...
[INFO] Dataset created successfully.
[INFO] Opening RNTuple...

[DEBUG] Pre-Reset Metrics:
  Transactions: 4
  Randomness:   24.7884
  Total File Size: 0 bytes
  N Entries: 4200000
[INFO] Resetting Metrics to isolate analysis phase...
[INFO] Running analysis on 4200000 events...
  Processed 500k events...
  Processed 1000k events...
  Processed 1500k events...
  Processed 2000k events...
  Processed 2500k events...
  Processed 3000k events...
  Processed 3500k events...
  Processed 4000k events...
[INFO] Analysis complete. Selected 3782202 events, <pT> = 72.4999 GeV

===== Analysis Performance Report =====
Transactions: 24 (Expected > 0)
Sparseness:   0.919237
Randomness:   4.08388e-05 (Expected low for sequential)
======================================

[SUCCESS] Analysis metrics look good!
          Sequential reading achieved (low randomness).
## Checklist:
  • tested changes locally
  • updated the docs (if necessary)

This PR fixes #20853

Copy link
Contributor

@jblomer jblomer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Before going into a detailed review, I see a few larger items that need to be addressed.

  • The metrics should not be added directly as atomic integers to the RNTupleMetrics object. We have RNTuplePerfCounter for this.
  • Nice observation on the randomness metric. However, I would prefer doing this internally, e.g. excluding header and footer or having two metrics (with and without).
  • I think it makes more sense to do the accounting in the fReader; should save us to chase all the invocations of ReadBuffer.

@JasMehta08
Copy link
Contributor Author

@jblomer Thanks for the review. I have updated the PR to address your three main points:

Counters: I replaced the std::atomic members with RNTupleAtomicCounter (using the standard RNTuplePerfCounter).

Accounting: I moved all I/O accounting into RMiniFileReader (specifically ReadBuffer and ReadV). RPageStorageFile now just passes the metrics pointer to the reader.

Randomness Logic: As suggested, I handled the exclusion internally rather than via Reset().

I added an internal fSumSkipAnalysis counter.

The reader automatically switches to the "Analysis Phase" at the end of Open().

GetRandomness() calculates the ratio using only the analysis-phase seeks, effectively ignoring header/footer overhead without user intervention.

@JasMehta08
Copy link
Contributor Author

While verifying the changes, I noticed a pre-existing issue with RPageSourceFile. When the source object is moved (e.g. into the Reader), the internal fReader still points to the old fMetrics address.

This results in 0 metrics during verification tests that snapshot the reader across a move. It likely doesn't affect production (where the reader owns the source), but it breaks the "Snapshot & Subtract" verification logic.

I have not added a custom Move Constructor to fix this yet, as it felt like a separate architectural fix outside the scope of this metrics refactor. Let me know if you would prefer I include that fix here.

@JasMehta08
Copy link
Contributor Author

@jblomer

I pushed a polish commit addressing 3 specific things that i noticed while reviewing:

  • Object Lifetimes (CloneImpl): Explicitly documented the raw pointer dependency with a // SAFETY comment. I kept the raw pointer pattern to match the existing RNTuple design (Reader owns Views).

    • Question: Is this "Contract-Based Safety" acceptable here, or would std::weak_ptr be better for strict lifetime enforcement?
  • Concurrency Docs: Added comments clarifying that fLastOffset is thread-local (per Clone), while fMetrics uses shared atomics.

  • UB Safety: seek distance logic to use purely unsigned arithmetic (offset >= fLastOffset ? ) to eliminate theoretical signed integer overflow (even though unlikely).

@JasMehta08 JasMehta08 requested a review from jblomer February 2, 2026 16:44
@jblomer
Copy link
Contributor

jblomer commented Feb 2, 2026

I'm afraid this isn't going quite in the right direction. Take a look at ROOT::Internal::RPageSource::EnableDefaultMetrics, we already have some computed metrics that you can use as a blueprint.

@JasMehta08 JasMehta08 force-pushed the performance-metrics-RNTuple branch from fc79786 to c92fc14 Compare February 3, 2026 16:01
@JasMehta08
Copy link
Contributor Author

hey @jblomer, Thanks for the review, sorry about miss understanding the previous review.

I moved the implementation to follow the EnableDefaultMetrics blueprint.

please let me know if they are in the right directions, or should i look a bit deeper in the library.

On a side note RNTuple seems very interesting to me, I'm interested in the low-level systems engineering side of this and learning about columnar data structures in general, could you let me know if there is anyway to deep dive and understand it even better (i did go through the root reference guide i could only find an introduction page. And i also went through the youtube video on the HSF youtube channel but it did not have a lot of details).

please have a review and let me know if these changes are as desired.

Thanks!

@JasMehta08 JasMehta08 force-pushed the performance-metrics-RNTuple branch from c92fc14 to 5a8845d Compare February 3, 2026 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ntuple] add several useful performance metrics

2 participants