Add HIP 0001 ring buffer proposal for Hyperlight I/O #1112

andreiltd · 2025-12-12T14:08:09Z

jsturtevant · 2025-12-16T20:58:39Z

proposals/0001-rng-buf/README.md

+
+### Risks and Mitigations
+
+**Risk**: Malicious guest corrupting the queue **Mitigation**: Do not expose low level queue API to


I don't know that not exposing it to the guest does anything. The guest is fully untrusted and so we need to do the mitigation suggested after this, that we serialize all data as known types and do assertions as suggested.

proposals/0001-rng-buf/README.md

jsturtevant · 2025-12-16T21:06:17Z

proposals/0001-rng-buf/README.md

+example `ioeventfd` for kvm). This is especially useful for streaming scenarios where the guest can
+continue processing while the host consumes data asynchronously.
+
+**3. Inline Descriptors**


Is this proposed to be implemented now or as possible improvement in the future

I would like to give it a try in the first draft, but really if we're gonna keep it or not depends on two factors: 1) can we actually fit any meaningful data into inline slot, e.g. can serialized function call with single argument fit? 2) the performance improvement for such calls is substantial. Both are hard to assess without actual implementation.

jsturtevant · 2025-12-16T21:08:33Z

proposals/0001-rng-buf/README.md

+2. Device writes up to buffer length
+3. Device sets actual written length in descriptor
+4. If `actual_length > buffer_length`, device sets a `TRUNCATED` flag,
+5. Driver can re-submit with larger buffer if needed


I can image this would be possible performance issue? Can we provide a metric for when this occurs and a way to improve initial buffer size?

That's a good point. I would imagine that this only happens when you try to transport a "big" chunk of data for which the base case should be using stream rather then pushing everything at once. That being said the protocol should support full round trip but really the mitigation of this problem should be using streams.

proposals/0001-rng-buf/README.md

jsturtevant · 2025-12-16T21:10:12Z

proposals/0001-rng-buf/README.md

+### Snapshotting
+
+Snapshotting requires that the descriptor table has no in-flight guest-to-host requests and any
+attempt to snapshot a sandbox with such pending requests will result in a snapshot failure.


Can we ensure this?

Yes, we can query the queue for number of inflight messages during snapshotting and fail if it reports positive number. Because we have to track the number of free slots in the queue getting number of inflight messages is just length of the queue minus free slots.

proposals/0001-rng-buf/README.md

jsturtevant

This looks great, thanks for such a detailed write up. Left a few minor comments but otherwise this seems like it would set up the project to work well for the future

Signed-off-by: Tomasz Andrzejak <[email protected]>

Copilot

Pull request overview

This pull request adds HIP 0001, a comprehensive proposal document for implementing a virtio-inspired ring buffer mechanism for Hyperlight I/O. The proposal aims to replace the current stack-based communication model with a more efficient ring buffer approach to reduce VM exits and enable streaming communication patterns.

Changes:

Adds a detailed technical proposal document describing the ring buffer design, API, and implementation plan
Updates typos.toml to allow "readables" and "writables" as valid terms used in the proposal

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 10 comments.

File	Description
proposals/0001-rng-buf/README.md	Comprehensive HIP document describing virtio-inspired ring buffer design with memory layout, API examples, and implementation plan
typos.toml	Adds exceptions for "readables" and "writables" terms used in the proposal document

Copilot · 2026-01-14T01:38:17Z

proposals/0001-rng-buf/README.md

+        T: FlatbufferSerializable;
+
+    /// Try to send without blocking
+    pub fn try_send<T>(&mut self, message: T) -> Result<Token, RingError>;


There's a syntax error in the code example. The try_send method has a semicolon after the parameter list on line 674, but it should be followed by the where clause on line 675 without a semicolon.

Suggested change

pub fn try_send<T>(&mut self, message: T) -> Result<Token, RingError>;

pub fn try_send<T>(&mut self, message: T) -> Result<Token, RingError>

Copilot · 2026-01-14T01:38:18Z

proposals/0001-rng-buf/README.md

+    pub fn try_recv<T>(&mut self) -> Result<T, RingError>
+    where


There's a syntax error in the code example. The try_recv method has a semicolon after the parameter list on line 687, but it should be followed by the where clause on line 688 without a semicolon.

Copilot · 2026-01-14T01:38:18Z

proposals/0001-rng-buf/README.md

+impl<A, N> Ring<A, N>
+where
+    A: Allocator,
+    N: NotificationStrategy,
+{
+    /// Split into separate sender and receiver
+    pub fn split(self) -> (RingSender<A, N>, RingReceiver);
+}


The code example shows Ring<A, N> as the type being implemented, but this type is not defined anywhere in the API. Based on the context, it appears this should be implementing methods for either RingSender<A, N> or a combined Ring<A, N> type that should be defined above.

Copilot · 2026-01-14T01:38:18Z

proposals/0001-rng-buf/README.md

+    N: NotificationStrategy,
+{
+    /// Split into separate sender and receiver
+    pub fn split(self) -> (RingSender<A, N>, RingReceiver);


The method signature is missing a return type. The split method on line 660 should specify what it returns, likely -> (RingSender<A, N>, RingReceiver) to match the comment on line 659.

Suggested change

pub fn split(self) -> (RingSender<A, N>, RingReceiver);

pub fn split(self) -> (RingSender<A, N>, RingReceiver) {

unimplemented!("split is not implemented in this example");

}

Copilot · 2026-01-14T01:38:19Z

proposals/0001-rng-buf/README.md

+    pub fn send<T>(&mut self, message: T) -> Result<Token, RingError>
+    where
+        T: FlatbufferSerializable;


The method signature is missing a semicolon. The send method declaration on line 669 should end with a semicolon since it's a trait method signature without an implementation body.

Copilot · 2026-01-14T01:38:19Z

proposals/0001-rng-buf/README.md

+  the inverse. When the device publishes a used descriptor, it sets both `AVAIL` and `USED` to its
+  wrap bit.
+- The reader of a descriptor then compares the flags it reads to its own current wrap to decide if
+  the descriptr is newly available/used now, or is it lagging behind.


The word "descriptr" is misspelled. It should be "descriptor".

Suggested change

the descriptr is newly available/used now, or is it lagging behind.

the descriptor is newly available/used now, or is it lagging behind.

Copilot · 2026-01-14T01:38:19Z

proposals/0001-rng-buf/README.md

+the next free byte in the stack.
+
+When pushing, the payload which is flatbuffer-serialized message is written at the current stack
+pointer, followed by the 8-byte footer that containins just written payload's starting offset.


The word "containins" is misspelled. It should be "containing".

Suggested change

pointer, followed by the 8-byte footer that containins just written payload's starting offset.

pointer, followed by the 8-byte footer that containing just written payload's starting offset.

Copilot · 2026-01-14T01:38:20Z

proposals/0001-rng-buf/README.md

+
+When the `INLINE` flag is set, the `addr` is unused. This optimization, inspired by io_uring,
+eliminates memory indirection for common small messages, improving both latency and cache behavior.
+The tradeof is the increased size of descriptor table. Alternatively we could repurpose the `addr`


The word "tradeof" is misspelled. It should be "tradeoff" or "trade-off".

Suggested change

The tradeof is the increased size of descriptor table. Alternatively we could repurpose the `addr`

The tradeoff is the increased size of descriptor table. Alternatively we could repurpose the `addr`

Copilot · 2026-01-14T01:38:20Z

proposals/0001-rng-buf/README.md

+    pub fn recv<T>(&mut self) -> Result<T, RingError>
+    where


There's a syntax error in the code example. The recv method has a semicolon after the parameter list on line 682, but it should be followed by the where clause on line 683 without a semicolon.

Copilot · 2026-01-14T01:38:20Z

proposals/0001-rng-buf/README.md

+**2. Event based notifications**
+
+In the single threaded application the notification involve VM exit but in multi-thread environment
+where host and guest are running in separate threads we can laverage event-based notifications (for


The word "laverage" is misspelled. It should be "leverage".

Suggested change

where host and guest are running in separate threads we can laverage event-based notifications (for

where host and guest are running in separate threads we can leverage event-based notifications (for

danbugs · 2026-01-14T02:05:09Z

Really great work on this HIP, @andreiltd! Loved seeing the diagrams, extensive testing plan, code, and whatnot.

A few thoughts:

1. Side-by-side comparison suggestion

I think the HIP could benefit from a more explicit side-by-side comparison in the "Comparison with current implementation" section. Something like a table showing:

Current: push_shared_output_data()-> outb -> VM Exit -> pop_shared_input_data() (per call)
Proposed: submit_available() × N -> notify() → VM Exit → poll_used() × N (batched)

I feel like this would help readers quickly grasp the key difference in the two approaches.

2. Related work in Nanvix fork

I've done some similar work in the Nanvix hyperlight fork that might be relevant: 76c9e7c

There, I added a credits_value field to HyperlightPEB that the guest can check before performing a VMExit. This way, the guest avoids unnecessary exits when the host signals no resources are available. This is similar to ring buffer notification suppression / checking num_free() before submitting, so I feel like Hyperlight-Nanvix could really benefit from this work!

3. Single-threaded vs multi-threaded phasing

For the multi-threaded case (guest and host on separate threads), the host would need a way to interrupt the running guest when responses are ready--yes? For that, I've also done related work for that in Nanvix's Hyperlight fork: a897cad for KVM.

My hardware interrupts work enables PIC/APIC interrupt injection which could support the ioeventfd-style notifications mentioned in the HIP. Happy to coordinate on this when the time comes :)

The backward compatibility goal (current function call model portable without public API changes) is really cool to see--great for downstream consumers like Nanvix.

Looking forward to seeing this land!

andreiltd requested review from danbugs, dblnz, devigned, jprendes, ludfjig, marosset, simongdavies and syntactically as code owners December 12, 2025 14:08

andreiltd added area/documentation Related to documentation kind/enhancement For PRs adding features, improving functionality, docs, tests, etc. labels Dec 12, 2025

jsturtevant reviewed Dec 16, 2025

View reviewed changes

proposals/0001-rng-buf/README.md Show resolved Hide resolved

jsturtevant reviewed Dec 16, 2025

View reviewed changes

proposals/0001-rng-buf/README.md Outdated Show resolved Hide resolved

jsturtevant reviewed Dec 16, 2025

View reviewed changes

proposals/0001-rng-buf/README.md Outdated Show resolved Hide resolved

jsturtevant reviewed Dec 16, 2025

View reviewed changes

proposals/0001-rng-buf/README.md Show resolved Hide resolved

jsturtevant reviewed Dec 16, 2025

View reviewed changes

proposals/0001-rng-buf/README.md Show resolved Hide resolved

jsturtevant previously approved these changes Dec 16, 2025

View reviewed changes

andreiltd dismissed jsturtevant’s stale review via 2318367 December 18, 2025 09:04

Add HIP 0001 ring buffer proposal for Hyperlight I/O

25f5d22

Signed-off-by: Tomasz Andrzejak <[email protected]>

andreiltd force-pushed the tandr/rng-rfc branch from 2318367 to 25f5d22 Compare December 18, 2025 09:09

Merge branch 'main' into tandr/rng-rfc

6f5c265

danbugs requested a review from Copilot January 14, 2026 01:34

Copilot started reviewing on behalf of danbugs January 14, 2026 01:35 View session

Copilot AI reviewed Jan 14, 2026

View reviewed changes


		### Risks and Mitigations

		Risk: Malicious guest corrupting the queue Mitigation: Do not expose low level queue API to

	pub fn try_send<T>(&mut self, message: T) -> Result<Token, RingError>;
	pub fn try_send<T>(&mut self, message: T) -> Result<Token, RingError>

-    pub fn split(self) -> (RingSender<A, N>, RingReceiver);
+    pub fn split(self) -> (RingSender<A, N>, RingReceiver) {
+        unimplemented!("split is not implemented in this example");
+    }

	the descriptr is newly available/used now, or is it lagging behind.
	the descriptor is newly available/used now, or is it lagging behind.

	pointer, followed by the 8-byte footer that containins just written payload's starting offset.
	pointer, followed by the 8-byte footer that containing just written payload's starting offset.

	The tradeof is the increased size of descriptor table. Alternatively we could repurpose the `addr`
	The tradeoff is the increased size of descriptor table. Alternatively we could repurpose the `addr`

	where host and guest are running in separate threads we can laverage event-based notifications (for
	where host and guest are running in separate threads we can leverage event-based notifications (for

Add HIP 0001 ring buffer proposal for Hyperlight I/O #1112

Are you sure you want to change the base?

Add HIP 0001 ring buffer proposal for Hyperlight I/O #1112

Uh oh!

Conversation

andreiltd commented Dec 12, 2025

Uh oh!

jsturtevant Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jsturtevant Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

andreiltd Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

jsturtevant Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

andreiltd Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jsturtevant Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

andreiltd Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jsturtevant left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

danbugs commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!