-
Notifications
You must be signed in to change notification settings - Fork 1
Feature-complete RR #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Helps get some binaries to test with
Publish dev artifacts from CI
Test out the merge queue
…hooks into a separate module
Things left to complete: * Core wasm support * Improved error handling * Wasmtime CLI integration
… completed core wasm rr
cfallin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is generally pretty good -- thanks a bunch for all of your hard work here!
I have a ton of nits below but I don't think there is anything fundamentally wrong with any of the design. This is more a lot of polish and getting details right.
My eyes did glaze over at several parts around macro usage, and innards of the component ABI. Definitely will need careful review by Alex and/or others when this lands for real. But this should be a start at least. Happy to discuss any of the below.
| impl<T: Read + Seek + Send + Sync> ReplayReader for T {} | ||
| /// In `no_std`, types must provide explicit read/seek capabilities | ||
| /// to a underlying byte slice through these methods. | ||
| pub trait ReplayReader: AsRef<[u8]> + Send + Sync { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than AsRef<[u8]>, which I think means that the reader must materialize the rest of the trace into memory, could we have a separate fn read(&mut self, len: usize) -> &[u8] that reads len bytes and returns a slice? If EOF then the slice may be shorter than len. This would allow an implementation with a buffer fed by underlying storage on demand (e.g. from a file).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought with no_std, the point is that you don't have a file-like storage? Isn't the implication that everything is materialized in memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, no, the literal implication is that there isn't a standard library. The user's custom environment may very well still have storage abstractions. Consider e.g. dropping Wasmtime into a big embedded system with a custom OS, or something like that.
|
Thanks for the quick response! I skimmed over most of the new commits; the one that I want to understand more closely (with non late-night brain) is the change to permit reentrancy (522bb60). I'll take a look at that one separately in a bit. |
|
Btw, I have a commented out line of code in the repo now to throw a compile error if component-model-async and rr are enabled together (currently disabled since CI needs to pass). We should figure out how to add that into the CI as well |
Hmm, looking through the CI config, it's more than one Perhaps we can put something in |
|
I left one comment in the reentrancy commit re: lifetimes (can we try harder to avoid the unsafe, basically; happy to help think through this). More generally though I'm not sure I understand the nested event loop thing in the return hook -- I mean, I understand how architecturally one could be forced into it, but ideally I'd want replay to be one loop at one level -- that also makes eventual trace seeking (which we'll need for reversible debugging) feasible, versus having to have a nested series of stack frames in certain states. |
|
Yeah, this is challenging, I thought quite a bit on how to get re-entrancy to work and had settled on this. The unsafe doesn't really have to do with lifetimes. It's that to accomplish re-entrancy, the replay stub for the host functions need access to the As for replay being one loop at one level, is it possible to do that? I'm not sure how that's even possible with re-entrancy, because there is no getting around actually invoking the wasm function from the return hook right? The "loop" in the return hook is just because we can have re-entrancy arbitrary levels deep. The only thing it expects is a balanced set of entry/return events. |
|
FYI, mentioned this PR to Alex today in the Wasmtime biweekly and he will take a look sometime in the next few weeks -- cc @alexcrichton. |
Thinking more about this, it seems like a very core challenge for reversible debugging. We fundamentally need to snapshot and restore to earlier execution states. Our plan-of-record for the active stack frames has been to actually memcpy out the fiber stack (the active parts of it, down to saved SP) and copy it back in; if no pointers move, then everything is still valid. But if we have native frames between Wasm activations, all that goes right out. There's no safe way at all to restore a set of native frames that core Wasm called out to, that called back into Wasm. Maybe we don't support reentrancy during record or replay; maybe the component model got this right. (Side-eyeing upcoming changes to support reentrancy that I've heard rumblings about -- I don't know any details.) In this case, we'll want to (I think) trap on attempted re-entry into Wasm when recording... |
|
Actually yeah, I just realized, it's more than native frames, we'd actually need native snapshots between activations (which ends up being just like RR!) since Store/Instance state can be modified too. |
Notes
Func,TypedFunc, andComponentFuncasyncStructure
rrfeature. The only things exposed without the feature arerr::hooks, which offers convenient methods to hook recording/replaying mechanisms into the rest of the engine.As a result,
rrwithouthooksshould be able to land with no impact on existing Wasmtime whatsoever.Observed overheads
Between 4-8%