-
Notifications
You must be signed in to change notification settings - Fork 14
Support in-memory I/O APIs #493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9ffbf5a to
9d797ac
Compare
98dee32 to
2ea94e3
Compare
2ea94e3 to
fa3bd87
Compare
fcaa3bb to
0b10f1c
Compare
0b10f1c to
993e658
Compare
src/engine.rs
Outdated
| // Need to track with deterministic order so don't use a hash | ||
| let mut imports = vec![]; | ||
| for import in module.imports().map(|i| i.module().to_string()) { | ||
| if !imports.contains(&import) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does Rust not have an ordered set data structure we could collect to instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't one in the standard library that preserves insertion order. There is a BTreeSet that will iterate over the keys in sorted order. FWIW, the valid cases are 0, 1, or 2 distinct modules that are being importing from, so the contains operation will be faster than the equivalent for any kind of set. Even in the invalid case, I'm not sure if the number of distinct modules would realistically go over a point where a set would offer a performance profile.
| let log_len2 = u32::from_le_bytes(buf[20..24].try_into().unwrap()) as usize; | ||
|
|
||
| let mut output = vec![0; output_len]; | ||
| memory.read(store.as_context(), output_offset, &mut output)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The memory advances an internal offset after every read I presume?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. Conceptually you can think of the memory as a Vec<u8> in the store. The read operation just says start reading from this offset into the mutable buffer passed to it and takes the length of the mutable buffer as how much to read. There's no internal offset.
otheriault
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never worked in this repo, but the changes and tests make sense to me. Maybe we should wait for another reviewer to approve.
src/io.rs
Outdated
| let provider_path = format!("{module_name}.wasm"); | ||
| let imported_module_bytes = StandardProviders::get(&provider_path); | ||
|
|
||
| if let Some(bytes) = imported_module_bytes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would imported_module_bytes be None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the Function module only imports from wasi_snapshot_preview1 and has no other imports.
src/io.rs
Outdated
| if is_mem_io_provider(module_name) { | ||
| mem_io_instance = Some(imported_module_instance); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we error in the case that we hit this condition more than once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. But ideally I'd want that state to be unrepresentable since it's invalid. I'm thinking I might need to revisit the validation logic I added and maybe create a ValidatedModule wrapper or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a ValidatedModule abstraction to operate on. That said, it does rely on instantiation failing with unmet imports to handle some cases. Realistically, it's pretty hard to accidentally have multiple standard imports and code with multiple standard imports wouldn't work properly even if there were all linked. So I'm not too concerned about making that a nice experience given how unlikely it is to happen in practice.
| logs.append(&mut logs2); | ||
|
|
||
| if logs.len() > FUNCTION_LOG_LIMIT { | ||
| logs.splice(0..1, b"[TRUNCATED]...".iter().copied()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I understanding correctly that we aren't truncating anything here, just adding the [TRUNCATED]... at the start?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function-runner doesn't truncate but the provider does and the capacity for the provider's log limit is 1 byte larger than our log limit so we can detect that truncation happened.
| .ok_or_else(|| anyhow!("Missing memory export named memory"))? | ||
| .write(store.as_context_mut(), input_offset as _, &self.input.raw)?; | ||
| } | ||
| Ok(()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we decide whether fuel counts during initialization? If not, do we need to set the fuel to u64::MAX again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We decided it doesn't and we set it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, I can move the fuel setting logic to this method if it would be easier. I generally prefer to avoid performing unrelated side-effects though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to avoid side-effects we could store the fuel value at the start and re-set it at the end of this method like we proposed for finalize
| IOStrategy::Memory(instance) => { | ||
| let instance = instance.expect("Should have been defined in initialize"); | ||
| store.set_epoch_deadline(1); // Make sure we don't timeout while finalizing. | ||
| store.set_fuel(u64::MAX)?; // Make sure we don't run out of fuel finalizing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that we check the fuel before calling this but I think it would be an easy enough mistake to move it after. Do we have a test catching that, or should we add some logic here to get the fuel at the start of this method and re-set it at the end of this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd probably be easier to reset it if we want to be defensive.
src/validated_module.rs
Outdated
| let mut std_import = None; | ||
| for import in imports { | ||
| if let Some(bytes) = StandardProviders::get(&format!("{import}.wasm")) { | ||
| std_import = Some(Provider { | ||
| bytes: bytes.data, | ||
| name: import, | ||
| }); | ||
| break; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would Iterator::find_map work here?
Support providers that use in-memory buffers instead of WASI hostcalls for I/O.