Skip to content

Conversation

@jeffcharles
Copy link
Contributor

@jeffcharles jeffcharles commented Aug 8, 2025

Support providers that use in-memory buffers instead of WASI hostcalls for I/O.

@jeffcharles jeffcharles force-pushed the jc.in-memory-io branch 7 times, most recently from 98dee32 to 2ea94e3 Compare August 18, 2025 15:29
@jeffcharles jeffcharles force-pushed the jc.in-memory-io branch 2 times, most recently from fcaa3bb to 0b10f1c Compare September 11, 2025 17:55
@jeffcharles jeffcharles marked this pull request as ready for review September 17, 2025 14:44
src/engine.rs Outdated
// Need to track with deterministic order so don't use a hash
let mut imports = vec![];
for import in module.imports().map(|i| i.module().to_string()) {
if !imports.contains(&import) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Rust not have an ordered set data structure we could collect to instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't one in the standard library that preserves insertion order. There is a BTreeSet that will iterate over the keys in sorted order. FWIW, the valid cases are 0, 1, or 2 distinct modules that are being importing from, so the contains operation will be faster than the equivalent for any kind of set. Even in the invalid case, I'm not sure if the number of distinct modules would realistically go over a point where a set would offer a performance profile.

let log_len2 = u32::from_le_bytes(buf[20..24].try_into().unwrap()) as usize;

let mut output = vec![0; output_len];
memory.read(store.as_context(), output_offset, &mut output)?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory advances an internal offset after every read I presume?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. Conceptually you can think of the memory as a Vec<u8> in the store. The read operation just says start reading from this offset into the mutable buffer passed to it and takes the length of the mutable buffer as how much to read. There's no internal offset.

Copy link

@otheriault otheriault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never worked in this repo, but the changes and tests make sense to me. Maybe we should wait for another reviewer to approve.

src/io.rs Outdated
let provider_path = format!("{module_name}.wasm");
let imported_module_bytes = StandardProviders::get(&provider_path);

if let Some(bytes) = imported_module_bytes {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would imported_module_bytes be None?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the Function module only imports from wasi_snapshot_preview1 and has no other imports.

src/io.rs Outdated
Comment on lines 208 to 210
if is_mem_io_provider(module_name) {
mem_io_instance = Some(imported_module_instance);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we error in the case that we hit this condition more than once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But ideally I'd want that state to be unrepresentable since it's invalid. I'm thinking I might need to revisit the validation logic I added and maybe create a ValidatedModule wrapper or something like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a ValidatedModule abstraction to operate on. That said, it does rely on instantiation failing with unmet imports to handle some cases. Realistically, it's pretty hard to accidentally have multiple standard imports and code with multiple standard imports wouldn't work properly even if there were all linked. So I'm not too concerned about making that a nice experience given how unlikely it is to happen in practice.

logs.append(&mut logs2);

if logs.len() > FUNCTION_LOG_LIMIT {
logs.splice(0..1, b"[TRUNCATED]...".iter().copied());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I understanding correctly that we aren't truncating anything here, just adding the [TRUNCATED]... at the start?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function-runner doesn't truncate but the provider does and the capacity for the provider's log limit is 1 byte larger than our log limit so we can detect that truncation happened.

.ok_or_else(|| anyhow!("Missing memory export named memory"))?
.write(store.as_context_mut(), input_offset as _, &self.input.raw)?;
}
Ok(())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we decide whether fuel counts during initialization? If not, do we need to set the fuel to u64::MAX again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided it doesn't and we set it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, I can move the fuel setting logic to this method if it would be easier. I generally prefer to avoid performing unrelated side-effects though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to avoid side-effects we could store the fuel value at the start and re-set it at the end of this method like we proposed for finalize

IOStrategy::Memory(instance) => {
let instance = instance.expect("Should have been defined in initialize");
store.set_epoch_deadline(1); // Make sure we don't timeout while finalizing.
store.set_fuel(u64::MAX)?; // Make sure we don't run out of fuel finalizing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that we check the fuel before calling this but I think it would be an easy enough mistake to move it after. Do we have a test catching that, or should we add some logic here to get the fuel at the start of this method and re-set it at the end of this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd probably be easier to reset it if we want to be defensive.

Comment on lines 59 to 68
let mut std_import = None;
for import in imports {
if let Some(bytes) = StandardProviders::get(&format!("{import}.wasm")) {
std_import = Some(Provider {
bytes: bytes.data,
name: import,
});
break;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would Iterator::find_map work here?

@jeffcharles jeffcharles merged commit f719fe2 into main Sep 18, 2025
10 checks passed
@jeffcharles jeffcharles deleted the jc.in-memory-io branch September 18, 2025 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants