Skip to content

optimize time wasted in passing argument and fetching return #137

@muayyad-alsadi

Description

@muayyad-alsadi

While discussing #96 I was able to identify a bottleneck
there was needlessly large time wasted in passing the arguments and retrieving the return

basically I made a simple 32-bit hash function cdb_djp_hash.c

in pure python the time is proportional to string length
in wasm it was almost constant time ~40ms regardless of string length 13 bytes to 1300 bytes (100x)
which means that most of those 40ms are in passing the arguments not in the actual loop (if it was the loop it would be proportional to string length or a large fraction of that).

This was confirmed using

pr=Profile()
pr.enable()
for i in range(10000):
        cdb_djp_hash_wasm(large_a)
pr.disable()
#pr.print_stats()
#pr.print_stats('cumulative')
pr.print_stats('calls')

the first note that the actual WASM call is fast (169ms out of 800ms)
which means that we might able to squeeze ~650ms of 800ms making it 4x faster

and I was able to identify bottleneck

  • things that take too much cumulative time
  • things that are called more than 10k (number of iterations)

examples of such things

  • too many calls to isinstance which is 170k (in a 10k iteration benchmark), that is 17 times per call
  • converting the argument _value.py:129(_convert) too ~200ms out of 800ms
  • we got dynamically sized lists operations like {method 'append' of 'list' objects}

Are we sure that passing arguments / converting can be made faster?
Yes

I can confirm _value.py:129(_convert) that is too slow

I've moved the conversion outside the loop and confirmed it was the bottleneck

    i32_start = Val.i32(start)
    i32_size = Val.i32(size)
    for i in range(10000):
        np_mem[start:stop]=input
        hash_wasm = ctypes.c_uint32(_cdb_djp_hash_wasm(i32_start, i32_size)).value

also I can confirm that this time is just waste because passing parameter should do the following

to proof that we can convert in almost no time

in_struct=Struct('<LL')
in_pack=in_struct.pack
for i in range(10000):
    b=pack(start, size)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions