Skip to content
This repository was archived by the owner on Nov 17, 2025. It is now read-only.
Draft
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions aesara/tensor/rewriting/elemwise.py
Original file line number Diff line number Diff line change
Expand Up @@ -896,16 +896,16 @@ def print_profile(cls, stream, prof, level=0):
print(blanc, " time_toposort", prof[7], file=stream)


fuse_seqopt = SequenceDB()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was potentially buggy, as fuse_seqopt was always being imported by the other module, even though it's creation was conditional on a config flag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should we get rid of it if we have only a single rewrite now?

if config.tensor__local_elemwise_fusion:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't thought enough about this just yet, but it really looks like we need to consider removing this option entirely, because, unless we redesign the whole approach to fusion/FusionOptimizer, I don't know whether this idea of applying a graph (i.e. "global") rewriter on each node (i.e. "locally") makes any sense.

I could see a place for a distinct node-based fusion rewrite, but I really don't think it would/should operate anything like a graph-based rewrite (e.g. it seems like there would be way too much unnecessary graph walking).

In general, the idea of a node-based fusion rewrite sounds appealing, because it could be a lot simpler than the/a graph-based one, but I don't know which types of graphs the former would miss that the latter wouldn't.

# Must be after gpu(48.5) and before AddDestroyHandler(49.5)
fuse_seqopt = SequenceDB()
fuse_seqopt.register(
"composite_elemwise_fusion",
FusionOptimizer(local_elemwise_fusion),
"fast_run",
"fusion",
position=1,
)
# Position before AddDestroyHandler(49.5)
compile.optdb.register( # type: ignore
"elemwise_fusion",
fuse_seqopt,
Expand Down