Skip to content

Conversation

@serhiy-katsyuba-intel
Copy link
Contributor

fast_get() for the same data can be called by multiple clients from the same or other cores. Since sram_ptr points to cached memory, writeback is needed after memcpy() to also copy data to uncached memory. When fast_get() is called a second time for the same data, invalidate() is used to bring shared data into another core's cache.

This fixes tests which use two SRC modules.

fast_get() for the same data can be called by multiple clients from
the same or other cores. Since sram_ptr points to cached memory, writeback
is needed after memcpy() to also copy data to uncached memory. When
fast_get() is called a second time for the same data, invalidate() is used
to bring shared data into another core's cache.

This fixes tests which use two SRC modules.

Signed-off-by: Serhiy Katsyuba <serhiy.katsyuba@intel.com>
@serhiy-katsyuba-intel
Copy link
Contributor Author

@lgirdwood, Zephyr / build-linux (zmain, imx8 imx8x imx8m imx8ulp) CI failes with final link failed: No space left on device.

@kv2019i kv2019i requested review from Copilot and jsarha January 15, 2026 16:43
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a cache coherency issue in the fast_get() function when the same data is accessed by multiple callers across different cores. The changes ensure proper cache alignment and synchronization for shared memory access.

Changes:

  • Modified memory allocation to use cache-line alignment instead of no alignment
  • Added cache writeback operation after copying data to ensure visibility across cores

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +126 to 128
dcache_writeback_region((__sparse_force void __sparse_cache *)entry->sram_ptr, size);
entry->dram_ptr = dram_ptr;
entry->refcount = 1;
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache writeback occurs after the refcount is initialized, but before the dram_ptr is set. If another core accesses this entry between lines 126-127, it could observe an incomplete state. Consider moving the dcache_writeback_region() call after line 127 to ensure all entry fields are fully initialized before the data becomes visible to other cores.

Suggested change
dcache_writeback_region((__sparse_force void __sparse_cache *)entry->sram_ptr, size);
entry->dram_ptr = dram_ptr;
entry->refcount = 1;
entry->dram_ptr = dram_ptr;
entry->refcount = 1;
dcache_writeback_region((__sparse_force void __sparse_cache *)entry->sram_ptr, size);

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments inline but looking at the modified code alone, seems good.

entry->size = size;
entry->sram_ptr = ret;
memcpy_s(entry->sram_ptr, entry->size, dram_ptr, size);
dcache_writeback_region((__sparse_force void __sparse_cache *)entry->sram_ptr, size);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you @jsarha and @lyakh comment on the semantics? This seems correct in the sense to ensure the memcpy actually reaches sram, but I'm not sure of the fast_get API semantics whether this is correct usage (iow are there other places that would need modifications if used by multiple cores).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants