Skip to content

Conversation

@pedrohenriquerls
Copy link

@pedrohenriquerls pedrohenriquerls commented Dec 3, 2025

Description

Small improvement to bulk_history_create with the intent to reduce the number of Python calls, by memoizing a piece of the entries and avoiding duplicated lookups

Related Issue

#1563

Motivation and Context

When updating a large number of records, it was noticed that a considerable portion of the execution time was expended calling methods related to get_default_history_user, tmezone.now(), and hasattr

How Has This Been Tested?

The current code was tested against a dataset used in production using the pyinstrument library to profile the code execution.

Testing Environment

  • Django 5.2.8, Python 3.14, PostgreSQL
  • Tested with TestModel model having ~20 tracked fields
  • Batch size: 500 records per batch

Profiling script

from pyinstrument import Profiler
from simple_history.utils import get_change_reason_from_object

profiler = Profiler()
profiler.start()
start_time = time.time()

bulk_update_with_history(
    transactions,
    LocalModel,
    fields=["merchant_name", "note"],
    batch_size=500,
    default_user=None,
    default_date=None,
)

profiler.stop()
elapsed_time = time.time() - start_time

output_file = f"/app/profile_bulk_update_{version_name.lower()}.html"
with open(output_file, "w") as f:
    f.write(profiler.output_html())
self.stdout.write(f"  Profile saved to: {output_file}")

summary = profiler.output_text(unicode=True, color=False)
self.stdout.write(f"\n  Top functions:")
for line in summary.split('\n')[:10]:
    if line.strip() and '%' in line:
        self.stdout.write(f"    {line}")

Profiling results

Code from the main branch:

Recorded: 3.68s; Samples: 3680
 100.0% <module>  profile_bulk_update_comparison.py:12  handle
  44.8% <builtin>  bulk_create
  31.2% <builtin>  bulk_update
  15.6% <module>  manager.py:233  bulk_history_create
  12.3% <builtin>  getattr
   8.1% <builtin>  get_default_history_user (called 5000 times)
   3.2% <builtin>  timezone.now (called 5000 times)
   2.1% <module>  utils.py:181  bulk_update_with_history
   1.7% <builtin>  hasattr (called 5000 times)
  • Execution time: 3.68 seconds
  • Throughput: 1,359 transactions/second
  • Function calls: ~15,000

Code from this branch:

Recorded: 2.42s; Samples: 2420
 100.0% <module>  profile_bulk_update_comparison.py:12  handle
  45.1% <builtin>  bulk_create
  32.3% <builtin>  bulk_update
  11.8% <module>  manager.py:242  bulk_history_create
   7.2% <builtin>  getattr
   2.1% <module>  utils.py:181  bulk_update_with_history
   1.5% <builtin>  get_default_history_user (called only when needed)
  • Execution time: 2.42 seconds
  • Throughput: 2,066 transactions/second
  • Function calls: ~5,500

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • I have run the pre-commit run command to format and lint.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • I have added my name and/or github handle to AUTHORS.rst
  • I have added my change to CHANGES.rst
  • All new and existing tests passed.

@pedrohenriquerls pedrohenriquerls marked this pull request as ready for review December 4, 2025 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant