Clarify dual-mode codecs in builtin_codecs docstring#1334
Merged
dimitri-yatsenko merged 3 commits intopre/v2.0from Jan 16, 2026
Merged
Clarify dual-mode codecs in builtin_codecs docstring#1334dimitri-yatsenko merged 3 commits intopre/v2.0from
dimitri-yatsenko merged 3 commits intopre/v2.0from
Conversation
Replace deprecated 'external storage' terminology with canonical terms: - 'object storage' for general concept - 'in-store storage' for @ modifier specifics - 'in-table storage' for database storage Changes: - builtin_codecs.py: Update BlobCodec, AttachCodec, HashCodec docstrings * 'internal/external' → 'in-table/in-store' * Update examples and get_dtype() docstrings - settings.py: Update StoresSettings docstrings - gc.py: Update module docstring and format_stats() - expression.py: Update to_dicts() docstring - heading.py, codecs.py, declare.py: Update internal comments - migrate.py: Add note explaining use of legacy terminology Ref: TERMINOLOGY.md, DOCSTRING_TERMINOLOGY_REPORT.md
Replace deprecated SQL-derived terms with accurate DataJoint terminology: - 'semijoin/antijoin' → 'restriction/anti-restriction' - Clarify that A & B restricts A (does not join attributes) Changes in source code comments: - expression.py:1081: 'antijoin' → 'anti-restriction' - condition.py:296: '(semijoin/antijoin)' → 'for restriction' - condition.py:401: '(aka semijoin and antijoin)' → removed Rationale: In relational algebra, joins combine attributes from both operands. DataJoint's A & B restricts A to matching entities—no attributes from B appear in the result. This is fundamentally restriction, not a join operation.
- List <blob> and <blob@> separately to show both inline and external modes - List <attach> and <attach@> separately to show both modes - Change <hash> to <hash@> (external only) - Change <object> to <object@> (external only) - Clarify storage mode for each codec variant - Also corrected hash algorithm from SHA256 to MD5 This makes it clear which codecs support dual modes vs external-only.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Improve clarity of the builtin_codecs.py module docstring by explicitly listing dual-mode codecs with both their inline and external forms.
Changes
File:
src/datajoint/builtin_codecs.py(lines 8-16)Updated the module-level docstring to:
<blob>and<blob@>separately (was: combined as one entry)<attach>and<attach@>separately (was: combined as one entry)<object>to<object@>(external-only, no inline mode)<hash>to<hash@>(external-only, no inline mode)Before:
After:
Motivation
The original docstring was ambiguous about which codecs support both inline and external storage modes. This caused confusion when:
<object>without@(not supported)ObjectCodecwas meant to be dual-modeBy explicitly listing both forms, it's now immediately clear that:
<blob>and<attach>support both inline and external storage<hash@>,<object@>,<npy@>,<filepath@>are external-onlyRelated