Skip to content

Commit b251e86

Browse files
feat: Implement thread-safe mode with Instance class
- Add Instance class that encapsulates config + connection - Add ThreadSafetyError exception for global state access - Add _ConfigProxy to delegate dj.config to global config - Add _get_singleton_connection for lazy connection creation - Update dj.conn(), dj.Schema(), dj.FreeTable() to use singleton - Connection now stores _config reference for instance isolation - Add DJ_THREAD_SAFE environment variable support - Add comprehensive tests for thread-safe mode When DJ_THREAD_SAFE=true: - dj.config raises ThreadSafetyError - dj.conn() raises ThreadSafetyError - dj.Schema() raises ThreadSafetyError (without explicit connection) - dj.FreeTable() raises ThreadSafetyError (without explicit connection) - dj.Instance() always works for isolated contexts Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 32b5235 commit b251e86

File tree

6 files changed

+707
-62
lines changed

6 files changed

+707
-62
lines changed

docs/design/thread-safe-mode.md

Lines changed: 70 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -6,30 +6,34 @@ DataJoint uses global state (`dj.config`, `dj.conn()`) that is not thread-safe.
66

77
## Solution
88

9-
Introduce **Instance** objects that encapsulate config and connection. The `dj` module provides access to a lazily-loaded singleton instance. New isolated instances are created with `dj.Instance()`.
9+
Introduce **Instance** objects that encapsulate config and connection. The `dj` module provides a global config that can be modified before connecting, and a lazily-loaded singleton connection. New isolated instances are created with `dj.Instance()`.
1010

1111
## API
1212

13-
### Legacy API (singleton instance)
13+
### Legacy API (global config + singleton connection)
1414

1515
```python
1616
import datajoint as dj
1717

18+
# Configure credentials (no connection yet)
19+
dj.config.database.user = "user"
20+
dj.config.database.password = "password"
1821
dj.config.safemode = False
19-
dj.conn() # Triggers singleton creation, returns connection
22+
23+
# First call to conn() or Schema() creates the singleton connection
24+
dj.conn() # Creates connection using dj.config credentials
2025
schema = dj.Schema("my_schema")
2126

2227
@schema
2328
class Mouse(dj.Manual):
2429
definition = "..."
2530
```
2631

27-
Internally, `dj.config`, `dj.conn()`, and `dj.Schema()` are aliases to the singleton instance:
28-
- `dj.config``dj._singleton_instance.config`
29-
- `dj.conn()``dj._singleton_instance.connection`
30-
- `dj.Schema()``dj._singleton_instance.Schema()`
31-
32-
The singleton is created lazily on first access to any of these.
32+
Internally:
33+
- `dj.config` → delegates to `_global_config` (with thread-safety check)
34+
- `dj.conn()` → returns `_singleton_connection` (created lazily)
35+
- `dj.Schema()` → uses `_singleton_connection`
36+
- `dj.FreeTable()` → uses `_singleton_connection`
3337

3438
### New API (isolated instance)
3539

@@ -86,12 +90,13 @@ table = inst.FreeTable("db.table") # Uses inst.connection
8690
export DJ_THREAD_SAFE=true
8791
```
8892

89-
`thread_safe` is read from environment/config file at module import time.
93+
`thread_safe` is checked dynamically on each access to global state.
9094

91-
When `thread_safe=True`, accessing the singleton raises `ThreadSafetyError`:
95+
When `thread_safe=True`, accessing global state raises `ThreadSafetyError`:
9296
- `dj.config` raises `ThreadSafetyError`
9397
- `dj.conn()` raises `ThreadSafetyError`
94-
- `dj.Schema()` raises `ThreadSafetyError`
98+
- `dj.Schema()` raises `ThreadSafetyError` (without explicit connection)
99+
- `dj.FreeTable()` raises `ThreadSafetyError` (without explicit connection)
95100
- `dj.Instance()` works - isolated instances are always allowed
96101

97102
```python
@@ -110,26 +115,26 @@ inst.Schema("name") # OK
110115

111116
| Operation | `thread_safe=False` | `thread_safe=True` |
112117
|-----------|--------------------|--------------------|
113-
| `dj.config` | `_singleton.config` | `ThreadSafetyError` |
114-
| `dj.conn()` | `_singleton.connection` | `ThreadSafetyError` |
115-
| `dj.Schema()` | `_singleton.Schema()` | `ThreadSafetyError` |
118+
| `dj.config` | `_global_config` | `ThreadSafetyError` |
119+
| `dj.conn()` | `_singleton_connection` | `ThreadSafetyError` |
120+
| `dj.Schema()` | Uses singleton | `ThreadSafetyError` |
121+
| `dj.FreeTable()` | Uses singleton | `ThreadSafetyError` |
116122
| `dj.Instance()` | Works | Works |
117123
| `inst.config` | Works | Works |
118124
| `inst.connection` | Works | Works |
119125
| `inst.Schema()` | Works | Works |
120126

121-
## Singleton Lazy Loading
127+
## Lazy Loading
122128

123-
The singleton instance is created lazily on first access:
129+
The global config is created at module import time. The singleton connection is created lazily on first access:
124130

125131
```python
126-
dj.config # Creates singleton, returns _singleton.config
127-
dj.conn() # Creates singleton, returns _singleton.connection
128-
dj.Schema("name") # Creates singleton, returns _singleton.Schema("name")
132+
dj.config.database.user = "user" # Modifies global config (no connection yet)
133+
dj.config.database.password = "pw"
134+
dj.conn() # Creates singleton connection using global config
135+
dj.Schema("name") # Uses existing singleton connection
129136
```
130137

131-
All three trigger creation of the same singleton instance.
132-
133138
## Usage Example
134139

135140
```python
@@ -167,7 +172,7 @@ Mouse().delete() # Uses inst.config.safemode
167172
```python
168173
class Instance:
169174
def __init__(self, host, user, password, port=3306, **kwargs):
170-
self.config = Config() # Fresh config with defaults
175+
self.config = _create_config() # Fresh config with defaults
171176
# Apply any config overrides from kwargs
172177
self.connection = Connection(host, user, password, port, ...)
173178
self.connection._config = self.config
@@ -179,58 +184,74 @@ class Instance:
179184
return FreeTable(self.connection, full_table_name)
180185
```
181186

182-
### 2. Singleton with lazy loading
187+
### 2. Global config and singleton connection
183188

184189
```python
185190
# Module level
186-
_thread_safe = _load_thread_safe_from_env_or_config()
187-
_singleton_instance = None
191+
_global_config = _create_config() # Created at import time
192+
_singleton_connection = None # Created lazily
188193

189-
def _get_singleton():
190-
if _thread_safe:
194+
def _check_thread_safe():
195+
if _load_thread_safe():
191196
raise ThreadSafetyError(
192197
"Global DataJoint state is disabled in thread-safe mode. "
193198
"Use dj.Instance() to create an isolated instance."
194199
)
195-
global _singleton_instance
196-
if _singleton_instance is None:
197-
_singleton_instance = Instance(
198-
host=_load_from_env_or_config("database.host"),
199-
user=_load_from_env_or_config("database.user"),
200-
password=_load_from_env_or_config("database.password"),
200+
201+
def _get_singleton_connection():
202+
_check_thread_safe()
203+
global _singleton_connection
204+
if _singleton_connection is None:
205+
_singleton_connection = Connection(
206+
host=_global_config.database.host,
207+
user=_global_config.database.user,
208+
password=_global_config.database.password,
201209
...
202210
)
203-
return _singleton_instance
211+
_singleton_connection._config = _global_config
212+
return _singleton_connection
204213
```
205214

206-
### 3. Legacy API as aliases
215+
### 3. Legacy API with thread-safety checks
207216

208217
```python
209-
# dj.config -> singleton.config
218+
# dj.config -> global config with thread-safety check
210219
class _ConfigProxy:
211220
def __getattr__(self, name):
212-
return getattr(_get_singleton().config, name)
221+
_check_thread_safe()
222+
return getattr(_global_config, name)
213223
def __setattr__(self, name, value):
214-
setattr(_get_singleton().config, name, value)
224+
_check_thread_safe()
225+
setattr(_global_config, name, value)
215226

216227
config = _ConfigProxy()
217228

218-
# dj.conn() -> singleton.connection
229+
# dj.conn() -> singleton connection
219230
def conn():
220-
return _get_singleton().connection
221-
222-
# dj.Schema() -> singleton.Schema()
223-
def Schema(name, **kwargs):
224-
return _get_singleton().Schema(name, **kwargs)
225-
226-
# dj.FreeTable() -> singleton.FreeTable()
227-
def FreeTable(full_table_name):
228-
return _get_singleton().FreeTable(full_table_name)
231+
return _get_singleton_connection()
232+
233+
# dj.Schema() -> uses singleton connection
234+
def Schema(name, connection=None, **kwargs):
235+
if connection is None:
236+
_check_thread_safe()
237+
connection = _get_singleton_connection()
238+
return _Schema(name, connection=connection, **kwargs)
239+
240+
# dj.FreeTable() -> uses singleton connection
241+
def FreeTable(conn_or_name, full_table_name=None):
242+
if full_table_name is None:
243+
# Called as FreeTable("db.table")
244+
_check_thread_safe()
245+
return _FreeTable(_get_singleton_connection(), conn_or_name)
246+
else:
247+
# Called as FreeTable(conn, "db.table")
248+
return _FreeTable(conn_or_name, full_table_name)
229249
```
230250

231251
### 4. Refactor internal code
232252

233253
All internal code uses `self.connection._config` instead of global `config`:
254+
- Connection stores reference to its config as `self._config`
234255
- Tables access config via `self.connection._config`
235256
- This works uniformly for both singleton and isolated instances
236257

src/datajoint/__init__.py

Lines changed: 148 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
"config",
2424
"conn",
2525
"Connection",
26+
"Instance",
2627
"Schema",
2728
"VirtualModule",
2829
"virtual_schema",
@@ -52,6 +53,7 @@
5253
"errors",
5354
"migrate",
5455
"DataJointError",
56+
"ThreadSafetyError",
5557
"logger",
5658
"cli",
5759
"ValidationResult",
@@ -72,17 +74,158 @@
7274
NpyRef,
7375
)
7476
from .blob import MatCell, MatStruct
75-
from .connection import Connection, conn
76-
from .errors import DataJointError
77+
from .connection import Connection
78+
from .errors import DataJointError, ThreadSafetyError
7779
from .expression import AndList, Not, Top, U
80+
from .instance import Instance, _ConfigProxy, _get_singleton_connection, _global_config, _check_thread_safe
7881
from .logging import logger
7982
from .objectref import ObjectRef
80-
from .schemas import Schema, VirtualModule, list_schemas, virtual_schema
81-
from .settings import config
82-
from .table import FreeTable, Table, ValidationResult
83+
from .schemas import Schema as _Schema, VirtualModule, list_schemas, virtual_schema
84+
from .table import FreeTable as _FreeTable, Table, ValidationResult
8385
from .user_tables import Computed, Imported, Lookup, Manual, Part
8486
from .version import __version__
8587

88+
# =============================================================================
89+
# Singleton-aware API
90+
# =============================================================================
91+
# config is a proxy that delegates to the singleton instance's config
92+
config = _ConfigProxy()
93+
94+
95+
def conn(
96+
host: str | None = None,
97+
user: str | None = None,
98+
password: str | None = None,
99+
*,
100+
reset: bool = False,
101+
use_tls: bool | dict | None = None,
102+
) -> Connection:
103+
"""
104+
Return a persistent connection object.
105+
106+
When called without arguments, returns the singleton connection.
107+
When connection parameters are provided, creates a new Connection.
108+
109+
Parameters
110+
----------
111+
host : str, optional
112+
Database hostname.
113+
user : str, optional
114+
Database username.
115+
password : str, optional
116+
Database password.
117+
reset : bool, optional
118+
If True, reset existing connection. Default False.
119+
use_tls : bool or dict, optional
120+
TLS encryption option.
121+
122+
Returns
123+
-------
124+
Connection
125+
Database connection.
126+
127+
Raises
128+
------
129+
ThreadSafetyError
130+
If thread_safe mode is enabled and using singleton.
131+
"""
132+
# If any connection params provided, use legacy behavior
133+
if host is not None or user is not None or password is not None or reset:
134+
from .connection import conn as _legacy_conn
135+
136+
return _legacy_conn(host, user, password, reset=reset, use_tls=use_tls)
137+
138+
# Otherwise use singleton connection
139+
return _get_singleton_connection()
140+
141+
142+
def Schema(
143+
schema_name: str | None = None,
144+
context: dict | None = None,
145+
*,
146+
connection: Connection | None = None,
147+
create_schema: bool = True,
148+
create_tables: bool | None = None,
149+
add_objects: dict | None = None,
150+
) -> _Schema:
151+
"""
152+
Create a Schema for binding table classes to a database schema.
153+
154+
When connection is not provided, uses the singleton connection.
155+
156+
Parameters
157+
----------
158+
schema_name : str, optional
159+
Database schema name.
160+
context : dict, optional
161+
Namespace for foreign key lookup.
162+
connection : Connection, optional
163+
Database connection. Defaults to singleton connection.
164+
create_schema : bool, optional
165+
If False, raise error if schema doesn't exist. Default True.
166+
create_tables : bool, optional
167+
If False, raise error when accessing missing tables.
168+
add_objects : dict, optional
169+
Additional objects for declaration context.
170+
171+
Returns
172+
-------
173+
Schema
174+
A Schema bound to the specified connection.
175+
176+
Raises
177+
------
178+
ThreadSafetyError
179+
If thread_safe mode is enabled and using singleton.
180+
"""
181+
if connection is None:
182+
# Use singleton connection - will raise ThreadSafetyError if thread_safe=True
183+
_check_thread_safe()
184+
connection = _get_singleton_connection()
185+
186+
return _Schema(
187+
schema_name,
188+
context=context,
189+
connection=connection,
190+
create_schema=create_schema,
191+
create_tables=create_tables,
192+
add_objects=add_objects,
193+
)
194+
195+
196+
def FreeTable(conn_or_name, full_table_name: str | None = None) -> _FreeTable:
197+
"""
198+
Create a FreeTable for accessing a table without a dedicated class.
199+
200+
Can be called in two ways:
201+
- ``FreeTable("schema.table")`` - uses singleton connection
202+
- ``FreeTable(connection, "schema.table")`` - uses provided connection
203+
204+
Parameters
205+
----------
206+
conn_or_name : Connection or str
207+
Either a Connection object, or the full table name if using singleton.
208+
full_table_name : str, optional
209+
Full table name when first argument is a connection.
210+
211+
Returns
212+
-------
213+
FreeTable
214+
A FreeTable instance for the specified table.
215+
216+
Raises
217+
------
218+
ThreadSafetyError
219+
If thread_safe mode is enabled and using singleton.
220+
"""
221+
if full_table_name is None:
222+
# Called as FreeTable("db.table") - use singleton connection
223+
_check_thread_safe()
224+
return _FreeTable(_get_singleton_connection(), conn_or_name)
225+
else:
226+
# Called as FreeTable(conn, "db.table") - use provided connection
227+
return _FreeTable(conn_or_name, full_table_name)
228+
86229
# =============================================================================
87230
# Lazy imports — heavy dependencies loaded on first access
88231
# =============================================================================

0 commit comments

Comments
 (0)