Skip to content

Conversation

@MarekSuchanek
Copy link
Member

No description provided.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the handling of deleted documents in the document worker by introducing a DocumentNotFoundException exception class and adding logic to check if a document exists before attempting to set its state to FAILED.

Changes:

  • Adds a new DocumentNotFoundException exception that inherits from JobException
  • Introduces a check_document_exists() method to verify document presence before operations
  • Updates exception handling to gracefully handle cases where documents are deleted during processing

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
packages/dsw-document-worker/dsw/document_worker/exceptions.py Adds DocumentNotFoundException class and updates create_job_exception to return it when document is not found
packages/dsw-document-worker/dsw/document_worker/worker.py Adds document existence check in _set_failed, handles DocumentNotFoundException in run method, and uses new exception in get_document

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +334 to +336
LOG.warning('Document %s does not exist, cannot set to FAILED',
self.doc_uuid)
return
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check_document_exists() method returns False on any exception (line 98-101), which could hide database connectivity or permission issues. When database errors occur, the code will incorrectly assume the document doesn't exist and silently return without setting the FAILED state or reporting the actual error. Consider distinguishing between 'document not found' and 'error checking document existence' to avoid masking genuine database problems.

Suggested change
LOG.warning('Document %s does not exist, cannot set to FAILED',
self.doc_uuid)
return
LOG.warning(
'Document %s may not exist or could not be checked; '
'attempting to set state to FAILED anyway',
self.doc_uuid,
)

Copilot uses AI. Check for mistakes.
try:
self._run()
except DocumentNotFoundException as e:
LOG.warning('Document not found: %s', e.log_message())
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DocumentNotFoundException is caught and logged, but no error reporting to Sentry occurs. This creates inconsistent error reporting compared to the JobException handler (lines 350-353) which captures the exception in Sentry. Add SentryReporter.capture_exception(e) to ensure these errors are properly tracked.

Suggested change
LOG.warning('Document not found: %s', e.log_message())
LOG.warning('Document not found: %s', e.log_message())
SentryReporter.capture_exception(e)

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +101
def check_document_exists(self) -> bool:
LOG.debug('Checking if document "%s" exists in DB', self.doc_uuid)
try:
doc = self.ctx.app.db.fetch_document(
document_uuid=self.doc_uuid,
tenant_uuid=self.tenant_uuid,
)
exists = doc is not None
LOG.debug('Document "%s" exists: %s', self.doc_uuid, exists)
return exists
except Exception as e:
LOG.error('Failed to check if document "%s" exists: %s',
self.doc_uuid, str(e))
return False
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check_document_exists() method duplicates the document fetching logic from get_document() (lines 113-116). Both methods call fetch_document with the same parameters. This duplication could lead to maintenance issues if the fetching logic needs to change. Consider refactoring to have a single method that performs the fetch, or have this method leverage the existing get_document() method's fetch logic.

Copilot uses AI. Check for mistakes.
@MarekSuchanek MarekSuchanek merged commit d86e917 into develop Jan 15, 2026
64 checks passed
@MarekSuchanek MarekSuchanek deleted the fix/doc-deleted branch January 15, 2026 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants