Skip to content

Conversation

@bolinocroustibat
Copy link
Contributor

@bolinocroustibat bolinocroustibat commented Nov 12, 2025

Closes #360 (sub-issue of described issue here datagouv/data.gouv.fr#1810)

Send a preflight OPTIONS request to collect the CORS headers of the distant ressource, and send them to udata to be stored in the extras.

Copy link
Contributor

@Pierlou Pierlou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean and concise 👏 just a couple of NIT notes

Copy link
Contributor

@Pierlou Pierlou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Copy link
Contributor

@maudetes maudetes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm always unsure about the proper way to get the needed CORS values.

)

if cors_probe := await probe_cors(session, url):
cors_payload = build_cors_payload(cors_probe)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we store this cors_payload in the checks table directly? I think everytime we want to debug something, it's useful to get the state at check time (the same way we store headers)

bolinocroustibat and others added 6 commits November 20, 2025 14:52
commit 64b1aae
Author: Adrien Carpentier <me@adriencarpentier.com>
Date:   Tue Nov 18 16:50:45 2025 +0100

    fix: add sync/async wrapper so it passes tests

commit c57f525
Author: Adrien Carpentier <me@adriencarpentier.com>
Date:   Tue Nov 18 15:34:06 2025 +0100

    fix: fix async typer
commit 866d1d7
Author: Adrien Carpentier <adrien.carpentier@numerique.gouv.fr>
Date:   Fri Nov 21 10:36:44 2025 +0100

    fix: resolve async/await compatibility issues with Typer CLI (#361)

    This PR fixes CLI commands that were broken after migrating from minicli to Typer, which doesn't handle async functions directly.

    All async CLI commands are now wrapped with synchronous functions that use `_make_async_wrapper()` to detect the execution context and run async code appropriately. The wrapper automatically detects if an event loop is already running (for tests) or creates one using `asyncio.run()` (for CLI execution), ensuring compatibility with both direct CLI usage and async test environments.

    The unused `cleanup()` function has been removed, and redundant exception handling (`except Exception as e: raise e`) has been cleaned up. All CLI commands now work correctly without RuntimeWarnings about unawaited coroutines.

    ...

    ...but it's VERY ugly and verbose.
commit a43d2d4
Author: Adrien Carpentier <adrien.carpentier@numerique.gouv.fr>
Date:   Fri Nov 21 15:44:14 2025 +0100

    feat: optimize cleanup of stuck status resources (#363)

    Replaces the iterative cleanup approach with a single SQL UPDATE query
    that processes all stuck resources at once, improving performance in
    `load_catalog`.

    **Performance improvement:**
    - Before: 1 SELECT query + N UPDATE queries (one per stuck resource)
    - After: 1 single UPDATE query with subquery
    - Reduces database round-trips and allows PostgreSQL to optimize the
    operation atomically

commit 80eb0fd
Author: Adrien Carpentier <adrien.carpentier@numerique.gouv.fr>
Date:   Fri Nov 21 15:25:54 2025 +0100

    feat: add total resources and deleted resources in crawler status health check (#362)

    - Add total resources and deleted resources in crawler status health
    check (to be later used in Munin dashboards)
    - Reorganize crawler health check output to be more explicit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make hydra or udata test if there is a CORS issue on the remote ressources

4 participants