Skip to content

Comments

MDF Agent: v2 client with auth, curation, streaming, and agentic handlers#43

Open
blaiszik wants to merge 11 commits intomasterfrom
mdf-agent
Open

MDF Agent: v2 client with auth, curation, streaming, and agentic handlers#43
blaiszik wants to merge 11 commits intomasterfrom
mdf-agent

Conversation

@blaiszik
Copy link
Contributor

Summary

  • New mdf_agent package: full Python client + CLI for the MDF Connect v2 backend
  • BackendClient.authenticated() with multi-path auth resolution: explicit token → confidential client credentials → dev user → interactive Globus OAuth
  • CLI commands: mdf login/logout/whoami, mdf publish, mdf backend *, mdf stream *, mdf search
  • Curation CLI: curation-pending, curation-detail, curation-approve, curation-reject
  • Dataset discovery: preview, files, sample, dataset cards, citations
  • Streaming: create, append, close (with DOI minting), snapshot, clone
  • Agent-safe skill handlers for all operations (structured error handling)
  • Data source URL normalization, domains, and external import metadata
  • Comprehensive test suites: auth routing, client sync, CLI, models, extractors, repository, submission normalization
  • Legacy mdf_forge / mdf_connect_client code preserved in legacy/
  • Example scripts including E2E staging tests and Globus Search verification

Key files

Path Description
src/mdf_agent/core/backend_client.py HTTP client with auth, all v2 endpoints
src/mdf_agent/cli/main.py CLI entry point (login, publish, search)
src/mdf_agent/cli/backend.py Backend subcommands (curation, preview)
src/mdf_agent/cli/stream.py Stream subcommands
src/mdf_agent/skill/handlers.py Agent-safe handlers for MCP/agentic use
src/mdf_agent/auth/globus.py Globus OAuth with token caching

Test plan

  • python -m pytest tests/ -v — all client tests pass
  • mdf login --service prod — interactive Globus auth works
  • mdf backend health --service staging — staging health check
  • E2E: submit → approve → verify DOI + search via examples/test_staging_e2e.py

🤖 Generated with Claude Code

blaiszik and others added 11 commits January 31, 2026 10:13
- add authenticated BackendClient with service URL resolution and auth header injection
- add login/logout/whoami commands and unify publish/search auth flows
- add backend/stream subcommand auth callbacks and authenticated client routing
- migrate agent/skill handlers and standalone publish CLI to authenticated backend path
- deprecate legacy submit_submission path while keeping backward compatibility
- add focused tests for backend auth routing and auth lifecycle commands
- update v2-review.md with phase completion and verification
… submission model

- Add normalize_data_source() to convert Globus File Manager URLs and
  data.materialsdatafacility.org URLs to canonical globus:// URIs
- Update resolve_data_sources() to normalize URLs and handle stream:// pass-through
- Add standalone DataCite test minting script (examples/test_datacite_mint.py)
- Add demo scripts for full lifecycle and staging Globus integration
- Extend submission model and backend client for v2 API compatibility

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…refix 10.23677)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add test_submission_normalize.py: 14 tests for normalize_data_source()
  and resolve_data_sources() covering Globus File Manager URLs, MDF data
  domain, passthrough cases, and encoded paths
- Add test_staging_e2e.py: full submit → approve → publish E2E test
  against deployed staging backend

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests the full version lifecycle against deployed staging:
- v1.0 with mint_doi=True (dataset DOI)
- v1.1 with mint_doi=False (inherit, update metadata)
- v1.2 with mint_doi=True (version-specific DOI)
- Queries DataCite test API for relationship metadata
- Prints pass/fail assertions for post-implementation verification

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add `domains` (List[str]) for scientific domain categorization and
external import provenance fields (external_doi, external_url,
external_source) for datasets imported from other repositories.

Wired through ManifestConfig → to_metadata_payload() → Submission →
to_payload(). Includes 12 unit tests and an E2E staging test script.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix test_domains_external_e2e.py to read fields from dataset_mdata
(not top-level submission). Add domains/external import coverage to
test_staging_search_e2e.py with full pipeline verification (submit →
approve → publish → search index).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update v2-review.md with Part II+III implementation details (BackendClient
sync, curation/preview methods, confidential client auth, skill handlers).
Add check_search_index.py example. Add .DS_Store to .gitignore.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cover install, CLI commands (auth, publish, stream, backend, search),
Python SDK usage, auth resolution, service targeting, and connection
to the backend.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant