Skip to content

Conversation

@kmckiern
Copy link
Contributor

Add Mandoline's MCP server to the community servers section.

Description

This server provides AI assistants with access to Mandoline's evaluation tools, allowing them to reflect on, critique, and continuously improve their own performance through self-evaluation.

Motivation and Context

Two-fold:

  1. Direct, Real-world contexts: Embed eval tools directly in AI clients where users work, so they can measure model performance on their real day-to-day tasks. This helps users be more rigorous about which model works best in a given context. It also helps them do things like build up customized test sets over time.

  2. AI self-improvement: AI assistants can check and improve their own work based on what users care about. This can happen proactively (get task → generate multiple solutions → eval each → send back the best) or reactively (get task → generate solution → get user feedback → run eval → try to do better).

How Has This Been Tested?

Tested with multiple LLM clients: Claude Code, Claude Desktop, and Cursor. All successfully connected to the
Mandoline MCP server and can access the evaluation framework tools.

Example interaction in Claude Code:

> In fact, you can run it right now. Get one metric.

⏺ mandoline - get_metrics (MCP)(limit: 1)
  ⎿  [                                                
       {
         "id": "4cb434d4-c012-48ac-9a40-19b92d73450e",
     … +14 lines (ctrl+r to expand)

⏺ Perfect! The Mandoline MCP server is working right now in Claude Code. I just retrieved an "Architecture
  Consistency" metric that evaluates whether code changes align with existing codebase patterns - exactly the kind of
   evaluation tool that would help AI assistants improve their development work.

Breaking Changes

None - this is a documentation addition only.

Types of changes

  • Documentation update

Checklist

  • I have read the MCP Protocol Documentation
  • My changes follows MCP security best practices
  • I have updated the server's README accordingly
  • I have tested this with an LLM client
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have documented all environment variables and configuration options

Add Mandoline AI evaluation framework MCP server to the community servers section.
Enables AI assistants to reflect on and continuously improve their performance.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@olaservo olaservo merged commit b920f3c into modelcontextprotocol:main Aug 29, 2025
19 checks passed
@kmckiern kmckiern deleted the add-mandoline-mcp-server branch August 29, 2025 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants