Improve documentation for the `read_multiple_files` action #2602

sebastien-rosset · 2025-08-22T18:02:04Z

Description

This PR fixes the read_multiple_files function to properly handle empty arrays and improves documentation to make the parameter structure self-documenting for AI assistants.

Server Details

Server: filesystem
Changes to: tools (read_multiple_files function schema and validation)

Motivation and Context

Addresses issue #2032 where the read_multiple_files function lacked clear documentation about its parameter structure and behavior, causing AI assistants to use it incorrectly without explicit instruction. The function would accept empty arrays and return empty results without helpful feedback, making it difficult for AI assistants to understand proper usage.

How Has This Been Tested?

Schema validation tested with empty arrays (now properly rejected with clear error message)
Verified that non-empty arrays continue to work as expected
Confirmed that the JSON schema generation includes the new descriptive text
Tested backward compatibility with existing valid use cases

Breaking Changes

No breaking changes for valid usage. The only change is that empty arrays (which previously returned empty results) now return a clear error message explaining that at least one file path is required.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update

Checklist

I have read the MCP Protocol Documentation
My changes follows MCP security best practices
I have updated the server's README accordingly
I have tested this with an LLM client
My code follows the repository's style guidelines
New and existing tests pass locally
I have added appropriate error handling
I have documented all environment variables and configuration options

Additional context

Changes made:

Schema validation: Added .min(1, "At least one file path must be provided") to reject empty arrays with a clear error message
Parameter documentation: Added .describe() to the paths parameter explaining the expected format and requirements
Code formatting: Reformatted the schema definition for better readability

Before this fix:

Empty arrays would be processed and return empty results
AI assistants required explicit instruction about parameter structure
No clear feedback about minimum requirements

After this fix:

Empty arrays are rejected with descriptive error: "At least one file path must be provided"
Schema is self-documenting with clear parameter descriptions
AI assistants can better understand correct usage without explicit instruction

The fix maintains full backward compatibility for all valid use cases while providing better guidance for incorrect usage.

olaservo

Thanks for the PR! In order to fully address this issue, should the tool description also be updated? #2032

sebastien-rosset · 2025-08-23T15:49:36Z

#2032

I'm assuming you are referring to enhancing this description:

      {
        name: "read_multiple_files",
        description:
          "Read the contents of multiple files simultaneously. This is more " +
          "efficient than reading files one by one when you need to analyze " +
          "or compare multiple files. Each file's content is returned with its " +
          "path as a reference. Failed reads for individual files won't stop " +
          "the entire operation. Only works within allowed directories. " +
          "Requires at least one file path in the 'paths' parameter array.",

Are you suggesting the input parameters should be documented in the tool description field?

The MCP tool specification defines three distinct documentation layers for the description fields:

Tool description: "Human-readable description of functionality"
Input inputSchema: "JSON Schema defining expected parameters" (with property-level descriptions)
Output outputSchema: "Optional JSON Schema defining expected output structure" (with property-level descriptions)

Established Patterns for description fields

Looking at established patterns for the description field, I see inconsistencies. Sometimes the tool description is high-level without parameter descriptions. And sometimes the tool description includes parameter descriptions.

Filesystem Server:

Most tools: high-level, e.g. "Create a new file or completely overwrite an existing file"
Inconsistent: read_text_file mentions 'head' and 'tail' parameters in description

Memory Server: Tool descriptions are consistently high-level:

"Create multiple new entities in the knowledge graph"
"Add new observations to existing entities"

NO parameter names or structures mentioned in tool descriptions

Git Server: Tool descriptions are brief and functional:

"Shows the working tree status"
"Records changes to the repository"

NO parameter details in tool descriptions

How LLMs Use Tool Documentation in Practice

A key consideration is how LLMs Use Tool Documentation in Practice. Looking at VS Code, Claude desktop and Claude code, I see the LLMs use the following pattern:

Read Tool descriptions → Help LLM understand when and why to use a tool
Read Schema descriptions → Help LLM understand how to structure the parameters

Both description fields are read before invoking the tool, as they serve complementary, not redundant purposes. Note this is just anecdotal evidence, I'm not sure if this pattern is widespread across all LLMs.

I've also noticed the following behavior:

When the input schema is not documented or poorly documented, the LLM creates a wrong input payload for the tool action.
The above problem is resolved after writing proper descriptions to the input schema. In that case, there is no need to document the input schema in the high-level tool description.

Problems with Including Parameters in Tool Descriptions

IMHO, it makes sense to keep the tool description high-level otherwise it becomes redundant with the input schema description, potentially leading to maintenance issues and contradictions/ambiguities. Including Parameters in Tool Descriptions can cause problems:

Redundancy: Parameter details are already captured in schema descriptions where they belong
Maintenance burden: Changes to parameters require updates in multiple places
Inconsistency: Creates different documentation patterns within the same codebase
Semantic confusion: Mixes high-level purpose with implementation details
Verbosity: Makes tool descriptions unnecessarily long and harder to scan (i.e. more tokens consumed if the parameters are described both in the input schema and tool description)

Benefits of Separation

The separation of the tool description and input schema descriptions allows for:

Quick tool selection based on high-level descriptions
Detailed parameter construction using schema specifications
Clear separation of concerns in documentation

Potential Enhancements to the MCP Spec

I think there is a genuine ambiguity in the MCP spec that could benefit from clarification. For example, there could be documentation best practices. I have created modelcontextprotocol/modelcontextprotocol#1382

olaservo · 2025-08-23T19:21:50Z

Thanks @sebastien-rosset , looking at the tool description again, I think you're correct that there isn't a need to specify anything differently there. I agree that we shouldn't be repeating the information given by tool params given that the model should already see those (and tbh the name of the function already implies that a list or array of files should be passed in, so not sure why models are struggling with using it correctly in the first place).

sebastien-rosset force-pushed the read_multiple_files branch from b11c9e0 to fbec84b Compare August 22, 2025 18:02

olaservo reviewed Aug 23, 2025

View reviewed changes

olaservo added the server-filesystem Reference implementation for the Filesystem MCP server - src/filesystem label Aug 23, 2025

sebastien-rosset mentioned this pull request Aug 23, 2025

SEP-1382: Documentation Best Practices for MCP Tools modelcontextprotocol/modelcontextprotocol#1382

Open

Add documentation for the read_multiple_files action.

b647cb3

sebastien-rosset force-pushed the read_multiple_files branch from fbec84b to b647cb3 Compare August 23, 2025 17:51

olaservo approved these changes Aug 23, 2025

View reviewed changes

olaservo merged commit 338d8af into modelcontextprotocol:main Aug 23, 2025
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve documentation for the `read_multiple_files` action #2602

Improve documentation for the `read_multiple_files` action #2602

Uh oh!

sebastien-rosset commented Aug 22, 2025

Uh oh!

olaservo left a comment

Uh oh!

sebastien-rosset commented Aug 23, 2025 •

edited

Loading

Uh oh!

olaservo commented Aug 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Improve documentation for the read_multiple_files action #2602

Improve documentation for the read_multiple_files action #2602

Uh oh!

Conversation

sebastien-rosset commented Aug 22, 2025

Description

Server Details

Motivation and Context

How Has This Been Tested?

Breaking Changes

Types of changes

Checklist

Additional context

Uh oh!

olaservo left a comment

Choose a reason for hiding this comment

Uh oh!

sebastien-rosset commented Aug 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Established Patterns for description fields

How LLMs Use Tool Documentation in Practice

Problems with Including Parameters in Tool Descriptions

Benefits of Separation

Potential Enhancements to the MCP Spec

Uh oh!

olaservo commented Aug 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Improve documentation for the `read_multiple_files` action #2602

Improve documentation for the `read_multiple_files` action #2602

sebastien-rosset commented Aug 23, 2025 •

edited

Loading