-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
Bug description
the MarkdownDocumentReader in Spring AI appears to ignore or fail to read hyperlinks in Markdown documents. When a Markdown file contains links (), the extracted Document content does not include the expected URL or link text, effectively dropping link information in the processed output and also table contains the image links also ignored by readers.
#Syntax Issue 1: "[](link-url)"
#Syntax Issue 2: "["This affects applications where preserving link references is important (e.g., ingestion for RAG, documentation analysis, link-aware agents, etc).
Environment
- Spring AI version: (e.g., 1.1.x or 1.0.x)
- MarkdownDocumentReader: org.springframework.ai.reader.markdown.MarkdownDocumentReader (from spring-ai-markdown-document-reader)
- Java version: (e.g., 17, 21)
- Build tool: (Maven)
- Operating System: (Mac book Pro M2)
Steps to reproduce
Create a Markdown file with content that includes one or more links,
e.g.:
# Welcome to the Docling Java Project!

This is the repository for Docling Java, a Java API for using [Docling](https://github.com/docling-project).Instantiate MarkdownDocumentReader with the Markdown as a resource:
Resource resource = new ClassPathResource("example.md");
MarkdownDocumentReader reader = new MarkdownDocumentReader(resource, MarkdownDocumentReaderConfig.defaultConfig());
List<Document> docs = reader.get();Inspect the resulting Document contents for the link. The link URL/text is missing or the link is stripped out.
Expected behavior
Links in Markdown content (both link text AND URL) should either: be preserved in the parsed Document output, or included in a structured property/metadata field (if configurable)
The current behavior should be clarified (documented), or a change should be made so that links are not silently dropped.
Minimal Complete Reproducible example
@Order(4)
public CommandLineRunner springAIMarkdownReader() {
return args -> {
MarkdownDocumentReaderConfig config = MarkdownDocumentReaderConfig.builder()
.withHorizontalRuleCreateDocument(true)
.withIncludeCodeBlock(true)
.withIncludeBlockquote(true)
.withAdditionalMetadata("filename", "Test.md")
.build();
MarkdownDocumentReader reader = new MarkdownDocumentReader("classpath:mark-down/Test.md", config);
List<Document> documents = reader.get();
log.info("The size of the documents list is {}", documents.size());
System.out.println("");
documents.forEach(document ->
{
log.info("{} {}", document.getText(), document.getMetadata());
System.out.println("");
}
);
};
}`