Skip to content

Add streaming/pagination support for large result sets #25

@sgaunet

Description

@sgaunet

Problem

The processRows() method in internal/app/client.go (lines 361-389) allocates the full result set in memory before returning, which can cause memory spikes with large queries.

Current implementation:

func (c *PostgreSQLClientImpl) processRows(rows *sql.Rows) ([]map[string]any, error) {
    // ... column processing ...
    
    results := make([]map[string]any, 0)
    for rows.Next() {
        // ... scan row ...
        results = append(results, row)
    }
    return results, nil
}

Performance Impact

Memory usage example:

  • Query returning 1 million rows with 10 columns averaging 100 bytes each
  • Memory consumption: ~1GB allocated in a single array
  • Causes: Memory spikes, GC pressure, potential OOM errors

Issues

  1. No streaming: Must load entire result set before processing
  2. Memory spikes: Large queries cause unpredictable memory usage
  3. Slow first response: User waits for all rows to load
  4. No pagination: Can't fetch results in batches

Proposed Solutions

Option 1: Add Streaming Interface

type RowCallback func(row map[string]any) error

func (c *PostgreSQLClientImpl) ExecuteQueryStream(ctx context.Context, query string, callback RowCallback) error {
    rows, err := c.db.QueryContext(ctx, query)
    if err != nil {
        return err
    }
    defer rows.Close()
    
    columns, _ := rows.Columns()
    for rows.Next() {
        row := make(map[string]any)
        // ... scan row ...
        if err := callback(row); err != nil {
            return err
        }
    }
    return nil
}

Option 2: Add Pagination Support

type PaginationOptions struct {
    Limit  int
    Offset int
}

func (c *PostgreSQLClientImpl) ExecuteQueryPaginated(ctx context.Context, query string, opts PaginationOptions) ([]map[string]any, error) {
    // Append LIMIT/OFFSET to query
    paginatedQuery := fmt.Sprintf("%s LIMIT %d OFFSET %d", query, opts.Limit, opts.Offset)
    return c.ExecuteQuery(ctx, paginatedQuery)
}

Option 3: Add Result Size Limits

const (
    MaxResultRows = 10000 // Configurable maximum
)

func (c *PostgreSQLClientImpl) processRows(rows *sql.Rows) ([]map[string]any, error) {
    results := make([]map[string]any, 0)
    rowCount := 0
    
    for rows.Next() {
        rowCount++
        if rowCount > MaxResultRows {
            return nil, fmt.Errorf("result set exceeds maximum of %d rows", MaxResultRows)
        }
        // ... process row ...
    }
    return results, nil
}

Recommended Approach

Implement Option 3 (result limits) immediately for safety, then add Option 2 (pagination) for usability.

Impact

  • Severity: MEDIUM
  • Type: Performance, Enhancement
  • Location: internal/app/client.go:361-389
  • Affects: All query execution methods

Checklist

  • Add configurable maximum result size limit
  • Implement pagination support for large result sets
  • Add environment variable for max result rows (e.g., POSTGRES_MCP_MAX_ROWS)
  • Update MCP tools to expose pagination parameters
  • Add tests with large result sets
  • Document memory limitations and pagination usage in README
  • Consider adding streaming interface for advanced use cases

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions