Skip to content

Optimize PostgreSQL Storage for Contract Data to Resolve Slow Queries #95

@bennyhodl

Description

@bennyhodl

Description

We're experiencing slow queries (>1s) when working with contract data in our PostgreSQL database. The root cause is the large row size (approximately 2MB per row) primarily due to the contract_data BYTEA field being stored with all other metadata.
This issue outlines a two-phase approach to resolve these performance problems:

Add indexes to improve query performance (immediate fix)
Restructure the database to separate large BYTEA data from metadata

Phase 1: Add Indexes and restructure (Immediate Performance Improvement)

/ddk/src/storage/postgres/migrations/[new_migration_file].sql

-- Create new tables with separated metadata and binary data
CREATE TABLE contract_metadata (
    id TEXT PRIMARY KEY,
    state SMALLINT NOT NULL CHECK (state >= 0),
    is_offer_party BOOLEAN NOT NULL,
    counter_party TEXT NOT NULL,
    offer_collateral BIGINT NOT NULL CHECK (offer_collateral >= 0),
    accept_collateral BIGINT NOT NULL CHECK (accept_collateral >= 0),
    total_collateral BIGINT NOT NULL CHECK (total_collateral >= 0),
    fee_rate_per_vb BIGINT NOT NULL CHECK (fee_rate_per_vb >= 0),
    cet_locktime INTEGER NOT NULL CHECK (cet_locktime >= 0),
    refund_locktime INTEGER NOT NULL CHECK (refund_locktime >= 0),
    pnl BIGINT
);

-- Add index on the state column which is used in frequent queries
CREATE INDEX idx_contract_metadata_state ON contract_metadata(state);

-- Create separate table for large binary data
CREATE TABLE contract_data (
    id TEXT PRIMARY KEY REFERENCES contract_metadata(id) ON DELETE CASCADE,
    contract_data BYTEA NOT NULL,
    is_compressed BOOLEAN NOT NULL DEFAULT false
);

-- Migrate existing data
INSERT INTO contract_metadata (
    id, state, is_offer_party, counter_party,
    offer_collateral, accept_collateral, total_collateral, fee_rate_per_vb,
    cet_locktime, refund_locktime, pnl
)
SELECT 
    id, state, is_offer_party, counter_party,
    offer_collateral, accept_collateral, total_collateral, fee_rate_per_vb,
    cet_locktime, refund_locktime, pnl
FROM contracts;

-- Copy binary data to the new table
INSERT INTO contract_data (id, contract_data, is_compressed)
SELECT id, contract_data, false FROM contracts;

-- We'll keep the original contracts table until the new schema is fully tested,
-- then we can remove it in a future migration

Step 2: Update the PostgreSQL storage code

/ddk/src/storage/postgres/mod.rs

Add new structs to represent the split tables:

#[derive(sqlx::FromRow, Debug)]
struct ContractMetadataRow {
    pub id: String,
    pub state: i16,
    pub is_offer_party: bool,
    pub counter_party: String,
    pub offer_collateral: i64,
    pub accept_collateral: i64,
    pub total_collateral: i64,
    pub fee_rate_per_vb: i64,
    pub cet_locktime: i32,
    pub refund_locktime: i32,
    pub pnl: Option<i64>,
}

#[derive(sqlx::FromRow, Debug)]
struct ContractDataRow {
    pub id: String,
    pub contract_data: Vec<u8>,
    pub is_compressed: bool,
}

Step 3: Update the contract operations in PostgresStore implementation

Then modify each function in the #[async_trait::async_trait] impl ManagerStorage for PostgresStore block to use the new table structure. Here's an example for get_contract:

async fn get_contract(
    &self,
    id: &ddk_manager::ContractId,
) -> Result<Option<ddk_manager::contract::Contract>, ddk_manager::error::Error> {
    let id_hex = hex::encode(id);
    
    // First, check if the contract exists in metadata
    let metadata = sqlx::query_as::<Postgres, ContractMetadataRow>(
        "SELECT * FROM contract_metadata WHERE id = $1"
    )
    .bind(&id_hex)
    .fetch_optional(&self.pool)
    .await
    .map_err(to_storage_error)?;
    
    if let Some(metadata) = metadata {
        // If it exists, fetch the contract data
        let data = sqlx::query_as::<Postgres, ContractDataRow>(
            "SELECT * FROM contract_data WHERE id = $1"
        )
        .bind(&id_hex)
        .fetch_one(&self.pool)
        .await
        .map_err(to_storage_error)?;
        
        // Decompress if needed
        let contract_bytes = decompress_contract_data(
            &data.contract_data, 
            data.is_compressed
        )?;
        
        Ok(Some(deserialize_contract(&contract_bytes)?))
    } else {
        Ok(None)
    }
}

Similar changes would be needed for other methods such as create_contract, update_contract, get_contracts, etc.
Testing Plan

Test in development environment first:

Verify all operations (create, update, fetch contracts) work with the new structure
Measure performance improvement compared to old schema
Ensure data integrity is maintained

Before deploying to production:

Add temporary logging of query times
Prepare rollback procedure if issues are encountered
Consider feature flag for enabling the new storage implementation

Expected Outcomes

Query times should improve from >1s to <100ms for most operations
Database size should be reduced
Network transfer between database and application should be minimized

Priority

High - These slow queries are impacting API response times and application performance

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions