Skip to content

This project provides a Python framework for building conversational AI applications that can interact with your data through CData Connect AI (https://www.cdata.com/ai/). It uses the Model Context Protocol (MCP) to enable OpenAI's GPT models to discover and query your connected data sources.

License

Notifications You must be signed in to change notification settings

CDataSoftware/connectai-openai-agent

Repository files navigation

CData Connect AI OpenAI Agent

Build AI-powered data assistants using OpenAI's GPT models and CData Connect AI. Query your live data sources using natural language conversations.

Overview

This project provides a Python framework for building conversational AI applications that can interact with your data through CData Connect AI. It uses the Model Context Protocol (MCP) to enable OpenAI's GPT models to discover and query your connected data sources.

Key Features:

  • Natural language queries against 300+ data sources (Google Sheets, Salesforce, Snowflake, etc.)
  • Automatic tool discovery via MCP protocol
  • Multi-turn conversation support
  • Streaming responses
  • Easy-to-use Python API

Interested in embedding connectivity into your product?

Learn more about Embedded Cloud for AI:

Architecture

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│                 │     │                  │     │                 │
│  Your Python    │────▶│  CData Connect   │────▶│  Data Sources   │
│  Application    │     │  AI MCP Server   │     │  (300+ types)   │
│                 │◀────│                  │◀────│                 │
└─────────────────┘     └──────────────────┘     └─────────────────┘
        │                        │
        │    Tool Discovery      │
        │    & Execution         │
        ▼                        │
┌─────────────────┐              │
│                 │              │
│  OpenAI API     │──────────────┘
│  (GPT-4, etc.)  │   Natural Language
│                 │   to SQL Translation
└─────────────────┘

Quick Start

Prerequisites

  • Python 3.9+ - Download from python.org or install via your package manager
  • pip - Python's package installer (included with Python 3.4+). Verify with pip --version or pip3 --version
  • An OpenAI API key
  • A CData Connect AI account (free trial available)

Installation

  1. Clone the repository:
git clone https://github.com/CDataSoftware/connectai-openai-agent.git
cd connectai-openai-agent
  1. Install dependencies:
pip install -r requirements.txt
  1. Configure your environment:
cp .env.example .env

Edit .env with your credentials:

OPENAI_API_KEY=your_openai_api_key
CDATA_EMAIL=your_email@example.com
CDATA_PAT=your_personal_access_token

Get Your CData Connect AI Credentials

  1. Sign up at CData Connect AI
  2. Add a data source connection (e.g., Google Sheets)
  3. Go to Settings > Access Tokens > Create PAT
  4. Copy the token (it's only shown once!)

Basic Usage

from dotenv import load_dotenv
from src.connectai_openai import Config, MCPAgent

load_dotenv()

# Create agent from environment variables
config = Config.from_env()
agent = MCPAgent(config)

# Ask questions about your data
response = agent.chat("What data sources do I have connected?")
print(response)

response = agent.chat("Show me the tables in my Google Sheets connection")
print(response)

response = agent.chat("Query the top 5 accounts by revenue")
print(response)

Interactive Chat

Run the example chat application:

python examples/basic_chat.py

Examples

Query Google Sheets Data

from dotenv import load_dotenv
from src.connectai_openai import Config, MCPAgent

load_dotenv()

config = Config.from_env()
agent = MCPAgent(config)

# Explore available data
agent.chat("List all my data connections")
agent.chat("What tables are in my Google Sheets?")
agent.chat("Show me the columns in the account table")

# Query the data
response = agent.chat("""
    Show me all accounts with revenue over $1 million,
    sorted by revenue descending
""")
print(response)

Streaming Responses

from dotenv import load_dotenv
from src.connectai_openai import Config, MCPAgent

load_dotenv()

config = Config.from_env()
agent = MCPAgent(config)

# Stream the response
for chunk in agent.chat_stream("Analyze the health of my top 5 customers"):
    print(chunk, end="", flush=True)

Multi-Source Analysis

# Query across multiple connected sources
agent.chat("Compare sales data from Salesforce with usage data from Google Sheets")

API Reference

Config

Configuration class for credentials and settings.

# From environment variables
config = Config.from_env()

# Or explicit values
config = Config(
    openai_api_key="sk-...",
    cdata_email="user@example.com",
    cdata_pat="your-pat-token",
    openai_model="gpt-4o",  # optional
    mcp_server_url="https://mcp.cloud.cdata.com/mcp"  # optional
)

MCPAgent

AI agent with tool calling capabilities.

agent = MCPAgent(
    config,
    instructions="Custom system prompt...",  # optional
    max_tool_iterations=10  # optional
)

# Methods
response = agent.chat("Your question")
for chunk in agent.chat_stream("Your question"):
    print(chunk)
agent.clear_history()
tools = agent.get_available_tools()

MCPClient

Low-level MCP client for direct tool access.

from src.connectai_openai import Config, MCPClient

config = Config.from_env()
client = MCPClient(config)

# Discover tools
tools = client.list_tools()

# Execute tools directly
catalogs = client.get_catalogs()
schemas = client.get_schemas("MyConnection")
tables = client.get_tables("MyConnection", "GoogleSheets")
columns = client.get_columns("MyConnection", "GoogleSheets", "account")
results = client.query_data("SELECT * FROM [MyConnection].[GoogleSheets].[account]")

Available MCP Tools

The agent automatically has access to these CData Connect AI tools:

Tool Description
getCatalogs List available data source connections
getSchemas Get schemas for a specific catalog
getTables Get tables in a schema
getColumns Get column metadata for a table
queryData Execute SQL queries
getProcedures List stored procedures
getProcedureParameters Get procedure parameter details
executeProcedure Execute stored procedures
getInstructions Get driver-specific instructions and best practices for a data source

SQL Query Format

When querying data, use fully qualified table names:

SELECT * FROM [CatalogName].[SchemaName].[TableName]

Example:

SELECT [Name], [Revenue]
FROM [demo_organization].[GoogleSheets].[account]
WHERE [Revenue] > 1000000
ORDER BY [Revenue] DESC

Sample Data

To get started quickly, copy our sample Google Sheet with customer data:

  • account: Company information (name, industry, revenue)
  • opportunity: Sales pipeline data
  • tickets: Support ticket information
  • usage: Product usage metrics

Troubleshooting

Authentication Errors

  • Verify your CData email and PAT are correct in .env
  • Ensure the PAT hasn't expired
  • Check that your Connect AI account is active

No Tools Available

  • Confirm you have at least one data source connected in Connect AI
  • Check that your user has permissions to access the connection

Query Errors

  • Use fully qualified table names: [Catalog].[Schema].[Table]
  • Verify column names exist using getColumns
  • Check SQL syntax (Connect AI uses SQL-92 standard)

Resources

License

MIT License - see LICENSE for details.

Support

About

This project provides a Python framework for building conversational AI applications that can interact with your data through CData Connect AI (https://www.cdata.com/ai/). It uses the Model Context Protocol (MCP) to enable OpenAI's GPT models to discover and query your connected data sources.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages