CData Connect AI OpenAI Agent

Build AI-powered data assistants using OpenAI's GPT models and CData Connect AI. Query your live data sources using natural language conversations.

Overview

This project provides a Python framework for building conversational AI applications that can interact with your data through CData Connect AI. It uses the Model Context Protocol (MCP) to enable OpenAI's GPT models to discover and query your connected data sources.

Key Features:

Natural language queries against 300+ data sources (Google Sheets, Salesforce, Snowflake, etc.)
Automatic tool discovery via MCP protocol
Multi-turn conversation support
Streaming responses
Easy-to-use Python API

Interested in embedding connectivity into your product?

Learn more about Embedded Cloud for AI:

Visit the Embedded Cloud website
Watch our introductory video

Architecture

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│                 │     │                  │     │                 │
│  Your Python    │────▶│  CData Connect   │────▶│  Data Sources   │
│  Application    │     │  AI MCP Server   │     │  (300+ types)   │
│                 │◀────│                  │◀────│                 │
└─────────────────┘     └──────────────────┘     └─────────────────┘
        │                        │
        │    Tool Discovery      │
        │    & Execution         │
        ▼                        │
┌─────────────────┐              │
│                 │              │
│  OpenAI API     │──────────────┘
│  (GPT-4, etc.)  │   Natural Language
│                 │   to SQL Translation
└─────────────────┘

Quick Start

Prerequisites

Python 3.9+ - Download from python.org or install via your package manager
pip - Python's package installer (included with Python 3.4+). Verify with pip --version or pip3 --version
An OpenAI API key
A CData Connect AI account (free trial available)

Installation

Clone the repository:

git clone https://github.com/CDataSoftware/connectai-openai-agent.git
cd connectai-openai-agent

Install dependencies:

pip install -r requirements.txt

Configure your environment:

cp .env.example .env

Edit .env with your credentials:

OPENAI_API_KEY=your_openai_api_key
CDATA_EMAIL=your_email@example.com
CDATA_PAT=your_personal_access_token

Get Your CData Connect AI Credentials

Sign up at CData Connect AI
Add a data source connection (e.g., Google Sheets)
Go to Settings > Access Tokens > Create PAT
Copy the token (it's only shown once!)

Basic Usage

from dotenv import load_dotenv
from src.connectai_openai import Config, MCPAgent

load_dotenv()

# Create agent from environment variables
config = Config.from_env()
agent = MCPAgent(config)

# Ask questions about your data
response = agent.chat("What data sources do I have connected?")
print(response)

response = agent.chat("Show me the tables in my Google Sheets connection")
print(response)

response = agent.chat("Query the top 5 accounts by revenue")
print(response)

Interactive Chat

Run the example chat application:

python examples/basic_chat.py

Examples

Query Google Sheets Data

from dotenv import load_dotenv
from src.connectai_openai import Config, MCPAgent

load_dotenv()

config = Config.from_env()
agent = MCPAgent(config)

# Explore available data
agent.chat("List all my data connections")
agent.chat("What tables are in my Google Sheets?")
agent.chat("Show me the columns in the account table")

# Query the data
response = agent.chat("""
    Show me all accounts with revenue over $1 million,
    sorted by revenue descending
""")
print(response)

Streaming Responses

from dotenv import load_dotenv
from src.connectai_openai import Config, MCPAgent

load_dotenv()

config = Config.from_env()
agent = MCPAgent(config)

# Stream the response
for chunk in agent.chat_stream("Analyze the health of my top 5 customers"):
    print(chunk, end="", flush=True)

Multi-Source Analysis

# Query across multiple connected sources
agent.chat("Compare sales data from Salesforce with usage data from Google Sheets")

API Reference

Config

Configuration class for credentials and settings.

# From environment variables
config = Config.from_env()

# Or explicit values
config = Config(
    openai_api_key="sk-...",
    cdata_email="user@example.com",
    cdata_pat="your-pat-token",
    openai_model="gpt-4o",  # optional
    mcp_server_url="https://mcp.cloud.cdata.com/mcp"  # optional
)

MCPAgent

AI agent with tool calling capabilities.

agent = MCPAgent(
    config,
    instructions="Custom system prompt...",  # optional
    max_tool_iterations=10  # optional
)

# Methods
response = agent.chat("Your question")
for chunk in agent.chat_stream("Your question"):
    print(chunk)
agent.clear_history()
tools = agent.get_available_tools()

MCPClient

Low-level MCP client for direct tool access.

from src.connectai_openai import Config, MCPClient

config = Config.from_env()
client = MCPClient(config)

# Discover tools
tools = client.list_tools()

# Execute tools directly
catalogs = client.get_catalogs()
schemas = client.get_schemas("MyConnection")
tables = client.get_tables("MyConnection", "GoogleSheets")
columns = client.get_columns("MyConnection", "GoogleSheets", "account")
results = client.query_data("SELECT * FROM [MyConnection].[GoogleSheets].[account]")

Available MCP Tools

The agent automatically has access to these CData Connect AI tools:

Tool	Description
`getCatalogs`	List available data source connections
`getSchemas`	Get schemas for a specific catalog
`getTables`	Get tables in a schema
`getColumns`	Get column metadata for a table
`queryData`	Execute SQL queries
`getProcedures`	List stored procedures
`getProcedureParameters`	Get procedure parameter details
`executeProcedure`	Execute stored procedures
`getInstructions`	Get driver-specific instructions and best practices for a data source

SQL Query Format

When querying data, use fully qualified table names:

SELECT * FROM [CatalogName].[SchemaName].[TableName]

Example:

SELECT [Name], [Revenue]
FROM [demo_organization].[GoogleSheets].[account]
WHERE [Revenue] > 1000000
ORDER BY [Revenue] DESC

Sample Data

To get started quickly, copy our sample Google Sheet with customer data:

account: Company information (name, industry, revenue)
opportunity: Sales pipeline data
tickets: Support ticket information
usage: Product usage metrics

Troubleshooting

Authentication Errors

Verify your CData email and PAT are correct in .env
Ensure the PAT hasn't expired
Check that your Connect AI account is active

No Tools Available

Confirm you have at least one data source connected in Connect AI
Check that your user has permissions to access the connection

Query Errors

Use fully qualified table names: [Catalog].[Schema].[Table]
Verify column names exist using getColumns
Check SQL syntax (Connect AI uses SQL-92 standard)

Resources

License

MIT License - see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
docs		docs
examples		examples
src/connectai_openai		src/connectai_openai
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CData Connect AI OpenAI Agent

Overview

Interested in embedding connectivity into your product?

Architecture

Quick Start

Prerequisites

Installation

Get Your CData Connect AI Credentials

Basic Usage

Interactive Chat

Examples

Query Google Sheets Data

Streaming Responses

Multi-Source Analysis

API Reference

Config

MCPAgent

MCPClient

Available MCP Tools

SQL Query Format

Sample Data

Troubleshooting

Authentication Errors

No Tools Available

Query Errors

Resources

License

Support

About

Uh oh!

Releases

Packages

Languages

License

CDataSoftware/connectai-openai-agent

Folders and files

Latest commit

History

Repository files navigation

CData Connect AI OpenAI Agent

Overview

Interested in embedding connectivity into your product?

Architecture

Quick Start

Prerequisites

Installation

Get Your CData Connect AI Credentials

Basic Usage

Interactive Chat

Examples

Query Google Sheets Data

Streaming Responses

Multi-Source Analysis

API Reference

Config

MCPAgent

MCPClient

Available MCP Tools

SQL Query Format

Sample Data

Troubleshooting

Authentication Errors

No Tools Available

Query Errors

Resources

License

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages