This project deploys a Telegram chatbot on AWS using Infrastructure as Code (Terraform). It provisions S3 buckets for storing archived chat history, DynamoDB tables for managing user sessions, Lambda for serverless processing, and API Gateway for real-time webhook integration with Telegram.
Status: Fully functional. Ollama AI integration is live on EC2 with API key authentication. Session management, commands, archive features, and AI chat are all working.
- Overview
- Architecture
- Features
- Prerequisites
- AWS Academy Setup
- Remote State Setup
- Deployment
- Telegram Webhook Setup
- Bot Commands
- Project Structure
- Module Structure
- External API Integration
- Data Storage
- Observability
- Verification
- Troubleshooting
- Cleanup
- License
This project creates a serverless Telegram bot running on AWS. When users send messages to the bot, Telegram forwards them to an API Gateway endpoint, which triggers a Lambda function to process the message and respond.
Key Features:
- ✅ Real-time message handling via API Gateway webhook
- ✅ User session creation and management
- ✅ Command handling (
/help,/newsession,/listsessions,/switch,/history,/echo) - ✅ Archive system (
/archive,/listarchives,/export, file import) - ✅ DynamoDB for live session storage
- ✅ S3 for archived session storage
- ✅ Ollama AI integration with API key authentication
┌─────────────┐ ┌─────────────────┐ ┌────────────────┐ ┌─────────────────┐
│ Telegram │─────▶│ API Gateway │─────▶│ Lambda │─────▶│ EC2 (Ollama) │
│ User │◀─────│ (webhook) │◀─────│ (handler.py) │◀─────│ AI inference │
└─────────────┘ └─────────────────┘ └────────────────┘ └─────────────────┘
│ │
┌───────────────────────┴──────────┐ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────────────────┐
│ DynamoDB │ │ S3 │
│(active sessions)│ │ (archived chats + AI models)│
└─────────────────┘ └─────────────────────────────┘
Flow:
- User sends a message to the Telegram bot
- Telegram POSTs the update to API Gateway webhook URL
- API Gateway triggers Lambda function
- Lambda processes the message (command or chat)
- Active session data is stored/retrieved from DynamoDB
- Archived sessions are stored in S3
- Lambda sends response back to Telegram
| Command | Purpose | Status |
|---|---|---|
/start or /hello |
Initialize and greet user | ✅ Working |
/help |
Show available commands | ✅ Working |
/newsession |
Show available models | ✅ Working |
/newsession <number> |
Create session with chosen model | ✅ Working |
/listsessions |
List all user sessions | ✅ Working |
/switch <number> |
Switch to a different session | ✅ Working |
/history |
Show recent messages in session | ✅ Working |
/archive |
List sessions available to archive | ✅ Working |
/archive <number> |
Archive a specific session to S3 | ✅ Working |
/listarchives |
List archived sessions | ✅ Working |
/export <number> |
Export archive as JSON file | ✅ Working |
| Send JSON file | Import archive from file | ✅ Working |
/status |
Check bot status | ✅ Working |
/echo <text> |
Echo back text (test command) | ✅ Working |
| Chat messages | Send to AI model | ✅ Working |
- AWS Academy Learner Lab access (or AWS account)
- Terraform >= 1.0.0
- AWS CLI configured with credentials
- Python 3.9+ with pip
- Telegram Bot Token (from @BotFather)
- Open AWS Academy and navigate to "Launch AWS Academy Learner Lab"
- Click Start Lab and wait for the status to turn green
- Click AWS Details to view credentials
Click Show next to "AWS CLI" in AWS Details and copy the credentials:
# Edit credentials file
nano ~/.aws/credentialsPaste the credentials:
[default]
aws_access_key_id=ASIA...
aws_secret_access_key=...
aws_session_token=FwoGZX...aws sts get-caller-identityaws iam get-role --role-name LabRole --query 'Role.Arn' --output textOutput: arn:aws:iam::ACCOUNT_ID:role/LabRole
Remote state stores your Terraform state in S3 with DynamoDB locking, enabling team collaboration and state protection.
A management script handles everything automatically — no manual file editing or hardcoding needed.
./scripts/manage-state.sh remoteThis will:
- Verify AWS credentials
- Create the S3 bucket and DynamoDB lock table if they don't exist
- Uncomment the backend block in
provider.tf - Auto-detect your account ID from active credentials
- Migrate existing local state to S3
./scripts/manage-state.sh localThis will:
- Comment out the backend block in
provider.tf - Migrate remote state back to a local
terraform.tfstatefile
./scripts/manage-state.sh statusIn AWS Academy environments, the backend S3 bucket and DynamoDB table may need to be recreated each session since resources are deleted when labs end. For persistent setups, consider using a personal AWS account.
git clone https://github.com/Man2Dev/cloud-Ai.git
cd cloud-Ai
# Create configuration file
cp terraform.tfvars.example terraform.tfvarsEdit terraform.tfvars:
telegram_token = "YOUR_TELEGRAM_BOT_TOKEN"
lab_role_arn = "arn:aws:iam::YOUR_ACCOUNT_ID:role/LabRole"# Clean previous builds
rm -rf package/ lambda_function.zip
# Create package directory
mkdir -p package
# Install dependencies
pip install -r requirements.txt -t ./package
# Copy handler
cp handler.py package/# Initialize Terraform
terraform init
# Preview changes
terraform plan
# Deploy
terraform apply -auto-approveAfter deployment, Terraform will output:
api_gateway_url- Your webhook URLs3_bucket_name- S3 bucket for archivesdynamodb_table_name- DynamoDB table namelambda_function_name- Lambda function name
Run the webhook setup script - it reads your token from terraform.tfvars and configures everything automatically:
./scripts/setup-webhook.shThe script will:
- Read your bot token from
terraform.tfvars(keeps it private) - Get the API Gateway URL from Terraform outputs
- Register the webhook with Telegram
- Verify the configuration
- Test bot connectivity
If you prefer to set up manually, replace YOUR_BOT_TOKEN and use the api_gateway_url from Terraform output:
# Get your API Gateway URL
terraform output api_gateway_url
# Set the webhook
curl "https://api.telegram.org/botYOUR_BOT_TOKEN/setWebhook?url=YOUR_API_GATEWAY_URL"Example:
curl "https://api.telegram.org/bot123456:ABC-DEF/setWebhook?url=https://abc123.execute-api.us-east-1.amazonaws.com/prod/webhook"curl "https://api.telegram.org/botYOUR_BOT_TOKEN/getWebhookInfo"Expected response:
{
"ok": true,
"result": {
"url": "https://abc123.execute-api.us-east-1.amazonaws.com/prod/webhook",
"has_custom_certificate": false,
"pending_update_count": 0
}
}- Open Telegram and find your bot
- Send
/startor/help - The bot should respond instantly!
If the bot stops responding after redeployment:
- The API Gateway URL may have changed
- Run
./scripts/setup-webhook.shto update the webhook
.
├── provider.tf # AWS provider + backend configuration
├── variables.tf # Variable definitions with validation
├── locals.tf # Local values for naming/tags
├── main.tf # Module calls and data sources
├── outputs.tf # Terraform outputs (from modules)
├── modules/ # Reusable Terraform modules
│ ├── s3/ # S3 bucket module
│ ├── dynamodb/ # DynamoDB table module
│ ├── lambda/ # Lambda function module
│ ├── api_gateway/ # API Gateway module
│ ├── monitoring/ # CloudWatch metric filter + alarm
│ └── ec2/ # EC2 Ollama inference server
├── backend-setup/ # Remote state infrastructure
│ └── main.tf # S3 bucket + DynamoDB for state
├── terraform.tfvars.example # Example configuration
├── terraform.tfvars # Your configuration (gitignored)
├── requirements.txt # Python dependencies
├── handler.py # Lambda function code
├── package/ # Lambda deployment package (generated)
├── scripts/
│ ├── setup-webhook.sh # Telegram webhook setup
│ ├── view-data.sh # View S3/DynamoDB contents
│ ├── test-observability.sh # Verify logging, metrics, alarms
│ ├── manage-state.sh # Switch between local/remote state
│ └── manage-ollama.sh # Start/stop Ollama EC2 instance
├── docs/
│ ├── GAP_ANALYSIS.md # Best practices analysis
│ └── DEMO_CHEATSHEET.md # Demo commands reference
├── .github/workflows/
│ ├── terraform-validate.yml # CI: Terraform validation
│ ├── pr-check.yml # CI: PR validation
│ └── deploy.yml # CD: AWS deployment
├── CONTRIBUTING.md # Branch strategy & guidelines
├── CHANGELOG.md # Project changelog
├── .gitignore # Git ignore rules
├── LICENSE # GPL v3 License
└── README.md # This documentation
The infrastructure is organized into reusable modules:
Creates an S3 bucket with versioning and lifecycle rules.
| Variable | Description | Default |
|---|---|---|
bucket_name |
Name of the bucket | Required |
versioning_enabled |
Enable versioning | true |
enable_lifecycle_rules |
Enable archival rules | true |
transition_to_ia_days |
Days before IA transition | 90 |
transition_to_glacier_days |
Days before Glacier | 180 |
Creates a DynamoDB table with GSIs and TTL support.
| Variable | Description | Default |
|---|---|---|
table_name |
Name of the table | Required |
billing_mode |
PAY_PER_REQUEST or PROVISIONED | PAY_PER_REQUEST |
hash_key |
Partition key name | Required |
global_secondary_indexes |
List of GSI configurations | [] |
ttl_enabled |
Enable TTL | false |
Creates a Lambda function with CloudWatch log group.
| Variable | Description | Default |
|---|---|---|
function_name |
Name of the function | Required |
filename |
Path to deployment package | Required |
handler |
Function handler | handler.lambda_handler |
runtime |
Lambda runtime | python3.9 |
role_arn |
IAM role ARN | Required |
Creates a REST API with Lambda integration.
| Variable | Description | Default |
|---|---|---|
api_name |
Name of the API | Required |
resource_path |
API path (e.g., webhook) | webhook |
lambda_invoke_arn |
Lambda invoke ARN | Required |
stage_name |
Deployment stage | dev |
Creates a CloudWatch metric filter and alarm for error detection.
| Variable | Description | Default |
|---|---|---|
function_name |
Lambda function name | Required |
log_group_name |
CloudWatch log group name | Required |
metric_namespace |
CloudWatch metric namespace | TelegramBot |
error_threshold |
Error count to trigger alarm | 1 |
evaluation_period_minutes |
Alarm evaluation window | 5 |
Creates an EC2 instance running Ollama for AI inference.
| Variable | Description | Default |
|---|---|---|
instance_name |
Name tag for the instance | Required |
instance_type |
EC2 instance type | t3.large |
ollama_model |
Model to pull on first boot | llama3.2:1b |
models_s3_bucket |
S3 bucket for model persistence | Required |
ssh_allowed_cidr |
CIDR for SSH access | 0.0.0.0/0 |
The bot integrates with Ollama, a self-hosted large language model inference server running on an EC2 instance. When users send chat messages, Lambda calls the Ollama API over HTTP to generate AI responses.
API Details:
| Property | Value |
|---|---|
| Service | Ollama (self-hosted) |
| Endpoint | POST http://<EC2_EIP>:11434/api/chat |
| Protocol | HTTP (REST) |
| Authentication | API key via X-API-Key header (nginx reverse proxy) |
| Request format | JSON: {"model": "llama3.2:1b", "messages": [...], "stream": false} |
| Response format | JSON: {"message": {"content": "..."}} |
Available Models:
| # | Model | Description | Size |
|---|---|---|---|
| 1 | llama3.2:1b |
Meta Llama 3.2, fast general-purpose | 1.3 GB |
| 2 | qwen2.5:1.5b-instruct-q4_K_M |
Alibaba Qwen 2.5, instruction-tuned | 986 MB |
Users select a model when creating a session via /newsession <number>. Sessions using removed models show a warning and prompt the user to create a new session.
Error Handling:
- Connection timeouts (22s) with structured JSON error logging
- Smart retry: retries once only on fast connection errors (< 5s), not on timeouts
- HTTP status code validation (non-200 responses return user-friendly error)
- Exception handling with stack traces logged to CloudWatch
- Graceful fallback: bot remains functional even if Ollama is unreachable
Secrets Management:
OLLAMA_URLpassed as Lambda environment variable via Terraform (not hardcoded)OLLAMA_API_KEYauto-generated (32-char random password) and passed to both Lambda and EC2 via Terraform- API key validated by nginx reverse proxy on EC2 (returns 401 without valid key)
Security:
- Nginx reverse proxy validates
X-API-Keyheader on all requests to port 11434 - Ollama binds to
127.0.0.1:11435(localhost only, not externally accessible) - SSH restricted to configurable CIDR (
ssh_allowed_cidrvariable) - S3 bucket: public access blocked, AES256 server-side encryption
- DynamoDB: server-side encryption enabled, point-in-time recovery enabled
- Sensitive Terraform outputs marked with
sensitive = true
Lifecycle Management:
./scripts/manage-ollama.sh start # Start instance, wait for Ollama API
./scripts/manage-ollama.sh stop # Stop instance (syncs models to S3)
./scripts/manage-ollama.sh status # Check instance and API health
./scripts/manage-ollama.sh ssh # SSH into the instanceManaging Models:
To add a new model, SSH into the EC2 instance and pull it:
# SSH into the instance
./scripts/manage-ollama.sh ssh
# Pull a model (must set OLLAMA_HOST since Ollama binds to port 11435)
OLLAMA_HOST=http://127.0.0.1:11435 ollama pull <model_name>
# List installed models
OLLAMA_HOST=http://127.0.0.1:11435 ollama list
# Remove a model
OLLAMA_HOST=http://127.0.0.1:11435 ollama rm <model_name>After pulling a new model, add it to the AVAILABLE_MODELS list in handler.py and redeploy the Lambda:
# Rebuild and deploy
cp handler.py /tmp/lambda-build/handler.py
cd /tmp/lambda-build && zip -r /path/to/lambda.zip . -x '__pycache__/*' '*.pyc'
aws lambda update-function-code --function-name telegram-bot --zip-file fileb://lambda.zipNote: Ollama binds to
127.0.0.1:11435(not the default 11434) because nginx reverse proxy occupies port 11434 for API key authentication. Always setOLLAMA_HOST=http://127.0.0.1:11435when using theollamaCLI on the instance.
Table: chatbot-sessions
| Attribute | Type | Purpose |
|---|---|---|
pk |
Number | Telegram user ID (partition key) |
sk |
String | Session identifier (sort key) |
model_name |
String | Selected AI model |
session_id |
String | UUID for the session |
conversation |
List | Array of messages |
is_active |
Number | 1 = active, 0 = inactive |
last_message_ts |
Number | Unix timestamp |
Global Secondary Indexes:
model_index- Query by model across usersactive_sessions_index- Query active sessions
Bucket: chatbot-conversations-{ACCOUNT_ID}
Structure:
chatbot-conversations-123456789/
└── archives/
└── {user_id}/
├── {session_id_1}.json
└── {session_id_2}.json
All Lambda logs use structured JSON with consistent fields:
{
"level": "INFO|WARNING|ERROR",
"timestamp": "2025-01-20T12:00:00.000000+00:00",
"action": "handle_command",
"outcome": "success|failure|warning",
"message": "Human-readable description",
"request_id": "lambda-request-id",
"user_id": 123456789,
"message_id": 100,
"chat_id": 123456789
}On errors, error and stack_trace fields are included automatically.
- CloudWatch log group:
/aws/lambda/telegram-bot - Retention: 14 days (configurable via
log_retention_daysvariable) - Managed by Terraform in the Lambda module
- Metric filter: Captures
{ $.level = "ERROR" }from structured JSON logs - Metric namespace:
TelegramBot - Alarm: Triggers when 1 or more errors occur within a 5-minute window
- Alarm auto-resolves (returns to OK) when no errors in the next period
# Tail live logs
aws logs tail /aws/lambda/telegram-bot --follow
# Filter for errors only
aws logs filter-log-events \
--log-group-name /aws/lambda/telegram-bot \
--filter-pattern '{ $.level = "ERROR" }'
# Check alarm state
aws cloudwatch describe-alarms \
--alarm-names "telegram-bot-error-alarm" \
--query 'MetricAlarms[0].{State:StateValue,Reason:StateReason}'
# View metric datapoints (last hour)
aws cloudwatch get-metric-statistics \
--namespace TelegramBot \
--metric-name telegram-bot-error-count \
--start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%S) \
--period 300 --statistics SumUse the data viewer script to inspect what's stored in S3 and DynamoDB:
# Show summary of all data
./scripts/view-data.sh
# Show full content (verbose mode)
./scripts/view-data.sh -v
# Show only DynamoDB sessions
./scripts/view-data.sh dynamodb
# Show only S3 archives
./scripts/view-data.sh s3Run the automated observability test to verify logging, metrics, and alarms are working:
./scripts/test-observability.shThis checks AWS credentials, log group retention, metric filter, alarm, triggers success/error events, and verifies structured log output.
# Check Lambda
aws lambda get-function --function-name telegram-bot
# Check DynamoDB
aws dynamodb describe-table --table-name chatbot-sessions
# Check S3
aws s3 ls
# Check API Gateway
aws apigateway get-rest-apis
# View Lambda logs
aws logs tail /aws/lambda/telegram-bot --followAccess the console through AWS Academy:
- Click AWS button (green dot) in Vocareum
- Navigate to Lambda, DynamoDB, S3, API Gateway services
- Session tokens expire every few hours
- Refresh credentials from AWS Academy → AWS Details
- AWS Academy restricts IAM role creation
- Use the pre-existing
LabRolevialab_role_arnvariable
- Check CloudWatch logs:
aws logs tail /aws/lambda/telegram-bot --follow - Verify environment variables are set
- Verify API Gateway URL is correct
- Test Lambda directly:
aws lambda invoke --function-name telegram-bot out.json - Check webhook info:
curl "https://api.telegram.org/bot<TOKEN>/getWebhookInfo"
- Bucket names must be globally unique
- The template uses
chatbot-conversations-{ACCOUNT_ID}for uniqueness
terraform destroy -auto-approvecurl "https://api.telegram.org/botYOUR_BOT_TOKEN/deleteWebhook"# Deploy
terraform init && terraform apply -auto-approve
# Setup webhook (automated - recommended)
./scripts/setup-webhook.sh
# Or manually:
# Get webhook URL
terraform output api_gateway_url
# Set webhook
curl "https://api.telegram.org/bot<TOKEN>/setWebhook?url=<URL>"
# Check webhook
curl "https://api.telegram.org/bot<TOKEN>/getWebhookInfo"
# View stored data (S3 + DynamoDB)
./scripts/view-data.sh
./scripts/view-data.sh -v # verbose
# Verify observability (logs, metrics, alarm)
./scripts/test-observability.sh
# Switch state backend
./scripts/manage-state.sh remote # use S3 remote state
./scripts/manage-state.sh local # use local state
./scripts/manage-state.sh status # check current backend
# View logs
aws logs tail /aws/lambda/telegram-bot --follow
# Destroy
terraform destroy -auto-approveThis project is licensed under GNU General Public License v3.0 or later (GPLv3+). See LICENSE for details.