Skip to content

πŸ”§ Database Migration: Neon β†’ AKS PostgreSQL (Jan 1st)Β #35

@mjunaidca

Description

@mjunaidca

Summary

Neon free tier quota exceeded - need to migrate databases to self-hosted PostgreSQL in AKS.

Current Status

  • SSO Platform: Down (ImagePullBackOff fixed, but DB connection fails)
  • Taskflow API: Likely affected
  • Root Cause: Neon compute quota exceeded (402 error)
  • Quota Resets: January 1st, 2026

Affected Databases (Old Neon Project)

Database Connection String
sso-v1 postgresql://neondb_owner:***@ep-young-cherry-adgf4fam-pooler.c-2.us-east-1.aws.neon.tech/sso-v1
api-v1 postgresql://neondb_owner:***@ep-young-cherry-adgf4fam-pooler.c-2.us-east-1.aws.neon.tech/api-v1
chatkit-v1 postgresql://neondb_owner:***@ep-young-cherry-adgf4fam-pooler.c-2.us-east-1.aws.neon.tech/chatkit-v1

Old Neon Project: taskflow-cloud (ID: soft-breeze-10557120)

New Resources Created (Not in use yet)

  • New Neon Project: taskflow-prod (ID: morning-hat-00626434)
  • Databases created: sso-v1, api-v1, chatkit-v1
  • Connection: postgresql://neondb_owner:npg_w8NCzWx1DePg@ep-damp-bar-aer5yllj-pooler.c-2.us-east-2.aws.neon.tech/

Migration Plan (Jan 1st)

Phase 1: Dump from Old Neon (when quota resets)

# Dump all 3 databases
pg_dump "postgresql://neondb_owner:npg_MenPdZg1aUV8@ep-young-cherry-adgf4fam-pooler.c-2.us-east-1.aws.neon.tech/sso-v1?sslmode=require" --no-owner --no-acl > sso-v1.sql

pg_dump "postgresql://neondb_owner:npg_MenPdZg1aUV8@ep-young-cherry-adgf4fam-pooler.c-2.us-east-1.aws.neon.tech/api-v1?sslmode=require" --no-owner --no-acl > api-v1.sql

pg_dump "postgresql://neondb_owner:npg_MenPdZg1aUV8@ep-young-cherry-adgf4fam-pooler.c-2.us-east-1.aws.neon.tech/chatkit-v1?sslmode=require" --no-owner --no-acl > chatkit-v1.sql

Phase 2: Install PostgreSQL in AKS

# Add Bitnami repo
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update

# Install PostgreSQL with persistence
helm install postgres bitnami/postgresql \
  --namespace taskflow \
  --set auth.postgresPassword=<GENERATE_SECURE_PASSWORD> \
  --set auth.database=postgres \
  --set primary.persistence.size=10Gi \
  --set primary.persistence.storageClass=managed-csi

Phase 3: Create Databases & Restore

# Port forward to access postgres
kubectl port-forward svc/postgres-postgresql 5432:5432 -n taskflow

# Create databases
psql -h localhost -U postgres -c "CREATE DATABASE \"sso-v1\";"
psql -h localhost -U postgres -c "CREATE DATABASE \"api-v1\";"
psql -h localhost -U postgres -c "CREATE DATABASE \"chatkit-v1\";"

# Restore data
psql -h localhost -U postgres -d sso-v1 < sso-v1.sql
psql -h localhost -U postgres -d api-v1 < api-v1.sql
psql -h localhost -U postgres -d chatkit-v1 < chatkit-v1.sql

Phase 4: Update K8s Secrets

# New connection string format (in-cluster):
# postgresql://postgres:<PASSWORD>@postgres-postgresql.taskflow.svc.cluster.local:5432/<DB_NAME>

# Update SSO secret
kubectl delete secret sso-platform-postgres-secret -n taskflow
kubectl create secret generic sso-platform-postgres-secret \
  --from-literal=DATABASE_URL="postgresql://postgres:<PASSWORD>@postgres-postgresql.taskflow.svc.cluster.local:5432/sso-v1" \
  -n taskflow

# Update Taskflow API secret
kubectl delete secret taskflow-api-postgres-secret -n taskflow
kubectl create secret generic taskflow-api-postgres-secret \
  --from-literal=DATABASE_URL="postgresql://postgres:<PASSWORD>@postgres-postgresql.taskflow.svc.cluster.local:5432/api-v1" \
  -n taskflow

# Update ChatKit in taskflow-api-secret
kubectl get secret taskflow-api-secret -n taskflow -o yaml > taskflow-api-secret-backup.yaml
# Edit TASKFLOW_CHATKIT_DATABASE_URL to point to in-cluster postgres

Phase 5: Restart Deployments

kubectl rollout restart deployment sso-platform -n taskflow
kubectl rollout restart deployment taskflow-api -n taskflow
# Restart any other affected deployments

Phase 6: Set Up Backup CronJob

apiVersion: batch/v1
kind: CronJob
metadata:
  name: postgres-backup
  namespace: taskflow
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: postgres:17
            command:
            - /bin/sh
            - -c
            - |
              pg_dump -h postgres-postgresql -U postgres sso-v1 > /backup/sso-v1-$(date +%Y%m%d).sql
              pg_dump -h postgres-postgresql -U postgres api-v1 > /backup/api-v1-$(date +%Y%m%d).sql
              pg_dump -h postgres-postgresql -U postgres chatkit-v1 > /backup/chatkit-v1-$(date +%Y%m%d).sql
            env:
            - name: PGPASSWORD
              valueFrom:
                secretKeyRef:
                  name: postgres-postgresql
                  key: postgres-password
            volumeMounts:
            - name: backup
              mountPath: /backup
          volumes:
          - name: backup
            persistentVolumeClaim:
              claimName: postgres-backup-pvc
          restartPolicy: OnFailure

Cleanup (After Migration Verified)

  • Delete old Neon project taskflow-cloud
  • Delete unused Neon project taskflow-prod
  • Update GitHub secrets if any reference Neon
  • Update any CI/CD pipelines

Why Self-Hosted PostgreSQL?

  • No external quota limits
  • Uses existing AKS resources (no extra cost)
  • Full control over backups and scaling
  • No vendor lock-in

K8s Secrets to Update

Secret Key Status
sso-platform-postgres-secret DATABASE_URL Pending
taskflow-api-postgres-secret DATABASE_URL Pending
taskflow-api-secret TASKFLOW_CHATKIT_DATABASE_URL Pending
sso-platform-secret DATABASE_PASSWORD Pending

Fixed During This Session

  • βœ… GHCR image pull secret updated (was using wrong token ghs_* instead of ghp_*)

Created: December 29, 2024
Target Date: January 1, 2025
Priority: High

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions