Skip to content

fix: merge main into blog feature branch#1091

Closed
whitep4nth3r wants to merge 297 commits intofeat/atproto-blog-fefrom
main
Closed

fix: merge main into blog feature branch#1091
whitep4nth3r wants to merge 297 commits intofeat/atproto-blog-fefrom
main

Conversation

@whitep4nth3r
Copy link
Contributor

No description provided.

iiio2 and others added 30 commits February 2, 2026 09:23
Co-authored-by: Daniel Roe <daniel@roe.dev>
Co-authored-by: Yevhen Husak <gusa4grr@users.noreply.github.com>
Co-authored-by: Daniel Roe <daniel@roe.dev>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Daniel Roe <daniel@roe.dev>
Co-authored-by: Daniel Roe <daniel@roe.dev>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Daniel Roe <daniel@roe.dev>
Co-authored-by: Daniel Roe <daniel@roe.dev>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Daniel Roe <daniel@roe.dev>
Co-authored-by: Daniel Roe <daniel@roe.dev>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Daniel Roe <daniel@roe.dev>
Co-authored-by: Daniel Roe <daniel@roe.dev>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Daniel Roe <daniel@roe.dev>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Daniel Roe <daniel@roe.dev>
IdrisGit and others added 26 commits February 5, 2026 20:20
Co-authored-by: Daniel Roe <daniel@roe.dev>
Co-authored-by: Daniel Roe <daniel@roe.dev>
Co-authored-by: Salma Alam-Naylor <52798353+whitep4nth3r@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Daniel Roe <daniel@roe.dev>
}

console.warn(
`User ${loggedInUsersDid} tried to unlike a package ${body.packageName} but it was not liked by them.`,

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This logs sensitive data returned by
an access to oAuthSession
as clear text.

Copilot Autofix

AI 1 day ago

In general, to fix clear-text logging of sensitive information, you should either remove the sensitive fields from logs entirely, or anonymize/obfuscate them (for example, hashing or truncating) so that they cannot easily be mapped back to a specific user, while still providing enough information for debugging. You should also avoid logging tokens, passwords, or full session objects.

For this specific case, the simplest fix that preserves functionality is to stop logging the full loggedInUsersDid and instead log either a redacted form (e.g., truncated DID) or no user information at all. Since the main purpose of the warning seems to be to record that an unlike was attempted for a package that is not liked, the package name alone is likely sufficient. If some correlation is still needed for operations, we can log a non-reversible hash of the DID. This requires a hashing function but does not change any control flow or business logic.

Concretely, in server/api/social/like.delete.ts, we will replace the console.warn call to avoid embedding loggedInUsersDid directly. A minimal change is to log only the package:

console.warn(`A user tried to unlike package ${body.packageName} but it was not liked by them.`);

This removes the sensitive identifier from the log while preserving the information about unexpected behavior. No new imports or external libraries are required for this change.

Suggested changeset 1
server/api/social/like.delete.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/server/api/social/like.delete.ts b/server/api/social/like.delete.ts
--- a/server/api/social/like.delete.ts
+++ b/server/api/social/like.delete.ts
@@ -37,7 +37,7 @@
   }
 
   console.warn(
-    `User ${loggedInUsersDid} tried to unlike a package ${body.packageName} but it was not liked by them.`,
+    `A user tried to unlike package ${body.packageName} but it was not liked by them.`,
   )
 
   return await likesUtil.getLikes(body.packageName, loggedInUsersDid)
EOF
@@ -37,7 +37,7 @@
}

console.warn(
`User ${loggedInUsersDid} tried to unlike a package ${body.packageName} but it was not liked by them.`,
`A user tried to unlike package ${body.packageName} but it was not liked by them.`,
)

return await likesUtil.getLikes(body.packageName, loggedInUsersDid)
Copilot is powered by AI and may make mistakes. Always verify output.
function getProviderInfo(builderId: string): { provider: string; providerLabel: string } {
const exact = PROVIDER_IDS[builderId]
if (exact) return exact
if (builderId.includes('gitlab.com') && builderId.includes('/runners/'))

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High

'
gitlab.com
' can be anywhere in the URL, and arbitrary hosts may come before or after it.

Copilot Autofix

AI 1 day ago

In general, the fix is to stop using substring checks on the full URL string to infer the host or origin, and instead parse the URL and validate its hostname and path structure explicitly. For GitLab builder IDs, we should confirm that the URL is actually on gitlab.com (or a well-defined set of allowed GitLab hosts) and that its path contains the expected /-/runners/ segment, rather than just checking for those substrings anywhere.

Concretely, in server/utils/provenance.ts, within getProviderInfo(builderId: string), replace the condition:

if (builderId.includes('gitlab.com') && builderId.includes('/runners/'))
  return { provider: 'gitlab', providerLabel: 'GitLab CI' }

with logic that:

  1. Attempts to construct a URL object from builderId.
  2. Checks that url.hostname is exactly gitlab.com (or potentially extendable to a small allowlist if needed in the future).
  3. Checks that url.pathname contains the expected /-/runners/ segment (for example url.pathname.includes('/-/runners/')), since GitLab’s documented format is https://gitlab.com/<path>/-/runners/<id>.

If URL parsing fails (throws), we fall through to the 'unknown' provider case, preserving existing behavior for malformed IDs. We can use the built-in URL class available in modern Node/JS runtimes, so no extra imports are needed. This keeps existing functionality (legitimate GitLab builder IDs are still recognized) while closing the substring-sanitization issue.

Suggested changeset 1
server/utils/provenance.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/server/utils/provenance.ts b/server/utils/provenance.ts
--- a/server/utils/provenance.ts
+++ b/server/utils/provenance.ts
@@ -15,8 +15,13 @@
 function getProviderInfo(builderId: string): { provider: string; providerLabel: string } {
   const exact = PROVIDER_IDS[builderId]
   if (exact) return exact
-  if (builderId.includes('gitlab.com') && builderId.includes('/runners/'))
-    return { provider: 'gitlab', providerLabel: 'GitLab CI' }
+  try {
+    const url = new URL(builderId)
+    if (url.hostname === 'gitlab.com' && url.pathname.includes('/-/runners/'))
+      return { provider: 'gitlab', providerLabel: 'GitLab CI' }
+  } catch {
+    // If builderId is not a valid URL, fall through to the unknown provider case
+  }
   return { provider: 'unknown', providerLabel: builderId ? 'CI' : 'Unknown' }
 }
 
EOF
@@ -15,8 +15,13 @@
function getProviderInfo(builderId: string): { provider: string; providerLabel: string } {
const exact = PROVIDER_IDS[builderId]
if (exact) return exact
if (builderId.includes('gitlab.com') && builderId.includes('/runners/'))
return { provider: 'gitlab', providerLabel: 'GitLab CI' }
try {
const url = new URL(builderId)
if (url.hostname === 'gitlab.com' && url.pathname.includes('/-/runners/'))
return { provider: 'gitlab', providerLabel: 'GitLab CI' }
} catch {
// If builderId is not a valid URL, fall through to the unknown provider case
}
return { provider: 'unknown', providerLabel: builderId ? 'CI' : 'Unknown' }
}

Copilot is powered by AI and may make mistakes. Always verify output.

function repoUrlToCommitUrl(repository: string, sha: string): string {
const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
if (normalized.includes('github.com')) return `${normalized}/commit/${sha}`

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High

'
github.com
' can be anywhere in the URL, and arbitrary hosts may come before or after it.

Copilot Autofix

AI 1 day ago

In general, to fix incomplete URL substring sanitization, you should parse the URL and inspect its hostname instead of using string.includes on the full URL. This ensures that checks like “is this GitHub?” only succeed when the actual host is github.com (or a specific set of allowed hosts), not when github.com appears in the path, query string, or as part of another domain name.

For this code, the best fix is to change repoUrlToCommitUrl and repoUrlToBlobUrl so that they normalize the repository string as before, then try to parse it as a URL using Node’s built‑in URL class. If parsing succeeds, use url.hostname to decide whether it is github.com or gitlab.com. If parsing fails (e.g., repository is in some non‑URL format), fall back to the existing generic behavior that appends /commit/ or /blob/. This preserves behavior for well‑formed URLs while eliminating the substring host check. Concretely:

  • Leave the initial normalization (replace(/\/$/, '').replace(/\.git$/, '')) in place.
  • Introduce a small helper that safely parses a URL and returns its hostname, or inline that logic in each function using try { const u = new URL(normalized); ... } catch { ... }.
  • Replace normalized.includes('github.com') with hostname === 'github.com'.
  • Replace normalized.includes('gitlab.com') with hostname === 'gitlab.com'.
  • Keep the default return ${normalized}/commit/${sha} and `return `${normalized}/blob/${ref}/${path} if the host is neither GitHub nor GitLab, or URL parsing fails.

No new external libraries are needed because URL is part of the standard library in modern Node.js and browsers.

Suggested changeset 1
server/utils/provenance.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/server/utils/provenance.ts b/server/utils/provenance.ts
--- a/server/utils/provenance.ts
+++ b/server/utils/provenance.ts
@@ -75,15 +75,25 @@
 
 function repoUrlToCommitUrl(repository: string, sha: string): string {
   const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
-  if (normalized.includes('github.com')) return `${normalized}/commit/${sha}`
-  if (normalized.includes('gitlab.com')) return `${normalized}/-/commit/${sha}`
+  try {
+    const url = new URL(normalized)
+    if (url.hostname === 'github.com') return `${normalized}/commit/${sha}`
+    if (url.hostname === 'gitlab.com') return `${normalized}/-/commit/${sha}`
+  } catch {
+    // Fall through to generic handling if parsing fails
+  }
   return `${normalized}/commit/${sha}`
 }
 
 function repoUrlToBlobUrl(repository: string, path: string, ref = 'main'): string {
   const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
-  if (normalized.includes('github.com')) return `${normalized}/blob/${ref}/${path}`
-  if (normalized.includes('gitlab.com')) return `${normalized}/-/blob/${ref}/${path}`
+  try {
+    const url = new URL(normalized)
+    if (url.hostname === 'github.com') return `${normalized}/blob/${ref}/${path}`
+    if (url.hostname === 'gitlab.com') return `${normalized}/-/blob/${ref}/${path}`
+  } catch {
+    // Fall through to generic handling if parsing fails
+  }
   return `${normalized}/blob/${ref}/${path}`
 }
 
EOF
@@ -75,15 +75,25 @@

function repoUrlToCommitUrl(repository: string, sha: string): string {
const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
if (normalized.includes('github.com')) return `${normalized}/commit/${sha}`
if (normalized.includes('gitlab.com')) return `${normalized}/-/commit/${sha}`
try {
const url = new URL(normalized)
if (url.hostname === 'github.com') return `${normalized}/commit/${sha}`
if (url.hostname === 'gitlab.com') return `${normalized}/-/commit/${sha}`
} catch {
// Fall through to generic handling if parsing fails
}
return `${normalized}/commit/${sha}`
}

function repoUrlToBlobUrl(repository: string, path: string, ref = 'main'): string {
const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
if (normalized.includes('github.com')) return `${normalized}/blob/${ref}/${path}`
if (normalized.includes('gitlab.com')) return `${normalized}/-/blob/${ref}/${path}`
try {
const url = new URL(normalized)
if (url.hostname === 'github.com') return `${normalized}/blob/${ref}/${path}`
if (url.hostname === 'gitlab.com') return `${normalized}/-/blob/${ref}/${path}`
} catch {
// Fall through to generic handling if parsing fails
}
return `${normalized}/blob/${ref}/${path}`
}

Copilot is powered by AI and may make mistakes. Always verify output.
function repoUrlToCommitUrl(repository: string, sha: string): string {
const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
if (normalized.includes('github.com')) return `${normalized}/commit/${sha}`
if (normalized.includes('gitlab.com')) return `${normalized}/-/commit/${sha}`

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High

'
gitlab.com
' can be anywhere in the URL, and arbitrary hosts may come before or after it.

Copilot Autofix

AI 1 day ago

In general terms, the fix is to stop using String.prototype.includes to infer whether a repository URL is hosted on GitHub or GitLab. Instead, parse the repository string as a URL (or, if that fails, fall back to a simple heuristic for non-URL forms like git@github.com:org/repo.git) and base the decision on the host/hostname (and maybe protocol) field. This ensures that github.com or gitlab.com must appear as the actual host (or an expected subdomain) rather than somewhere else in the string.

Concretely, within server/utils/provenance.ts, the changes are localized to repoUrlToCommitUrl and repoUrlToBlobUrl:

  1. Introduce a small helper function, e.g. getRepoHost(repository: string): string | null, that:
    • Trims trailing / and .git as currently done.
    • Tries new URL(normalized) to parse web URLs.
    • If parsing fails, handles common git-style URLs like git@github.com:org/repo.git by detecting @ and : and mapping them to a host (e.g. for git@github.com:foo/bar.git, host is github.com).
    • Returns the host in a normalized form (e.g. lowercase), or null if it cannot be determined.
  2. Update repoUrlToCommitUrl:
    • Compute normalized as before.
    • Compute const host = getRepoHost(normalized).
    • If host === 'github.com', return the GitHub commit URL pattern.
    • Else if host === 'gitlab.com', return the GitLab commit URL pattern.
    • Else fall back to the generic ${normalized}/commit/${sha}.
  3. Update repoUrlToBlobUrl analogously, using the same getRepoHost helper and selecting the appropriate pattern based on the host.

No new external imports are strictly needed; TypeScript’s URL class is globally available in Node.js. This approach keeps existing behavior for valid GitHub/GitLab URLs, including .git suffix or trailing slash, while preventing misclassification based on mere substring matches.

Suggested changeset 1
server/utils/provenance.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/server/utils/provenance.ts b/server/utils/provenance.ts
--- a/server/utils/provenance.ts
+++ b/server/utils/provenance.ts
@@ -73,17 +73,36 @@
   }
 }
 
+function getRepoHost(repository: string): string | null {
+  const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
+  try {
+    const url = new URL(normalized)
+    return url.hostname.toLowerCase()
+  } catch {
+    // Handle common SSH-style git URLs, e.g. git@github.com:org/repo.git
+    const atIndex = normalized.indexOf('@')
+    const colonIndex = normalized.indexOf(':', atIndex + 1)
+    if (atIndex !== -1 && colonIndex !== -1) {
+      const hostPart = normalized.slice(atIndex + 1, colonIndex).toLowerCase()
+      return hostPart || null
+    }
+  }
+  return null
+}
+
 function repoUrlToCommitUrl(repository: string, sha: string): string {
   const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
-  if (normalized.includes('github.com')) return `${normalized}/commit/${sha}`
-  if (normalized.includes('gitlab.com')) return `${normalized}/-/commit/${sha}`
+  const host = getRepoHost(repository)
+  if (host === 'github.com') return `${normalized}/commit/${sha}`
+  if (host === 'gitlab.com') return `${normalized}/-/commit/${sha}`
   return `${normalized}/commit/${sha}`
 }
 
 function repoUrlToBlobUrl(repository: string, path: string, ref = 'main'): string {
   const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
-  if (normalized.includes('github.com')) return `${normalized}/blob/${ref}/${path}`
-  if (normalized.includes('gitlab.com')) return `${normalized}/-/blob/${ref}/${path}`
+  const host = getRepoHost(repository)
+  if (host === 'github.com') return `${normalized}/blob/${ref}/${path}`
+  if (host === 'gitlab.com') return `${normalized}/-/blob/${ref}/${path}`
   return `${normalized}/blob/${ref}/${path}`
 }
 
EOF
@@ -73,17 +73,36 @@
}
}

function getRepoHost(repository: string): string | null {
const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
try {
const url = new URL(normalized)
return url.hostname.toLowerCase()
} catch {
// Handle common SSH-style git URLs, e.g. git@github.com:org/repo.git
const atIndex = normalized.indexOf('@')
const colonIndex = normalized.indexOf(':', atIndex + 1)
if (atIndex !== -1 && colonIndex !== -1) {
const hostPart = normalized.slice(atIndex + 1, colonIndex).toLowerCase()
return hostPart || null
}
}
return null
}

function repoUrlToCommitUrl(repository: string, sha: string): string {
const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
if (normalized.includes('github.com')) return `${normalized}/commit/${sha}`
if (normalized.includes('gitlab.com')) return `${normalized}/-/commit/${sha}`
const host = getRepoHost(repository)
if (host === 'github.com') return `${normalized}/commit/${sha}`
if (host === 'gitlab.com') return `${normalized}/-/commit/${sha}`
return `${normalized}/commit/${sha}`
}

function repoUrlToBlobUrl(repository: string, path: string, ref = 'main'): string {
const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
if (normalized.includes('github.com')) return `${normalized}/blob/${ref}/${path}`
if (normalized.includes('gitlab.com')) return `${normalized}/-/blob/${ref}/${path}`
const host = getRepoHost(repository)
if (host === 'github.com') return `${normalized}/blob/${ref}/${path}`
if (host === 'gitlab.com') return `${normalized}/-/blob/${ref}/${path}`
return `${normalized}/blob/${ref}/${path}`
}

Copilot is powered by AI and may make mistakes. Always verify output.

function repoUrlToBlobUrl(repository: string, path: string, ref = 'main'): string {
const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
if (normalized.includes('github.com')) return `${normalized}/blob/${ref}/${path}`

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High

'
github.com
' can be anywhere in the URL, and arbitrary hosts may come before or after it.

Copilot Autofix

AI 1 day ago

In general, the fix is to stop using substring matches on the whole URL string to infer the hosting provider, and instead parse the URL and inspect its host (or hostname) explicitly. This ensures that github.com or gitlab.com only match when they are the actual host (and, if desired, its subdomains), and not when they appear in the path, query, or as part of an unrelated domain name.

Concretely, in server/utils/provenance.ts, we should change repoUrlToCommitUrl and repoUrlToBlobUrl so that:

  1. They parse repository using the standard URL class.
  2. They derive normalized from the parsed URL (origin + pathname with trailing slash and .git stripped) instead of blindly manipulating the raw string.
  3. They check url.hostname for equality with github.com or gitlab.com (or allow appropriate subdomains, if needed).
  4. On parse failure (invalid URL) they fall back to the existing behavior, to avoid breaking current functionality.

This requires importing Node’s built‑in url parser (URL is globally available in modern Node, so we can just use it directly without a new import). The behavior for valid GitHub/GitLab URLs remains identical: GitHub uses /commit/ and /blob/, GitLab uses /-/commit/ and /-/blob/, and any other host falls back to the generic /commit/ and /blob/ pattern. For malformed or non-absolute repository strings, we preserve the old substring-based fallback path by detecting and short‑circuiting when new URL throws.

Suggested changeset 1
server/utils/provenance.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/server/utils/provenance.ts b/server/utils/provenance.ts
--- a/server/utils/provenance.ts
+++ b/server/utils/provenance.ts
@@ -74,17 +74,37 @@
 }
 
 function repoUrlToCommitUrl(repository: string, sha: string): string {
-  const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
-  if (normalized.includes('github.com')) return `${normalized}/commit/${sha}`
-  if (normalized.includes('gitlab.com')) return `${normalized}/-/commit/${sha}`
-  return `${normalized}/commit/${sha}`
+  // Try to parse as a URL to safely inspect the hostname; fall back to string ops on failure.
+  try {
+    const url = new URL(repository)
+    const base = url.origin + url.pathname.replace(/\/$/, '').replace(/\.git$/, '')
+    const hostname = url.hostname.toLowerCase()
+    if (hostname === 'github.com') return `${base}/commit/${sha}`
+    if (hostname === 'gitlab.com') return `${base}/-/commit/${sha}`
+    return `${base}/commit/${sha}`
+  } catch {
+    const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
+    if (normalized.includes('github.com')) return `${normalized}/commit/${sha}`
+    if (normalized.includes('gitlab.com')) return `${normalized}/-/commit/${sha}`
+    return `${normalized}/commit/${sha}`
+  }
 }
 
 function repoUrlToBlobUrl(repository: string, path: string, ref = 'main'): string {
-  const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
-  if (normalized.includes('github.com')) return `${normalized}/blob/${ref}/${path}`
-  if (normalized.includes('gitlab.com')) return `${normalized}/-/blob/${ref}/${path}`
-  return `${normalized}/blob/${ref}/${path}`
+  // Try to parse as a URL to safely inspect the hostname; fall back to string ops on failure.
+  try {
+    const url = new URL(repository)
+    const base = url.origin + url.pathname.replace(/\/$/, '').replace(/\.git$/, '')
+    const hostname = url.hostname.toLowerCase()
+    if (hostname === 'github.com') return `${base}/blob/${ref}/${path}`
+    if (hostname === 'gitlab.com') return `${base}/-/blob/${ref}/${path}`
+    return `${base}/blob/${ref}/${path}`
+  } catch {
+    const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
+    if (normalized.includes('github.com')) return `${normalized}/blob/${ref}/${path}`
+    if (normalized.includes('gitlab.com')) return `${normalized}/-/blob/${ref}/${path}`
+    return `${normalized}/blob/${ref}/${path}`
+  }
 }
 
 /**
EOF
@@ -74,17 +74,37 @@
}

function repoUrlToCommitUrl(repository: string, sha: string): string {
const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
if (normalized.includes('github.com')) return `${normalized}/commit/${sha}`
if (normalized.includes('gitlab.com')) return `${normalized}/-/commit/${sha}`
return `${normalized}/commit/${sha}`
// Try to parse as a URL to safely inspect the hostname; fall back to string ops on failure.
try {
const url = new URL(repository)
const base = url.origin + url.pathname.replace(/\/$/, '').replace(/\.git$/, '')
const hostname = url.hostname.toLowerCase()
if (hostname === 'github.com') return `${base}/commit/${sha}`
if (hostname === 'gitlab.com') return `${base}/-/commit/${sha}`
return `${base}/commit/${sha}`
} catch {
const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
if (normalized.includes('github.com')) return `${normalized}/commit/${sha}`
if (normalized.includes('gitlab.com')) return `${normalized}/-/commit/${sha}`
return `${normalized}/commit/${sha}`
}
}

function repoUrlToBlobUrl(repository: string, path: string, ref = 'main'): string {
const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
if (normalized.includes('github.com')) return `${normalized}/blob/${ref}/${path}`
if (normalized.includes('gitlab.com')) return `${normalized}/-/blob/${ref}/${path}`
return `${normalized}/blob/${ref}/${path}`
// Try to parse as a URL to safely inspect the hostname; fall back to string ops on failure.
try {
const url = new URL(repository)
const base = url.origin + url.pathname.replace(/\/$/, '').replace(/\.git$/, '')
const hostname = url.hostname.toLowerCase()
if (hostname === 'github.com') return `${base}/blob/${ref}/${path}`
if (hostname === 'gitlab.com') return `${base}/-/blob/${ref}/${path}`
return `${base}/blob/${ref}/${path}`
} catch {
const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
if (normalized.includes('github.com')) return `${normalized}/blob/${ref}/${path}`
if (normalized.includes('gitlab.com')) return `${normalized}/-/blob/${ref}/${path}`
return `${normalized}/blob/${ref}/${path}`
}
}

/**
Copilot is powered by AI and may make mistakes. Always verify output.
function repoUrlToBlobUrl(repository: string, path: string, ref = 'main'): string {
const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
if (normalized.includes('github.com')) return `${normalized}/blob/${ref}/${path}`
if (normalized.includes('gitlab.com')) return `${normalized}/-/blob/${ref}/${path}`

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High

'
gitlab.com
' can be anywhere in the URL, and arbitrary hosts may come before or after it.

Copilot Autofix

AI 1 day ago

In general, instead of using string.includes('gitlab.com') on an entire URL, parse the URL and compare the actual host (and optionally scheme) against a whitelist or well-defined patterns (e.g., exact match or checking hostname === 'gitlab.com' or ends-with .gitlab.com). This avoids cases where the substring appears in the path, query, or as part of another domain.

For this specific file, we should update repoUrlToCommitUrl and repoUrlToBlobUrl so that they determine whether a URL is GitHub or GitLab based on the parsed hostname, not a substring search. The behavior we want to preserve is: if the repository is hosted on GitHub, use /commit/ or /blob/; if on GitLab, use /-/commit/ or /-/blob/; otherwise, fall back to the GitHub-style paths. We can safely use the built‑in URL class available in Node.js/modern JS to parse the URL; if parsing fails (e.g., repository is just user/repo), we can fall back to the original substring behavior to avoid breaking existing functionality.

Concretely:

  • Add a small helper, e.g. getHostFromUrl, that tries new URL(...) and returns the hostname or null if parsing fails.
  • In both repoUrlToCommitUrl and repoUrlToBlobUrl, after normalizing, call this helper. If we get a hostname:
    • If it is exactly github.com or ends with .github.com, treat as GitHub.
    • If it is exactly gitlab.com or ends with .gitlab.com, treat as GitLab.
  • If we cannot parse a hostname, retain the existing includes logic as a non‑URL fallback (handles SCP-like git@github.com:user/repo.git or bare github.com/user/repo strings).
    This keeps existing behavior for non-URL repo identifiers while making URL-based checks safe.
Suggested changeset 1
server/utils/provenance.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/server/utils/provenance.ts b/server/utils/provenance.ts
--- a/server/utils/provenance.ts
+++ b/server/utils/provenance.ts
@@ -73,17 +73,49 @@
   }
 }
 
+function getHostFromUrl(url: string): string | null {
+  try {
+    // Ensure we have a scheme so that bare hosts like "github.com/foo" still parse
+    const value = url.match(/^[a-zA-Z][a-zA-Z0-9+.-]*:\/\//) ? url : `https://${url}`
+    return new URL(value).hostname
+  } catch {
+    return null
+  }
+}
+
 function repoUrlToCommitUrl(repository: string, sha: string): string {
   const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
-  if (normalized.includes('github.com')) return `${normalized}/commit/${sha}`
-  if (normalized.includes('gitlab.com')) return `${normalized}/-/commit/${sha}`
+  const host = getHostFromUrl(normalized)
+  if (host) {
+    if (host === 'github.com' || host.endsWith('.github.com')) {
+      return `${normalized}/commit/${sha}`
+    }
+    if (host === 'gitlab.com' || host.endsWith('.gitlab.com')) {
+      return `${normalized}/-/commit/${sha}`
+    }
+  } else {
+    // Fallback for non-URL repository formats
+    if (normalized.includes('github.com')) return `${normalized}/commit/${sha}`
+    if (normalized.includes('gitlab.com')) return `${normalized}/-/commit/${sha}`
+  }
   return `${normalized}/commit/${sha}`
 }
 
 function repoUrlToBlobUrl(repository: string, path: string, ref = 'main'): string {
   const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
-  if (normalized.includes('github.com')) return `${normalized}/blob/${ref}/${path}`
-  if (normalized.includes('gitlab.com')) return `${normalized}/-/blob/${ref}/${path}`
+  const host = getHostFromUrl(normalized)
+  if (host) {
+    if (host === 'github.com' || host.endsWith('.github.com')) {
+      return `${normalized}/blob/${ref}/${path}`
+    }
+    if (host === 'gitlab.com' || host.endsWith('.gitlab.com')) {
+      return `${normalized}/-/blob/${ref}/${path}`
+    }
+  } else {
+    // Fallback for non-URL repository formats
+    if (normalized.includes('github.com')) return `${normalized}/blob/${ref}/${path}`
+    if (normalized.includes('gitlab.com')) return `${normalized}/-/blob/${ref}/${path}`
+  }
   return `${normalized}/blob/${ref}/${path}`
 }
 
EOF
@@ -73,17 +73,49 @@
}
}

function getHostFromUrl(url: string): string | null {
try {
// Ensure we have a scheme so that bare hosts like "github.com/foo" still parse
const value = url.match(/^[a-zA-Z][a-zA-Z0-9+.-]*:\/\//) ? url : `https://${url}`
return new URL(value).hostname
} catch {
return null
}
}

function repoUrlToCommitUrl(repository: string, sha: string): string {
const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
if (normalized.includes('github.com')) return `${normalized}/commit/${sha}`
if (normalized.includes('gitlab.com')) return `${normalized}/-/commit/${sha}`
const host = getHostFromUrl(normalized)
if (host) {
if (host === 'github.com' || host.endsWith('.github.com')) {
return `${normalized}/commit/${sha}`
}
if (host === 'gitlab.com' || host.endsWith('.gitlab.com')) {
return `${normalized}/-/commit/${sha}`
}
} else {
// Fallback for non-URL repository formats
if (normalized.includes('github.com')) return `${normalized}/commit/${sha}`
if (normalized.includes('gitlab.com')) return `${normalized}/-/commit/${sha}`
}
return `${normalized}/commit/${sha}`
}

function repoUrlToBlobUrl(repository: string, path: string, ref = 'main'): string {
const normalized = repository.replace(/\/$/, '').replace(/\.git$/, '')
if (normalized.includes('github.com')) return `${normalized}/blob/${ref}/${path}`
if (normalized.includes('gitlab.com')) return `${normalized}/-/blob/${ref}/${path}`
const host = getHostFromUrl(normalized)
if (host) {
if (host === 'github.com' || host.endsWith('.github.com')) {
return `${normalized}/blob/${ref}/${path}`
}
if (host === 'gitlab.com' || host.endsWith('.gitlab.com')) {
return `${normalized}/-/blob/${ref}/${path}`
}
} else {
// Fallback for non-URL repository formats
if (normalized.includes('github.com')) return `${normalized}/blob/${ref}/${path}`
if (normalized.includes('gitlab.com')) return `${normalized}/-/blob/${ref}/${path}`
}
return `${normalized}/blob/${ref}/${path}`
}

Copilot is powered by AI and may make mistakes. Always verify output.
@@ -300,12 +328,26 @@
// (e.g., #install, #dependencies, #versions are used by the package page)
const id = `user-content-${uniqueSlug}`

// Collect TOC item with plain text (HTML stripped)
const plainText = text.replace(/<[^>]*>/g, '').trim()

Check failure

Code scanning / CodeQL

Incomplete multi-character sanitization High

This string may still contain
<script
, which may cause an HTML element injection vulnerability.

Copilot Autofix

AI 1 day ago

In general, the problem is that we're using a hand-written regex /\<[^>]*\>/g to remove HTML tags from a string derived from user-controlled Markdown. This is both brittle and an example of incomplete multi-character sanitization. The best fix is to avoid this regex altogether and either (a) use the already-imported sanitize-html library to strip all tags safely, or (b) leverage the marked token tree to build a plain-text heading directly. Since sanitize-html is already imported in this file, the simplest fix with minimal functional change is to feed text into sanitizeHtml with a configuration that removes all tags and returns only text.

Concretely, in server/utils/readme.ts, in the renderer.heading implementation around line 332, we should replace:

const plainText = text.replace(/<[^>]*>/g, '').trim()

with a call to sanitizeHtml configured to disallow all tags and attributes:

const plainText = sanitizeHtml(text, {
  allowedTags: [],
  allowedAttributes: {},
}).trim()

This uses a well-tested library instead of a fragile regex, eliminates multi-character sanitization issues, and should preserve existing behavior of “HTML stripped, text kept” for TOC entries. No new imports are needed, because sanitizeHtml is already imported at line 2. All other logic (slug creation, TOC pushing, heading rendering) can remain unchanged.

Suggested changeset 1
server/utils/readme.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/server/utils/readme.ts b/server/utils/readme.ts
--- a/server/utils/readme.ts
+++ b/server/utils/readme.ts
@@ -329,7 +329,10 @@
     const id = `user-content-${uniqueSlug}`
 
     // Collect TOC item with plain text (HTML stripped)
-    const plainText = text.replace(/<[^>]*>/g, '').trim()
+    const plainText = sanitizeHtml(text, {
+      allowedTags: [],
+      allowedAttributes: {},
+    }).trim()
     if (plainText) {
       toc.push({ text: plainText, id, depth })
     }
EOF
@@ -329,7 +329,10 @@
const id = `user-content-${uniqueSlug}`

// Collect TOC item with plain text (HTML stripped)
const plainText = text.replace(/<[^>]*>/g, '').trim()
const plainText = sanitizeHtml(text, {
allowedTags: [],
allowedAttributes: {},
}).trim()
if (plainText) {
toc.push({ text: plainText, id, depth })
}
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
@vercel
Copy link

vercel bot commented Feb 6, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
npmx.dev Ignored Ignored Feb 6, 2026 0:49am

Request Review

@github-actions
Copy link

github-actions bot commented Feb 6, 2026

Lunaria Status Overview

🌕 This pull request will trigger status changes.

Learn more

By default, every PR changing files present in the Lunaria configuration's files property will be considered and trigger status changes accordingly.

You can change this by adding one of the keywords present in the ignoreKeywords property in your Lunaria configuration file in the PR's title (ignoring all files) or by including a tracker directive in the merged commit's description.

Tracked Files

File Note
lunaria/files/ar-EG.json Localization changed, will be marked as complete.
lunaria/files/az-AZ.json Localization added, will be marked as complete.
lunaria/files/cs-CZ.json Localization added, will be marked as complete.
lunaria/files/de-DE.json Localization changed, will be marked as complete.
lunaria/files/en-GB.json Localization added, will be marked as complete.
lunaria/files/en-US.json Source changed, localizations will be marked as outdated.
lunaria/files/es-419.json Localization changed, will be marked as complete.
lunaria/files/es-ES.json Localization changed, will be marked as complete.
lunaria/files/fr-FR.json Localization changed, will be marked as complete.
lunaria/files/hi-IN.json Localization added, will be marked as complete.
lunaria/files/hu-HU.json Localization changed, will be marked as complete.
lunaria/files/id-ID.json Localization changed, will be marked as complete.
lunaria/files/it-IT.json Localization changed, will be marked as complete.
lunaria/files/ja-JP.json Localization changed, will be marked as complete.
lunaria/files/mr-IN.json Localization added, will be marked as complete.
lunaria/files/ne-NP.json Localization changed, will be marked as complete.
lunaria/files/pl-PL.json Localization changed, will be marked as complete.
lunaria/files/pt-BR.json Localization changed, will be marked as complete.
lunaria/files/ru-RU.json Localization changed, will be marked as complete.
lunaria/files/uk-UA.json Localization changed, will be marked as complete.
lunaria/files/zh-CN.json Localization changed, will be marked as complete.
lunaria/files/zh-TW.json Localization added, will be marked as complete.
Warnings reference
Icon Description
🔄️ The source for this localization has been updated since the creation of this pull request, make sure all changes in the source have been applied.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.