Skip to content
41 changes: 41 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -312,3 +312,44 @@ You can inject CodeQL into the build tool as such as custom command; for compile
It will be necessary to keep track of the CodeQL category assigned to each build target, to avoid clashes.

See the GitHub documentation on [Using code scanning with your existing CI system](https://docs.github.com/en/enterprise-cloud@latest/code-security/code-scanning/integrating-with-code-scanning/using-code-scanning-with-your-existing-ci-system)

### Alternative SARIF Republishing: `republish-filtered-sarif` Action

A new GitHub Action, `republish-filtered-sarif`, is now available to streamline Code Scanning results presentation on Pull Requests within a monorepo.

**Problem Addressed:**
In monorepo PR workflows, Code Scanning checks for unscanned projects may appear incomplete. While the existing `republish-sarif` action addresses this, it often requires maintaining a `projects.json` file to define all projects for republishing. This can be an overhead for users.

**Solution (`republish-filtered-sarif` Action):**
This composite action provides a quick, easy, and **`projects.json`-agnostic** way to ensure a complete Code Scanning overview on PRs. It works by:

1. **Dynamically Discovering Analyses:** It queries the GitHub Code Scanning API to find all recent CodeQL analyses published to the `main` branch.
2. **Intelligent Exclusion:** It takes an `excluded-category` input, which should be the exact category string used in the CodeQL `analyze` step for the project currently being scanned in the PR. This allows the action to *exclude* that specific project's SARIF from being downloaded and re-uploaded.
3. **Latest SARIF Selection:** For all *other* projects/categories found on the `main` branch, it selects and downloads only the *most recent* SARIF.
4. **Direct API Republishing:** These filtered SARIFs are then directly uploaded to the current Pull Request's commit SHA via the GitHub Code Scanning API.

**Key Benefit:**
This action **simplifies the republishing process by removing the need for a `projects.json` file** for this step. Users provide the CodeQL `category` value of the currently scanned project, and the action automatically handles the rest, offering a streamlined, category-based approach for comprehensive PR security insights.

**Example Usage:**

```yaml
# In your monorepo workflow (e.g., for a 'backend' project)

jobs:
analyze:
steps:
# ... (checkout, setup, CodeQL init, build steps) ...

- name: Perform CodeQL Analysis (Backend)
uses: github/codeql-action/analyze@v3
with:
category: 'backend' # <--- This unique category string is key!

# ... (other steps, like dependency submission) ...

- name: Republish other SARIFs for PR
if: github.event_name == 'pull_request'
uses: your-org/your-repo/.github/actions/republish-filtered-sarif@main # Update this path
with:
excluded-category: 'backend' # Pass the category of the project just scanned
111 changes: 111 additions & 0 deletions republish-filtered-sarifs/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# .github/actions/republish-sarifs/action.yml

name: Download & Republish Filtered SARIFs
description: 'Downloads Code Scanning SARIF files, excludes a specified category, uploads them as an artifact, and then republishes the remaining to the PR/target branch.'
inputs:
excluded-category:
description: 'The single CodeQL category string to exclude from the download and republish (e.g., /language:java;project:backend-service).'
required: true
type: string

runs:
using: 'composite'
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Run SARIF Download Script
uses: actions/github-script@v7
env:
EXCLUDED_CATEGORY_INPUT: ${{ inputs.excluded-category }}
with:
script: |
const scriptPath = `${process.env.GITHUB_ACTION_PATH}/download_filtered_sarifs.js`;
const script = require(scriptPath);
script(github, context, core);
github-token: ${{ github.token }}

- name: Upload Downloaded SARIFs as Artifact
uses: actions/upload-artifact@v4
with:
name: filtered-sarif-downloads
path: sarif_downloads/
if-no-files-found: ignore

# Scenario 1: PR mode - upload SARIF files via GitHub API to the PR itself
- name: Upload Filtered SARIF files to PR Code Scanning
if: github.event_name == 'pull_request' && github.event.pull_request.merged != true && hashFiles('sarif_downloads/*.sarif') != ''
shell: bash
env:
GH_TOKEN: ${{ github.token }}
run: |
set -x # Keep debugging enabled for now, can remove after successful run

echo "Uploading filtered SARIF files to PR via GitHub Code Scanning API"

SARIF_COUNT=$(find sarif_downloads -name "*.sarif" | wc -l)
echo "Found $SARIF_COUNT SARIF files to upload"

REPO_OWNER="${GITHUB_REPOSITORY%/*}"
REPO_NAME="${GITHUB_REPOSITORY#*/}"

for SARIF_FILE in sarif_downloads/*.sarif; do
echo "Processing $SARIF_FILE"

# 1. Gzip and base64 encode the SARIF file content into a temporary file.
# This is the safest way to handle potentially very large data.
TEMP_BASE64_FILE=$(mktemp)
gzip -c "$SARIF_FILE" | base64 -w0 > "$TEMP_BASE64_FILE"

# 2. Reconstruct the category name from the filename (for logging only, not payload)
CATEGORY_NAME=$(basename "$SARIF_FILE" .sarif | sed 's/_/\//g' | sed 's/\.yml:/yml:/g' | sed 's/^analyze\//\/analyze\//g')

echo "Uploading $(basename "$SARIF_FILE") with category '$CATEGORY_NAME' to PR"

if [ ! -s "$TEMP_BASE64_FILE" ]; then # -s checks if file exists and is not empty
echo "Error: SARIF content for $SARIF_FILE is empty after base64 encoding or file creation failed."
rm -f "$TEMP_BASE64_FILE"
exit 1
fi

# 3. Construct the full JSON payload using jq
# We use `--rawfile sarif_data "$TEMP_BASE64_FILE"` to read the file content
# directly into the 'sarif_data' variable as a raw string.
JSON_PAYLOAD=$(jq -n \
--arg ref_val "refs/pull/${{ github.event.pull_request.number }}/merge" \
--arg commit_val "${{ github.sha }}" \
--rawfile sarif_data "$TEMP_BASE64_FILE" \
'{sarif: $sarif_data, ref: $ref_val, commit_sha: $commit_val}')

# Clean up the temporary file immediately after jq has used it
rm -f "$TEMP_BASE64_FILE"

echo "DEBUG: JSON_PAYLOAD (sarif content redacted):"
# Pretty-print the JSON_PAYLOAD but replace the 'sarif' field's value with "REDACTED"
echo "$JSON_PAYLOAD" | jq 'if .sarif then .sarif = "REDACTED" else . end'

if [ -z "$JSON_PAYLOAD" ]; then
echo "Error: JSON_PAYLOAD is empty. jq command failed to produce output."
exit 1
fi

# 4. Pipe the JSON payload to gh api --input -
printf "%s" "$JSON_PAYLOAD" | gh api \
--method POST \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
/repos/$REPO_OWNER/$REPO_NAME/code-scanning/sarifs \
--input - \
--jq '.id'

if [ $? -eq 0 ]; then
echo "✓ Successfully uploaded $(basename "$SARIF_FILE")"
else
echo "✗ Failed to upload $(basename "$SARIF_FILE")"
exit 1
fi

sleep 1
done
set +x

156 changes: 156 additions & 0 deletions republish-filtered-sarifs/download_filtered_sarifs.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
// download_filtered_sarifs.js

const fs = require('node:fs');
const path = require('node:path');

async function run(github, context, core) {
const repo = context.repo;

// --- Input: Get the single excluded category from an environment variable ---
// This value will be passed from your GitHub Actions workflow step (see workflow example below).
const EXCLUDED_CATEGORY = process.env.EXCLUDED_CATEGORY_INPUT;

if (!EXCLUDED_CATEGORY || EXCLUDED_CATEGORY.trim() === '') {
core.setFailed("Error: 'EXCLUDED_CATEGORY_INPUT' environment variable is required and cannot be empty.");
return;
}

// --- Dynamically determine the default branch for downloading analyses ---
let targetRef;
try {
const { data: repoData } = await github.rest.repos.get({
owner: repo.owner,
repo: repo.repo,
});
targetRef = `refs/heads/${repoData.default_branch}`;
core.info(`Dynamically determined default branch: '${repoData.default_branch}'. Will download analyses from '${targetRef}'.`);
} catch (error) {
core.setFailed(`Failed to get default branch for ${repo.owner}/${repo.repo}: ${error.message}`);
return;
}

core.info(`EXCLUDING SARIFs with category: '${EXCLUDED_CATEGORY}'`);


const DOWNLOAD_DIR = 'sarif_downloads';
if (!fs.existsSync(DOWNLOAD_DIR)) {
fs.mkdirSync(DOWNLOAD_DIR, { recursive: true }); // Ensure parent directories are created
core.info(`Created directory: ${DOWNLOAD_DIR}`);
} else {
core.info(`Directory already exists: ${DOWNLOAD_DIR}`);
}

let allAnalyses = [];
let page = 1;
let hasNextPage = true;

// --- 1. Fetch all recent Code Scanning analyses for the targetRef ---
try {
while (hasNextPage) {
const response = await github.rest.codeScanning.listRecentAnalyses({
owner: repo.owner,
repo: repo.repo,
ref: targetRef,
per_page: 100, // Fetch 100 analyses per page
page: page,
tool_name: 'CodeQL' // Optionally filter by tool if you only want CodeQL SARIFs
});

allAnalyses = allAnalyses.concat(response.data);

// Check if there are more pages to fetch
if (response.data.length < 100) {
hasNextPage = false;
} else {
page++;
}
}
core.info(`Found ${allAnalyses.length} total recent analyses for ref: '${targetRef}'`);
} catch (error) {
core.setFailed(`Failed to list recent analyses for '${targetRef}': ${error.message}`);
// Provide more detail for common errors like 404 (no analyses found)
if (error.status === 404) {
core.warning(`No CodeQL analyses found for ref: '${targetRef}'. Ensure analyses exist for this branch.`);
}
return;
}

// --- 2. Filter out analyses based on the single EXCLUDED_CATEGORY ---
const analysesToDownload = [];
const categoriesSeen = new Set(); // To ensure we only download the most recent analysis for each unique category

// Sort by created_at (most recent first) to ensure we get the latest if multiple analyses exist for a category
allAnalyses.sort((a, b) => new Date(b.created_at).getTime() - new Date(a.created_at).getTime());

for (const analysis of allAnalyses) {
// Check if the category matches our single excluded category
if (analysis.category === EXCLUDED_CATEGORY) {
core.info(`Skipping analysis for excluded category: '${analysis.category}' (ID: ${analysis.id})`);
continue; // Skip to the next analysis if it's the excluded one
}

// Ensure we only download the most recent analysis for each unique category found after exclusion
if (!categoriesSeen.has(analysis.category)) {
categoriesSeen.add(analysis.category);
analysesToDownload.push(analysis);
} else {
core.debug(`Skipping older analysis for category '${analysis.category}' (ID: ${analysis.id})`);
}
}

if (analysesToDownload.length === 0) {
core.info("No analyses found to download after filtering. Exiting.");
return;
}

core.info(`Attempting to download ${analysesToDownload.length} SARIF files after filtering.`);

// --- 3. Download the filtered SARIF files ---
// Use Promise.all to download files concurrently
await Promise.all(analysesToDownload.map(async (analysis) => {
try {
const sarifId = analysis.id;
const category = analysis.category;
// Generate a clean filename from the category
const fileName = category.replace(/[^a-z0-9_]/gi, '_').toLowerCase() + '.sarif';
const filePath = path.join(DOWNLOAD_DIR, fileName);

core.info(`Downloading SARIF for category '${category}' (ID: ${sarifId})`);

// Make a direct API call to get the SARIF content
const sarifResponse = await github.rest.codeScanning.getAnalysis({
owner: repo.owner,
repo: repo.repo,
analysis_id: sarifId,
headers: {
Accept: "application/sarif+json", // Crucial: Request SARIF JSON directly
},
});

const sarifContent = sarifResponse.data;

// Basic validation for received content
if (!sarifContent || (typeof sarifContent === 'object' && Object.keys(sarifContent).length === 0) || (typeof sarifContent === 'string' && sarifContent.trim().length === 0)) {
core.warning(`SARIF content received is empty or malformed for analysis ID ${analysis.id} (category: '${category}'). Skipping write.`);
core.debug(`Received SARIF content (raw): ${sarifContent}`);
return;
}

// Write the SARIF content to a file
fs.writeFileSync(filePath, JSON.stringify(sarifContent, null, 2)); // Pretty print JSON for readability
core.info(`Successfully downloaded SARIF for '${category}' to '${filePath}'`);

} catch (error) {
core.error(`Failed to download SARIF for analysis ID ${analysis.id} (category: '${analysis.category}'): ${error.message}`);
// Log the error but don't fail the whole job so other downloads can proceed
// If you want the job to fail on any download error, you'd re-throw or setFailed here.
}
}));

core.info("All filtered SARIF files processed and downloaded.");
}

// This is the entry point for actions/github-script
module.exports = (github, context, core) => {
run(github, context, core).then(() => {});
};