Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# How to Convert Incoming Spreadsheets to the Evaluation Format

## Overview

This guide explains how to handle incoming spreadsheet files (for example, from the Highside team) and convert or validate them against the expected evaluation (eval) format used by our team.

## Prerequisites

- Access to the shared location or channel where incoming files are uploaded (e.g., Slack or shared drive).
- Ability to download and open spreadsheet files (e.g., Microsoft Excel, Google Sheets) and comma-separated value (CSV) files.
- Access to the reference evaluation file:
- `10.9_Gold_Sets_renamed.csv` (this file has the correct structure for evals).

## Instructions

1. **Obtain the Incoming File**
- Download the spreadsheet sent by the external team (e.g., Highside team).
- If multiple versions were uploaded, use the most recent upload (e.g., “Uploading again here”).

2. **Open and Inspect the Spreadsheet**
- Open the file in your spreadsheet application.
- Confirm how many tabs (worksheets) the file contains.
- Focus on **Tab 3**:
- Team members indicated: “See only Tab 3” and “90 rows on mine - see 3rd tab only.”
- This implies that **only the third tab is relevant for the evaluation format**.

3. **Verify the Structure Against the Eval Format**
- Locate and open the reference file:
- `10.9_Gold_Sets_renamed.csv`
- Compare the structure of **Tab 3** in the incoming spreadsheet to `10.9_Gold_Sets_renamed.csv`:
- Column names
- Column order
- Data types (e.g., text vs numeric)
- The goal is for the incoming data (from Tab 3) to match the structure of `10.9_Gold_Sets_renamed.csv`.

4. **Convert or Align the Incoming Data**
- If Tab 3 already matches the structure of `10.9_Gold_Sets_renamed.csv`, export Tab 3 as a CSV file.
- If it does not match:
- Rename columns to match the reference file.
- Reorder columns to match the reference file.
- Remove any extra columns not present in `10.9_Gold_Sets_renamed.csv`.
- Save or export the cleaned Tab 3 as a CSV file in the evaluation format.

5. **Confirm the Result**
- Once the CSV is prepared, validate that:
- The file has the same headers as `10.9_Gold_Sets_renamed.csv`.
- The number of columns matches.
- If there is an automated eval pipeline or script, run it against the new CSV (if available) to confirm it works as expected.
- If you are unable to run the pipeline (e.g., “on two other branches” or otherwise occupied), ask another team member to test it.

## Important Notes and Caveats

- **Only Tab 3 is relevant**:
Ignore other tabs unless explicitly instructed otherwise. The conversation indicates that the evaluation data is contained solely in the third tab.
- **Reference file is authoritative**:
`10.9_Gold_Sets_renamed.csv` is confirmed by the team as having “the right structure for evals.” Use it as the canonical template.
- **External source issues**:
If the spreadsheet appears “malformatted” or significantly different from the expected structure, the issue may originate from the external sender (e.g., Highside team).

## Troubleshooting

### Issue: Spreadsheet looks “malformatted” or very different

**Symptoms:**
- The spreadsheet layout looks unexpected.
- You see “25 pages of Excel” or an unusually large or complex layout.

**Actions:**
1. Confirm you are viewing **only Tab 3**.
2. Check if Tab 3 has approximately the expected number of rows (e.g., “90 rows on mine” was mentioned as a reference).
3. If the structure still looks incorrect:
- Ask the sending team (e.g., Highside) to verify and resend the file:
- “Can you ask them to check?”
- Request that they align their export to match the structure of `10.9_Gold_Sets_renamed.csv`.

### Issue: Unclear whether the conversion is working

**Symptoms:**
- Team members ask “is this working?” or are unsure if the new file is in the correct format.

**Actions:**
1. Re-compare the new CSV against `10.9_Gold_Sets_renamed.csv`:
- Headers
- Column order
- Number of columns
2. If an evaluation script or pipeline exists, run it on the new file.
3. Ask a colleague to confirm:
- For example, another team member may “confirm” that the structure is correct.

## Gaps and Additional Information Needed

The Slack conversation does not specify:

- The exact column names and data types required for the evaluation format.
- The location (path or URL) where `10.9_Gold_Sets_renamed.csv` is stored.
- The exact tool or script used to run evaluations on the CSV.

To fully complete this documentation, you may need to add:

- A table listing the required columns and their descriptions.
- The repository path or shared drive location of `10.9_Gold_Sets_renamed.csv`.
- Any command-line instructions or application steps used to run the evaluation on the prepared CSV.

---
*Source: [Original Slack thread](https://distylai.slack.com/archives/impl-tower-billing/p1746831418355589)*