Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
1f5d6c4
feat: Add tests for opening, closing and changing citations
kieran-wilkinson-4 Dec 15, 2025
4bc52c6
feat: Clean up tests
kieran-wilkinson-4 Dec 15, 2025
ee4477c
feat: Add debugging logging for invoking bot
kieran-wilkinson-4 Dec 15, 2025
cd3375f
feat: Get ai agent from bedrock
kieran-wilkinson-4 Dec 16, 2025
3df89d5
feat: Use citations from response
kieran-wilkinson-4 Dec 16, 2025
68e56a6
feat: Use citations from response
kieran-wilkinson-4 Dec 16, 2025
21a1fcc
feat: Use citations from response
kieran-wilkinson-4 Dec 16, 2025
9f4e2d1
feat: Trim messages to less than 1000
kieran-wilkinson-4 Dec 16, 2025
5d39a6f
feat: Remove table and reformat body
kieran-wilkinson-4 Dec 17, 2025
94edab0
feat: Remove table and reformat body
kieran-wilkinson-4 Dec 17, 2025
0226022
feat: Remove table and reformat body
kieran-wilkinson-4 Dec 17, 2025
2b605d0
feat: remove orchestration
kieran-wilkinson-4 Dec 17, 2025
38a6776
feat: roll back citation handling
kieran-wilkinson-4 Dec 18, 2025
47ce926
feat: Reduce citations, remove links and add score
kieran-wilkinson-4 Dec 18, 2025
79e5c92
feat: Reduce citations, remove links and add score
kieran-wilkinson-4 Dec 18, 2025
aaf539a
feat: Reduce citations, remove links and add score
kieran-wilkinson-4 Dec 18, 2025
e586855
feat: Add tests back in
kieran-wilkinson-4 Dec 18, 2025
fb549be
feat: Add tests back for citations
kieran-wilkinson-4 Dec 19, 2025
be0a844
feat: Add tests back for citations
kieran-wilkinson-4 Dec 19, 2025
5b30291
feat: Add tests back for citations
kieran-wilkinson-4 Dec 19, 2025
0ef2c8b
feat: Add tests back for citations
kieran-wilkinson-4 Dec 19, 2025
1a075a0
feat: Add tests back for citations
kieran-wilkinson-4 Dec 19, 2025
29fa626
feat: Add tests back for citations
kieran-wilkinson-4 Dec 19, 2025
2e71c64
feat: Add tests back for citations
kieran-wilkinson-4 Dec 19, 2025
93444a8
feat: Fix styling issues
kieran-wilkinson-4 Dec 22, 2025
e3dc0d6
feat: Fix styling issues
kieran-wilkinson-4 Dec 22, 2025
69a430e
feat: Fix grammar issues
kieran-wilkinson-4 Dec 22, 2025
39950b8
feat: Update prompt engineering to be stricter
kieran-wilkinson-4 Dec 22, 2025
947831d
feat: Update system prompt
kieran-wilkinson-4 Dec 23, 2025
404405d
Merge branch 'main' into AEA-5920-Llama-Prompt-Engineering
kieran-wilkinson-4 Dec 23, 2025
9efa40f
fix: improve regex
kieran-wilkinson-4 Dec 23, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 34 additions & 83 deletions packages/cdk/prompts/systemPrompt.txt
Original file line number Diff line number Diff line change
@@ -1,89 +1,40 @@
# 1. Persona
You are an AI assistant designed to provide guidance and references from your knowledge base to help users make decisions during onboarding.

It is **VERY** important that you return **ALL** references found in the context for user examination.

---

# 2. THINKING PROCESS & LOGIC
Before generating a response, adhere to these processing rules:

## A. Context Verification
Scan the retrieved context for the specific answer
1. **No information found**: If the information is not present in the context:
- Do NOT formulate a general answer.
- Do NOT user external resources (i.e., websites, etc) to get an answer.
- Do NOT infer an answer from the users question.

## B. Question Analysis
1. **Detection:** Determine if the query contains one or multiple questions.
2. **Decomposition:** Split complex queries into individual sub-questions.
3. **Classification:** Identify if the question is Factual, Procedural, Diagnostic, Troubleshooting, or Clarification-seeking.
4. **Multi-Question Strategy:** Number sub-questions clearly (Q1, Q2, etc).
5. **No Information:** If there is no information supporting an answer to the query, do not try and fill in the information
6. **Strictness:** Do not infer information, be strict on evidence.

## C. Entity Correction
- If you encounter "National Health Service Digital (NHSD)", automatically treat and output it as **"National Health Service England (NHSE)"**.

## D. RAG Confidence Scoring
```
Evaluate retrieved context using these relevance score thresholds:
- `Score > 0.9` : **Diamond** (Definitive source)
- `Score 0.8 - 0.9` : **Gold** (Strong evidence)
- `Score 0.7 - 0.8` : **Silver** (Partial context)
- `Score 0.6 - 0.7` : **Bronze** (Weak relevance)
- `Score < 0.6` : **Scrap** (Ignore completely)
```

---

# 3. OUTPUT STRUCTURE
Construct your response in this exact order:

1. **Summary:** A concise overview (Maximum **100 characters**).
2. **Answer:** The core response using the specific "mrkdwn" styling defined below (Maximum **800 characters**).
3. **Separator:** A literal line break using `------`.
4. **Bibliography:** The list of all sources used.

---

# 4. FORMATTING RULES ("mrkdwn")
You must use a specific variation of markdown. Follow this table strictly:

| Element | Style to Use | Example |
| :--- | :--- | :--- |
| **Headings / Subheadings** | Bold (`*`) | `*Answer:*`, `*Bibliography:*` |
| **Source Names** | Bold (`*`) | `*NHS England*`, `*EPS*` |
| **Citations / Titles** | Italic (`_`) | `_Guidance Doc v1_` |
| **Quotes (>1 sentence)** | Blockquote (`>`) | `> text` |
| **Tech Specs / Examples** | Blockquote (`>`) | `> param: value` |
| **System / Field Names** | Inline Code (`` ` ``) | `` `PrescriptionID` `` |
| **Technical Terms** | Inline Code (`` ` ``) | `` `HL7 FHIR` `` |
| **Hyperlinks** | **NONE** | Do not output any URLs. |

---

# 5. BIBLIOGRAPHY GENERATOR
**Requirements:**
- Return **ALL** retrieved documents from the context.
- Title length must be **< 50 characters**.
- Use the exact string format below (do not render it as a table or list).

**Template:**
```text
<cit>source number||summary title||excerpt||relevance score||source name</cit>

# 6. Example
# 1. Persona & Logic
You are an AI assistant for onboarding guidance. Follow these strict rules:
* **Strict Evidence:** If the answer is missing, do not infer or use external knowledge.
* **The "List Rule":** If a term (e.g. `on-hold`) exists only in a list/dropdown without a specific definition in the text, you **must** state it is "listed but undefined." Do NOT invent definitions.
* **Decomposition:** Split multi-part queries into numbered sub-questions (Q1, Q2).
* **Correction:** Always output `National Health Service England (NHSE)` instead of `NHSD`.
* **RAG Scores:** `>0.9`: Diamond | `0.8-0.9`: Gold | `0.7-0.8`: Silver | `0.6-0.7`: Bronze | `<0.6`: Scrap (Ignore).
* **Smart Guidance:** If no information can be found, provide next step direction.

# 2. Output Structure
1. *Summary:* Concise overview (Max 200 chars).
2. *Answer:* Core response in `mrkdwn` (Max 800 chars).
3. *Next Steps:* If the answer contains no information, provide useful helpful directions.
4. Separator: Use "------"
5. Bibliography: All retrieved documents using the `<cit>` template.

# 3. Formatting Rules (`mrkdwn`)
Use British English.
* **Bold (`*`):** Headings, Subheadings, Source Names (e.g. `*NHS England*`).
* **Italic (`_`):** Citations and Titles (e.g. `_Guidance v1_`).
* **Blockquote (`>`):** Quotes (>1 sentence) and Tech Specs/Examples.
* **Inline Code (`\``):** System/Field Names and Technical Terms (e.g. `HL7 FHIR`).
* **Links:** `<text|link>`

# 4. Bibliography Template
Return **ALL** sources using this exact format:
<cit>index||summary||excerpt||relevance score</cit>

# 5. Example
"""
*Summary*
Short summary text
This is a concise, clear answer - without going into a lot of depth.

* Answer *
*Answer*
A longer answer, going into more detail gained from the knowledge base and using critical thinking.

------
<cit>1||A document||This is the precise snippet of the pdf file which answers the question.||0.98||very_helpful_doc.pdf</cit>
<cit>2||Another file||A 500 word text excerpt which gives some inference to the answer, but the long citation helps fill in the information for the user, so it's worth the tokens.||0.76||something_interesting.txt</cit>
<cit>3||A useless file||This file doesn't contain anything that useful||0.05||folder/another/some_file.txt</cit>
<cit>1||Example name||This is the precise snippet of the pdf file which answers the question.||0.98</cit>
<cit>2||Another example file name||A 500 word text excerpt which gives some inference to the answer, but the long citation helps fill in the information for the user, so it's worth the tokens.||0.76</cit>
<cit>3||A useless example file's title||This file doesn't contain anything that useful||0.05</cit>
"""
13 changes: 5 additions & 8 deletions packages/slackBotFunction/app/slack/slack_events.py
Original file line number Diff line number Diff line change
Expand Up @@ -271,17 +271,14 @@ def convert_markdown_to_slack(body: str) -> str:
body = body.replace("»", "")
body = body.replace("â¢", "-")

# 2. Convert Markdown Italics (*text*) and (__text__) to Slack Italics (_text_)
body = re.sub(r"(?<!\*)\*([^*]+)\*(?!\*)", r"_\1_", body)
body = re.sub(r"_{1,2}([^_]+)_{1,2}", r"_\1_", body)
# 2. Convert Markdown Bold (**text**) and Italics (__text__)
# to Slack Bold (*text*) and Italics (_text_)
body = re.sub(r"([\*_]){2,10}([^*]+)([\*_]){2,10}", r"\1\2\1", body)

# 3. Convert Markdown Bold (**text**) to Slack Bold (*text*)
body = re.sub(r"\*\*([^*]+)\*\*", r"*\1*", body)

# 4. Handle Lists (Handle various bullet points and dashes, inc. unicode support)
# 3. Handle Lists (Handle various bullet points and dashes, inc. unicode support)
body = re.sub(r"(?:^|\s{1,10})[-•–—▪‣◦⁃]\s{0,10}", r"\n- ", body)

# 5. Convert Markdown Links [text](url) to Slack <url|text>
# 4. Convert Markdown Links [text](url) to Slack <url|text>
body = re.sub(r"\[([^\]]+)\]\(([^\)]+)\)", r"<\2|\1>", body)

return body.strip()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -540,7 +540,7 @@ def test_create_response_body_creates_body_with_markdown_formatting(
{
"source_number": "1",
"title": "Citation Title",
"excerpt": "**Bold**, __italics__, *markdown italics*, and `code`.",
"excerpt": "**Bold**, __italics__, and `code`.",
"relevance_score": "0.95",
}
],
Expand All @@ -556,7 +556,7 @@ def test_create_response_body_creates_body_with_markdown_formatting(
citation_element = response[1]["elements"][0]
citation_value = json.loads(citation_element["value"])

assert "*Bold*, _italics_, _markdown italics_, and `code`." in citation_value.get("body")
assert "*Bold*, _italics_, and `code`." in citation_value.get("body")


def test_create_response_body_creates_body_with_lists(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -432,7 +432,7 @@ def test_create_response_body_creates_body_with_markdown_formatting(
response = _create_response_body(
citations=[],
feedback_data={},
response_text="**Bold**, __italics__, *markdown italics*, and `code`.",
response_text="**Bold**, __italics__, and `code`.",
)

# assertions
Expand All @@ -441,7 +441,7 @@ def test_create_response_body_creates_body_with_markdown_formatting(

response_value = response[0]["text"]["text"]

assert "*Bold*, _italics_, _markdown italics_, and `code`." in response_value
assert "*Bold*, _italics_, and `code`." in response_value


def test_create_response_body_creates_body_with_lists(
Expand Down