You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 03_Overview_AI_Concepts/README.md
+82-36Lines changed: 82 additions & 36 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,8 @@
5
5
6
6
A Large Language Models (LLM) is a type of AI that can process and produce natural language text. LLMs are "general purpose" AI models trained using massive amounts of data gathered from various sources; like books, articles, webpages, and images to discover patterns and rules of language.
7
7
8
+
LLMs are complex and built using a neural network architecture. They are trained using large amounts of information, and calculate millions of parameters. From a developer perspective, the APIs expose by Azure OpenAI Service enable the LLMs to be easily integrated into enterprise solutions without requiring knowledge of how to build to train the models.
9
+
8
10
Understanding the capabilities of what an LLM can do is important when deciding to use it for a solution:
9
11
10
12
-**Understand language** - An LLM is a predictive engine that pulls patterns together based on pre-existing text to produce more text. It doesn't understand language or math.
@@ -29,77 +31,121 @@ Here are a few things that separate NLPs from LLMs:
29
31
| Provides a set of labeled data to train the ML model. | Uses many terabytes of unlabeled data in the foundation model. |
30
32
| Describes in natural language what you want the model to do. | Highly optimized for specific use cases. |
A prompt is an input or instruction provided to an Artificial Intelligence (AI) model to direct its behavior and produce the desired results. The quality and specificity of the prompt are crucial in obtaining precise and relevant outputs. A well-designed prompt can ensure that the AI model generates the desired information or completes the intended task effectively. Some typical prompts include summarization, question answering, text classification, and code generation.
35
39
40
+
#### Guidelines for creating robust prompts
36
41
37
-
## Standard Patterns
42
+
While it can be quick to write basic prompts, it can also be difficult to write more complex prompts to ge the AI to generate the responses necessary. When writing prompts, there are three basic guidelines to follow for creating useful prompts:
38
43
39
-
### Retrieval Augmentation Generation (RAG)
44
+
-**Show and tell** - Make it clear what you want either through instructions, examples, or a combination of the two. If you want the model to rank a list of items in alphabetical order or to classify a paragraph by sentiment, include these details in your prompt to show the model.
45
+
-**Provide quality data** - If you're trying to build a classifier or get the model to follow a pattern, make sure there are enough examples. Be sure to proofread your examples. The model is smart enough to resolve basic spelling mistakes and give you a meaningful response. Conversely, the model might assume the mistakes are intentional, which can affect the response.
46
+
-**Check your settings** - Probability settings, such as `Temperature` and `Top P`, control how deterministic the model is in generating a response. If you're asking for a response where there's only one right answer, you should specify lower values for these settings. If you're looking for a response that's not obvious, you might want to use higher values. The most common mistake users make with these settings is assuming they control "cleverness" or "creativity" in the model response.
40
47
41
-
Retrieval Augmentation Generation (RAG) is an architecture that augments the capabilities of a Large Language Model (LLM) like ChatGPT by adding an information retrieval system that provides grounding data. Adding an information retrieval system gives you control over grounding data used by an LLM when it formulates a response. For an enterprise solution, RAG architecture means that you can constrain generative AI to your enterprise content sourced from vectorized documents, images, audio, and video.
48
+
### What is prompt engineering
42
49
43
-
GPT language models can be fine-tuned to achieve several common tasks such as sentiment analysis and named entity recognition. These tasks generally don't require additional background knowledge.
50
+
[Prompt engineering](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/prompt-engineering) is the iterative process of designing, evaluating, and optimizing prompts to produce consistently accurate responses from language models for a particular problem domain. It involves designing and refining the prompts given to an AI model to achieve the desired outputs. Prompt engineers experiment with various prompts, test their effectiveness, and refine them to improve performance. Performance is measured using predefined metrics such as accuracy, relevance, and user satisfaction to assess the impact of prompt engineering.
44
51
45
-
The RAG pattern facilitates bringing private proprietary knowledge to the model so that it can perform Question Answering over this content. Remember that Large Language Models are indexed only on public information.
46
-
Because the RAG technique accesses external knowledge sources to complete tasks, it enables more factual consistency, improves the reliability of the generated responses, and helps to mitigate the problem of "hallucination".
52
+
### General anatomy of a prompt
47
53
48
-
In some cases, the RAG process involves a technique called vectorization on the proprietary data. The user prompt is compared to the vector store and only the most relevant/matching pieces of information are returned and stuffed into prompt for the LLM to reason over and provide an answer. The next set of demos will go into this further.
54
+
context, input, output indicator
55
+
There are a several components that are used in the anatomy of constructing AI prompts:
Instead of splitting a task into smaller steps, with Chain of Though (CoT) the model response is instructed to proceed step-by-step and present all the steps involved. Doing so reduces the possibility of inaccuracy of outcomes and makes assessing the model response easier.
Reasoning and Acting (ReAct) leverages LLMs to generate both reasoning traces and task-specific actions. ReAct combines these to allow for greater synergy between the two. Reasoning traces help the model induce, track, and update action plans as well as handle exceptions, while actions allow it to interface with external data sources to gather additional information. When using LLMs to answer questions and verify facts, ReAct is used to overcome issues of hallucination and error propagation by interacting with the additional usage of external data sources.
69
+
#### Instructions
61
70
62
-
### Others
71
+
Instructions are likely the most commonly used prompt component. Instructions are simply that: instructions to the model on what to do. While simple in concept, they can be complex in practice.
63
72
64
-
#### Zero-shot prompting
73
+
The following are examples of both simple and more complex instructions:
65
74
66
-
Zero-shot prompting is what we would consider the “default”. This is when we provide no examples of inputs/expected outputs to the model to work with. We’re leaving it up to the model to decipher what is needed and how to output it from the instructions.
75
+
| Instruction Type | Prompt | Completion |
76
+
| :--- | :--- | :--- |
77
+
| Simple | Write an introduction for a weekly newsletter. | Welcome to our weekly newsletter. This week we have a special focus on productivity. We'll be sharing tips and tricks on how to be productive in all areas of your life. We hope you enjoy and please share with your friends! |
78
+
| Complex | Write an introduction for a weekly newsletter to my company named Contoso. It should mention the great all hands meeting we had last week and thank the team for all their hard work over the tough past few months. | Welcome to Contoso's weekly newsletter! We had a great all hands meeting last week and I want to thank the team for all their hard work over the tough past few months. We're looking forward to a bright future and I'm confident that with our team's dedication and commitment, we will achieve great things. Thank you for being a part of our team and I look forward to working with you all in the coming weeks and months. Sincerely, [Your name]. |
79
+
| Complex | Write an introduction for a weekly newsletter, including the following: <br/> - My company is named contoso <br/> - An exuberant greeting <br/> - Thanking the team for their hard work over a couple tough months <br/> - Positive outlook for the coming quarter <br/> - Signed by the SLT | Hello Contoso team! I wanted to take a moment to thank you for all your hard work over the past few months. It's been a tough couple of months, but you've all pulled together and done an amazing job. I'm really proud of you all. Looking ahead to the next quarter, I'm feeling really positive. I think we're going to achieve some great things. Thanks again, The Senior Leadership Team. |
67
80
68
-
#### Few-shot prompting
81
+
#### Primary Content
69
82
70
-
Few-shot prompting is a technique that enables in-context learning for the LLM by providing examples or further information within the prompt to steer the model to generate a better response. Providing additional information in the prompt helps particularly in areas where the data used to train the model isn't enough to generate the desired output from the model.
83
+
Primary content refers to some sort of text that is being processed or transformed by the model. Primary content is typically used in conjunction with instructions. A simple example would be language translation.
| Can you please tell me how to get to the museum? <br/> Translate to French: |`Pouvez-vous s'il vous plaît me dire comment aller au musée?`|
73
88
89
+
Primary content can also be much longer. For example, the primary content could pass the introduction section of text content that could be hundreds of words long. Additionally, the primary content could be int he form of structured data as well, such as in JSON or TSV format.
74
90
75
-
##Vectorization and Vector Search
91
+
#### System message
76
92
77
-
What are you trying to solve with finding relevant data through vector?
A prompt is an input or instruction provided to an Artificial Intelligence (AI) model to direct its behavior and produce the desired results. The quality and specificity of the prompt are crucial in obtaining precise and relevant outputs. A well-designed prompt can ensure that the AI model generates the desired information or completes the intended task effectively. Some typical prompts include summarization, question answering, text classification, and code generation.
84
97
85
-
Simple examples of prompts:
98
+
## Standard Patterns
86
99
87
-
-_""_
88
-
-_""_
100
+
### Retrieval Augmentation Generation (RAG)
89
101
90
-
### What is prompt engineering
102
+
[Retrieval Augmentation Generation (RAG)](https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview)is an architecture that augments the capabilities of a Large Language Model (LLM) like ChatGPT by adding an information retrieval system that provides grounding data. Adding an information retrieval system gives you control over grounding data used by an LLM when it formulates a response. For an enterprise solution, RAG architecture means that you can constrain generative AI to your enterprise content sourced from vectorized documents, images, audio, and video.
91
103
92
-
Prompt engineering is the iterative process of designing, evaluating, and optimizing prompts to produce consistently accurate responses from language models for a particular problem domain. It involves designing and refining the prompts given to an AI model to achieve the desired outputs. Prompt engineers experiment with various prompts, test their effectiveness, and refine them to improve performance. Performance is measured using predefined metrics such as accuracy, relevance, and user satisfaction to assess the impact of prompt engineering.
104
+
GPT language models can be fine-tuned to achieve several common tasks such as sentiment analysis and named entity recognition. These tasks generally don't require additional background knowledge.
93
105
94
-
### General anatomy of a prompt
106
+
The RAG pattern facilitates bringing private proprietary knowledge to the model so that it can perform Question Answering over this content. Remember that Large Language Models are indexed only on public information. For an enterprise solution, RAG architecture means you can constrain the generative AI to your enterprise content sourced from vectorized documents, images, audio, and video.
95
107
96
-
Instruction, context, input, output indicator
108
+
Because the RAG technique accesses external knowledge sources to complete tasks, it enables more factual consistency, improves the reliability of the generated responses, and helps to mitigate the problem of "_hallucination_".
In some cases, the RAG process involves a technique called vectorization on the proprietary data. The user prompt is compared to the vector store and only the most relevant/matching pieces of information are returned and stuffed into prompt for the LLM to reason over and provide an answer. The next set of demos will go into this further.
Instead of splitting a task into smaller steps, with [Chain of Thought (CoT)](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/advanced-prompt-engineering?pivots=programming-language-chat-completions#chain-of-thought-prompting) the model response is instructed to proceed step-by-step and present all the steps involved. Doing so reduces the possibility of inaccuracy of outcomes and makes assessing the model response easier.
| You are an AI assistant that helps people find information. | Who was the most decorated (maximum medals) individual athlete in the Olympic games that were held at Sydney? Take a step-by-step approach in your response, cite sources and give reasoning before sharing final answer in the below format: ANSWER is: `<name>` | Step 1: Determine the year of the Sydney Olympic games. The Sydney Olympic games were held in the year 2000. <br/> <br/> Step 2: Find a list of individual athletes who won multiple medals in the 2000 Sydney Olympics. <br/> <br/> According to the official website of the Olympic Games, there were several individual athletes who won multiple medals in the 2000 Sydney Olympics. Some of the notable names include: <br/> - Ian Thorpe (Australia) - 3 gold, 2 silver, 0 bronze <br/> - Pieter van den Hoogenband (Netherlands) - 2 gold, 0 silver, 2 bronze <br/> - Lenny Krayzelburg (USA) - 3 gold, 0 silver, 0 bronze <br/> - Inge de Bruijn (Netherlands) - 3 gold, 1 silver, 0 bronze <br/> <br/> Step 3: Determine the athlete with the most medals. After reviewing the list of individual athletes who won multiple medals in the 2000 Sydney Olympics, we can see that the athlete with the most medals is Ian Thorpe from Australia, who won a total of 5 medals (3 gold and 2 silver). <br/> <br/> Therefore, the ANSWER is: Ian Thorpe. |
119
+
120
+
### ReAct
121
+
122
+
Reasoning and Acting (ReAct) leverages LLMs to generate both reasoning traces and task-specific actions. ReAct combines these to allow for greater synergy between the two. Reasoning traces help the model induce, track, and update action plans as well as handle exceptions, while actions allow it to interface with external data sources to gather additional information. When using LLMs to answer questions and verify facts, ReAct is used to overcome issues of hallucination and error propagation by interacting with the additional usage of external data sources.
123
+
124
+
### Others
125
+
126
+
#### Zero-shot prompting
127
+
128
+
Zero-shot prompting is what we would consider the “default”. This is when we provide no examples of inputs/expected outputs to the model to work with. We’re leaving it up to the model to decipher what is needed and how to output it from the instructions.
105
129
130
+
Example:
131
+
132
+
| Prompt | Completion |
133
+
| :--- | :--- |
134
+
| Headline: Coach confident injury won't derail Warriors <br/> Topic: | The coach is confident that the injury won't derail the Warriors' season. The team is still focused on their goals and that they will continue to work hard to achieve them. |
135
+
136
+
#### Few-shot prompting
137
+
138
+
Few-shot prompting is a technique that enables in-context learning for the LLM by providing examples or further information within the prompt to steer the model to generate a better response. Providing additional information in the prompt helps particularly in areas where the data used to train the model isn't enough to generate the desired output from the model.
139
+
140
+
Example:
141
+
142
+
| Prompt | Completion |
143
+
| :--- | :--- |
144
+
| Headline: Twins' Correa to use opt-out, test free agency <br/> Topic: Baseball <br/> Headline: Qatar World Cup to have zones for sobering up <br/> Topic: Soccer <br/> Headline: Yates: Fantasy football intel for Week 6 <br/> Topic: Football <br/> Headline: Coach confident injury won't derail Warriors <br/> Topic: | Basketball |
0 commit comments