Skip to content

Conversation

@vladimirivic
Copy link
Contributor

@vladimirivic vladimirivic commented Jan 31, 2025

What does this PR do?

Sync updates from main

Test

Run manual client sdk tests

llama-stack-client configure --endpoint={} --api-key={}

llama-stack-client models list

Available Models

┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ model_type             ┃ identifier                                   ┃ provider_resource_id                        ┃ metadata           ┃ provider_id             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ llm                    │ llama3.1-8b-instruct                         │ llama3.1-8b-instruct                        │                    │ meta-llama              │
├────────────────────────┼──────────────────────────────────────────────┼─────────────────────────────────────────────┼────────────────────┼─────────────────────────┤
│ llm                    │ llama3.3-70b-instruct                        │ llama3.3-70b-instruct                       │                    │ meta-llama              │
├────────────────────────┼──────────────────────────────────────────────┼─────────────────────────────────────────────┼────────────────────┼─────────────────────────┤
│ llm                    │ llama3.2-1b-instruct                         │ llama3.2-1b-instruct                        │                    │ meta-llama              │
├────────────────────────┼──────────────────────────────────────────────┼─────────────────────────────────────────────┼────────────────────┼─────────────────────────┤
│ llm                    │ llama3.2-3b-instruct                         │ llama3.2-3b-instruct                        │                    │ meta-llama              │
└────────────────────────┴──────────────────────────────────────────────┴─────────────────────────────────────────────┴────────────────────┴─────────────────────────┘

Total models: 4

llama-stack-client inference chat-completion --message "What model are you?"
ChatCompletionResponse(
    completion_message=CompletionMessage(
        content='I\'m an AI, and my model is based on a type of transformer architecture called a "scaled-up transformer" or "supertransformer." This architecture was
developed by a team of researchers at Meta AI and is designed for natural language processing tasks like conversational dialogue, question answering, and text
summarization.\n\nIn more detail, my model is based on the Transformer-XL architecture, which was introduced in a research paper by Zhengdong Zhang and others in
2019. This architecture is similar to the original transformer model, but with a few key differences:\n\n1. **Longer context window**: The original transformer model
has a limited context window of 512 tokens, while my model has a much longer context window of 4096 tokens, which allows me to understand and respond to longer
sequences of text.\n2. **New attention mechanism**: My model uses a new type of attention mechanism that allows it to weigh the importance of different parts of the
input sequence more effectively.\n3. **Scaling up to larger models**: The Transformer-XL architecture was designed to be more efficient and effective at larger
scales, which is why my model is so large and powerful.\n\nMy model is a large language model, trained on a massive corpus of text data from the internet. It has been
trained on a specific dataset and fine-tuned for conversational dialogue, so it can engage in conversations with users like you.\n\nI hope that helps clarify things!
Let me know if you have any other questions.',
        role='assistant',
        stop_reason='end_of_turn',
        tool_calls=[]
    ),
    logprobs=None
)

llama-stack-client inference chat-completion --message "What model are you?" --stream
Assistant> I’m a large language model. When you ask me a question or provide me with a prompt, I analyze what you say and generate a response that is relevant and accurate. I'm constantly learning and improving, so over time I'll be even better at assisting you. Is there anything I can help you with?

@yanxi0830
Copy link
Contributor

are you going to use the sync script with headers patch in another PR?

@vladimirivic
Copy link
Contributor Author

are you going to use the sync script with headers patch in another PR?

Yes, right after I merge this one.

@vladimirivic vladimirivic merged commit 246c547 into main Jan 31, 2025
2 checks passed
@vladimirivic vladimirivic deleted the sync-stailness-master-1-31 branch January 31, 2025 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants