-
Notifications
You must be signed in to change notification settings - Fork 105
Update cost-tracking for OAI chatcompletions and response API #260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review by Korbit AI
Korbit automatically attempts to detect when you fix issues in new commits.
| Category | Issue | Status |
|---|---|---|
| Incomplete API type detection logic ▹ view | ✅ Fix detected | |
| Inconsistent dictionary key names causing AttributeError ▹ view | ✅ Fix detected | |
| Missing Response Context in Log ▹ view | ||
| Redundant Token Extraction Logic ▹ view |
Files scanned
| File Path | Reviewed |
|---|---|
| src/agentlab/llm/tracking.py | ✅ |
Explore our documentation to understand the languages and file types we support and the files we ignore.
Check out our docs on how you can make Korbit work best for you and your team.
src/agentlab/llm/tracking.py
Outdated
| if 'prompt_tokens_details' in usage: | ||
| usage['cached_tokens'] = usage['prompt_token_details'].cached_tokens |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
src/agentlab/llm/tracking.py
Outdated
| if usage is None: | ||
| logging.warning("No usage information found in the response. Defaulting cost to 0.0.") | ||
| return 0.0 | ||
| api_type = 'chatcompletion' if hasattr(usage, "prompt_tokens_details") else 'response' |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
src/agentlab/llm/tracking.py
Outdated
| if api_type == 'chatcompletion': | ||
| total_input_tokens = usage.prompt_tokens | ||
| output_tokens = usage.completion_tokens | ||
| cached_input_tokens = usage.prompt_tokens_details.cached_tokens | ||
| non_cached_input_tokens = total_input_tokens - cached_input_tokens | ||
| elif api_type == 'response': | ||
| total_input_tokens = usage.input_tokens | ||
| output_tokens = usage.output_tokens | ||
| cached_input_tokens = usage.input_tokens_details.cached_tokens | ||
| non_cached_input_tokens = total_input_tokens - cached_input_tokens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant Token Extraction Logic 
Tell me more
What is the issue?
Duplicated token extraction logic with only attribute names differing between API types.
Why this matters
The duplicated structure makes the code harder to maintain and obscures the fact that both branches follow the same pattern.
Suggested change ∙ Feature Preview
TOKEN_MAPPINGS = {
'chatcompletion': {
'total_tokens': 'prompt_tokens',
'output_tokens': 'completion_tokens',
'details_attr': 'prompt_tokens_details'
},
'response': {
'total_tokens': 'input_tokens',
'output_tokens': 'output_tokens',
'details_attr': 'input_tokens_details'
}
}
mapping = TOKEN_MAPPINGS.get(api_type)
if mapping:
total_input_tokens = getattr(usage, mapping['total_tokens'])
output_tokens = getattr(usage, mapping['output_tokens'])
details = getattr(usage, mapping['details_attr'])
cached_input_tokens = details.cached_tokens
non_cached_input_tokens = total_input_tokens - cached_input_tokensProvide feedback to improve future suggestions
💬 Looking for more details? Reply to this comment to chat with Korbit.
This pull request refines token usage tracking and cost calculation logic in the
src/agentlab/llm/tracking.pyfile. The changes improve handling of cached tokens and introduce better differentiation between API types (chatcompletionvs.response) for more accurate cost computation.Enhancements to token usage tracking:
src/agentlab/llm/tracking.py: Updated the__call__method to extract and includecached_tokensfromprompt_tokens_detailsorinput_tokens_detailsin the usage dictionary, ensuring more precise tracking of cached token usage.Improvements to cost calculation logic:
src/agentlab/llm/tracking.py: Refactored theget_effective_cost_from_openai_apimethod to handle two distinct API types (chatcompletionandresponse). Added logic to computecached_input_tokensandnon_cached_input_tokensseparately for each type, improving the accuracy of effective cost calculations. Introduced warnings for missing usage information or unsupported API types.…ponses APIDescription by Korbit AI
What change is being made?
Update cost-tracking logic for OpenAI chatcompletion and response APIs to account for token caching and improve cost calculation accuracy.
Why are these changes being made?
The change introduces handling for
prompt_tokens_detailsandinput_tokens_detailsto accurately assess cached tokens, addressing previous shortcomings in cost estimation by accounting for cached and non-cached token differences. This ensures more precise cost tracking and provides fallback handling for missing API usage information, enhancing robustness and logging unsupported API types to prevent miscalculations.