Add Vision Transformer demo for image classification (Fixes #13372)#13538
Open
kdt523 wants to merge 7 commits intoTheAlgorithms:masterfrom
Open
Add Vision Transformer demo for image classification (Fixes #13372)#13538kdt523 wants to merge 7 commits intoTheAlgorithms:masterfrom
kdt523 wants to merge 7 commits intoTheAlgorithms:masterfrom
Conversation
There was a problem hiding this comment.
Click here to look at the relevant links ⬇️
🔗 Relevant Links
Repository:
Python:
Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.
algorithms-keeper commands and options
algorithms-keeper actions can be triggered by commenting on this PR:
@algorithms-keeper reviewto trigger the checks for only added pull request files@algorithms-keeper review-allto trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.
| return result | ||
|
|
||
|
|
||
| def main() -> None: |
There was a problem hiding this comment.
As there is no test file in this pull request nor any test function or class in the file computer_vision/vision_transformer_demo.py, please provide doctest for the function main
Author
There was a problem hiding this comment.
@algorithms-keeper
Thank you for the review! I've added a doctest to the main() function. The test verifies the function is callable. I used +SKIP for the execution test since main() requires network access and model downloads.
Let me know if you need any other changes.
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request adds a Vision Transformer (ViT) demo for image classification to the computer_vision module. The new script demonstrates how to use a pre-trained ViT model from Hugging Face Transformers to classify images from either a URL or a local file. The implementation includes full type hints, comprehensive docstrings, and working doctests that pass automated testing. The code follows all repository conventions for naming, structure, and documentation, and includes references to the original ViT paper and Hugging Face documentation.
References:
https://arxiv.org/abs/2010.11929
https://huggingface.co/docs/transformers/model_doc/vit
Fixes: #13372