-
Notifications
You must be signed in to change notification settings - Fork 252
LP ONNX on Android #2689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LP ONNX on Android #2689
Conversation
GemmaParis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first chapter the "theory" chapter, perhaps it is too long, but I like your style of writing and the content conveyed. Let's see what the LP team thinks of this. The rest looks great! I have picked the need to upgrade from "Arm compute library" to "Arm Kleidi Kernels". See comments
| 1. Cross-platform support. ORT runs on Windows, Linux, macOS, and mobile operating systems like Android and iOS. It has first-class support for both x86 and Arm64 architectures, making it ideal for deployment on devices ranging from cloud servers to Raspberry Pi boards and smartphones. | ||
|
|
||
| 2. Hardware acceleration. ORT integrates with a wide range of execution providers (EPs) that tap into hardware capabilities: | ||
| * Arm NEON / Arm Compute Library for efficient CPU execution on Arm64. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say "Arm Kleidi kernels accelerated with Arm Neon, SVE2 and SME2, for efficient CPU execution on Arm64"
| A typical ONNX workflow looks like this: | ||
| 1. Train the model. You first use your preferred framework (e.g., PyTorch, TensorFlow, or scikit-learn) to design and train a model. At this stage, you benefit from the flexibility and ecosystem of the framework of your choice. | ||
| 2. Export to ONNX. Once trained, the model is exported into the ONNX format using built-in converters (such as torch.onnx.export for PyTorch). This produces a portable .onnx file describing the network architecture, weights, and metadata. | ||
| 3. Run inference with ONNX Runtime. The ONNX model can now be executed on different devices using ONNX Runtime. On Arm64 hardware, ONNX Runtime takes advantage of the Arm Compute Library and NEON instructions, while on Android devices it can leverage NNAPI for mobile accelerators. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Arm Kleidi kernels accelerated with NEON, SVE2 and SME2 instructions"
Added draft status to ONNX topic and updated metadata.
|
merging into main for tech review |
8001c6e
into
ArmDeveloperEcosystem:main
| ## Choosing the hardware | ||
| You can choose a variety of hardware, including: | ||
| * Edge boards (Linux/Arm64) - Raspberry Pi 4/5 (64-bit OS), Jetson (Arm64 CPU; GPU via CUDA if using NVIDIA stack), Arm servers (e.g., AWS Graviton). | ||
| * Apple Silicon (macOS/Arm64) - Great for development, deploy to Arm64 Linux later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @dawidborycki - I'm modifying the hardware selection for the host development machine to be an Arm Linux based machine. I tested this on macOS and (venv) parver01@KWJY1XP2MT ~ % pip3 install onnx onnxruntime onnxscript netron numpy
Collecting onnx
Using cached onnx-1.20.1-cp312-abi3-macosx_12_0_universal2.whl.metadata (8.4 kB)
ERROR: Could not find a version that satisfies the requirement onnxruntime (from versions: none)
ERROR: No matching distribution found for onnxruntime
| Our end goal is a camera-to-solution Sudoku app that runs efficiently on Arm64 devices (e.g., Raspberry Pi or Android phones). ONNX is the glue: we’ll train the digit recognizer in PyTorch, export it to ONNX, and run it anywhere with ONNX Runtime (CPU EP on edge devices, NNAPI EP on Android). Everything around the model—grid detection, perspective rectification, and solving—stays deterministic and lightweight. | ||
|
|
||
| ## Objective | ||
| In this step, we will generate a custom dataset of Sudoku puzzles and their digit crops, which we’ll use to train a digit recognition model. Starting from a Hugging Face parquet dataset that provides paired puzzle/solution strings, we transform raw boards into realistic, book-style Sudoku pages, apply camera-like augmentations to mimic mobile captures, and automatically slice each page into 81 labeled cell images. This yields a large, diverse, perfectly labeled set of digits (0–9 with 0 = blank) without manual annotation. By the end, you’ll have a structured dataset ready to train a lightweight model in the next section. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @dawidborycki can you please point me to the Hugging Face parquet dataset that you used. I'll modify the instructions to point the developer the exact dataset that needs to be downloaded before running prepare data
Before submitting a pull request for a new Learning Path, please review Create a Learning Path
Please do not include any confidential information in your contribution. This includes confidential microarchitecture details and unannounced product information.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the Creative Commons Attribution 4.0 International License.