Family 01
Large Language Models
Llama, Mistral, Qwen, Phi — instruction-tuned and base checkpoints up to ~13 B parameters running at conversational latency.
Invotet SDK
A unified Python SDK that ingests PyTorch, ONNX, and HuggingFace checkpoints, quantizes for Invotet modules, and ships a deterministic runtime to the device. No CUDA in the loop.
Frameworks
Compile any of the three; ship one runtime
Framework
Trace or torch.export checkpoints compile directly with no rewrite.
Framework
Standards-based interchange — compile any ONNX-exported model.
Framework
transformers checkpoints land on Invotet through a one-line loader.
Five lines, one runtime
Loaded from PyTorch, ONNX, or HuggingFace, your checkpoint quantizes for the module and runs through a deterministic on-device runtime over USB or PCIe.
from invotet import Module, load_hf
# Compile a HuggingFace checkpoint for Invotet modules
model = load_hf("meta-llama/Llama-3-8B-Instruct", quantization="int8")
# Open the on-device runtime over USB or PCIe
with Module.connect() as device:
device.deploy(model)
response = device.generate(
"Plan a four-waypoint patrol around this perimeter.",
max_tokens=256,
)
print(response)Models supported
Family 01
Llama, Mistral, Qwen, Phi — instruction-tuned and base checkpoints up to ~13 B parameters running at conversational latency.
Family 02
LLaVA, Qwen-VL, and Gemma-Vision class models — image + text reasoning on-platform without a downlink.
Family 03
YOLO, DETR, SAM, and custom detectors. Run side-by-side with an LLM in the same module.
The SDK is in early access while the toolchain matures. Tell us your framework, the checkpoint family, and the target module — we will get you provisioned.