Model Compression & Adaption Support Grid
Below are tables summarizing the support for various compression and adaption techniques across different models. Apart from these, other similar models may be supported but have not been tested.
Vision Compression
Note that in the Table below, CPU and GPU indicate the target device of deployment. PTQ indicates Post Training Quantization and QAT indicates Quantization Aware Training.
Model | CPU PTQ - Torch | CPU PTQ - OpenVino | CPU PTQ - ONNX | GPU PTQ - TensorRT | CPU QAT - Torch | Knowledge Distillation | Structured Pruning | CPU QAT - OpenVino |
---|---|---|---|---|---|---|---|---|
Resnet (timm) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Convnextv2 (huggingface) | - | ✓ | ✓ | ✓ | - | ✓ | ✓ | ✓ |
Mobilenetv3 (timm) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
DeiT (huggingface) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
VanillaNet (timm) | - | - | ✓ | ✓ | - | ✓ | ✓ | ✓ |
Swin (huggingface) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
YoloX (mmyolo/mmdet) | - | ✓ | - | ✓ | ✓ | ✓ | ✓ | - |
RTMDet (mmyolo/mmdet) | - | ✓ | - | ✓ | ✓ | ✓ | - | - |
Yolov8 (mmyolo) | - | - | - | ✓ | - | ✓ | - | - |
LLM Compression
Model | LLM Quantization | LLM Engine TensorRT | LLM Engine Exllama | LLM Engine MLC-LLM | LLM Structured Pruning |
---|---|---|---|---|---|
LLaMA | ✓ | ✓ | ✓ | ✓ | ✓ |
LlaMA-2 | ✓ | ✓ | ✓ | ✓ | ✓ |
Vicuna | ✓ | ✓ | ✓ | ✓ | ✓ |
Mistral | ✓ | ✓ | ✓ | ✓ | - |
Mixtral | ✓ | ✓ | ✓ | ✓ | - |
Gemma | ✓ | ✓ | - | - | - |
Adapt
LLM Tasks
- Text Generation
- Summarization
- Question Answering
- Text Classification
- Translation
All of the major huggingface models are supported for these tasks.
Image Classification
All major image models on huggingface and timm are supported in Adapt.
Object Detection
LoRA | SSF | DoRA | Full Fine Tuning | |
---|---|---|---|---|
YoloX | ✓ | ✓ | ✓ | ✓ |
RTMDet | ✓ | ✓ | ✓ | ✓ |
Instance Segmentation
LoRA | SSF | DoRA | Full Fine Tuning | |
---|---|---|---|---|
SegNeXT | ✓ | ✓ | ✓ | ✓ |
Pose Detection
LoRA | SSF | DoRA | Full Fine Tuning | |
---|---|---|---|---|
RTMO | ✓ | ✓ | ✓ | ✓ |
Note that quantization support (QLoRA/QSSF) for adaptation of vision models is currently not supported.