Model Compression & Adaption Support Grid

Below are tables summarizing the support for various compression and adaption techniques across different models. Apart from these, other similar models may be supported but have not been tested.

Vision Compression

Note that in the Table below, CPU and GPU indicate the target device of deployment. PTQ indicates Post Training Quantization and QAT indicates Quantization Aware Training.

Model	CPU PTQ - Torch	CPU PTQ - OpenVino	CPU PTQ - ONNX	GPU PTQ - TensorRT	CPU QAT - Torch	Knowledge Distillation	Structured Pruning	CPU QAT - OpenVino
Resnet (timm)	✓	✓	✓	✓	✓	✓	✓	✓
Convnextv2 (huggingface)	-	✓	✓	✓	-	✓	✓	✓
Mobilenetv3 (timm)	✓	✓	✓	✓	✓	✓	✓	✓
DeiT (huggingface)	✓	✓	✓	✓	✓	✓	✓	✓
VanillaNet (timm)	-	-	✓	✓	-	✓	✓	✓
Swin (huggingface)	✓	✓	✓	✓	✓	✓	✓	✓
YoloX (mmyolo/mmdet)	-	✓	-	✓	✓	✓	✓	-
RTMDet (mmyolo/mmdet)	-	✓	-	✓	✓	✓	-	-
Yolov8 (mmyolo)	-	-	-	✓	-	✓	-	-

LLM Compression

Model	LLM Quantization	LLM Engine TensorRT	LLM Engine Exllama	LLM Engine MLC-LLM	LLM Structured Pruning
LLaMA	✓	✓	✓	✓	✓
LlaMA-2	✓	✓	✓	✓	✓
Vicuna	✓	✓	✓	✓	✓
Mistral	✓	✓	✓	✓	-
Mixtral	✓	✓	✓	✓	-
Gemma	✓	✓	-	-	-

Adapt

LLM Tasks

Text Generation
Summarization
Question Answering
Text Classification
Translation

All of the major huggingface models are supported for these tasks.

Image Classification

All major image models on huggingface and timm are supported in Adapt.

Object Detection

	LoRA	SSF	DoRA	Full Fine Tuning
YoloX	✓	✓	✓	✓
RTMDet	✓	✓	✓	✓

Instance Segmentation

	LoRA	SSF	DoRA	Full Fine Tuning
SegNeXT	✓	✓	✓	✓

Pose Detection

	LoRA	SSF	DoRA	Full Fine Tuning
RTMO	✓	✓	✓	✓

Note that quantization support (QLoRA/QSSF) for adaptation of vision models is currently not supported.