Table 8 Comparison of performance and detailed model specifics including the number of parameters (# Params), FLOPS, and inference time in seconds/iteration between different backbones and methods
From: Matching Compound Prototypes for Few-Shot Action Recognition
Method | Backbone | Object | # Params | FLOPS-b | FLOPS-m | FLOPS-o | Inference time | SSv2-Small | SSv2-Full | Kinetics |
|---|---|---|---|---|---|---|---|---|---|---|
MatchNet | ResNet50 | / | 24.6M | 33.0G | 0 | 0 | 0.4 (s/it) | 34.9 | 35.1 | 54.6 |
TRX (Perrett et al., 2021) | ResNet50 | / | 27.2M | 33.0G | 10.57G | 0 | 0.8 (s/it) | 37.1 | 41.5 | 64.6 |
ITA-Net (Zhang et al., 2021b) | ResNet50 | / | 30.9M | 33.0G | 11.3G | 0 | 0.9 (s/it) | 38.4 | 46.1 | 72.6 |
Ours- | ResNet-50 | / | 32.0M | 33.0G | 2.2G | 0 | 0.6 (s/it) | 38.9 | 49.3 | 73.3 |
Ours-ms | ResNet-50 | / | 39.8M | 33.0G | 8.82G | 0 | 0.8 (s/it) | 42.6 | 52.3 | 74.0 |
Ours-ms | ResNet-18 | / | 26.7M | 15.6G | 8.82G | 0 | 0.6 (s/it) | 40.8 | 50.2 | 71.4 |
Ours-ms | DenseNet | / | 23.2M | 26.0G | 8.82G | 0 | 0.7 (s/it) | 41.0 | 50.7 | 71.7 |
Ours-obj | ResNet-50 | 41.8M | 37.2M | 33.0G | 8.06G | 3T | 3.3 (s/it) | 57.1 | 59.6 | 81.0 |
Ours-obj | ResNet-18 | 41.8M | 24.1M | 15.6G | 8.06G | 3T | 3.2 (s/it) | 53.4 | 56.2 | 77.3 |
Ours-obj | DenseNet | 41.8M | 20.6M | 26.0G | 8.06G | 3T | 3.2 (s/it) | 53.7 | 56.5 | 77.6 |