Comparative analysis on few-shot models performance for  improving object detection in the military Domain

Junsub Kim; Dongnyeok Choi

doi:10.37944/jams.v8i1.277

Authors

Junsub Kim Funzin
Dongnyeok Choi Funzin

DOI:

https://doi.org/10.37944/jams.v8i1.277

Keywords:

object detection, military domain, images of military vehicles, few-shot learning, model performance evaluation

Abstract

The application of Object Detection (OD) techniques in the military and defense domain is often restricted by stringent security requirements and limited data availability. To overcome these challenges, the present study investigates the potential of Few-Shot Object Detection (FSOD) for military applications. A military vehicle image dataset, composed of real-world defense imagery, was constructed for this purpose. Four representative object detection models—YOLO, DETR, GLIP, and CD-ViTO—were fine-tuned under 1-shot, 5-shot, and 10-shot conditions. The model performance was evaluated using mean Average Precision(mAP). Notably, the CD-ViTO model's cross-domain generalization capability was further examined by comparing its performance on this military dataset against public benchmarks previously used in FSOD studies. Experimental results demonstrate that CD-ViTO achieved superior mAP scores, highlighting the viability of FSOD for efficient and accurate object detection in military and defense applications.

Metrics

Metrics Loading ...

Author Biographies

Junsub Kim, Funzin

* (First author) Funzin, Associate Researcher, [email protected], https://orcid.org/0009-0008-3223-4529.

Dongnyeok Choi, Funzin

** (Corresponding author) Funzin, Principal Researcher, [email protected], https://orcid.org/0000-0006-3383-1179.

References

Alawi, A. E. B., & Mohammed, H. M. (2024, August). The Role of YOLOv8 in Enhancing Strategic Military Equipment Detection. In 2024 4th International Conference on Emerging Smart Technologies and Applications (eSmarTA) (pp. 1-5). IEEE. https://doi.org/10.1109/eSmarTA62850.2024.10638856

Bae, J. Y., Park, D. H., Shin, H. J., Yoo, Y. S., Kim, D. W., Hur, D. H., Bae, S. H., Cheon, J. H., & Bae, S. H. (2024). Research on Local and Global Infrared Image Pre-Processing Methods for Deep Learning Based Guided Weapon Target Detection. Journal of The Korea Society of Computer and Information, 29(7), 41-51. https://doi.org/10.9708/jksci.2024.29.07.041

Baek, J. Y., Park, D. H., Shin, H. J., Yoo, Y. S., Kim, D. W., Hur, D. H., Bae, S. H., Cheon, J. H., & Bae, S. H. (2024). Research on Local and Global Infrared Image Pre-Processing Methods for Deep Learning Based Guided Weapon Target Detection. Journal of The Korea Society of Computer and Information, 29(7), 41-51. https://doi.org/10.9708/jksci.2024.29.07.041

Bulat, A., Guerrero, R., Martinez, B., & Tzimiropoulos, G. (2023). Fs-detr: Few-shot detection transformer with prompting and without re-training. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 11793-11802). https://openaccess.thecvf.com/content/ICCV2023/html/Bulat_FS-DETR_Few-Shot_DEtection_TRansformer_with_Prompting_and_without_Re-Training_ICCV_2023_paper.html

Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020, August). End-to-end object detection with transformers. In European conference on computer vision (pp. 213-229). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-58452-8_13

Fu, Y., Wang, Y., Pan, Y., Huai, L., Qiu, X., Shangguan, Z., ... & Jiang, X. (2024, September). Cross-domain few-shot object detection via enhanced open-set object detector. In European Conference on Computer Vision (pp. 247-264). Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-73636-0_15

Jocher, G., Chaurasia, A., & Qiu, J. (2023). YOLO by Ultralytics. Retrieved from https://github.com/ultralytics/ultralytics

Kang, S. H. (2022). Research of Unidentified Tank Classification Using Few-Shot Learning. Summer conference of Korean Institute of Communications and Information Sciences. 392-393. https://www.dbpia.co.kr/journal/articleDetail?dbid=edspia&text=Full+Text+%28DBPIA%29&nodeId=NODE11107752&an=edspia.NODE11107752

Li, L. H., Zhang, P., Zhang, H., Yang, J., Li, C., Zhong, Y., ... & Gao, J. (2022). Grounded language-image pre-training. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10965-10975). https://openaccess.thecvf.com/content/CVPR2022/html/Li_Grounded_Language-Image_Pre-Training_CVPR_2022_paper.html?ref=blog.roboflow.com

Liu, S., Zeng, Z., Ren, T., Li, F., Zhang, H., Yang, J., ... & Zhang, L. (2024, September). Grounding dino: Marrying dino with grounded pre-training for open-set object detection. In European Conference on Computer Vision (pp. 38-55). Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-72970-6_3

Madan, A., Peri, N., Kong, S., & Ramanan, D. (2024). Revisiting few-shot object detection with vision-language models. Advances in Neural Information Processing Systems, 37, 19547-19560. https://doi.org/10.48550/arXiv.2312.14494

Park, C., Lee, S., Choi, H., Kim, D., Jeong, Y., & Paik, J. (2024, January). Enhancing defense surveillance: Few-shot object detection with synthetically generated military data. In 2024 International Conference on Electronics, Information, and Communication (ICEIC) (pp. 1-2). IEEE. https://doi.org/10.1109/ICEIC61013.2024.10457124

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788). https://www.cv-foundation.org/openaccess/content_cvpr_2016/html/Redmon_You_Only_Look_CVPR_2016_paper.html

Ryu, H., Park, M., & Kim, D. Y. (2024). Object prediction and detection of ground-based weapon with an improved YOLO11 approach: Focusing on assumptions underlying operational environments and UAV-captured features related to PLZ-05 Self-Propelled Howitzer. Journal of Advances in Military Studies, 7(3), 13-30. https://doi.org/10.37944/jams.v7i3.256

Wang, X., Huang, T. E., Darrell, T., Gonzalez, J. E., & Yu, F. (2020). Frustratingly simple few-shot object detection. arXiv preprint arXiv:2003.06957.

Wu, G., Cong, L., Huang, C., Ju, Y., Jiang, J., & Chen, C. (2025, January). Meta-Learning Framework for Effective Few Shot Time Series Prediction. In 2025 IEEE 5th International Conference on Power, Electronics and Computer Applications (ICPECA) (pp. 18-22). IEEE. https://doi.org/10.1109/ICPECA63937.2025.10928851

Yuk, T. K., Oh, S. H., Hwang, S. I., & Jeong, K (2024). Effective Few-Shot Learning for Military Vehicles Image Classification Using Prompt-Based Learning. Journal of Convergence Security, 24(5), 189-194. https://doi.org/10.33778/kcsa.2024.24.5.189