Skip to main content
Log in

Multi-level consistency regularization for domain adaptive object detection

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

To improve the adaptability of detectors, most existing domain adaptation algorithms adopt adversarial learning to align feature distributions between source and target datasets. Different from previous methods, this work explores the possibility of transferring detectors with only source domain data and style information of the target domain. Specifically, we propose three consistency regularizations to enhance the adaptation performance of the detector. First, the source domain and the synthetic domain share the same image content, and the supervision regularization fully exploits the source annotations, which narrows the domain gap and saves labeling costs. Second, prediction regularization improves the robustness of the detector to category semantics and location awareness in different domains. Third, self-discovering feature regularization projects the detector’s attention to object-related regions, which are more discriminative than background noise. In addition, our method can cooperate with the classic domain adaptation algorithm to further improve the generalization of the detector, which shows that both the content and style information of target domain images are crucial for the transfer process. Extensive experiments have been conducted on multiple detection benchmarks, including Foggy Cityscapes, Sim10k, KITTI, Clipart, and Watercolor datasets. The favorable performance compared with existing state-of-the-art methods confirms the effectiveness of the proposed consistency regularizations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availability

The Cityscapes and Foggy Cityscapes datasets analyzed during the current study are available in the https://www.cityscapes-dataset.com/. The KITTI dataset analyzed during the current study is available in the http://www.cvlibs.net/datasets/kitti/index.php. The Sim10k dataset analyzed during the current study is available in the https://fcav.engin.umich.edu/projects/driving-in-the-matrix. The PASCAL VOC datasets analyzed during the current study are available in the http://host.robots.ox.ac.uk/pascal/VOC/. The Clipart and Watercolor datasets analyzed during the current study are available in the http://www.hal.t.u-tokyo.ac.jp/~inoue/projects/cross_domain_detection/datasets/.

References

  1. Yuan D, Chang X, Li Z, He Z (2022) Learning adaptive spatial-temporal context-aware correlation filters for UAV tracking. ACM Trans Multimed Comput Commun Appl 18(3):1–18

    Article  Google Scholar 

  2. Balid W, Tafish H, Refai HH (2017) Intelligent vehicle counting and classification sensor for real-time traffic surveillance. IEEE Trans Intell Transp Syst 19(6):1784–1794

    Article  Google Scholar 

  3. Hasenjäger M, Heckmann M, Wersing H (2019) A survey of personalization for advanced driver assistance systems. IEEE Trans Intell Veh 5(2):335–344

    Article  Google Scholar 

  4. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: CVPR, pp 3213–3223

  5. Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite. In: CVPR, pp 3354–3361

  6. Johnson-Roberson M, Barto C, Mehta R, Sridhar SN, Rosaen K, Vasudevan R (2016) Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? arXiv preprint arXiv:1610.01983

  7. Torralba A, Efros AA (2011) Unbiased look at dataset bias. In: CVPR, pp 1521–1528

  8. Sakaridis C, Dai D, Van Gool L (2018) Semantic foggy scene understanding with synthetic data. Int J Comput Vis 126(9):973–992

    Article  Google Scholar 

  9. Yuan D, Shu X, Liu Q, Zhang X, He Z (2022) Robust thermal infrared tracking via an adaptively multi-feature fusion model. Neural Comput Appl, 1–12

  10. Xu C-D, Zhao X-R, Jin X, Wei X-S (2020) Exploring categorical regularization for domain adaptive object detection. In: CVPR, pp 11724–11733

  11. Tian Z, Shen C, Chen H, He T (2019) Fcos: fully convolutional one-stage object detection. In: ICCV, pp 9627–9636

  12. Patel VM, Gopalan R, Li R, Chellappa R (2015) Visual domain adaptation: a survey of recent advances. IEEE Signal Process Mag 32(3):53–69

    Article  Google Scholar 

  13. Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V (2016) Domain-adversarial training of neural networks. J Mach Learn Res 17(1):2096

    MathSciNet  MATH  Google Scholar 

  14. Long M, Zhu H, Wang J, Jordan MI (2016) Unsupervised domain adaptation with residual transfer networks. NeurIPS 29

  15. Long M, Zhu H, Wang J, Jordan MI (2017) Deep transfer learning with joint adaptation networks. In: ICML, pp 2208–2217

  16. Zhao S, Yue X, Zhang S, Li B, Zhao H, Wu B, Krishna R, Gonzalez JE, Sangiovanni-Vincentelli AL, Seshia SA et al (2020) A review of single-source deep unsupervised visual domain adaptation. IEEE Trans Neural Netw Learn Syst 33(2):473–493

    Article  Google Scholar 

  17. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: NeurIPS, pp 91–99

  18. Uijlings J, Sande KE, Gevers T, Smeulders AWM (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171

    Article  Google Scholar 

  19. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: ECCV, pp 21–37

  20. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: CVPR, pp 779–788

  21. Ganin Y, Lempitsky V (2015) Unsupervised domain adaptation by backpropagation. In: ICML, pp 1180–1189

  22. Tsai Y-H, Hung W-C, Schulter S, Sohn K, Yang M-H, Chandraker M (2018) Learning to adapt structured output space for semantic segmentation. In: CVPR, pp 7472–7481

  23. Wu M, Pan S, Zhou C, Chang X, Zhu X (2020) Unsupervised domain adaptive graph convolutional networks. In: WWW, pp 1457–1467

  24. Chen Y, Li W, Sakaridis C, Dai D, Van Gool L (2018) Domain adaptive faster r-cnn for object detection in the wild. In: CVPR, pp 3339–3348

  25. Saito K, Ushiku Y, Harada T, Saenko K (2019) Strong-weak distribution alignment for adaptive object detection. In: CVPR, pp 6956–6965

  26. Zhu X, Pang J, Yang C, Shi J, Lin D (2019) Adapting object detectors via selective cross-domain alignment. In: CVPR, pp 687–696

  27. He Z, Zhang L (2019) Multi-adversarial faster-rcnn for unrestricted object detection. In: ICCV, pp 6668–6677

  28. Zheng Y, Huang D, Liu S, Wang Y (2020) Cross-domain object detection through coarse-to-fine feature adaptation. In: CVPR, pp 13766–13775

  29. He Z, Zhang L (2020) Domain adaptive object detection via asymmetric tri-way faster-rcnn. In: ECCV, pp 309–324

  30. Wu A, Liu R, Han Y, Zhu L, Yang Y (2021) Vector-decomposed disentanglement for domain-invariant object detection. In: ICCV, pp 9342–9351

  31. Hsu H-K, Yao C-H, Tsai Y-H, Hung W-C, Tseng H-Y, Singh M, Yang M-H (2020) Progressive domain adaptation for object detection. In: WACV, pp 749–757

  32. Zhang Y, Wang Z, Mao Y (2021) Rpn prototype alignment for domain adaptive object detector. In: CVPR, pp 12425–12434

  33. Hsu C-C, Tsai Y-H, Lin Y-Y, Yang M-H (2020) Every pixel matters: center-aware feature alignment for domain adaptive object detector. In: ECCV, pp 733–748

  34. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338

    Article  Google Scholar 

  35. Inoue N, Furuta R, Yamasaki T, Aizawa K (2018) Cross-domain weakly-supervised object detection through progressive domain adaptation. In: CVPR, pp 5001–5009

  36. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. NeurIPS 32:8026–8037

    Google Scholar 

  37. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  38. Kim T, Jeong M, Kim S, Choi S, Kim C (2019) Diversify and match: a domain adaptive representation learning paradigm for object detection. In: CVPR, pp 12456–12465

  39. Cai Q, Pan Y, Ngo C-W, Tian X, Duan L, Yao T (2019) Exploring object relation in mean teacher for cross-domain detection. In: CVPR, pp 11457–11466

  40. Xu M, Wang H, Ni B, Tian Q, Zhang W (2020) Cross-domain detection via graph-induced prototype alignment. In: CVPR, pp 12355–12364

  41. Chen C, Zheng Z, Ding X, Huang Y, Dou Q (2020) Harmonizing transferability and discriminability for adapting object detectors. In: CVPR, pp 8869–8878

  42. Zhao G, Li G, Xu R, Lin L (2020) Collaborative training between region proposal localization and classification for domain adaptive object detection. In: ECCV, pp 86–102

  43. Kim S, Choi J, Kim T, Kim C (2019) Self-training and adversarial background regularization for unsupervised domain adaptive one-stage object detection. In: ICCV, pp 6092–6101

  44. Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(11)

Download references

Funding

This research was supported by the National Key Research and Development Program of China under Grant No. 2018AAA0100400, and the National Natural Science Foundation of China under Grants 91646207, 62076242, 62071466, and 61976208.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Wang.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Informed consent

All authors gave informed consent.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tian, K., Zhang, C., Wang, Y. et al. Multi-level consistency regularization for domain adaptive object detection. Neural Comput & Applic 35, 18003–18018 (2023). https://doi.org/10.1007/s00521-023-08677-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08677-9

Keywords

Navigation