Audio-Visual Transformer Based Crowd Counting | IEEE Conference Publication | IEEE Xplore