A Simple Baseline for Audio-Visual Scene-Aware Dialog | IEEE Conference Publication | IEEE Xplore