下一条:Jia Li, Yin Chen, Xuesong Zhang, et al. Multimodal feature extraction and fusion for emotional reaction intensity estimation and expression classification in videos with transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 2023: 5837-5843.