基于Transformer的辅助损失用于跨年龄的人脸识别

2412.02198v3

中文标题#

基于 Transformer 的辅助损失用于跨年龄的人脸识别

英文标题#

Transformer-Based Auxiliary Loss for Face Recognition Across Age Variations

中文摘要#

衰老在人脸识别中是一个重大挑战，因为皮肤纹理和色调的变化会随时间改变面部特征，使得比较多年间拍摄的同一个体图像变得特别困难，例如在长期识别场景中。 Transformer 网络具有保持由衰老效应引起的顺序空间关系的优势。本文提出了一种损失评估技术，该技术在人脸识别领域使用 Transformer 网络作为附加损失。标准的度量损失函数通常以主 CNN 骨干网络的最终嵌入作为输入。在这里，我们采用了一种 Transformer - 度量损失，这是一种结合了 Transformer 损失和度量损失的方法。本研究旨在分析当 CNN 结果排列成顺序向量时，Transformer 在卷积输出上的行为。这些顺序向量有可能克服被称为皱纹或松弛皮肤的纹理或区域结构，这些结构受衰老影响。 Transformer 编码器从网络最终卷积层获得的上下文向量中获取输入。学习到的特征可以更加与年龄无关，补充标准度量损失嵌入的判别能力。通过这种技术，我们使用带有各种基础度量损失函数的 Transformer 损失来评估组合损失函数的效果。我们观察到，这种配置使网络在 LFW 和年龄变化数据集（CA-LFW 和 AgeDB）上实现了最先进的结果。这项研究扩展了 Transformer 在机器视觉领域的角色，并为探索 Transformer 作为损失函数打开了新的可能性。

英文摘要#

Aging presents a significant challenge in face recognition, as changes in skin texture and tone can alter facial features over time, making it particularly difficult to compare images of the same individual taken years apart, such as in long-term identification scenarios. Transformer networks have the strength to preserve sequential spatial relationships caused by aging effect. This paper presents a technique for loss evaluation that uses a transformer network as an additive loss in the face recognition domain. The standard metric loss function typically takes the final embedding of the main CNN backbone as its input. Here, we employ a transformer-metric loss, a combined approach that integrates both transformer-loss and metric-loss. This research intends to analyze the transformer behavior on the convolution output when the CNN outcome is arranged in a sequential vector. These sequential vectors have the potential to overcome the texture or regional structure referred to as wrinkles or sagging skin affected by aging. The transformer encoder takes input from the contextual vectors obtained from the final convolution layer of the network. The learned features can be more age-invariant, complementing the discriminative power of the standard metric loss embedding. With this technique, we use transformer loss with various base metric-loss functions to evaluate the effect of the combined loss functions. We observe that such a configuration allows the network to achieve SoTA results in LFW and age-variant datasets (CA-LFW and AgeDB). This research expands the role of transformers in the machine vision domain and opens new possibilities for exploring transformers as a loss function.

PDF 获取#

查看中文 PDF - 2412.02198v3

智能达人抖店二维码

抖音扫码查看更多精彩内容