LinkedIn中的弱链接：在大型语言模型时代增强虚假资料检测

2507.16860v1

中文标题#

LinkedIn 中的弱链接：在大型语言模型时代增强虚假资料检测

英文标题#

Weak Links in LinkedIn: Enhancing Fake Profile Detection in the Age of LLMs

中文摘要#

大型语言模型（LLMs）使得在 LinkedIn 等平台上创建逼真的虚假资料变得更加容易。这对基于文本的虚假资料检测器构成了重大风险。在本研究中，我们评估了现有检测器对 LLM 生成的资料的鲁棒性。虽然在检测人工创建的虚假资料方面非常有效（误接受率：6-7%），但现有检测器无法识别 GPT 生成的资料（误接受率：42-52%）。我们提出了一种 GPT 辅助的对抗训练作为对策，将误接受率恢复到 1-7% 之间，同时不影响误拒绝率（0.5-2%）。消融研究表明，使用组合数值和文本嵌入训练的检测器表现出最高的鲁棒性，其次是使用仅数值嵌入的检测器，最后是使用仅文本嵌入的检测器。对基于提示的 GPT-4Turbo 和人类评估者的分析进一步证实了需要如本研究中提出的强大自动化检测器。

英文摘要#

Large Language Models (LLMs) have made it easier to create realistic fake profiles on platforms like LinkedIn. This poses a significant risk for text-based fake profile detectors. In this study, we evaluate the robustness of existing detectors against LLM-generated profiles. While highly effective in detecting manually created fake profiles (False Accept Rate: 6-7%), the existing detectors fail to identify GPT-generated profiles (False Accept Rate: 42-52%). We propose GPT-assisted adversarial training as a countermeasure, restoring the False Accept Rate to between 1-7% without impacting the False Reject Rates (0.5-2%). Ablation studies revealed that detectors trained on combined numerical and textual embeddings exhibit the highest robustness, followed by those using numerical-only embeddings, and lastly those using textual-only embeddings. Complementary analysis on the ability of prompt-based GPT-4Turbo and human evaluators affirms the need for robust automated detectors such as the one proposed in this study.

PDF 获取#

查看中文 PDF - 2507.16860v1

智能达人抖店二维码

抖音扫码查看更多精彩内容