Typoswype：一种检测错别字的成像方法

论文标题

Typoswype：一种检测错别字的成像方法

TypoSwype: An Imaging Approach to Detect Typo-Squatting

论文作者

Lee, Joon Sern, David, Yam Gui Peng

论文摘要

错别字方向域是一种常见的网络攻击技术。它涉及利用域名，以利用常见访问域的可能的印刷错误，进行恶意活动，例如网络钓鱼，恶意软件安装等。当前的方法通常围绕字符串比较算法（如Demaru-Levenschtein距离（DLD）算法）进行。这样的技术没有考虑到键盘距离，研究人员认为这与典型的印刷错误有很强的相关性，并且正在尝试考虑。在本文中，我们介绍了Typoswype框架，该框架将字符串转换为现有键盘位置的图像。我们还展示了如何通过三胞胎损失或nt Xent损失训练的涉及卷积神经网络的现代状态图像识别技术如何应用映射到较低的维空间，距离距离对应于图像，并且等效地相似。最后，我们还展示了我们方法在广泛使用的DLD算法上改善错别字检测检测的能力，同时保持分类准确性，即输入域试图拼写出域。

Typo-squatting domains are a common cyber-attack technique. It involves utilising domain names, that exploit possible typographical errors of commonly visited domains, to carry out malicious activities such as phishing, malware installation, etc. Current approaches typically revolve around string comparison algorithms like the Demaru-Levenschtein Distance (DLD) algorithm. Such techniques do not take into account keyboard distance, which researchers find to have a strong correlation with typical typographical errors and are trying to take account of. In this paper, we present the TypoSwype framework which converts strings to images that take into account keyboard location innately. We also show how modern state of the art image recognition techniques involving Convolutional Neural Networks, trained via either Triplet Loss or NT-Xent Loss, can be applied to learn a mapping to a lower dimensional space where distances correspond to image, and equivalently, textual similarity. Finally, we also demonstrate our method's ability to improve typo-squatting detection over the widely used DLD algorithm, while maintaining the classification accuracy as to which domain the input domain was attempting to typo-squat.

下载PDF全文

下载文献需遵守相关版权规定

论文标题