2022, 31(4): 604-611.
doi: 10.1049/cje.2021.00.139
Abstract:
The similarity detection between two cross-platform binary functions has been applied in many fields, such as vulnerability detection, software copyright protection or malware classification. Current advanced methods for binary function similarity detection usually use semantic features, but have certain limitations. For example, practical applications may encounter instructions that have not been seen in training, which may easily cause the out of vocabulary (OOV) problem. In addition, the generalization of the extracted binary semantic features may be poor, resulting in a lower accuracy of the trained model in practical applications. To overcome these limitations, we propose a double-layer positional encoding based transformer model (DP-Transformer). The DP-Transformer’s encoder is used to extract the semantic features of the source instruction set architecture (ISA), which is called the source ISA encoder. Then, the source ISA encoder is fine-tuned by the triplet loss while the target ISA encoder is trained. This process is called DP-MIRROR. When facing the same semantic basic block, the embedding vectors of the source and target ISA encoders are similar. Different from the traditional transformer which uses single-layer positional encoding, the double-layer positional encoding embedding can solve the OOV problem while ensuring the separation between instructions, so it is more suitable for the embedding of assembly instructions. Our comparative experiment results show that DP-MIRROR outperforms the state-of-the-art approach, MIRROR, by about 35% in terms of precision at 1.