Categorizing Malware via A Word2Vec-based Temporal Convolutional Network Scheme

Medientyp: E-Artikel

Titel: Categorizing Malware via A Word2Vec-based Temporal Convolutional Network Scheme

Beteiligte: Sun, Jiankun; Luo, Xiong; Gao, Honghao; Wang, Weiping; Gao, Yang; Yang, Xi

Erschienen: Springer Science and Business Media LLC, 2020

Sprache: Englisch

DOI: 10.1186/s13677-020-00200-y

ISSN: 2192-113X

Entstehung:

Anmerkungen:

Beschreibung: AbstractAs edge computing paradigm achieves great popularity in recent years, there remain some technical challenges that must be addressed to guarantee smart device security in Internet of Things (IoT) environment. Generally, smart devices transmit individual data across the IoT for various purposes nowadays, and it will cause losses and impose a huge threat to users since malware may steal and damage these data. To improve malware detection performance on IoT smart devices, we conduct a malware categorization analysis based on the Kaggle competition of Microsoft Malware Classification Challenge (BIG 2015) dataset in this article. Practically speaking, motivated by temporal convolutional network (TCN) structure, we propose a malware categorization scheme mainly using Word2Vec pre-trained model. Considering that the popular one-hot encoding converts input names from malicious files to high-dimensional vectors since each name is represented as one dimension in one-hot vector space, more compact vectors with fewer dimensions are obtained through the use of Word2Vec pre-training strategy, and then it can lead to fewer parameters and stronger malware feature representation. Moreover, compared with long short-term memory (LSTM), TCN demonstrates better performance with longer effective memory and faster training speed in sequence modeling tasks. The experimental comparisons on this malware dataset reveal better categorization performance with less memory usage and training time. Especially, through the performance comparison between our scheme and the state-of-the-art Word2Vec-based LSTM approach, our scheme shows approximately 1.3% higher predicted accuracy than the latter on this malware categorization task. Additionally, it also demonstrates that our scheme reduces about 90 thousand parameters and more than 1 hour on the model training time in this comparison.

Zugangsstatus: Freier Zugang

Nur in Feld suchen:

Zuletzt gesuchte Begriffe: