• Media type: Doctoral Thesis; Electronic Thesis; E-Book
  • Title: Techniques For Efficient Binary-Level Coverage Analysis
  • Contributor: Ben Khadra, Mohamed Ammar [Author]
  • Published: KLUEDO - Publication Server of University of Kaiserslautern-Landau (RPTU), 2021
  • Language: English
  • DOI: https://doi.org/10.26204/KLUEDO/6410
  • Keywords: jump table analysis ; analysis ; reverse ; software ; engineering ; binary analysis ; probe pruning ; static instrumentation ; machine code analysis ; code coverage analysis
  • Origination:
  • Footnote: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.
  • Description: Code coverage analysis plays an important role in the software testing process. More recently, the remarkable effectiveness of coverage feedback has triggered a broad interest in feedback-guided fuzzing. In this work, we discuss static instrumentation techniques for binary-level coverage analysis without compiler support. We show that the proposed techniques are precise, efficient, and transparent significantly beyond the state of the art. We implement these techniques into two tools, namely, Spedi and bcov. Both tools are open source and publicly available. Spedi shows that the disassembly and function identification of stripped binaries can be highly accurate without resort to any external information. We build on these important results in bcov where we statically instrument x86-64 ELF binaries to track code coverage. However, improving efficiency and scaling to large real-world software required an orchestrated effort combining several techniques. First, we bring a well-known probe pruning technique, for the first time, to binary-level instrumentation and effectively leverage its notion of superblocks to reduce overhead. Second, we introduce sliced microexecution, a robust technique for jump table analysis which improves CFG precision and enables us to instrument jump table entries. Additionally, smaller instructions in x86-64 pose a challenge for inserting detours. To address this challenge, we aggressively exploit padding bytes. Also, we introduce a greedy scheme to systematically host detours in neighboring basic blocks. We evaluate bcov on a corpus of 95 binaries compiled from eight popular and well-tested packages like FFmpeg and LLVM. Two instrumentation policies, with different edge-level precision, are used to patch all functions in this corpus - over 1.6 million functions. Our precise policy has average performance and memory overheads of 14% and 22%, respectively. Instrumented binaries do not introduce any test regressions. The reported coverage is highly accurate with an average F-score of 99.86%. ...
  • Access State: Open Access