与图中心性分析进行模糊的有效种子调度

论文标题

与图中心性分析进行模糊的有效种子调度

Effective Seed Scheduling for Fuzzing with Graph Centrality Analysis

论文作者

She, Dongdong, Shah, Abhishek, Jana, Suman

论文摘要

种子调度（选择种子的顺序）可能会极大地影响模糊器的性能。现有方法根据种子基于其历史突变数据来安排种子，但忽略了基础控制流图（CFG）的结构。检查CFG可以通过揭示突变种子的潜在边缘覆盖率增益来帮助种子调度。理想的策略将基于从种子到突变的所有可覆盖和可行边缘的数量来安排种子，但是沿所有边缘的计算可行性非常昂贵。因此，种子调度策略必须近似此计数。我们观察到，大概计数应该具有3个属性 - （i）如果从种子到达更多边缘，则应该增加；（ii）如果突变历史信息表明难以达到边缘或远离目前访问的边缘，则应减少它；（iii）在大型CFG上计算应该有效。我们观察到，图形分析的中心度度量自然提供了这三个特性，因此可以有效地近似通过突变种子来达到未访问边缘的可能性。因此，我们构建了一个称为边缘图的图形，该图将种子连接到其最接近的未访问节点，并计算种子节点的中心性，以测量突变种子的潜在边缘覆盖率增益。我们在K-Sweduler中实施方法，并与许多流行的种子调度策略进行比较。我们发现，与次要的基于AFL的种子调度程序相比，与熵和边缘覆盖范围相比，K-Scheduler的特征覆盖范围增加了4.21％，而在12个Google Fuzzbench计划中，算术平均值。它还发现与下一基于AFL的种子调度程序相比，以前未知的错误要多。

Seed scheduling, the order in which seeds are selected, can greatly affect the performance of a fuzzer. Existing approaches schedule seeds based on their historical mutation data, but ignore the structure of the underlying Control Flow Graph (CFG). Examining the CFG can help seed scheduling by revealing the potential edge coverage gain from mutating a seed. An ideal strategy will schedule seeds based on a count of all reachable and feasible edges from a seed through mutations, but computing feasibility along all edges is prohibitively expensive. Therefore, a seed scheduling strategy must approximate this count. We observe that an approximate count should have 3 properties -- (i) it should increase if there are more edges reachable from a seed; (ii) it should decrease if mutation history information suggests an edge is hard to reach or is located far away from currently visited edges; and (iii) it should be efficient to compute over large CFGs. We observe that centrality measures from graph analysis naturally provide these three properties and therefore can efficiently approximate the likelihood of reaching unvisited edges by mutating a seed. We therefore build a graph called the edge horizon graph that connects seeds to their closest unvisited nodes and compute the seed node's centrality to measure the potential edge coverage gain from mutating a seed. We implement our approach in K-scheduler and compare with many popular seed scheduling strategies. We find that K-scheduler increases feature coverage by 25.89% compared to Entropic and edge coverage by 4.21% compared to the next-best AFL-based seed scheduler, in arithmetic mean on 12 Google FuzzBench programs. It also finds 3 more previously-unknown bugs than the next-best AFL-based seed scheduler.

下载PDF全文

下载文献需遵守相关版权规定

论文标题