Zack Newsham (University of Waterloo) Modern CDCL SAT solvers routinely solve very large in- dustrial SAT instances in relatively short periods of time. It is clear that these solvers somehow exploit the structure of real-world instances. How- ever, to-date there have been few results that precisely characterise this structure. In this paper, we provide evidence that the community struc- ture of real-world SAT instances is correlated with the running time of CDCL SAT solvers. It has been known for some time that real-world SAT instances, viewed as graphs, have natural communities in them. A community is a sub-graph of the graph of a SAT instance, such that this sub-graph has more internal edges than outgoing to the rest of the graph. The community structure of a graph is often characterised by a quality metric called Q. Intuitively, a graph with high-quality community struc- ture (high Q) is easily separableinto smaller communities, while the one with low Q is not. We provide three results based on empirical data which show that community structure of real-world industrial instances is a bet- ter predictor of the running time of CDCL solvers than other commonly considered factors such as variables and clauses. First, we show that there is a strong correlation between the Q value and Literal Block Distance metric of quality of conflict clauses used in clause-deletion policies in Glucose-like solvers. Second, using regression analysis, we show that the the number of communities and the Q value of the graph of real-world SAT instances is more predictive of the running time of CDCL solvers than traditional metrics like number of variables or clauses. Finally, we show that randomly-generated SAT instances with 0.05 ≤ Q ≤ 0.13 are dramatically harder to solve for CDCL solvers than otherwise.
Get notified about new features and conference additions.