Title | Brunker, Jess_MCS_2022 |
Alternative Title | Evaluating P VS NP Using Graph Neural Networks |
Creator | Brunker, Jess |
Collection Name | Master of Computer Science |
Description | The following Master of Computer Science thesis explores P VS NP using graph neural networks as an evaluation tool. |
Abstract | The question of P versus NP asks if all problems that have polynomial-time algorithms for verifying that a solution is correct also have polynomial-time algorithms for finding a solution. Many NP problems are graph-related and have variants in P. Graph neural networks (GNNs) are used for representation learning on graph-structured data by aggregating information from local structures in the graph at a node level, then pooling that information together to learn information about the graph itself. Using GNN models, we seek to determine how the differences in problem complexity (P versus NP) translate to machine learning model performance and complexity, using the graph coloring and Boolean satisfiability (SAT) problems for the datasets. Multiple datasets are generated for each problem, varying in both problem size and size of the training set. As expected, models trained on the P datasets had significantly higher performance when compared to models trained on similar NP datasets, and were less dependent on the size of the training set. Both graph representations of the SAT problem failed to achieve decent performance on the NP variants, so an analysis of how the difficulty increases with problem size is limited. However, the results on the graph coloring problem show that the performance decreased at a similar rate as the problem grows in size for both the P and NP variants. We also showed that increasing the power of the GNN model increased its performance on the NP datasets, but it was not enough to overcome the lack of training data. |
Subject | Computer science; Neural networks (Computer science); Computational complexity |
Keywords | P vs NP; datasets; graph neural networks; computer science; Boolean satisfiability |
Digital Publisher | Stewart Library, Weber State University, Ogden, Utah, United States of America |
Date | 2022 |
Medium | Thesis |
Type | Text |
Access Extent | 35 page PDF; 1.36 MB |
Language | eng |
Rights | The author has granted Weber State University Archives a limited, non-exclusive, royalty-free license to reproduce their theses, in whole or in part, in electronic or paper form and to make it available to the general public at no charge. The author retains all other rights. |
Source | University Archives Electronic Records; Master of Computer Science. Stewart Library, Weber State University |
OCR Text | Show EVALUATING P VS NP USING GRAPH NEURAL NETWORKS By Jess Ryan Brunker A thesis Submitted to the faculty of the MSCS Graduate Program of Weber State University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE in Computer Science July 28, 2022 Ogden, Utah Approved: Date: Committee Chair: Abdulmalek Al-Gahmi, Ph.D. Committee Member: Hugo Valle, Ph.D. Committee Member: Yong Zhang, Ph.D. 07/28/2022 07/29/2022 7/31/2022 ABSTRACT The question of P versus NP asks if all problems that have polynomial-time algorithms for verifying that a solution is correct also have polynomial-time algorithms for finding a solution. Many NP problems are graph-related and have variants in P. Graph neural networks (GNNs) are used for representation learning on graph-structured data by aggregating information from local structures in the graph at a node level, then pooling that information together to learn information about the graph itself. Using GNN models, we seek to determine how the differences in problem complexity (P versus NP) translate to machine learning model performance and complexity, using the graph coloring and Boolean satisfiability (SAT) problems for the datasets. Multiple datasets are generated for each problem, varying in both problem size and size of the training set. As expected, models trained on the P datasets had significantly higher performance when compared to models trained on similar NP datasets, and were less dependent on the size of the training set. Both graph representations of the SAT problem failed to achieve decent performance on the NP variants, so an analysis of how the difficulty increases with problem size is limited. However, the results on the graph coloring problem show that the performance decreased at a similar rate as the problem grows in size for both the P and NP variants. We also showed that increasing the power of the GNN model increased its performance on the NP datasets, but it was not enough to overcome the lack of training data. ii TABLE OF CONTENTS Page ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.1 Graph Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 GNNS and Combinatorial Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.1 Graph Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.2 Graph Coloring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.3 Boolean Satisfiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.1.1 Graph Coloring Problem Generation . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.1.2 Boolean Satisfiability Problem Generation . . . . . . . . . . . . . . . . . . . . . . 12 4.2 GNN Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.3 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5.1 P vs NP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5.2 Stronger Model Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 6.1 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 iii LIST OF TABLES Table Page 3.1 Information about graph representations of SAT problems . . . . . . . . . . . . . . . . . 10 4.1 Generated datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.2 Total trained models across all combinations of problem variants, representations, etc. . . 16 5.1 Number of Nodes in SAT Problems Using Clause-Variable Representation . . . . . . . . 25 iv LIST OF FIGURES Figure Page 3.1 An illustration of the neighborhood aggregation for a given node after two iterations of message-passing. Source: Graph Representation Learning by William Hamilton [1] . . . 7 3.2 The SAT formula (x1 ∨¬x2)∧(¬x1 ∨¬x3)∧(x2 ∨x3) represented in graph format . . . . 9 4.1 The model architectures used for training. The input graphs are passed through a series of GNN layers, during which message-passing occurs to learn node embeddings. Then, the nodes are pooled into a single graph-level embedding. Finally, the graph embedding is passed into a densely-connected neural network to attain the final output. . . . . . . . . . 15 5.1 The validation accuracy over time for every combination of problem variant and dataset size for the coloring problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.2 The results on the test set for every combination of problem variant and dataset size for the coloring problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.3 The validation accuracy over time for every combination of problem variant and dataset size for the SAT problem (using the variable-variable representation). . . . . . . . . . . 21 5.4 The results on the test set for every combination of problem variant and dataset size for the SAT problem (using the variable-variable representation). . . . . . . . . . . . . . . . 22 5.5 The validation accuracy over time for every combination of problem variant and dataset size for the SAT problem (using the clause-variable representation). . . . . . . . . . . . 23 5.6 The results on the test set for every combination of problem variant and dataset size for the SAT problem (using the clause-variable representation). . . . . . . . . . . . . . . . . 24 5.7 The validation loss over time for the smallest dataset of the coloring problem. . . . . . . 25 v CHAPTER 1 Introduction The question of P versus NP is one of the biggest open problems in mathematics and computer science. In fact, it is one of the Clay Institute’s seven ”Millennium Prize” problems [2]. Informally, the question of P versus NP asks if all problems that have polynomial-time algorithms for verifying that a solution is correct also have polynomial-time algorithms for finding a solution. Problems in P meet this requirement, but problems in NP have no such polynomial-time algorithm for finding a solution. It is widely believed that P 6= NP, but as of this writing there is no proof. Combinatorial optimization problems, such as the travelling salesman problem or the knapsack problem are often used as examples of problems in NP. In these problems, the solution space is so large that it is not tractable to evaluate all possible solutions to find the optimal solution. As the problem grows, the solution space grows exponentially, so most algorithms employ search algorithms to limit the number of solutions to evaluate or simply approximate the optimal solution. Many NP problems are graph-based. A subset of combinatorial optimization problems are NP-complete, meaning that all NP-complete problems can be reduced to one another. This implies that if a polynomial time algorithm were to be found to solve one NP-complete problem, it could be used to solve all NP problems, and would therefore prove that P = NP. Graph neural networks, or GNNs, are machine learning models built to learn from graph-based data. Because many combinatorial optimization problems are based directly on graph data, or can be mapped onto graphs, GNNs have been used to find solutions to these problems. GNNs work by learning information about local structures in a graph, then using that learned information to perform desired tasks, such as node classification or edge prediction. When used for combinatorial optimization problems, the task may be to solve the decision version of the problem (i.e., a yes or no question) or to find a full solution to the problem. In this work, we evaluate the P versus NP problem through the lens of GNN performance on two known graph-related NP problems. When training a GNN model on a combinatorial optimization problem, how much of a difference does P versus NP make for the final model performance? How is the performance affected by the size of the problem and the size of the training set? Two problems were chosen from NP that have a more restrictive variant in P: graph coloring and Boolean satisfiability (SAT). Details for these variants can be found in Chapter 3. For each problem variant, a number of datasets are generated with increasing problem size. For each of these datasets, models are trained on subsets of the training data, then tested against a single holdout set. We expect to see the GNN models perform better on the variants in P and that the performance on the NP variants will decrease rapidly as 1 the problem grows. In addition, we expect the P variants to require less training data to reach their best performance. Finally, an additional set of models with increased size and power are trained on datasets for the more difficult problem variants to see how performance on these datasets is affected. This thesis is organized into several chapters. Chapter 2 summarizes the related work, especially regarding the fields of GNNs and their uses in combinatorial optimization problems. Chapter 3 provides the background necessary for this work. It contains a description of graphs and the mathematical basis for GNNs. It also describes the graph coloring and SAT problems used in this work, as well as their different variants. An additional section describes methods for translating SAT problems to graph format. Chapter 4 discusses the approach used in this work, including details about the datasets, the models used, and the methodology. Chapter 5 presents the obtained results, and Chapter 6 discusses future possible work and final concluding remarks. 2 CHAPTER 2 Related Work 2.1 Graph Neural Networks A book by Hamilton builds intuition to the inner workings of GNNs by starting with fundamental approaches to machine learning using graph data, expanding into graph representation learning, and finally covering graph neural networks [1]. The first attempts assembled heuristic functions or domain knowledge related to the graph, which were then used as inputs into standard machine learning models. Information about an individual node could contain its degree, centrality, or clustering coefficient. For the graph as a whole, the Weisfeiler-Lehman kernel or path-based methods such as random walks could be used to gain information about the nodes of a neighborhood. Eventually, more research was done to encode nodes and graphs into embeddings. Examples of work in this area involve matrix factorization and random walk methods such as DeepWalk [3] and node2vec [4]. The innovation of GNNs is the neural message-passing approach, described in more detail in chapter 3. Graph neural networks are a relatively new development. Much work has been done to determine how powerful they actually are. Xu et al. [5] show that the popular GNN variants of graph convolutional networks (GCN) and GraphSAGE are not as powerful as the Weisfeiler-Leman test, used for graph isomorphism test-ing. They show that many GNN frameworks are unable to distinguish between some topologically identical graphs. They propose a new GNN framework, Graph Isomorphism Network (GIN), which they show to be an improvement over other models. The GIN model is used as the GNN layer for this work, along with an extended variant that can incorporate edge features [6]. A number of survey studies provide the background necessary for understanding current research in the field of GNNs. One such survey by Bacciu et al. focuses on the architectural structures of GNNs [7]. They discuss the mathematics and current work involved in GNN models: neighborhood aggregation, pooling, and node representation aggregation for graph embedding. A listing of tasks using GNNs is also given, such as link prediction, graph clustering, node and graph classification, and auto-encoders. Other surveys, such as the work ofWu et al., seek to categorize GNN architectures [8]. They present a tax-onomy of GNNs, organizing them into four categories, along with an introduction for each. The first category, recurrent graph neural networks (RecGNNs) represent the early work in the field, using message-passing al-gorithms until an equilibrium is reached. Convolutional graph neural networks (ConvGNNs) are defined as generalizing convolutions to graph data by adapting the message-passing algorithms from RecGNNs to incor- 3 porate information from a node’s neighbors. Graph autoencoders (GAEs) seek to map a node or a graph into a low-dimensional space, then attempt to rebuild it. These models can be used to learn graph embeddings or generate new graphs. The final category of spatial-temporal GNNs (STGNNs) are models built specifically for spatial-temporal graphs, or graphs that change over time. 2.2 GNNS and Combinatorial Optimization Many combinatorial optimization problems involve graphs, such as the graph coloring problem, or can be modeled in graph format, such as Boolean satisfiability (also known as the SAT problem). Because of this, GNNs are thought to be a natural tool for working on these types of problems. B¨unz and Lamm evaluate whether SAT problems can be solved by GNN models at all, showing that GNNs can learn satisfiability beyond a random baseline [9]. They also evaluate GNN performance on graphs around the phase shift, a point at which graphs become more difficult to solve. They discuss two methods of mapping a SAT problem to a graph, the clause-variable and variable-variable approach. For this work, they settle on the variable-variable representation due to its simplicity. Selsam et al. present Neurosat, a GNN model to assist in finding a solution to SAT problems [10]. GNN is used to first determine whether or not the formula is satisfiable, then the node embeddings are clustered to decode the actual solution to the formula. The success rate was lower than conventional SAT solvers, but this approach was shown to generalize to different graph optimization problems, namely graph coloring, clique detection, dominating set, and vertex cover. Rather than explicitly using a GNN to find solutions, Kurin et al. [11] use a GNN to assist in learning Q-values for each node. For each variable node, the agent decides whether to negate the variable in the original formula. Their model shows better results than the Variable State Independent Decaying Sum (VSIDS) heuristic, and generalizes well to larger problems, finding unsatisfiable SAT problems, and graph coloring problems. Tabucol is an early heuristic algorithm for finding a valid k-coloring for a graph, but it is still often used to benchmark performance on graph coloring problems [12]. Given k colors, each of the nodes is assigned a random initial color, which often results in a large number of adjacent nodes with invalid color assignments. A single iteration of the algorithm finds the node that decreases the number of conflicts when changing its color. This process results in a valid k-coloring after a finite number of steps. Lemos et al. use GNNs to predict the chromatic number χ for a graph, where χ ∈ [2,7] by adding color embeddings in addition to node embeddings [13]. Their results show an improvement over other GNN solvers, including the GNN model Neurosat [10] and Tabucol heuristic [12], and can generalize to different graph distributions and larger graphs. The authors also use clustering on the vertex embeddings in an attempt 4 to predict a valid color assignment for the graph. Li et al. provide an analysis on GNN approaches to graph coloring, showing that GNNs tend to generate similar embeddings for adjacent nodes, which creates contradictions for the graph coloring problem [14]. They propose a model they call Graph Discrimination Network (GDN), which alleviates the issues they found. Their model improves on the results from [10] and has similar performance to the Tabucol heuristic [12]. In a different approach, Shuetz et al. use the Potts model from statistical physics as a loss function for the node embeddings in an unsupervised setting [15]. For each node, the GNN is trained to predict the likelihood each node should be assigned each color. Their model outperforms the AC-GNN model from Li et al. [14] in all cases, and outperforms the Tabucol heuristic [12] on larger graphs. The only research I found directly relating the size of the training set to the final performance of the model (at least in the field of combinatorial optimization) is the work by [16] on the traveling salesman problem. Their strategy uses a GNN model to first output an adjacency matrix corresponding to a tour. Then, the model uses a beam search to convert the matrix to a valid tour. A beam search is used to build a search tree, using some defined heuristic to determine which branches to evaluate. During training, they evaluate the difference in optimality between the ground truth best tour and the current model output, finding that their work finds a solution within 1% of optimal, but more training samples are required for the same performance for datasets made from larger graphs. The work of Khalil et al. involves learning the actual algorithms involved behind approximating solutions to NP-hard combinatorial optimization problems using Q-learning [17]. The optimization problems they used were the minimum vertex cover, maximum cut, and the traveling salesman problem. For their datasets, they generated Erd˝os-Renyi [18] and Barabasi-Albert [19] graphs, with the intent of more closely modeling real-world networks. For the minimum vertex cover problem, a number of different datasets were created, ranging from small graphs (15-20 vertices) to the large ones (400-500 vertices). 5 CHAPTER 3 Background 3.1 Graph Neural Networks Graph neural networks, or GNNs, aim to apply neural networks to graph-structured data. Formally, a graph is defined as G = (V,E), where V is a set of vertices or nodes, and E is a set of edges connecting vertices to one another. Let vi ∈ V represent a single node, and ei j = (vi, v j) ∈ E represent an edge connecting the nodes vi and v j . The neighborhood of a node vi, which consists of all nodes connected to it by an edge, is defined as N(vi) = {u ∈ V|(vi,u) ∈ E}. Often, the nodes and edges are represented by an adjacency matrix A, an n×n matrix with ai j = 1 if an edge ei j exists, and 0 otherwise. A graph may also contain a set of node attributes X ∈ Rn×d , or a set of edge attributes Xe ∈ Rm×p, where each xi ∈ X or xej ∈ Xe represents a feature vector for the node vi or edge e j , respectively. Graph data has distinct properties that make them difficult for standard neural networks or other machine learning algorithms to handle. First, graph data is permutation invariant, meaning that the nodes and edges in the graph can be given in any order, but the resulting graphs are identical. Similarly, a node may be added to or removed from the graph with minimal changes to its total structure. A simple approach may be to use the adjacency matrix A directly as input to some model, but the model then depends directly on a concrete node ordering and graph size, neither of which are necessarily guaranteed. Graph neural networks (GNNs) generalize the approach of convolutional neural networks (CNNs) to work on graph structured data. A CNN learns information about the image by using convolutional filters to consider information in a localized area of the image. The same filters are themselves used across all regions in the image to both generalize learned patterns and reduce computation costs. The neighborhood of a filter considers a static number of pixels determined by the size of the filter, but that is not possible with a graph. In a single graph, a node can have no neighbors, one neighbor, or all other nodes in the graph. To deal with this issue, GNNs use a neural message-passing algorithm to aggregate information about a node’s neighborhood. Each node v ∈ V begins with an initial node embedding h(0) v , which may contain input features for the nodes. For the kth iteration, the message-passing algorithm can be described as h(k) v =UPDATE(k) h(k−1) v ,AGGREGATE(k)({h(k−1) u ∀u ∈ N(v)}) (3.1) where UPDATE is a differentiable function, represented often by a neural network. This function contains the non-linearity function, such as ReLU or Tanh. AGGREGATE is some permutation invariant function 6 Figure 3.1: An illustration of the neighborhood aggregation for a given node after two iterations of message-passing. Source: Graph Representation Learning by William Hamilton [1] for combining a set of embeddings into a single message, such as a simple sum. The value calculated by AGGREGATE can be considered the actual ”message” from the node’s neighborhood. Note that these func-tions may be different for each message-passing iteration. After k iterations of message-passing, a node will have incorporated information from all nodes within k ”hops”. Figure 3.1 shows a pictorial representation of the process for node A after the second iteration. Reading right to left, the embeddings for A’s neighbors are constructed from each of their neighbors for the first iteration. For example, N(C) = {A,B,E,F}, and N(D) = {A}. For iteration two, the process repeats. The embeddings from the previous iteration for N(A) are combined to create h(2) A . At this point, every node v ∈ V has a learned embedding h(k) v that can be used for node-level tasks like node classification. However, many tasks require graph-level knowledge, so we need to generate a graph embedding hG based on all the learned node embeddings. This is achieved using a ”pooling” function, another permutation invariant function that takes the node embeddings as input. A common method is to use a simple sum: hG = Σv∈V hv, but the max or average functions can also be used. More sophisticated methods treat pooling as part of the learning process. This embedding can then be used for the required task, such as input for a neural network for classification. The following two sections describe the two optimization problems to which the GNN models of this thesis will be applied. 3.2 Graph Coloring The graph coloring problem involves assigning labels to every node in a graph such that every node is assigned a different label from each of its neighbors. This is an optimization problem, so the goal is to minimize the number of labels used. While the label could represent anything, they are most often referred to as ”colors” due to the problem’s use case in map coloring. In a map, a country can be represented as a node, and all of its adjacent countries would be connected to that node via edges. It is undesirable to color adjacent countries 7 with the same color, so a valid color assignment involves assigning a color to each country such that no adjacent countries have the same color. Given a graph G with a set of vertices V, a label l is assigned to all v ∈ V such that vl 6= uli ∀ui ∈ N(v), where N(v) represents the vertices adjacent to v, or its neighborhood. A graph is considered k-colorable if there is a valid assignment using k labels. The minimum number of labels required for a graph is called its chromatic number χ. Determining whether a graph is k-colorable is one of Karp’s initial NP-complete problems, where he proves that the problem can be reduced to the Boolean satisfiability problem [20]. The problem of exactly determining χ for a graph, however, is NP-Hard [21]. Much work has been done finding bounds on this problem, including the famous computer-assisted Four Color Theorem, proving that for all planar graphs, χ = 4 [22]. For graphs where χ = 2, the coloring is equivalent to assigning the nodes into two disjoint sets, so the graph must be bipartite. This problem variant has a polynomial algorithm, which is not the case for graphs where χ > 2. 3.3 Boolean Satisfiability The Boolean satisfiability (SAT) problem, involves finding an assignment to variables in a Boolean formula that make it evaluate to true.The formula is said to be satisfiable if such an assignment exists, and unsatisfiable otherwise. A formula is constructed from a set of variables X = {x1, x2, . . . , xn}, each of which take the value of either true or f alse. A literal is defined as either xi or ¬xi. A Boolean formula is a conjunction of clauses C = {c1, c2, . . . , cm}, each containing a disjunction of a subset of literals taken from X. For example, given the formula (x1 ∨x2 ∨¬x3)∧(x1 ∨¬x2 ∨x3), the formula is satisfiable with x1 = true and any assignment to x2 and x3. The formula (x1 ∨ x2) ∧ (¬x1 ∨ ¬x2) is not satisfiable, as there is no assignment to x1 or x2 that make the formula evalaute to true. While there are many different variants of SAT problems, this work focuses on k-SAT. In k-SAT, the for-mula is presented in a conjunctive normal form (CNF) where each clause has exactly k literals. An expression is in CNF if each clause is joined with a conjunction (Boolean and), and the literals in each clause are joined with disjunctions (Boolean or). Even with this restriction, 3-SAT is still NP-complete, and is used as the definition for NP-completeness [20]. However, 2-SAT has a polynomial-time algorithm. In this algorithm, the formula is first turned into an implication graph, a directed graph with vertices representing both possible values of a variable, and edges between vertices imply a logical connection between the literals. For example, given the clause (x1 ∨ x2), if x1 is f alse, x2 must be true, and vice versa. Therefore, the implication graph would have an edge from ¬x1 to x2, and one from ¬x2 to x1. Given the implication graph, if a variable is in a strongly connected component with its negation, the formula is unsatisfiable [23]. Converting a formula to 8 Figure 3.2: The SAT formula (x1 ∨¬x2)∧(¬x1 ∨¬x3)∧(x2 ∨x3) represented in graph format its implication graph and finding strongly connected components for a graph are linear time algorithms. Because SAT problems are not inherently structured as graphs, extra consideration must be given when converting them to graphs. Such a conversion must encode all the information present in the original for-mula, especially if a GNN or other machine learning algorithm is to learn the underlying patterns defining satisfiability. The previously mentioned implication graph can sufficiently encode a 2-SAT problem, but is infeasible for 3-SAT due to more complex implication relationships. The two most common representations in the literature for SAT problems are the variable-variable and clause-variable representations. Both of these require adding an edge attributes matrix Xe ∈ Rm×p for storing information in addition to the traditional ad-jacency matrix, where m is the number of edges and p is the number of edge attributes. These representations are discussed in [9]. Table 3.1 compares these two representations. The variable-variable representation represents every variable and its negation as individual nodes. Two types of undirected edges are used in this representation for connecting nodes. First, an edge connects two literals if they appear in a clause together. Second, an edge connects each variable and its negation. Rather than handling separate edge types, these two types are one-hot encoded in the edge attributes matrix. This representation is relatively simple and remains compact as the number of clauses varies. This representation was only found in the work of [9]. The other common method for encoding SAT problems as graphs is the clause-variable, or factor graph, 9 Vertices Edges Edge Attributes Variable-Variable 2X X <= |E| <= X2 Xe ∈ Rm×2 Clause-Variable X +C 0 <= |E| <=CV Xe ∈ Rm×1 Table 3.1: Information about graph representations of SAT problems representation. This method builds an undirected bipartite graph with one set of nodes for the clauses and one for the variables. An edge connecting a variable to a clause indicates that the node appears in that particular clause. A single positive or negative value can encode the negation in the edge attributes. A major downside of this representation relevant to this work is that the size of the graph scales with the number of clauses in the original formula in addition to the number of variables, so they tend to be significantly larger than in the variable-variable representation. This seems to be the most popular method for representing SAT problems as graphs, and was found in many works [10] [24] [11]. 10 CHAPTER 4 Methodology In this work, we train graph neural network (GNN) models on two known graph-related NP problems, graph coloring and Boolean satisfiability, to evaluate how the P versus NP problem translates to model performance. We need to determine if models trained on datasets of P problems differ in performance when compared to those trained on datasets of NP problems. This comparison is made across multiple datasets of increasing problem size, and for increasing amounts of training data. 4.1 Datasets While there are existing datasets for both the graph coloring problem and the SAT problem, they do not meet the requirements for this work. This work requires datasets for both the P and NP variants of each problem, which is a rarity in existing datasets. In addition, a different dataset must exist for varying problem size. Because of these issues, it was deemed easier to generate synthetic datasets rather than use an existing dataset. A dataset is generated for each problem variant and size within each variant. Because of their combinatorial nature, NP problems grow exponentially more difficult as the problem grows, so the datasets range in size from small to relatively large. In this section, the methods for generating the datasets are described. Although the methods for generating instances for both problems differ significantly, both make use of ”dual” examples when generating the data. Every generated problem is on the verge of transitioning from the positive label to the negative. After making a small change to the graph such that it has the negative label, both graphs are added to the dataset. This has two advantages. First, the dataset is automatically balanced with the same number of positive and negative instances. Second, the instances are more difficult because of the natural similarities between the dual problems. This forces the model to learn stronger differences between the target labels. A total of twelve datasets were generated for this work. For each of the four problem variants (2-coloring, 3-coloring, 2-SAT, and 3-SAT), three individual datasets were created using problems of different sizes. Each dataset for the graph coloring problem consists of graphs with node counts in in the following ranges: n ∼ U(10,30), n ∼ U(40,60), and n ∼ U(70,90), where U is the uniform distribution. The datasets for the SAT problem increase in difficulty by adding more variables, which tends to increase the number of clauses per problem. The three datasets were generated with |X| ∈ {10,25,40}. The models are trained on both the variable-variable and clause-variable representations of the SAT problems, but the same generated datasets were used for both, instead of generating a different dataset for each representation. In essence, a 11 Problem Variant Problem Size Training Validation Test Coloring (χ = 2) n ∼U(10,30) 40K 4K 4K n ∼U(40,60) 40K 4K 4K n ∼U(70,90) 40K 4K 4K Coloring (χ = 3) n ∼U(10,30) 40K 4K 4K n ∼U(40,60) 40K 4K 4K n ∼U(70,90) 40K 4K 4K 2-SAT |X| = 10 40K 4K 4K |X| = 25 40K 4K 4K |X| = 40 40K 4K 4K 3-SAT |X| = 10 40K 4K 4K |X| = 25 40K 4K 4K |X| = 40 40K 4K 4K Table 4.1: Generated datasets single SAT dataset is turned into two, one for each representation. In total, each dataset contains 40,000 training examples, 4,000 examples in a validation set, and 4,000 examples in a test set. Because of the dual nature of the generation methods, each dataset contains half these numbers of generated problems. Table 4.1 summarizes these datasets. 4.1.1 Graph Coloring Problem Generation For the decision variant of the graph coloring problem, which remains an NP problem, the question can be formulated as ”Is this graph colorable with k colors?” for some k >= 2. Therefore, the chromatic number χ for a graph G can be calculated by finding the minimum k such that G is k-colorable. For the P variant, the problem is to determine if χ = 2 for a graph G. The dataset consists of two types of graphs: ones with χ = 2 which are assigned the positive label, and ones with χ = 3 which are assigned the negative label. The NP dataset follows a similar pattern for graphs with χ = 3 and χ = 4, respectively. The method for generating graph coloring problems was modified from the work of Lemos et al. [13]. Each dataset is generated for a given χ and a number of vertices n ∼U(a,b). For a graph G, edges are added between vertices randomly. After an edge e is added, a constraint satisfaction problem (CSP) solver [25] is used to determine if the graph is still χ-colorable. If the graph is no longer χ-colorable after adding e, the graph is added to the dataset along with a copy of the graph without e. The graph with e is given a negative label and the graph without e is given a positive label. Each node is given an initial random one-dimensional embedding in [0,1]. 4.1.2 Boolean Satisfiability Problem Generation The decision variant of the SAT problem is simply ”Is there an assignment to each variable xi ∈ X that makes the formula evaluate to true?” Each problem instance in the SAT datasets is either solvable or not, with the 12 former receiving a positive target label and the latter receiving a negative label. The only difference between the 2-SAT and 3-SAT datasets are their definitions: 2-SAT is restricted to only two literals per clause, and 3-SAT is restricted to three. Generating SAT problems follows the same methods as used by Selsam et al. [10]. Each dataset is created using the same number of variables n. A random clause ci is generated by randomly sampling from the n variables without replacement, then negating each literal with 50% probability. Each clause is restricted to either 2 or 3 literals, depending on the required dataset. Clauses are added one-by-one until the clause cm makes the problem no longer satisfiable, as verified by the PySAT solver library [26]. Negating a single literal in cm to get c0 m makes the problem solvable. The two problems with clauses Cunsat = {c1, c2, . . . , cm} and Csat = {c1, c2, . . . , c0 m} are added to the dataset. The embedding methods for SAT instances depend on how it is represented as a graph. For the variable-variable representation, a node’s initial embedding is initialized to 1 if it represents the positive literal, and -1 for the negative (e.g. x and ¬x in the original SAT problem, respectively). Similarly, the edge property one-hot encodes the relationship between the nodes it connects, whether it connects variables appearing in the same clause or the positive and negative literals for a given variable. For the clause-variable representation, the initial node embedding is one-hot encoded to indicate whether it represents a clause or a variable. Every edge connects clause nodes to variable nodes, so the edge property is set to -1 if the variable is negated in the clause, and 1 otherwise. 4.2 GNN Models Most research on combinatorial optimization problems uses custom models that are constructed specifically to solve the problem at hand. Instead, this work uses general-purpose GNN layers for simplicity and ease of use. The GNN portions of the model are implemented by the PyTorch Geometric library [27], and the rest is implemented in PyTorch [28]. The GNN layers used in the model are based on the Graph Isomorphism Network, or GIN, implemented from the work of Xu et al. [5]. The implementation of the node embedding UPDATE function 3.1 is as follows: h(k) v = MLP(k) h(k−1) v + Σ u∈N(v) h(k−1) u ! . (4.1) Here MLP represents a multi-layer perceptron, but any neural network model could be used. Every layer has its own MLP, so the learned embedding at one layer may differ substantially from the embedding at the next layer. GIN also includes a trainable weight for the previous node embedding h(k−1) v , but this was not used in this work for simplicity. 13 By itself, GIN does not take into account edge features. PyTorch Geometric provides a modified imple-mentation of the GIN layer based on the work of Hu et al. that also incorporates edge features [6]: h(k) v = MLP(k) h(k−1) v + Σ u∈N(v) ReLU(h(k−1) u +ev,u) ! . (4.2) The major addition here is that the edge features ev,u for the edge connecting nodes u and v are combined with the embedding for the neighboring node, then passed through the ReLU nonlinearity function. Otherwise, the UPDATE function is identical. Notice that no edge embedding is learned. Instead, the same initial edge values are used for every layer. Except for the GIN implementation, the models for both the graph coloring problem and the SAT problem have identical architectures. Three GNN layers are used, each using an MLP with two hidden layers. For the graph coloring problem, these are regular GIN layers, and the SAT problem uses the modified GIN layer that incorporates edge features. After the GNN layers, the node embeddings are pooled using the mean function to get a single embedding for the graph. This embedding is passed into a two-layer neural network, which uses the sigmoid function for its final output. An additional more powerful model is used exclusively on the more difficult problem variants in addition to the model above. This model adds an additional two GNN layers, bringing the total number to five. Another layer was also added to the final densely connected neural network layers before the final output sigmoid function. Everything else in the model is structured similarly. No extra processing logic was added to the model other than the naive approach of increasing its depth. Other GNN layers were considered for this model. A layer based on SAGE was attempted, but it did not perform well during initial testing [29]. The next attempts used an implementation from the work of Morris et al. [30]. This layer performed well on the graph coloring problem, but at time of testing it did not have an option to include edge embeddings. The GIN layer did have a corresponding layer that incorporates edge embeddings, making it a good choice to simplify a final comparison of the final model performance. Without this similarity between the two models, the performance of the individual layers would have to be considered along with that of the model itself. 4.3 Training One of the questions for this work is how much the amount of training data affects the performance on combinatorial optimization problems. To answer this question, additional models were trained using smaller subsets of the total training dataset in addition to a model trained on the entire training dataset. For every dataset detailed in Section 4.1, four models were trained using 25%, 50%, 75%, and 100% of the training 14 Figure 4.1: The model architectures used for training. The input graphs are passed through a series of GNN layers, during which message-passing occurs to learn node embeddings. Then, the nodes are pooled into a single graph-level embedding. Finally, the graph embedding is passed into a densely-connected neural network to attain the final output. 15 #Representations #Problem Sizes #Training Subsets #Models Per Subset Total Subset Models Strong Model? Total Trained Models 2-Coloring 1 3 4 3 36 No 36 3-Coloring 1 3 4 3 36 Yes 72 2-SAT 2 3 4 3 72 No 72 3-SAT 2 3 4 3 72 Yes 144 Total 324 Table 4.2: Total trained models across all combinations of problem variants, representations, etc. dataset. Each of these models used the same validation and testing sets to ensure that their performance was easily comparable. Due to the many combinations of problem variants and sizes, a large number of models were trained for this work. Table 4.2 shows the full breakdown. In total, there are 12 generated datasets for each combination of problem variant and size (see Table 4.1 for details). The SAT problems are mapped onto graphs in two different ways: variable-variable and clause-variable representations. During initial tests some models expe-rienced some instability (especially those with smaller datasets), so three models were trained on each dataset portion, with the best performing model used for the final results. Finally, models using the ”strong model” architecture are trained on the NP variants. This comes to a total of 324 trained models. Performing hyperpa-rameter tuning for each of these models was deemed not feasible for this work. Instead, more general-purpose settings were used for each high-level problem variant to achieve ”good enough” model performance. The intent here is to show performance relative to similar models, not absolute performance, so it is not essential to find peak performance for each model. During training, the graph instances can be combined together in a batch such that the model can learn representations for a number of graphs simultaneously. This is achieved by combining the adjacency matrix for each graph on the diagonal of a new batch adjacency matrix. Equation 4.3 illustrates how the batch adjacency matrix GB is constructed from the batch of graphs {G1,G2, . . . ,Gb−1,Gb}. GB = G1 0 . . . 0 0 0 G2 . . . 0 0 ... ... . . . ... ... 0 0 . . . Gb−1 0 0 0 . . . 0 Gb (4.3) 16 Because each graph in the batch is not connected to any other graph in the batch, the message-passing algo-rithm works as normal. The graph pooling function works on this batch matrix, pooling nodes together into their original graphs. All models were trained using similar methods, with minor parameter differences. Stochastic gradient descent with Nesterov momentum was used for the optimization function [31]. Each dataset uses the decision variant of the problem, so each instance is one of two classes: either true or false. Therefore, binary cross entropy was used for the loss function. To help deal with the issue of overfitting on the training set, a dropout layer with 25% dropout rate is used in the final densely connected layers, along with early stopping during the training loop. During training, the loss and accuracy on the training and validation sets are tracked for every epoch. After training completes, the loss and accuracy on the test set are recorded, along with the total number of epochs. 17 CHAPTER 5 Results This chapter discusses the performance of GNN models trained on P and NP variants of the graph coloring and Boolean satisfiability problems. The first section includes an interpretation of the results as they apply to the P vs NP problem, and the second section analyzes the differences between the stronger model architecture and the regular model on the NP variants. The results are shown in a set of charts below, grouped by problem. In the case of the SAT problem, the two representations have separate results. Figures 5.1, 5.3, and 5.5 show how each model performs epoch by epoch on the validation data for each problem variant and size, and for each subset of the training data. Reading left to right, the columns show the P variant, the NP variant, then the NP variant using the strong model respectively. Each row shows data for a single problem size (e.g., SAT with |X| = 10). Therefore, reading top to bottom shows how the performance changes as the problem variant grows in size. Each line in an individual plot is the performance on the holdout validation set during the training loop for a model trained on a certain subset of the training data. Figures 5.2, 5.4, and 5.6 show the performance for each model on the test set for each dataset. Similar to the previous set of figures, a row shows information for a given problem size. In these charts, the different variants are grouped together by the size of the training set to simplify comparisons. Validation loss calculations were stored during training in addition to the accuracy scores, but for brevity they are not shown in full here. The loss over time is inversely proportional to the accuracy, so little value is added in including this data. A sample from the smallest graph coloring dataset can be seen in Figure 5.7. All models were trained on a machine with two Intel Xeon Gold 6230 CPUs, four Nvidia Tesla V100S GPUs each with 32GB of memory, and 384 GB of RAM, running Ubuntu Linux as the operating system. 5.1 P vs NP One of the most obvious takeaways from these results is that the models trained on the P datasets outperform the NP models in every case. Even with a minimal amount of training data, the P models outperformed the NP models using the full dataset. For example, in the first row of Figure 5.1 showing n ∈ U(10,30) datasets, the leftmost column containing the χ = 2 results shows a higher accuracy score than the χ = 3 results in the center column for all sizes of the training set. As the size increases up to n ∈U(40,60) and n ∈U(70,90) for the second and third rows respectively, the accuracy decreases for both variants, but the χ = 2 consistently outperform the χ = 3 models. This pattern is even more extreme for the SAT problem results. Regardless 18 Figure 5.1: The validation accuracy over time for every combination of problem variant and dataset size for the coloring problem. 19 Figure 5.2: The results on the test set for every combination of problem variant and dataset size for the coloring problem. 20 Figure 5.3: The validation accuracy over time for every combination of problem variant and dataset size for the SAT problem (using the variable-variable representation). 21 Figure 5.4: The results on the test set for every combination of problem variant and dataset size for the SAT problem (using the variable-variable representation). 22 Figure 5.5: The validation accuracy over time for every combination of problem variant and dataset size for the SAT problem (using the clause-variable representation). 23 Figure 5.6: The results on the test set for every combination of problem variant and dataset size for the SAT problem (using the clause-variable representation). 24 Figure 5.7: The validation loss over time for the smallest dataset of the coloring problem. Problem Variables Min Max μ σ 2-SAT 10 15 72 31.72 6.11 25 36 115 67.99 9.6 40 56 160 102.88 12.30 3-SAT 10 35 142 61.82 11.27 25 95 225 139.98 14.89 40 165 311 217.96 17.15 Table 5.1: Number of Nodes in SAT Problems Using Clause-Variable Representation of the representation used, models training on the 3-SAT datasets were unable to learn anything significant about the data, even with the full dataset. This is likely caused by a combination of factors. The clause-variable representation for 3-SAT problems created significantly larger graphs, and with a greater variation in size than for 2-SAT. The 3-SAT datasets with 25 and 40 variables contain by far the largest graphs of all the datasets. The variable-variable representation used may not fully encode the underlying structure of the problem by not including which specific clauses two variables share, only that they share a clause. However, because models were able to achieve some amount of success on the 2-SAT datasets, it is less likely that the specific representation plays a large role. Overall, this suggests that the NP datasets are more intrinsically difficult than the P datasets and that the models training on the NP datasets need significantly more data to reach equivalent performance to the P models. With these results, it is difficult to determine exactly how much the size of the problem affects the fi-nal performance. Because of the poor performance on the 3-SAT datasets, we must focus on the coloring problem. As NP problems increase in size, the difficulty in finding a solution increases exponentially, so the expectation is that as the problem increases in size, we should see that the NP variants decrease in accuracy at a greater rate than the P variants. Instead, we see that the accuracy for both the P and NP variants decrease 25 by similar amounts in Figure 5.2. Using the full training datasets, the accuracy decreases by 21% from the simplest dataset to the middle dataset for both variants, then 4% to the biggest dataset. This behavior is similar regardless of the size of the training set, across all problem sizes. This seems to indicate that the difficulty of P problems scales with size at about the same rate as that of NP problems. The final comparison to make between the P and NP variants is how the amount of training data affects model performance. It is obvious that having more training data allows the model to generalize better to unseen inputs. These results show that this is even more important for the NP models than it is for the P models. Given more data, it is not inconceivable that a model trained on the simplest graph coloring problem could reach accuracies much closer to the P variant. There does not seem to be much correlation between the size of the training set and the rate of learning or the number of epochs taken. What the results show most strongly is the need for large amounts of training data for NP problems when compared to P problems. 5.2 Stronger Model Performance In an attempt to show that the NP problems are not unsolvable we can show that, in addition to increasing the size of the training set, performance can improve by increasing the power of the model itself. Just by increasing the depth of the model, a noticeable improvement on the validation and test sets was observed for every dataset, most notably in the second and third columns of Figure 5.1. The stronger architecture was still unable to learn anything significant on the 3-SAT datasets, but a small increase in performance was observed. Its performance on the coloring problem lends further credence to the idea that, given even more training data, the model could reach similar performance to the P variant. Even though the strong model used in this work has added complexity compared to the regular model, it is clear that naively adding depth alone is not enough to make up the difference between the P and NP problems. While the models here use simple GNN architectures, other works using custom models have achieved greater results than those found here, and on harder problems. For the graph coloring problem, Lemos et al. build a model to calculate the exact chromatic number for graphs with n ∼ U(40,60) and χ ∈ [3,7] with 82% accuracy [13]. Selsam et al. create a model to predict satisfiability for 3-SAT problems with |X| = 40 with 85% accuracy, training on millions of instances [10]. Clearly NP problems are solvable with GNNs, but the strength of these models come from an architecture custom built for the problem at hand instead of an off-the-shelf model, in addition to a much larger training set. 26 CHAPTER 6 Conclusion In this work, we have shown that the inherent difference in difficulty between P and NP problems translates to a difference in performance for graph neural network models trained on P and NP datasets. The datasets used were for the graph coloring and Boolean satisfiability problems, both of which are known NP problems that also have variants in P. We showed that models trained on P datasets outperform those trained on NP datasets with the same amount of data, implying that NP datasets require far more training data to achieve similar performance. Our results indicate that increasing the size of the problem decreases the performance significantly for both P and NP problems. We show that as the size increases for the coloring problem, the performance degrades at a similar rate for both the P and NP variants. Because of the poor performance on the 3-SAT datasets, we cannot generalize these results to all problems. Finally, we show that increasing the power of the GNN model can add a marginal increase in performance on the NP datasets. 6.1 Limitations It is clear that the datasets generated for this work were far too small, especially for the 3-SAT problems. This issue prevents a proper analysis of the questions that this work seeks to answer. Ideally, the datasets would be significantly larger such that the models training on the NP variants could learn something substantial. Because of time constraints, hyperparameter tuning was not performed during training for any model. While finding the best performance for each model was not a priority, this does somewhat weaken the con-clusions made in this work. In addition, it is possible that better hyperparameters could have improved the performance of the NP models, particularly the 3-SAT models. Similarly, the model architecture was only subject to minimal tuning. The GIN GNN layers were selected because of its variant that handles edge at-tributes, rather than being chosen because it was the best performing. Other model architectures may have shown improved performance. 6.2 Future Work As mentioned previously, the biggest weakness of this work is that the datasets were not large enough. A better analysis could be made if the models were trained on larger datasets, possibly in the range of hundreds of thousands or more. Repeating the work done here with more training data would hopefully allow the NP models to achieve higher performance, but also to possibly show how much data is ”enough” for each variant. It would be interesting to repeat this work using a model custom-built to solve a particular NP problem, 27 such as the models referenced in Section 5.2, in place of the strong model used here. Evaluating the perfor-mance of these models on NP datasets in comparison to a standard model would likely reinforce the findings in this work, and perhaps provide more insights on how problem complexity relates to training sizes, model complexity, and model performance. 28 References [1] W. L. Hamilton, “Graph representation learning,” Synthesis Lectures on Artifical Intelligence and Ma-chine Learning, vol. 14, no. 3, pp. 1–159, 2020. [2] S. Cook, “The p versus np problem,” Clay Mathematics Institute, vol. 2, 2000. [3] B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learning of social representations,” in Pro-ceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 701–710. [4] A. Grover and J. Leskovec, “node2vec: Scalable feature learning for networks,” in Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016, pp. 855–864. [5] K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” arXiv preprint arXiv:1810.00826, 2018. [6] W. Hu, B. Liu, J. Gomes, M. Zitnik, P. Liang, V. Pande, and J. Leskovec, “Strategies for pre-training graph neural networks,” arXiv preprint arXiv:1905.12265, 2019. [7] D. Bacciu, F. Errica, A. Micheli, and M. Podda, “A gentle introduction to deep learning for graphs,” Neural Networks, vol. 129, pp. 203–221, 2020. [8] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and S. Y. Philip, “A comprehensive survey on graph neural networks,” IEEE transactions on neural networks and learning systems, vol. 32, no. 1, pp. 4–24, 2020. [9] B. B¨unz and M. Lamm, “Graph neural networks and boolean satisfiability,” arXiv preprint arXiv:1702.03592, 2017. [10] D. Selsam, M. Lamm, B. B¨unz, P. Liang, L. de Moura, and D. L. Dill, “Learning a sat solver from single-bit supervision,” arXiv preprint arXiv:1802.03685, 2018. [11] V. Kurin, S. Godil, S. Whiteson, and B. Catanzaro, “Can q-learning with graph networks learn a gen-eralizable branching heuristic for a sat solver?” Advances in Neural Information Processing Systems, vol. 33, pp. 9608–9621, 2020. [12] A. Hertz and D. de Werra, “Using tabu search techniques for graph coloring,” Computing, vol. 39, no. 4, pp. 345–351, 1987. [13] H. Lemos, M. Prates, P. Avelar, and L. Lamb, “Graph colouring meets deep learning: Effective graph neural network models for combinatorial problems,” in 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 2019, pp. 879–885. [14] W. Li, R. Li, Y. Ma, S. O. Chan, and B. Yu, “Rethinking graph neural networks for graph coloring,” 2020. [15] M. J. Schuetz, J. K. Brubaker, Z. Zhu, and H. G. Katzgraber, “Graph coloring with physics-inspired graph neural networks,” arXiv preprint arXiv:2202.01606, 2022. [16] C. K. Joshi, T. Laurent, and X. Bresson, “An efficient graph convolutional network technique for the travelling salesman problem,” arXiv preprint arXiv:1906.01227, 2019. [17] E. Khalil, H. Dai, Y. Zhang, B. Dilkina, and L. Song, “Learning combinatorial optimization algorithms over graphs,” Advances in neural information processing systems, vol. 30, 2017. [18] P. Erd¨os and A. R´enyi, “On random graphs i,” Publicationes Mathematicae Debrecen, vol. 6, p. 290, 1959. 29 [19] A.-L. Barab´asi and R. Albert, “Emergence of scaling in random networks,” science, vol. 286, no. 5439, pp. 509–512, 1999. [20] R. M. Karp, “Reducibility among combinatorial problems,” in Complexity of computer computations. Springer, 1972, pp. 85–103. [21] A. S´anchez-Arroyo, “Determining the total colouring number is np-hard,” Discrete Mathematics, vol. 78, no. 3, pp. 315–319, 1989. [22] K. Appel and W. Haken, “Every planar map is four colorable,” Bulletin of the American mathematical Society, vol. 82, no. 5, pp. 711–712, 1976. [23] B. Aspvall, M. F. Plass, and R. E. Tarjan, “A linear-time algorithm for testing the truth of certain quan-tified boolean formulas,” Information processing letters, vol. 8, no. 3, pp. 121–123, 1979. [24] E. Yolcu and B. P´oczos, “Learning local search heuristics for boolean satisfiability,” Advances in Neural Information Processing Systems, vol. 32, 2019. [25] L. Perron and V. Furnon, “Or-tools,” Google. [Online]. Available: https://developers.google.com/ optimization/ [26] A. Ignatiev, A. Morgado, and J. Marques-Silva, “PySAT: A Python toolkit for prototyping with SAT oracles,” in SAT, 2018, pp. 428–437. [Online]. Available: https://doi.org/10.1007/ 978-3-319-94144-8 26 [27] M. Fey and J. E. Lenssen, “Fast graph representation learning with pytorch geometric,” arXiv preprint arXiv:1903.02428, 2019. [28] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch´e-Buc, E. Fox, and R. Garnett, Eds. Curran Associates, Inc., 2019, pp. 8024–8035. [Online]. Available: http://papers.neurips.cc/paper/ 9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf [29] W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” Advances in neural information processing systems, vol. 30, 2017. [30] C. Morris, M. Ritzert, M. Fey, W. L. Hamilton, J. E. Lenssen, G. Rattan, and M. Grohe, “Weisfeiler and leman go neural: Higher-order graph neural networks,” in Proceedings of the AAAI conference on artificial intelligence, vol. 33, no. 01, 2019, pp. 4602–4609. [31] Y. Nesterov, “A method for unconstrained convex minimization problem with the rate of convergence o(1/kˆ2),” 1983. 30 |
Format | application/pdf |
ARK | ark:/87278/s6knfxg6 |
Setname | wsu_smt |
ID | 96875 |
Reference URL | https://digital.weber.edu/ark:/87278/s6knfxg6 |