SDP Gaps from Pairwise Independence

We consider the problem of approximating fixed-predicate constraint satisfaction problems (MAX k-CSPq(P)), where the variables take values from (q) =f0; 1;:::; q 1g, and each constraint is on k variables and is defined by a fixed k-ary predicate P. Familiar problems like MAX 3-SAT and MAX-CUT belong to this category. Austrin and Mossel recently identified a general class of predicates P for which MAX k-CSPq(P) is hard to approximate. They study predicates P : (q) k !f0; 1g such that the set of assignments accepted by P contains the support of a balanced pairwise independent distribution over the domain of the inputs. We refer to such predicates as promising. Austrin and Mossel show that for any promising predicate P, the problem MAX k-CSPq(P) is Unique-Games-hard to approximate better than the trivial approximation obtained by a random assignment. We give an unconditional analogue of this result in a restricted model of computation. We consider the hierarchy of semidefinite relaxations of MAX k-CSPq(P) obtained by augmenting the canonical semidefinite relaxation with the Sherali-Adams hierarchy. We show that for any promising predicate P, the integrality gap remains the same as the approximation ratio achieved by a random assignment, even after W(n) levels of this hierarchy.


Introduction
A constraint satisfaction problem (CSP) is defined by a set of constraints on a set of q-valued variables.In the maximization version (MAX-CSP) one tries to maximize the number of constraints that can be simultaneously satisfied.In this paper, we consider the most commonly studied families of CSPs, denoted by MAX k-CSP q (P), defined as follows.The variables take values over the fixed alphabet [q] = {0, 1, . . ., q − 1}.All constraints are defined by a single predicate P : [q] k → {0, 1} and have the form P(x 1 + b 1 mod q, . . ., x k + b k mod q) for b i ∈ [q].For instance, MAX 3-SAT is the same as MAX 3-CSP 2 (OR 3 ), where OR 3 denotes the 3-variable OR function.We refer to the terms x i + b i mod q as literals, 1 generalizing from the Boolean case, where the literals are x i and ¬x i = x i + 1 mod 2.
Given a predicate P, an instance of the MAX k-CSP q (P) problem is a collection of constraints as above and the objective is to maximize the number of constraints that can be satisfied simultaneously.As special cases, we obtain many well-studied MAX-CSP problems, e. g., MAX k-SAT, MAX k-XOR, MAX k-LIN q (systems of mod q linear equations with k variables per equation), etc.Note that MAX 2-LIN 2 includes MAX-CUT.We use MAX k-CSP q to denote the union of the classes MAX k-CSP q (P) over all P : [q] k → {0, 1}.
The MAX k-CSP q problem is NP-hard for any k ≥ 2, q ≥ 2. A lot of effort has been devoted to determining the true approximability threshold of the problem as a function of k and q.For the Boolean case (q = 2), Samorodnitsky and Trevisan [25] proved that the problem is hard to approximate within a factor less than 2 k /2 2 √ k , which was improved to 2 k /2 √ 2k by Engebretsen and Holmerin [12].Later Samorodnitsky and Trevisan [24] showed that it is Unique-Games-hard (UG-hard) to approximate the same problem within a factor less than 2 k /2 log k+1 .For the general case of q-valued variables (MAX k-CSP q ), Guruswami and Raghavendra [16] proved UG-hardness of approximation within a factor less than q k /(kq 2 ) when q is a prime.
Austrin and Mossel [5] studied the complexity of MAX k-CSP q (P) for predicates P : [q] k → {0, 1} such that the set of accepted inputs P −1 (1) contains the support of a balanced pairwise independent distribution on [q] k .We shall refer to such predicates as promising.In a very general result, which (assuming the Unique Games Conjecture) subsumes all the above, Austrin and Mossel [5] showed that for a promising predicate P, the MAX k-CSP q (P) problem is UG-hard to approximate within a factor less than q k /|P −1 (1)|.Considering that a random assignment satisfies a |P −1 (1)|/q k fraction of all the constraints, this is the strongest result one can get for such P. Using appropriate choices for the predicate P, this then implies a hardness ratio of q k /((1 + o(1))kq 2 ) for MAX k-CSP q for any q ≥ 2, a ratio of q k /(kq(q − 1)) when q is a prime power, and 2 k /(k + O(k 0.525 )) for q = 2.
In this paper, we study the inapproximability of MAX k-CSP q (P) for promising predicates P, in a restricted model of computation.The model we consider is the hierarchy of semidefinite relaxations of MAX k-CSP q (P), given by an application of the Sherali-Adams [29] strengthenings of the canonical semidefinite relaxation of MAX k-CSP q (P).We give an unconditional analogue of the result of Austrin and Mossel in this hierarchy, demonstrating that even after Ω(n) levels of this hierarchy, the integrality gap remains at least q k /|P −1 |, which is the approxiation ratio achieved by a random assignment.(The implied constant in the Ω(n) notation depends on k and q.)

Hierarchies of linear and semidefinite programs
A standard approach in approximating NP-hard problems, and therefore MAX k-CSP q , is to formulate the problem as a 0-1 integer program and then relax the integrality condition to get a linear (or semidefinite) program which can be solved efficiently.The quality of such an approach is intimately related to the integrality gap of the relaxation, namely, the ratio between the optimum of the relaxation and that of the integer program.
Several methods (or procedures) were developed in order to obtain tightenings of relaxations in a systematic manner.These procedures give a sequence or a hierarchy of increasingly tighter relaxations of the starting program.The commonly studied ones include the hierarchies defined by Lovász-Schrijver [19], Sherali-Adams [29], and Lasserre [17] (see [18] for a comparison).Stronger relaxations in the sequence are referred to as higher levels of the hierarchy.It is known for all these hierarchies that for a starting program with n variables, the program at level n has integrality gap 1, and that it is possible to optimize over the program at the rth level in time n O(r) .
Many known linear (semidefinite) programs can be captured by constant many levels of the Sherali-Adams (Lasserre) hierarchy.In fact, these semidefinite programs can also be captured by a "mixed" hierarchy, first studied by Raghavendra2 [22]; where we augment the basic semidefinite relaxation by adding new real variables and imposing linear constraints according to the Sherali-Adams hierarchy.We will refer to this hierarchy of programs as the Sherali-Adams SDP hierarchy.
Fernandez de la Vega and Kenyon-Mathieu [14] have provided a PTAS for MAX-CUT in dense graphs using Sherali-Adams LP hierarchy.In [20] it is shown how to get a Sherali-Adams based PTAS for Vertex-Cover and Max-Independent-Set in minor-free graphs, while recently Mathieu and Sinclair [21] showed that the integrality gap for the matching polytope is asymptotically 1 + 1/r, and Bateni, Charikar and Guruswami [6] that the integrality gap for a natural LP formulation of the MaxMin allocation problem is at most n 1/r , both after r many Sherali-Adams tightenings.Chlamtac [10] and Chlamtac and Singh [11] gave an approximation algorithm for Max-Independent-Set in hypergraphs based on the Lasserre hierarchy, with the performance depending on the number of levels.Recently, an O(n 1/4 ) approximation for Densest k-Subgraph was shown by Bhaskara et al. [7], using linear programs implied by O(log n) levels of the Lovász-Schrijver hierarchy.
Lower bounds in these hierarchies amount to showing that the integrality gap remains large even after many levels of the hierarchy.Integrality gaps for Ω(n) levels can be seen as unconditional lower bounds (as they rule out even exponential time algorithms obtained by the hierarchy) in a restricted (but still fairly interesting) model of computation.Considerable effort has been invested in proving such lower bounds (see [3,31,30,28,8,13,1,2,27,15,14]).For some CSPs in particular, strong lower bounds (Ω(n) levels) have recently been proved for the Lasserre hierarchy (which is the strongest) by [26] and [32], who showed a factor 2 integrality gap for MAX k-XOR and factor 2 k /(2k) integrality gap for MAX k-CSP respectively.
In a beautiful result, Raghavendra [22] showed a general connection between integrality gaps and UG-hardness results.His result essentially shows that for MAX k-CSP q (P), if the integrality gap of a program obtained by k levels of the Sherali-Adams SDP hierarchy is I, then the MAX k-CSP q (P) is UG-hard to approximate better than a factor of I.However, in our case the hardness is already known (by the work of Austrin and Mossel), and we are interested in finding the integrality gap for programs obtained by Ω(n) levels.

Our result and techniques
Both the known results in the Lasserre hierarchy (and previous analogues in the Lovász-Schrijver hierarchy) seemed to heavily rely on the structure of the predicate for which the integrality gap was proved; in particular, the predicate is always some system of linear equations.It was not clear if the techniques could be extended using only the fact that the predicate is promising (which is a much weaker condition).In this paper, we try to explore this issue, proving Ω(n) level gaps for the Sherali-Adams SDP hierarchy for any promising predicate.
Theorem 1.1.Let P : [q] k → {0, 1} be a predicate such that P −1 (1) contains the support of a balanced pairwise independent distribution µ.Then for every constant ζ > 0, there exist c = c(q, k, ζ ) > 0 such that for large enough n, the integrality gap of MAX k-CSP q (P) for the tightening obtained by cn levels of the Sherali-Adams SDP hierarchy3 is at least Remark 1.2.We note that weaker integrality gaps for these predicates also follow, via reductions, from the corresponding integrality gap results for Unique Games.In particular, a (log log n) Ω(1) -level gap for the SDP hierarchy discussed above, follows from the recent results of Raghavendra and Steurer [23].Also, Ω(n δ )-level gaps (where δ → 0 as ζ → 0) for the Sherali-Adams LP hierarchy can be deduced from the results of Charikar, Makarychev, and Makarychev [9].
A first step in achieving our result is to reduce the problem of a level-t gap to a question about a family of distributions over assignments associated with sets of variables of size at most t.These distributions should be (a) supported only on satisfying (partial) assignments, (b) should be consistent among themselves, in the sense that for S 1 ⊆ S 2 which are subsets of variables, the distributions over S 1 and S 2 should be equal on S 1 , and (c) should be balanced and pairwise-independent.The first requirement guarantees that the solution achieves objective value that corresponds to satisfying all the constraints of the instance.The second requirement implies feasibility for the Sherali-Adams LP constraints, while the last one makes it easy to produce vectors satisfying the semidefinite constraints.
The second step is to come up with these distributions!We explain why the simple method of picking a uniform distribution (or a reweighting of it according to the pairwise independent distribution that is supported by P) over the satisfying assignments cannot work.Instead we introduce the notion of "advice sets."These are sets on which it is "safe" to define such simple distributions.The actual distribution for a set S we use is then the one induced on S by a simple distribution defined on the advice-set of S. Getting such advice sets heavily relies on notions of expansion of the constraints graph.In particular, we use the fact that random instances have inherently good expansion properties.At the same time, such instances are highly unsatisfiable, ensuring that the resulting integrality gap is large.
Arguing that it is indeed "safe" to use simple distributions over the advice sets relies on the fact that the predicate P in question is promising, namely P −1 (1) contains the support of a balanced pairwise independent distribution.We find it interesting and somewhat curious that the condition of pairwise independence comes up in this context for a reason very different than in the case of UG-hardness.Here, it represents the limit to which the expansion properties of a random CSP instance can be pushed to define such distributions.

Preliminaries and notation 2.1 Constraint satisfaction problems
For an instance Φ of MAX k-CSP q , we denote the variables by {x 1 , . . ., x n }, their domain {0, . . ., q − 1} by [q], and the constraints by C 1 , . . .,C m .Each constraint is a function of the form C i : [q] T i → {0, 1} depending only on the values of the variables in the ordered tuple For a set S ⊆ [n] of variables, we denote by [q] S the set of all mappings from the set S to [q].In the context of variables, these mappings can be understood as partial assignments to a given subset of variables.For α ∈ [q] S , we denote its projection to S ⊆ S by α(S ).Also, for α 1 ∈ [q] S 1 , α 2 ∈ [q] S 2 such that S 1 ∩ S 2 = / 0, we denote by α 1 • α 2 the assignment over S 1 ∪ S 2 defined by α 1 and α 2 .We shall prove results for constraint satisfaction problems where every constraint is specified by the same Boolean predicate P : [q] k → {0, 1}.We denote the set of assignments for which the predicate evaluates to 1 by P −1 (1).A CSP instance for such a problem is a collection of constraints of the form of P applied to k-tuples of literals.For a variable x with domain [q], we take a literal to be (x + a) mod q for any a ∈ [q].We record this more formally in the following definition.Definition 2.1.For a given P : . We denote the maximum number of constraints that can be simultaneously satisfied by OPT(Φ).

Expanding CSP instances
For an instance Φ of MAX k-CSP q , define its constraint graph G Φ as the following bipartite graph from L to R. The left hand side L consists of a vertex for each constraint C i .The right hand side R consists of a vertex for each variable x j .There is an edge between a constraint-vertex i and a variable-vertex j, whenever variable x j appears in constraint C i .When Φ is clear from the context, we will abbreviate G Φ by G.
For Our result relies on the expansion of the support sets of the constraints.We make this notion formal below.

The Sherali-Adams SDP hierarchy
Below we present a relaxation for the MAX k-CSP q problem as it is obtained by applying a level-t Sherali-Adams tightening to the basic SDP formulation of some instance Φ of MAX k-CSP q .A well known fact states that the level-n Sherali-Adams tightening (even starting from a linear program) provides a perfect formulation, i. e., the integrality gap is 1 (see [29] or [18] for a proof).
The intuition behind the level-t Sherali-Adams tightening is the following.Note that an integer solution to the problem can be given by a single mapping α 0 ∈ [q] [n] , which is an assignment to all the variables.Using this, we can define 0/1 variables X (S,α) for each S ⊆ [n] such that |S| ≤ t and α ∈ [q] S .The intended solution is X (S,α) = 1 if α 0 (S) = α and 0 otherwise.We introduce X ( / 0, / 0) which is intended to be 1.By relaxing the integrality constraint on the variables, we obtain the LP conditions given by level-t of the Sherali-Adams hierarchy.
We can further strengthen the integer program by adding the quadratic constraints 4 As solving quadratic programs is NP-hard we then relax these quadratic constraints to the existence of vectors v (i, j) and a unit vector v 0 , for which we require that The complete relaxation can be seen in Figure 1.
For an SDP formulation of MAX k-CSP q , and for a given instance Φ of the problem, we denote by FRAC(Φ) the SDP (fractional) optimum, and by OPT(Φ) the integral optimum.For the particular instance Φ, the integrality gap is then defined as FRAC(Φ)/OPT(Φ).The integrality gap of the formulation is the supremum of integrality gaps over all instances.
Next we give a sufficient condition for the existence of a solution to the program obtained by level-t of the Sherali-Adams SDP hierarchy for a MAX k-CSP q instance Φ.
Lemma 2.3.Consider a family {D(S)} S⊆[n]:|S|≤t of distributions, where each D(S) is defined over [q] S .Suppose that for every S ⊆ T ⊆ [n] with |T | ≤ t, the distributions D(S), D(T ) are equal on S and there exists a set v (i, j) i∈[n], j∈[q] of vectors and a unit vector v 0 satisfying: Then the vectors together with the LP variables X (S,α) = Pr D(S) [α] form a feasible solution to the program in Figure 1.
Proof.Consider some S ⊆ [n], |S| < t, and some i ∈ S. Note that the distributions D(S), D(S ∪ {i}) are equal on S, and therefore we have The same argument for S = / 0 shows that X ( / 0, / 0) = 1.It is clear that the solution satisfies all the other required conditions by definition, which proves the lemma.

Pairwise independence and approximation resistant predicates
We say that a distribution µ over variables x 1 , . . ., x k , is a balanced pairwise independent distribution over [q] k , if we have A predicate P is called approximation resistant if it is hard to approximate the MAX k-CSP q (P) problem better than using a random assignment.Assuming the Unique Games Conjecture, Austrin and Mossel [5] show that a predicate is approximation resistant if it is possible to define a balanced pairwise independent distribution µ such that P is always 1 on the support of µ.
Definition 2.4.A predicate P : [q] k → {0, 1} is called promising if there exist a distribution supported over a subset of P −1 (1) that is pairwise independent and balanced.If µ is such a distribution we say that P is promising supported by µ.

Towards defining consistent distributions
To construct valid solutions for the Sherali-Adams SDP hierarchy, we need to define distributions over every set S of bounded size as is required by Lemma 2.3.Since we will deal with promising predicates supported by some distribution µ, in order to satisfy consistency between distributions we will heavily rely on the fact that µ is a balanced pairwise independent distribution.Assume for simplicity that µ is uniform over P −1 (1) (the intuition for the general case is not significantly different).It is instructive to think of q = 2 and the predicate P being k-XOR, k ≥ 3. Observe that the uniform distribution over P −1 (1) is pairwise independent and balanced.A first attempt would be to define for every S, the distribution D(S) as the uniform distribution over all consistent assignments of S. We argue that such distributions are in general problematic.This follows from the fact that satisfying assignments are not always extendible.Indeed, consider two constraints C i 1 ,C i 2 ∈ L that share a common variable j ∈ R. Set S 2 = T i 1 ∪ T i 2 , and S 1 = S 2 \ { j}. 5 Assuming that the support of no other constraint is contained in S 2 , we get that distribution D(S 1 ) maps any variable in S 1 to {0, 1} with probability 1/2 independently, but some of these assignments are not even extendible to S 2 meaning that D(S 2 ) will assign them with probability zero.
Thus, to define D(S), we cannot simply sample assignments satisfying all constraints in C(S) with probabilities given by µ.In fact the above example shows that any attempt to blindly assign a set S with a distribution that is supported on all satisfying assignments for S is bound to fail.At the same time it seems hard to reason about a distribution that uses a totally different concept.To overcome this obstacle, we take a two step approach: 1.For a set S we define a superset S such that S is "global enough" to contain sufficient information, while it also is "local enough" so that C(S) is not too large.We require the property of such sets that if we remove S and C(S), then the remaining graph G| −S still has good expansion.We deal with this in Section 3.1.
2. When µ is the uniform distribution over P −1 (1), the distribution D(S) is going to be the uniform distribution over satisfying assignments in S. In the case that µ is not uniform over P −1 (1), we give a natural generalization to the above uniformity.We show how to define distributions, which we denote by P µ (S), such that for S 1 ⊆ S 2 , the distributions P µ (S 1 ) and P µ (S 2 ) are guaranteed to be consistent if G| −S 1 has good expansion.This appears in Section 3.2.
We then combine the two techniques and define D(S) according to P µ (S).This is done in Section 4.

Finding advice-sets
We now give an algorithm below to obtain a superset S for a given set S, which we call the advice-set of S. It is inspired by the "expansion correction" procedure in [8].
Initially set S ← / 0 and ξ ← r Proof.Let ξ S be the value of ξ when the loop terminates.From the bounded size of M j and how ξ changes at each iteration we know that ξ remains non-negative throughout the execution of the while loop, and in particular ξ S ≥ 0. Note that at step j, all the neighbors of M j are added to the set S so no member of M j will be in G −S after the jth step.In particular, all the sets M j will be disjoint.
In order to prove (a) we will prove the following loop invariant: G| −S is (ξ , e 2 )-boundary expanding.Indeed, note that the input graph G is (ξ , e 1 )-boundary expanding so the invariant holds at the beginning.At step j consider the set S ∪ {x j }, and suppose that G −(S∪{x j }) is not (ξ , e 2 )-boundary expanding.We To show (b) we consider the set M = t j=1 M j and upperbound and lowerbound its boundary expansion in G in two different ways.Notice that as we mentioned M j 's are disjoint, so |M| = ∑ t j=1 |M j | = r − ξ S .Since G is (r, e 1 )-boundary expanding, the set M has at least e 1 (r − ξ S ) boundary neighbors in G.But each member of ∂ M\S is in the boundary of exactly one of the M j 's, so it will be counted towards the boundary expansion of M j in G −S in the j iteration of the loop (for some j).Given that M j 's have boundary expansion at most e 2 we have which readily implies (b).
Finally note that S consists of S union the neighbors of all M j 's.But given that M j 's have boundary at most expansion e 2 they can not have a big neighbor set.In particular which proves (c).

Defining the Distributions P µ (S)
We now define for every set S, a distribution P µ (S) such that for any α ∈ [q] S , Pr P µ (S) [α] > 0 only if α satisfies all the constraints in C(S).For a constraint C i with set of inputs T i , defined as so that the support of µ i is contained in C −1 i (1).We then define the distribution P µ (S) by picking each assignment α ∈ [q] S with probability proportional to ∏ C i ∈C(S) µ i (α(T i )).Formally, where α(T i ) is the restriction of α to T i and Z S is a normalization factor given by To understand the distribution, it is easier to think of the special case when µ is just the uniform distribution on P −1 (1) (like in the case of MAX k-XOR).Then P µ (S) is simply the uniform distribution on assignments satisfying all the constraints in C(S).When µ is not uniform, the probabilities are weighted by the product of the values µ i (α(T i )) for all the constraints. 6However, we still have the property that if Pr P µ (S) [α] > 0, then α satisfies all the constraints in C(S).
In order for the distribution P µ (S) to be well defined, we need to ensure that Z S > 0. The following lemma shows how to calculate Z S if G is sufficiently expanding, and simultaneously proves that if S 1 ⊆ S 2 , and if G| −S 1 is sufficiently expanding, then P µ (S 1 ) is consistent with P µ (S 2 ) over S 1 .Lemma 3.2.Let Φ be a MAX k-CSP q (P) instance as above and S 1 ⊆ S 2 be two sets of variables such that both G and G| −S 1 are (r, k − 3 + δ )-boundary expanding for some δ > 0 and |C(S 2 )| ≤ r.Then Z S 2 = q |S 2 | /q k|C(S 2 )| , and for any ) be the set of t constraints dominated by S 2 but not S 1 .Without loss of generality let C = {C 1 , . . .,C t } with C i being on the set of variables T i some of which might be set by α 1 .
Note that any α 2 consistent with α 1 can be written as α 1 • α for some α ∈ [q] S 2 \S 1 .We will show a way to calculate a sum similar to the left hand side of (3.2).Note that these calculations are meaningful even if Z S 1 or Z S 2 are zero, in which case both sides are simply zero.Taking S 1 = / 0 will then give us the value of Z S 2 .
The following claim lets us calculate this last sum conveniently using the expansion of G| −S 1 .
Claim 3.3.Let C be as above.Then there exists an ordering C i 1 , . . .,C i t of constraints in C and a partition of S 2 \ S 1 into sets of variables F 1 , . . ., F t and F t+1 such that for all j ≤ t, F j ⊆ T i j , |F j | ≥ k − 2, and Proof of Claim 3.3.We build the sets F j inductively using the fact that G| −S 1 is (r, k − 3 + δ )-boundary expanding.
Start with the set of constraints and continue in the same way.What is left from S 2 \ S 1 after the last step will be F t+1 .It is not hard to check that F t+1 is the portion of S 2 \ S 1 that is not in any C i although we will not use this.
Since at every step we have F j ⊆ ∂ (C j ) \ S 1 and, for all > j, C ⊆ C j , F j shares no variables with Γ(C l ) for > j.Hence, we get F j ∩ ∪ > j T i = / 0 as claimed.
Using this decomposition, we can reorder the sum and split it as where the input to each µ i j depends on α 1 and β j , . . ., β t+1 but not on β 1 , . . ., β j−1 .
We now reduce the expression from right to left.Since F 1 contains at least k − 2 variables and µ i 1 is a balanced pairwise independent distribution, ∑ irrespective of the values assigned by Continuing in this fashion from right to left, we get that Hence, we get that Now, since we know that G is (r, k − 3 + δ )-boundary expanding, we can replace S 1 by / 0 in the above calculation to get as we wanted.Plugging in the value of Z S 2 and Z S 1 into (3.3) will show that which proves the lemma.
We are now almost ready to describe our Sherali-Adams SDP solution.The only other property of P µ (S) that we need to show is that it is pairwise-independent and balanced.Claim 3.4.Let G be a (r, k − 3 + ε)-boundary expanding constraint graph, with ε > 2/3 in which no two constraints share more than one variable.Then for any S ⊂ R, |S| ≤ r, the distribution P µ (S) is a pairwise-independent and balanced distribution.That is, for all i Proof.Let S be a subset of the variables, i, j ∈ S and ε = min(1, ε − 2/3).We will first show that G −{i, j} is It follows from applying Lemma 3.3 with S 2 = S and S 1 = {i, j} that P µ (S) agrees with P µ ({i, j}) on the set {i, j}.Now note that for k = 2 the only promising predicate is the one accepting everything for which the theorem is trivial, so one can assume that k > 2 and C({i, j}) is empty.It follows that P µ ({i, j}) is the uniform distribution on [q] {i, j} which completes the proof.

Constructing the integrality gap
We now show how to construct integrality gaps using the ideas in the previous section.For a given promising predicate P, our integrality gap instance will be a random instance Φ of the MAX k-CSP q (P) problem, conditioned on no two constraints sharing more than one variable.To generate a random instance with m constraints, for every constraint C i , we randomly select a k-tuple T i = {x i 1 , . . ., x i k } of distinct variables and a i 1 , . . ., a i k ∈ [q], and put C i ≡ P(x i 1 + a i 1 , . . ., x i k + a i k ).It is well known and used in various works on integrality gaps and proof complexity (e. g., [8], [1,2], [27] and [26]), that random instances of CSPs are both highly unsatisfiable and highly expanding.We capture the properties we need in the lemma below.The proof uses standard arguments; we defer it to the next section.Lemma 4.1.Let ε, δ > 0 and a predicate P : [q] k → {0, 1} be given.Then there exist γ = O(q k log q/ε 2 ), η = Ω((1/γ) 10/δ ) and N ∈ N, such that if n ≥ N and Φ is a random instance of MAX k-CSP q (P) with m = γn constraints, then with probability exp(−O(k 4 γ 2 )) Let Φ be an instance of MAX k-CSP q on n variables for which G Φ is (ηn, k − 2 − δ )-boundary expanding for some δ < 1/4, as in Lemma 4.1.For such a Φ, we now define the distributions D(S).In the rest of this chapter we assume k ≥ 3 as for k = 2 the only promising predicate is the one satisfying all assignments for which the theorem is trivial.
For a set S of size at most t = ηn/6k, let S be subset of variables output by the algorithm Advice when run with input S and parameters r = ηn, e and, also, We then use (3.1) to define the distribution D(S) for sets S of size at most ηn/6k as Pr P µ (S) [β ] .
Using the properties of the distributions P µ (S), we can now prove that the distributions D(S) are consistent.
Claim 4.2.Let the distributions D(S) be defined as above.Then for any two sets S which would contradict boundary expansion of G. Now, by Theorem 3.1, we know that both G| −S 1 and G| −S 2 are (3ηn/4, k − 8/3 − δ )-boundary expanding.Thus, using Lemma 3.2 for the pairs S 1 ⊆ S 3 and S 2 ⊆ S 3 , we get that which shows that D(S 1 ) and D(S 2 ) are equal on S 1 .
It is now easy to prove the main result.Theorem 4.3.Let P : [q] k → {0, 1} be a promising predicate.Then for every constant ζ > 0, there exist c = c(q, k, ζ ), such that for large enough n, the integrality gap of MAX k-CSP q (P) for the program obtained by cn levels of the Sherali-Adams SDP hierarchy is at least Proof.We take ε = ζ /q k , δ = 1/4 and consider a random instance Φ of MAX k-CSP q (P) with m = γn as given by Lemma 4.1.Thus, OPT(Φ) On the other hand, by Claim 4.2 we can define distributions D(S) over every set of at most ηn/6k variables such that for S 1 ⊆ S 2 , D(S 1 ) and D(S 2 ) are consistent over S 1 .Also, Claim 3.4 implies that P µ (S) hence D(S) is pairwise-independent and balanced for any S.We can now construct the SDP vectors with inner products agreeing with the probabilities according to the distributions D(S) = P µ (S) for an instance satisfying the above property using the following simple fact.Claim 4.4.There exist vectors {v (i, j) } i∈[n], j∈[q] and v 0 satisfying: Proof.Let e 1 , . . ., e n be an orthonormal basis for R n and u 0 , . . ., u q−1 be vertices of a q − 1 dimensional simplex satisfying u j 1 , u j 2 = −1/(q − 1) when j 1 = j 2 and 1 otherwise.Let v 0 ∈ R n(q−1)+1 be a unit vector such that v 0 , e i ⊗ u j = 0 for all i ∈ [n] and j ∈ [q].We then define v (i, j) as It is easy to check that v 0 , v (i 1 , j 1 ) = 1/q and v (i 1 , j 1 ) , v (i 2 , j 2 ) = 1/q 2 for all i 1 = i 2 ∈ [n] and j 1 , which proves the claim.
By Lemma 2.3 this gives a feasible solution to the SDP obtained by ηn/(6k) levels.Also, by definition of D(S), we have that Pr D(S) [α] > 0 only if α satisfies all constraints in C(S) ⊇ C(S).Hence, the value of FRAC(Φ) is given by Thus, the integrality gap after ηn/(6k) levels is at least 5 Proof of Lemma 4.1 The proof is a simple modification of the proof of [27, Lemma 5].We include it for convenience.
For showing the boundary expansion, we note that it suffices to show that the constraints have large expansion, i. e., each set of s constraints (for s ≤ ηn) contains at least (k − 1 − δ /2)s variables.Since each non-boundary variable occurs in at least two constraints, we get that that the number of boundary variables must be at least 2(k − 1 − δ /2)s − ks = (k − 2 − δ )s.
To show this, consider the probability that a set of s constraints contains at most cs variables, where c = k − 1 − δ .This is upper bounded by Here n cs is the number of ways of choosing the cs variables involved, ( cs k ) s is the number of ways of picking s tuples out of all possible k-tuples on cs variables and s! γn s is the number of ways of selecting the s constraints.The number n k s is simply the number of ways of picking s of these k-tuples in an unconstrained way.Using .
We need to show that the probability that a set of s constraints contains less than cs variables for any s ≤ ηn is o(1).Thus, we sum this probability over all s ≤ ηn to get The first term is o(1) and is small for large n.The second term is also o(1) for η = 1/(100γ 10/δ ).Thus, we get the first two properties with probability 1 − o(1).Finally, the probability that there are no two constraints sharing two variables must be at least ∏ i=1,...,m (1 − O(i • k 4 /n 2 )) because when we choose the ith constraint, by wanting it to not share two variables with another previously chosen constraint, we are forbidding any of the k 2 pairs of variables in the ith constraint from being equal to any of the (i − 1) • k 2 pairs in the previously chosen ones.Now using that for small enough x, 1 − x > exp(−2x), the probability is at least

Figure 1 :
Figure 1: SDP for MAX k-CSP q augmented with level-t Sherali-Adams constraints

Definition 2 . 2 .
Consider a bipartite graph G = (V, E) with partition L, R. The boundary expansion of X ⊂ L is the value |∂ X|/|X|, where ∂ X = {u ∈ R : |Γ(u) ∩ X| = 1}.G is (r, e)-boundary expanding if the boundary expansion for all (nonempty) subsets of L of size at most r is at least e.
e 2 )boundary expanding.Assuming the contrary, there must be M ⊂ L such that |M | ≤ ξ − |M j | and |∂ M | ≤ e 2 |M |.As we mentioned above, M j will be disjoint from the left vertices of G −(S∪{x j }∪Γ(M j )) , and in particular it will be disjoint from M .Consider then M j ∪ M and note that |M j ∪ M | ≤ ξ .More importantly (right before we added Γ(M j ) to S) |∂ (M j ∪ M )| ≤ e 2 |M j | + e 2 |M | = e 2 |M j ∪ M | contradicting the maximality of M j ; (a) follows.
(r, k − 3 + ε )-boundary expanding.Let C be a set of constraints with |C| ≤ r.When |C| = 1, C has k boundary neighbors in G and hence at least k − 2 ≥ (k − 3 + ε )|C| boundary neighbors in G| −{i, j} .When |C| = 2, the number of boundary neighbors in G must be at least 2k − 2 since the two constraints in C can share at most one variable.It follows that |∂ C| is at least 2k − 2 − 2 ≥ (k − 3 + ε )|C| in G| −{i, j} .Finally, when 3 ≤ |C| ≤ r, we get that the size of |∂ C| in G| −{i, j} is at least (k − 3 + ε)|C| − 2 as G is (r, k − 3 + ε)-boundary expanding.This proves the claim as