We are given a bipartite graph *G=(F,C;E)*
where *F* is the set of *facilities* and *C* is the set of
*cities* (aka customers). Let *f(i)* be a cost of
opening the facility *i* and *c(i,j)* be the cost of connecting
facility *i* to city *j*. The connection costs satisfy the
triangle inequality. The problem is to find a subset *I ⊆
F* of facilities that should be opened, and a function φ:C
-> I assigning (connecting) cities to open facilities in such a way that
a total cost of opening the facilities and connecting the cities is
minimized.

Let *y(i)* be an indicator variable
equal to 1 iff facility *i* is open and 0 otherwise. Let *x(i,j)* be an
indicator variable equal to 1 iff facility *i* is connected to city *j* and 0 otherwise.
Consider the following integer program for the problem

minimize ∑_{i,j} c(i,j)x(i,j) + ∑_{i} f(i)y(i)

∑_{i ∈ F} x(i,j) ≥ 1 j∈ C

y(i)-x(i,j) ≥ 0 i ∈ F, j∈ C

x(i,j),y(i) ∈ {0,1} i ∈ F, j∈ C

The LP-relaxation of the program is

minimize ∑_{i,j} c(i,j)x(i,j) + ∑_{i} f(i)y(i)

∑_{i ∈ F} x(i,j) ≥ 1 j∈ C

y(i)-x(i,j) ≥ 0 i ∈ F, j∈ C

x(i,j),y(i) ≥ 0 i ∈ F, j∈ C

The dual program is

maxizmize ∑_{j ∈ C} α_{j}

α_{j} - β_{ij} ≤ c(i,j) i ∈ F, j∈ C

∑_{j ∈ C} β_{ij} ≤ f(i) i ∈ F

α_{j}, β_{ij} ≥ 0 j∈ C, i ∈ F

Algorithm runs in 2 phases. In the first phase we open some facilities temporarily; in the second phase we shut some of them down.

We want the dual solution to be as large as
possible. This motivates the following approach. We start from
α_{i}, β_{ij} = 0 for all *i,j*, and raise the variables α_{j} gradually.
When α_{j} = c(i,j) for some edge *(i,j)*, the algorithm will
declare this edge to be *tight*. From now on to keep the
solution feasible we will have to raise variable β_{ij}. We
will say that an edge *(i,j)* is *special* if β_{ij} >
0.

Facility *i* is paid for if ∑_{j} β_{ij}=f(i). If a facilities *i* is paid for, it is opened temporarily,
and unconnected cities
with tight edges are connected (assigned) to *i*. The facility *i*
is a *connecting witness* for those cities. In the future if
an edge connecting a temporarily open facility *i* and a city *j*
gets tight we connect facility *i* to *j*, and *i* becomes the
connecting witness (note that in this case β_{ij} = 0 and
the edge is not special). The first phase terminates when all
cities get connected.

Note that in the first phase, a city might have paid for opening a facility it is not connected to. In the next phase we ensure that a city only pays for opening facility it is eventually connected to. (Can you find an example that shows that the second phase is really needed - that is - the algorithm is not a 3-approximation algorithm w/o the second phase. How bad can the approximation ratio get without the second phase?)

Let F_{t} denote the set of facilities that are temporarily opened and T
denote the subgraph of *G* with special edges.
Let H be the subgraph of T^{2} induced on F_{t}. Find
maximal independent set in *H*, say *I*. The facilities in *I* are opened.

Now we need to connect cities to the opened facilities. For a city
*j*, define M_{j}={i ∈ F_{t}: (i,j) is special}. For
a city *j* if there is an opened facility in M_{j} we connect
*j* to it. We call *j* directly connected. Otherwise,
consider a city *j* and a facility *i* such that *i* was a
connecting witness for *j*. If i ∈ I we connect *i* to *j*
(direct connection) again (note that in this case the edge *(i,j)*
is not special). In the remaining case let *i'* be any neighbor of
*i* in graph *H* such that i' ∈ I. Connect *i'* to *j* and
declare *j* indirectly connected.

The algorithm constructs a feasible solution (x,y) to the primal, and a feasible solution (α, β) to the dual. It is enough to show that cost(x,y) ≤ 3cost(α, β) , this will imply that the approximation ration of the algorithm is 3.

If a city j is connected directly to facilty i, then set α^{f}_{j} = β_{ij} and α^{e}_{j} = c(i,j) . If a city j is connected indirectly then set
α^{f}_{j} = 0 and α^{e}_{j} = α_{j} . Clearly, for all cities α_{j} = α^{f}_{j} + α^{e}_{j} .

First we show that if i ∈ I, then ∑_{j:φ(j)=i} α^{f}_{j} =f_{i}. Indeed, since the facility i must be fully paid for, we have
∑_{j:(i,j) is special} β_{j} =f_{i}. Since only cities connected directly to j pay for it, and for these cities α^{f}_{j} = β_{j} , the claim follows. The claim immediatelly implies that ∑_{j ∈ C} α^{f}_{j} =∑_{i ∈ I} f_{i}.

If a city is connected directly then c(i,j) = α^{e}_{j} . We will now show that for an indirectly connected city j to i c(i,j) ≤ 3 α^{e}_{j} . Let j be an indirectly connected city, and let i' be the facility j is connected to (we use the notation from the algorithm). Let j' <> j be any city connected to i' (j' must exists, since j does not pay for the facility) and let i be the facility j was connected to in the first phase. It is enought to show that α^{e}_{j} ≥ c(i, j), c(i, j'), c(i', j') . First observe that α^{e}_{j} > c(i, j) and that α^{e}_{j'} > c(i, j'), c(i',j) since all these edges are special. It remains to show that α^{e}_{j} > α^{e}_{j'} . This inequality holds because city j' was connected to i' before j was connected to i. (Since otherwise j' would be connected to i, not i'.)

i' *---* j' / / / / i *---* j

Using the results from the two last paragraphs, we have

∑_{i ∈ F, j ∈ C} c(i,j)x(i,j) + ∑_{i ∈ F} f(i)y(i) ≤ ∑_{i ∈ F, j ∈ C} c(i,j)x(i,j) + 3∑_{i ∈ F} f(i)y(i) ≤ ∑_{i,j: j = φ(i)} c(i,j) + 3 ∑_{i ∈ I} f(i) ≤ 3 ∑_{i,j: j = φ(i)} α^{e}_{j} + 3 ∑_{j ∈ C} α^{f}_{j} ≤ 3 ∑_{j} α_{j}

References:

- Primal-Dual Approximation Algorithms for Metric Facility Location and k-Median Problems. http://citeseer.nj.nec.com/3752.html