Khue

Linear Assignment Problem (or Weighted Bipartite Matching Problem)

3 posts in this topic

In this topic I would like to discuss everything about Linear Assignment Problems (LAPs). Another important class of assignment problems, that is Quadratic Assignment Problems, will be discussed later in another topic.

Let's begin with the classic LAP, defined as follows (Wikipedia):

Quote

The problem instance has a number of agents and a number of tasks. Any agent can be assigned to perform any task, incurring some cost that may vary depending on the agent-task assignment. It is required to perform all tasks by assigning exactly one agent to each task and exactly one task to each agent in such a way that the total cost of the assignment is minimized.

This problem is also called Linear Sum Assignment Problem (LSAP), since the objective function is a sum, to be differentiated with other types of assignment problems such as the Linear Bottleneck Assignment Problem (LBAP). These problems will also be discussed later in this topic. When we say LAP without specification, we refer to the LSAP.

The formal mathematical definition of the LAP is the following.

Problem. (LAP) Given a matrix $C = (c_{ij})_{1\le i,j \le n} \in \mathbb{R}^{n\times n}$, called cost matrix, where $n$ is a given positive integer.
\begin{align} \mbox{minimize}\quad & \sum_{i=1}^n \sum_{j=1}^n c_{ij}x_{ij} \\ \mbox{subject to}\quad & \sum_{i=1}^n x_{ij} = 1 \quad \forall j, \\ & \sum_{j=1}^n x_{ij} = 1 \quad \forall i,\\ & x_{ij} \in \set{0,1} \quad\forall i,j. \end{align}

It is straightforward to see that this problem is equivalent to finding a perfect matching of a bipartite graph, thus it is also known as the bipartite minimum cost perfect matching problem.

It is well-known that the LP relaxation of the above integer program (i.e. the integral constraints $x_{ij} \in \set{0,1}$ are relaxed to $x_{ij} \ge 0$) has optimal solutions at extreme points, thus the LP relaxed problem and the original integer program are equivalent.

Share this post


Link to post
Share on other sites

The Hungarian Method

This famous method is the first to solve the LAP in polynomial time. It is recognized as the predecessor of the primal-dual method for linear programming. Primal-dual methods are also a very important topic, which I will discuss later in a separate forum topic (as they deserve it).

It is easy to derive the following dual of the LP relaxation of the LAP: \begin{align} \mbox{maximize}\quad & \sum_{i=1}^n u_i+ \sum_{j=1}^n v_j \\ \mbox{subject to}\quad & u_i+v_j \le c_{ij} \quad\forall i,j. \end{align}

If a primal feasible solution $x^*$ and a dual feasible solution $(u^*,v^*)$ satisfy the complementary slackness conditions: 
$$x_{ij}(c_{ij} -u_i-v_j) \quad \forall ij,$$
then $x^*$ is an optimal solution to the primal and $(u^*,v^*)$ is an optimal solution to the dual. 

For a given dual feasible solution, let $J=\set{ij\in E: u_i+v_j=c_{ij}}$. The restricted primal:
\begin{align*} \mbox{minimize}\quad & \sum_{i} s_i \sum_{j}s_j \\
\mbox{subject to}\quad & \sum_{i=1}^n x_{ij} +s_j= 1 \quad \forall j, \\
& \sum_{j=1}^n x_{ij} +s_i = 1 \quad \forall i,\\
& x_{ij}=0\quad\forall ij\not\in J, \\
& x_{ij}\ge 0\quad\forall ij\in J, \\
&s\ge 0. \end{align*}

Since $$\sum_{i} s_i +  \sum_{j}s_j =  \sum_{i} \left(1-\sum_{j}x_{ij}\right) + \sum_{j} \left(1-\sum_{i}x_{ij}\right) = 2n -2\sum_{i}\sum_{j}x_{ij},$$ it is straightforward to see that the above RP is equivalent to the following problem:
\begin{align*} \mbox{maximize}\quad & \sum_{i}\sum_{j}x_{ij} \\
\mbox{subject to}\quad & \sum_{i=1}^n x_{ij} \le 1 \quad \forall j, \\
& \sum_{j=1}^n x_{ij} \le 1 \quad \forall i,\\
& x_{ij}=0\quad\forall ij\not\in J, \\
& x_{ij}\ge 0\quad\forall ij\in J. \end{align*}
Clearly, this is the Maximum Bipartite Matching Problem for the graph $G'=(A\cup B,J)$. One can solve it efficiently using one of the methods presented in that topic.

If the maximum cardinality of this matching is $n=|A|=|B|$ then the obtained matching is also the optimal solution to the original assignment problem. Otherwise, we can use this matching to find an optimal solution to the following dual of the RP, called the DRP:
\begin{align*} \mbox{maximize}\quad & \sum_{i}u_i' + \sum_{j}v_j' \\
\mbox{subject to}\quad & u_i' \le 1 \quad \forall i, \\
&v_j' \le 1 \quad \forall j,\\
& u_i'+v_j'\le 0 \quad\forall ij\in J. \end{align*}

Deriving an optimal solution to the DRP from an optimal solution to the RP

This is an important step. Let $x$ be a maximum matching of RP. The corresponding optimal $s$ can be trivially obtained: $s_i=0$ if $i$ is matched and $s_i=1$ otherwise, for any $i\in V=A\cup B$. Now, a solution $(u',v')$ to the DRP is optimal if and only if it is dual feasible and satisfy the complementary slackness conditions, i.e. it has to satisfy all the following conditions:
 \begin{align}
u_i' &\le 1 \quad \forall i\in A, \\
v_j' &\le 1 \quad \forall j\in B,\\
u_i'+v_j' &\le 0 \quad\forall ij\in J,\\
s_i(1-u_i') &= 0\quad\forall i\in A, \\
s_j(1-v_j') &= 0\quad\forall j\in B, \\
x_{ij}(u_i'+v_j') &= 0 \quad\forall ij\in J.
\end{align}
(The first three conditions are dual feasibility and the last three conditions are complementary slackness.) The last condition can be read: if $(i,j)$ are matched, then $u_i'+v_j'=0$.

If $i$ is exposed (i.e. $s_i=1$), then we must have $u_i'=1$ (for any $i\in V$).
Let $i_0$ be an exposed and consider any of its neighbors $j_0$ (in $J$). Clearly, $j$ must be matched (because otherwise we can add the edge $(i_0,j_0)$ to the current matching to obtain another matching, which is not possible since the current matching is already maximum). Since $u_{i_0}'+v_{j_0}' \le 0$ and $u_{i_0}'=1$, we have $v_{j_0}' \le -1$. Let $i_1$ be the mate of $j_0$, since $u_{i_1}'+v_{j_0}' = 0$ and $v_{j_0}' \le -1$ and $u_{i_1}' \le 1$, we must have $v_{j_0}' = -1$ and $u_{i_1}' = 1$. Thus, we have labeled the path $i_0,j_0,i_1$ with the values $1,-1,1$, where $j_0$ is any possible neighbor (in $J$) of $i_0$ and $i_1$ is the mate of $j_0$. Continue the same labeling strategy with $i_1$, and so on. By this, we have labeled all alternating paths from $i_0$. Note that these alternating paths are not augmenting because the given matching is maximum (c.f. topic on Maximum Matching Problem).

Two alternating paths generated from two different exposed nodes $i_0$ and $j_0$ (not necessarily in different partitions) do not have any node in common, otherwise we will have an augmenting path starting from $i_0$ and terminating at $j_0$.

For any exposed node $i_0\in V$: labeling the nodes on all maximal alternating paths (including the ones with a single node) starting from $i_0$ by alternating between $1$ and $-1$. Note: an alternating path is maximal if we cannot add any edge to it to create another alternating path.

Has the above labeling scheme already covered all the nodes? Not always. If some nodes remain, then obviously they are not connected to any of the already labeled nodes, and all of them are matched. For this subset of nodes, we label $1$ to all nodes on one side (e.g. $A$) and $-1$ to all nodes on the other side ($B$). It is easy to see that this labeling ensures the dual feasibility and the complementary slackness conditions. 


References

[1]     A. Schrijver, Combinatorial Optimization: Polyhedra and Efficiency, Springer 2012.

[2]     C. H. Papadimitriou and K. Steiglitz, Combinatorial Optimization: Algorithms and Complexity, Dover 1998.

[3]     M. X. Goemans and D. P. Williamson, The Primal-Dual Method for Approximation Algorithms and Its Application to Network Design Problems.

Share this post


Link to post
Share on other sites

Bipartite Minimum Cost (Non-perfect) Matching

The bipartite perfect matching problem is nice, but in practice (i.e. applications) we very often encounter non-perfect matching situations. I'll show how to reduce such problems to the perfect matching one.

Problem. Given a cost matrix $C = (c_{ij})_{1\le i \le m,1\le j \le n} \in \mathbb{R}^{m\times n}$, where $m$ and $n$ are given positive integers.
\begin{align} \mbox{minimize}\quad & \sum_{i=1}^m \sum_{j=1}^n c_{ij}x_{ij} \\ \mbox{subject to}\quad & \sum_{i=1}^m x_{ij} \le 1 \quad \forall j=\overline{1,n}, \\ & \sum_{j=1}^n x_{ij} \le 1 \quad \forall i=\overline{1,m},\\ & x_{ij} \in \set{0,1} \quad\forall i,j. \end{align}

First, if the two parts of the input bipartite graph are not of equal size, then we add dummy nodes to the smaller part to make them equal. 

Suppose that $m < n$, then define new cost values $c_{ij} = 0 \ \forall i=\overline{m+1,n}$. It is straightforward to show that the above problem is equivalent to the following problem: \begin{align} \mbox{minimize}\quad & \sum_{i=1}^n \sum_{j=1}^n c_{ij}x_{ij} \\ \mbox{subject to}\quad & \sum_{i=1}^n x_{ij} \le 1 \quad \forall j=\overline{1,n}, \\ & \sum_{j=1}^n x_{ij} \le 1 \quad \forall i=\overline{1,n},\\ & x_{ij} \in \set{0,1} \quad\forall i,j. \end{align}

Next, we define auxiliary variables to convert the inequality constraints to equality ones. Denote
\begin{align*} x_{(n+1),j} &= 1 - \sum_{i=1}^m x_{ij} \quad \forall j=\overline{1,n}, \\ x_{i,(n+1)} &= 1 - \sum_{j=1}^n x_{ij} \quad \forall i=\overline{1,n} \end{align*}
Define a new cost value $c_{ij}=0$ for each pair $(i,j) = (n+1,k)$ or $(i,j) = (k,n+1)$, where $1\le i,j \le n$, and a new cost value $c_{(n+1),(n+1)}$ that will be specified later. It is easy to see that the previous problem is equivalent to:
\begin{align} \mbox{minimize}\quad & \sum_{i=1}^{n+1} \sum_{j=1}^{n+1} c_{ij}x_{ij} \\ \mbox{subject to}\quad & \sum_{i=1}^{n+1} x_{ij} = 1 \quad \forall j=\overline{1,n+1}, \\ & \sum_{j=1}^n x_{ij} = 1 \quad \forall i=\overline{1,n+1},\\ & x_{(n+1),(n+1)} = 0 \\ & x_{ij} \in \set{0,1} \quad\forall i,j. \end{align}

Finally, we can remove the constraint $x_{(n+1),(n+1)} = 0$ by setting $c_{(n+1),(n+1)}$ to a very high value, then we obtain an instance of the bipartite perfect matching problem.

Therefore, any bipartite non-perfect matching can be converted to a perfect one.

Question: Is the converse true? I.e. can any bipartite minimum cost perfect matching problem be converted to a non-perfect one?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now

  • Related Content

    • By Khue
      In this topic we will discuss Nikos Komodakis's work on the approximate primal-dual method for solving MRF optimization. I will describe in this topic 4 algorithms: PD1, PD2, PD3 and FastPD.
      The MRF optimization (or MAP Inference) problem:
      \begin{equation} 
      \begin{aligned} 
      &\text{minimize}  \quad    && \sum_{p\in V}\sum_{l\in L} \theta_p(l)x_p(l) + \sum_{pq\in E}\sum_{(l,l')\in L^2} \theta_{pq}(l,l')x_p(l)x_q(l')\\
      &\text{subject to} \quad    && \sum_{l\in L} x_p(l) = 1 \quad \forall p\in V,\\ 
       & &&\sum_{l'\in L} x_{pq}(l,l') = x_p(l) \quad\forall pq\in E, l\in L, \\
       & &&\sum_{l\in L} x_{pq}(l,l') = x_q(l') \quad\forall pq\in E, l'\in L, \\
       & &&x_p(l)\in\set{0,1}, x_{pq}(l,l')\in\set{0,1}\quad\forall pq\in E, l\in L, l'\in L.
      \end{aligned} 
      \end{equation}
      Replacing the integral constraints in the above problem by $x_p(l)\ge 0, x_{pq}(l,l')\ge 0$ we get its the LP-relaxation. The dual of this LP-relaxation is given by:
      \begin{align}
      \mbox{maximize}\quad & \sum_{p\in V}\min_{a\in L}h_p(a) \label{eq:dual-objective}\\
      \mbox{subject to}\quad &y_{pq}(a) + y_{qp}(b) \le \theta_{pq}(a,b) \quad \forall pq\in E, a\in L,b\in L, \label{eq:dual-feasibility}\end{align}
      where the function $h_p(\cdot)$ is defined by $$h_p(a) = \theta_p(a) + \sum_{q:pq\in E} y_{pq}(a).$$
      Denote by $x_p$ the chosen label for the node $p$.
      ...
       
      PD1 Algorithm
      This algorithm, at termination, gives a primal-dual pair $(\mathbf{x},\mathbf{y})$ satisfying ($\mathbf{x}$ is always chosen as a labeling, thus primal feasibility always holds and hereafter we will not state it explicitly) :
      Relaxed dual feasibility:
      \begin{equation}y_{pq}(a) \le \frac{1}{2}\theta_{pq}^{\mathrm{min}}\quad\forall pq\in E,a\in L. \label{eq:relaxed-dual-feasibility} \end{equation}
      Clearly, this implies dual feasibility, i.e. \eqref{eq:dual-feasibility}. Relaxed complementary slackness conditions: \begin{align} h_p(x_p) &= \min_ah_p(a) \quad \forall p\in V \label{eq:primal-csc-p} \\ y_{pq}(x_p) + y_{qp}(x_q) &\ge \frac{1}{f_\mathrm{app}} \theta_{pq}(x_p,x_q) \quad \forall pq\in E. \label{eq:relaxed-primal-csc-pq} \end{align} During its iterations, the algorithm ensures all the above conditions, except \eqref{eq:primal-csc-p}, which will be satisfied at the end.
      Dual variables update
      Given $(\mathbf{x},\mathbf{y})$ satisfying \eqref{eq:relaxed-dual-feasibility} and \eqref{eq:relaxed-primal-csc-pq}. If $(\mathbf{x},\mathbf{y})$ also satisfy \eqref{eq:primal-csc-p} then we obtain an approximate solution. Otherwise, we will be able to increase the dual objective \eqref{eq:dual-objective} as follows.
      Since \eqref{eq:primal-csc-p} does not hold for all $p$, there exists a non-empty set $V_0 \subseteq V$ of nodes $p$ where $h_p(x_p) > \min_ah_p(a)$.
    • By Khue
      A matching $M$ in an undirected graph $G=(V,E)$ is a set of disjoint edges. A vertex $v$ that is not incident to any edge in $M$ is said to be exposed. Otherwise $v$ is matched. Given a graph $G=(V,E)$. The Maximum Matching Problem is to find a maximum (cardinality) matching $M$ of $G$.

      The image shows an example of a matching with cardinality $3$, the three edges are $(1,6), (2,7)$ and $(3,8)$. The vertices $4,5,9$ and $10$ are exposed.
      An alternating path with respect to $M$ (or an $M$-alternating path) is a path that alternates between edges in $M$ and edges in $E\setminus M$. An augmenting path with respect to $M$  (or an $M$-augmenting path) is a path in which the first and last vertices are exposed.
      Theorem. A matching $M$ in a graph $G$ is maximum if and only if there is no $M$-augmenting path in $G$.
      It is not difficult to prove the above theorem using the following lemma.
      Lemma. If $M$ is a matching in $G$ and $P$ is an $M$-augmenting path then $M' := M\triangle P$ (which is the symmetric difference of $M$ and $P$, defined by $M\triangle P = (M\cup P)\setminus (M\cap P)$) is also a matching and $|M'|=|M|+1$.
      By the above theorem, the problem of matching can be reduced to the problem of finding an augmenting path with respect to a given matching $M$.
      We will first discuss the matching problem for bipartite graphs before moving onto general graphs. A graph $G=(V\cup U,E)$, $V$ and $U$ are disjoint, is called bipartite if every edge in $E$ has one endpoint in $V$ and the other in $U$. 
      We present an algorithm that searches for an $M$-augmenting path (if such a path exists), given a bipartite graph $G$ and a matching $M$ in it:
      Begin with an exposed node $v\in V$. Find each neighbor $u$ of $v$: If $u$ is exposed, then we have found an augmenting path (that is $(v,u)$); If $u$ is not exposed, then denote by $v'$ its mate (i.e. $(u,v')$ is a matched edge). Continue the search with $v'$, by ignoring the vertices already encountered previously. During the search, if we encounter an exposed vertex, then we have found an augmenting path. If we do not encounter any exposed vertex, then the given matching is already maximum.   
      References
      [1]     A. Schrijver, Combinatorial Optimization: Polyhedra and Efficiency, Springer 2012.
      [2]     C. H. Papadimitriou and K. Steiglitz, Combinatorial Optimization: Algorithms and Complexity, Dover 1998.