A tutorial on using Column Generation to solve MIP problems (Part 2)
(This article is a sequel to Part 1 of the tutorial.)
In this part, we will look at the Column Generation (CG) algorithm and then discuss some aspects of the subproblem. As there will be a lot to cover in the subproblem, we will only look at some intuition of solving the subproblem. The next article on this series will be dedicated to the subproblem’s algorithm and more details. As usual, we will keep the mathematical jargon limited to only where absolutely required.
Figure 2.1 shows the primal and the dual of the Master Problem (MP) introduced in Part 1. To recap, each constraint in the primal problem will have a corresponding dual variable. For our MP, each points to be visited will have a corresponding dual variable (u_j), and the constraint for the upper bound of vehicles will have 1 dual variable (v). With this much information (and forgetting more details of duality property for now), we can now utilize the dual values in the subproblem to find new columns to enter the MP.
The subproblem for our VRP problem (and also for most of the other problems like scheduling, routing, etc) will be some variants of the shortest path problem. When I say shortest path problem, you may probably think of nodes and arcs and application of Dijkstra’s algorithm or Bellman-Ford’s algorithm. We shall come back to that later. But for now, the arc cost (or cost of traveling from point A to B) needs to be altered by the dual information from the Restricted Master Problem (RMP). More specifically, the cost of all the arcs is reduced as follows: c_ab -u*_a. Where c_ab is the original cost of traversing the arc from a to b, and u*_a is the dual value of point a. The * denotes that the value of u_a is not just any value but the optimal dual value from solving the RMP. (The assumption is that the RMP will have an optimal solution and will not be infeasible or unbounded). Once the arc cost is updated, the subproblem is solved to find the shortest path. If the shortest path found has a cost less than v, it is added back to RMP and resolved but if the shortest path is greater than v, then we have found the optimal solution to MP.
Let us now formally write the CG algorithm.
1. Find the initial set of solutions.2. Solve the RMP with the initial set of solutions.3. Get the dual variables.4. Calculate the arc costs of the graph based on dual variables.5. Find the shortest path in the graph with the new arc costs.6. If the shortest path has a cost less than v, add it to RMP as a new column and repeat steps 2 to 6 else terminate.7. At termination you have the optimal solution to the linearized MP.
If it still doesn’t make sense, don't worry we will take a look at an example and walk through each step in the up coming article of this series.
Subproblem
As mentioned earlier, solving subproblem is an art in CG algorithm. Unlike master problems, subproblems a highly dependent on the specificity of the problem making it harder to practice “build once deploy many”. Before we delve into sub-problems let's take a look at why Dijkstra’s and Bellman-Ford's algorithm fail even for a simple VRP subproblem.
If you recall, one of the most important requirements for Dijkstra’s algorithm is that the arc cost should be non-negative. You may be compelled to think that the arcs in your graph are distances between two points and will always be positive. However, based on the CG algorithm, the arc costs are modified using the dual variables from the MP solution. Therefore, non-negative arcs are true only in the first iteration. From the second iteration, your arcs may and will become negative.
Bellman-Ford overcomes the drawback of Dijkstra in that it can solve for the shortest path even when there are negative cost arcs. However, Bellman-Ford also fails when the graph contains a negative cost cycle. For VRP where one can traverse from any node to another, having negative arc costs can lead to a negative cost cycle, therefore failing the algorithm to find the shortest path. One type of problem I would like to mention where the Bellman-Ford algorithm will work for solving the sub-problem is the aircraft routing problem. Since the underlying network for aircraft routing is a time-expanded graph network, having a cycle is not possible. In simple terms, when you have the flight schedules, you cannot assign an aircraft to fly backwards, therefore, eliminating any possibility of creating cycles. In mathematical terms, the VRP network can be undirected and cyclic and the flight schedule graph can be directed and acyclic.
Note that until now we only said that all customers must be visited by one of the vehicles and the number of vehicles should not exceed an upper bound. We didn’t exclusively mention any other requirements such as the capacity of the vehicle, time window requirement of the customers, whether it is pick up or delivery or a mix, and so on. The reason is that all these requirements are handled in the subproblem.
Going back to Dijkstra’s algorithm or Bellman Ford’s algorithm, the shortest path is shortest with respect to only one resource. It could be the shortest distance or the shortest time to reach the sink from source. As our problem demands, we need to consider more resources in our shortest path problem. For example, we may want to minimize the travel distance while not violating the time window at which a customer can be served (case of VRPTW). This may require visiting the furthest customer first whose time window is about to end and come back to other customers even though traveling in that way increases the travel distance significantly.
I hope this gives readers the intuition as to why the subproblem is important in CG and how off-the-shelf algorithms are not able to handle the subproblem efficiently.
In the next part of the series (Part 3), we will be delving into a problem type called Shortest Path Problem with Resource Constraint (SPPRC) to solve our subproblem. We will be looking in detail at how to exactly solve the SPPRC problems and the complexity of such problems. Also, we will discuss what speed-up techniques and heuristics we can employ to solve the SPPRC problem. Once we finish the SPPRC problems we can then wrap up the CG algorithm and then start our numerical example and begin to write code for the problem.
As I started to write this series, I had not guessed that this would gain so much attention in a short time. I will try my best to regularly update this series in a timely fashion. My goal is to publish at least one per week as time allows. Please leave your feedback or comment on what can be done to improve any part of these articles.