算法 | 动态规划 - Dynamic Programming

Dynamic Programming

Dynamic programming: Break up a problem into a series of overlapping sub-problems, and build up solutions to larger and larger sub-problems.

Weighted Interval Scheduling

Weighted interval scheduling problem.

  • Job j starts at $s_j$, finishes at $f_j$, and has weight or value $v_j$.
  • Two jobs compatible if they don't overlap.
  • Goal: find maximum weight subset of mutually compatible jobs.

Greedy algorithm works when v = 1

Notation. Label jobs by finishing time: $f_1 < f_2 < \dots < f_n$ .

Def. p(j) = largest index i < j such that job i is compatible with j.

能够在 j 之前做的,DDL 最晚的任务

Notation. OPT(j) = value of optimal solution to the problem consisting of job requests 1, 2, ..., j

Case 1: OPT selects job j.

  • collect profit $v_j$
  • can't use incompatible jobs ${ p(j) + 1, p(j) + 2, ..., j - 1 }$
  • must include optimal solution to problem consisting of remaining compatible jobs $1, 2, ..., p(j)$

Case 2: OPT does not select job j.

  • must include optimal solution to problem consisting of remaining compatible jobs $1, 2, ..., j-1$

由此可知,其状态转移方程为:

$$\begin{equation}OPT( j)=\left{\begin{array}{lr}\text{0,}&\quad\text{if j = 0,}\max{v_j + OPT(p( j)), OPT( j -1)} &\quad\text{otherwise.}\end{array}\right.\end{equation}$$

Bottom-up DP

Running Time: $O(n)$

Segmented Least Squares

Intro.

Notation

  • OPT(j) = minimum cost for points $p_i, p_{i+1} , \dots , p_j$
  • e(i, j) = minimum sum of squares for points $p_i, p_{i+1} , \dots , p_j$

To compute OPT(j):

  • Last segment uses points $p_i, p_{i+1} , \dots , p_j$ for some i.
  • Cost = e(i, j) + c + OPT(i-1)

$$\begin{equation}OPT( j)=\left{\begin{array}{lr}\text{0,}&\quad\text{if j = 0,}\min_{1 \leq i \leq j}{e(i, j) + c + OPT(i-1)} &\quad\text{otherwise.}\end{array}\right.\end{equation}$$

Running time. $O(n^3)$.

can be improved to $O(n^2)$ by pre-computing various statistics

Bottleneck = computing e(i, j) for O(n2) pairs, O(n) per pair using previous formula

Knapsack Problem

Knapsack problem.

  • Given n objects and a "knapsack."
  • Item i weighs wi > 0 kilograms and has value vi > 0.
  • Knapsack has capacity of W kilograms.
  • Goal: fill knapsack so as to maximize total value.

Def. OPT(i, w) = max profit subset of items 1, …, i with weight limit w.

  • Case 1: OPT does not select item i.– OPT selects best of { 1, 2, …, i-1 } using weight limit w
  • Case 2: OPT selects item i.– new weight limit = w – $w_i$– OPT selects best of { 1, 2, …, i–1 } using this new weight limit

必须考虑重量的限制

$$\begin{equation}OPT(i, w)=\left{\begin{array}{lr}\text{0,}&\quad\text{if j = 0,}\text{OPT(i-1, w)}&\quad if w_i > w,\max{OPT(i-1, w), v_i + OPT(i-1, w- w_i)} &\quad\text{otherwise.}\end{array}\right.\end{equation}$$

Running time. $O(n W)$.

  • Not polynomial in input size!
  • "Pseudo-polynomial."
  • Decision version of Knapsack is NP-complete. [Chapter 8]

RNA Secondary Structure

Sequence Alignment

Sequence Alignment in Linear Space

分治

Shortest Paths

Dijkstra 要求每条边的权重均为正整数,若出现负数,则会陷入负数环中

Def. OPT(i, v) = length of shortest v-t path P using at most i edges

  • Case 1: P uses at most i-1 edges.
  • OPT(i, v) = OPT(i-1, v)
  • Case 2: P uses exactly i edges.– if (v, w) is first edge, then OPT uses (v, w), and then selects bestw-t path using at most i-1 edges

$$\begin{equation}OPT(i, v)=\left{\begin{array}{lr}\text{0,}&\quad\text{if i = 0,}\min{OPT(i-1, v), min_{v, w ∈ E}{OPT(i-1, w) + c_{vm}}} &\quad\text{otherwise.}\end{array}\right.\end{equation}$$

该算法在不断地更新起点

Remark. By previous observation, if no negative cycles, then OPT(n-1, v) = length of shortest v-t path.

Implementation:

Analysis. $\Theta(mn)$ time, $\Theta(n^2)$ space.

Finding the shortest paths. Maintain a "successor" for each table entry.

Practical Improvements

  • Maintain only one array M[v] = shortest v-t path that we have found so far (i : 1, 2, … n-1).
  • No need to check edges of the form (v, w) unless M[w] changed in previous iteration.

Theorem. Throughout the algorithm, M[v] is length of some v-t path,and after i rounds of updates, the value M[v] is no larger than the lengthof shortest v-t path using $\leq$ i edges.Overall impact.

  • Memory: $O(m + n)$.
  • Running time: $O(mn)$ worst case, but substantially faster in practice.