|
51 | 51 | "BicliqueCover": [Biclique Cover], |
52 | 52 | "BinPacking": [Bin Packing], |
53 | 53 | "ClosestVectorProblem": [Closest Vector Problem], |
| 54 | + "LongestCommonSubsequence": [Longest Common Subsequence], |
54 | 55 | ) |
55 | 56 |
|
56 | 57 | // Definition label: "def:<ProblemName>" — each definition block must have a matching label |
@@ -886,6 +887,46 @@ Biclique Cover is equivalent to factoring the biadjacency matrix $M$ of the bipa |
886 | 887 | ) <fig:binpacking-example> |
887 | 888 | ] |
888 | 889 |
|
| 890 | +#problem-def("LongestCommonSubsequence")[ |
| 891 | + Given a finite alphabet $Sigma$ and $k$ strings $s_1, dots, s_k$ over $Sigma$, find a longest string $w$ that is a subsequence of every $s_i$. A string $w$ is a _subsequence_ of $s$ if $w$ can be obtained by deleting zero or more characters from $s$ without changing the order of the remaining characters. |
| 892 | +][ |
| 893 | + The Longest Common Subsequence (LCS) problem is one of the fundamental string problems in computer science, listed as SR10 in @garey1979. For $k = 2$ strings, it is solvable in $O(m n)$ time via dynamic programming @wagner1974. However, Maier @maier1978 proved that the problem is NP-hard when $k$ is part of the input, even over a binary alphabet. LCS is central to diff and version control (e.g., `git diff`), bioinformatics (DNA/protein alignment), and data compression. The best known exact algorithm for the general $k$-string case runs in $O^*(2^n)$ by brute-force enumeration over subsequences of the shortest string#footnote[No algorithm improving on brute-force enumeration is known for the general $k$-string LCS.], where $n$ is the total length of all strings. |
| 894 | + |
| 895 | + *Example.* Consider $k = 3$ strings over $Sigma = {A, B, C, D}$: $s_1 = mono("ABCDAB")$, $s_2 = mono("BDCABA")$, $s_3 = mono("BCADBA")$. An optimal common subsequence is $w = mono("BCAB")$ with length 4. We verify that $w$ is a subsequence of each string by identifying matching positions: |
| 896 | + |
| 897 | + #figure({ |
| 898 | + canvas(length: 1cm, { |
| 899 | + let strings = ( |
| 900 | + ("A", "B", "C", "D", "A", "B"), |
| 901 | + ("B", "D", "C", "A", "B", "A"), |
| 902 | + ("B", "C", "A", "D", "B", "A"), |
| 903 | + ) |
| 904 | + let labels = ($s_1$, $s_2$, $s_3$) |
| 905 | + // Positions matched in each string for BCAB |
| 906 | + let matched = ((1, 2, 4, 5), (0, 2, 3, 4), (0, 1, 2, 4)) |
| 907 | + let dx = 0.7 |
| 908 | + let dy = -1.2 |
| 909 | + for si in range(3) { |
| 910 | + let y = si * dy |
| 911 | + draw.content((-0.6, y), labels.at(si)) |
| 912 | + for ci in range(strings.at(si).len()) { |
| 913 | + let x = ci * dx |
| 914 | + let is-matched = ci in matched.at(si) |
| 915 | + let fill-color = if is-matched { graph-colors.at(0) } else { luma(230) } |
| 916 | + let text-color = if is-matched { white } else { black } |
| 917 | + draw.rect((x - 0.25, y - 0.25), (x + 0.25, y + 0.25), |
| 918 | + fill: fill-color, stroke: 0.4pt + luma(120), radius: 2pt) |
| 919 | + draw.content((x, y), text(9pt, fill: text-color, font: "DejaVu Sans Mono")[#strings.at(si).at(ci)]) |
| 920 | + } |
| 921 | + } |
| 922 | + draw.content((strings.at(0).len() * dx / 2 - 0.15, 3 * dy + 0.1), |
| 923 | + text(8pt)[$w = mono("BCAB"), |w| = 4$]) |
| 924 | + }) |
| 925 | + }, |
| 926 | + caption: [Longest Common Subsequence of three strings. Blue cells mark the positions forming the common subsequence $w = mono("BCAB")$ of length 4.], |
| 927 | + ) <fig:lcs-example> |
| 928 | +] |
| 929 | + |
889 | 930 | // Completeness check: warn about problem types in JSON but missing from paper |
890 | 931 | #{ |
891 | 932 | let json-models = { |
|
0 commit comments