Skip to content

Commit c9cc8f1

Browse files
zazabapclaude
andcommitted
docs: add LongestCommonSubsequence to paper and update JSON exports
- Add problem-def entry with formal definition, background, and example - Add display-name entry - Add bibliography entries for Maier 1978 and Wagner & Fischer 1974 - Update reduction_graph.json and problem_schemas.json Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 7f5e79d commit c9cc8f1

4 files changed

Lines changed: 161 additions & 82 deletions

File tree

docs/paper/reductions.typ

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@
5151
"BicliqueCover": [Biclique Cover],
5252
"BinPacking": [Bin Packing],
5353
"ClosestVectorProblem": [Closest Vector Problem],
54+
"LongestCommonSubsequence": [Longest Common Subsequence],
5455
)
5556

5657
// Definition label: "def:<ProblemName>" — each definition block must have a matching label
@@ -886,6 +887,46 @@ Biclique Cover is equivalent to factoring the biadjacency matrix $M$ of the bipa
886887
) <fig:binpacking-example>
887888
]
888889

890+
#problem-def("LongestCommonSubsequence")[
891+
Given a finite alphabet $Sigma$ and $k$ strings $s_1, dots, s_k$ over $Sigma$, find a longest string $w$ that is a subsequence of every $s_i$. A string $w$ is a _subsequence_ of $s$ if $w$ can be obtained by deleting zero or more characters from $s$ without changing the order of the remaining characters.
892+
][
893+
The Longest Common Subsequence (LCS) problem is one of the fundamental string problems in computer science, listed as SR10 in @garey1979. For $k = 2$ strings, it is solvable in $O(m n)$ time via dynamic programming @wagner1974. However, Maier @maier1978 proved that the problem is NP-hard when $k$ is part of the input, even over a binary alphabet. LCS is central to diff and version control (e.g., `git diff`), bioinformatics (DNA/protein alignment), and data compression. The best known exact algorithm for the general $k$-string case runs in $O^*(2^n)$ by brute-force enumeration over subsequences of the shortest string#footnote[No algorithm improving on brute-force enumeration is known for the general $k$-string LCS.], where $n$ is the total length of all strings.
894+
895+
*Example.* Consider $k = 3$ strings over $Sigma = {A, B, C, D}$: $s_1 = mono("ABCDAB")$, $s_2 = mono("BDCABA")$, $s_3 = mono("BCADBA")$. An optimal common subsequence is $w = mono("BCAB")$ with length 4. We verify that $w$ is a subsequence of each string by identifying matching positions:
896+
897+
#figure({
898+
canvas(length: 1cm, {
899+
let strings = (
900+
("A", "B", "C", "D", "A", "B"),
901+
("B", "D", "C", "A", "B", "A"),
902+
("B", "C", "A", "D", "B", "A"),
903+
)
904+
let labels = ($s_1$, $s_2$, $s_3$)
905+
// Positions matched in each string for BCAB
906+
let matched = ((1, 2, 4, 5), (0, 2, 3, 4), (0, 1, 2, 4))
907+
let dx = 0.7
908+
let dy = -1.2
909+
for si in range(3) {
910+
let y = si * dy
911+
draw.content((-0.6, y), labels.at(si))
912+
for ci in range(strings.at(si).len()) {
913+
let x = ci * dx
914+
let is-matched = ci in matched.at(si)
915+
let fill-color = if is-matched { graph-colors.at(0) } else { luma(230) }
916+
let text-color = if is-matched { white } else { black }
917+
draw.rect((x - 0.25, y - 0.25), (x + 0.25, y + 0.25),
918+
fill: fill-color, stroke: 0.4pt + luma(120), radius: 2pt)
919+
draw.content((x, y), text(9pt, fill: text-color, font: "DejaVu Sans Mono")[#strings.at(si).at(ci)])
920+
}
921+
}
922+
draw.content((strings.at(0).len() * dx / 2 - 0.15, 3 * dy + 0.1),
923+
text(8pt)[$w = mono("BCAB"), |w| = 4$])
924+
})
925+
},
926+
caption: [Longest Common Subsequence of three strings. Blue cells mark the positions forming the common subsequence $w = mono("BCAB")$ of length 4.],
927+
) <fig:lcs-example>
928+
]
929+
889930
// Completeness check: warn about problem types in JSON but missing from paper
890931
#{
891932
let json-models = {

docs/paper/references.bib

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,23 @@
1+
@article{maier1978,
2+
author = {David Maier},
3+
title = {The Complexity of Some Problems on Subsequences and Supersequences},
4+
journal = {Journal of the ACM},
5+
volume = {25},
6+
number = {2},
7+
pages = {322--336},
8+
year = {1978}
9+
}
10+
11+
@article{wagner1974,
12+
author = {Robert A. Wagner and Michael J. Fischer},
13+
title = {The String-to-String Correction Problem},
14+
journal = {Journal of the ACM},
15+
volume = {21},
16+
number = {1},
17+
pages = {168--173},
18+
year = {1974}
19+
}
20+
121
@inproceedings{karp1972,
222
author = {Richard M. Karp},
323
title = {Reducibility among Combinatorial Problems},

docs/src/reductions/problem_schemas.json

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,6 +183,17 @@
183183
}
184184
]
185185
},
186+
{
187+
"name": "LongestCommonSubsequence",
188+
"description": "Find the longest string that is a subsequence of every input string",
189+
"fields": [
190+
{
191+
"name": "strings",
192+
"type_name": "Vec<Vec<u8>>",
193+
"description": "The input strings"
194+
}
195+
]
196+
},
186197
{
187198
"name": "MaxCut",
188199
"description": "Find maximum weight cut in a graph",

0 commit comments

Comments
 (0)