Skip to content

Commit ac4791f

Browse files
zazabapclaude
andcommitted
fix: tighten LCS complexity to 2^min_string_length, add edge-case tests
- Change declare_variants! from 2^total_length to 2^min_string_length (actual brute-force enumerates subsequences of shortest string) - Make min_string_length() public getter (was private shortest_len) - Add #[should_panic] test for empty input - Add edge-case test for empty string in input - Fix paper text to reference shortest string length, not total Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent c9cc8f1 commit ac4791f

3 files changed

Lines changed: 21 additions & 5 deletions

File tree

docs/paper/reductions.typ

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -890,7 +890,7 @@ Biclique Cover is equivalent to factoring the biadjacency matrix $M$ of the bipa
890890
#problem-def("LongestCommonSubsequence")[
891891
Given a finite alphabet $Sigma$ and $k$ strings $s_1, dots, s_k$ over $Sigma$, find a longest string $w$ that is a subsequence of every $s_i$. A string $w$ is a _subsequence_ of $s$ if $w$ can be obtained by deleting zero or more characters from $s$ without changing the order of the remaining characters.
892892
][
893-
The Longest Common Subsequence (LCS) problem is one of the fundamental string problems in computer science, listed as SR10 in @garey1979. For $k = 2$ strings, it is solvable in $O(m n)$ time via dynamic programming @wagner1974. However, Maier @maier1978 proved that the problem is NP-hard when $k$ is part of the input, even over a binary alphabet. LCS is central to diff and version control (e.g., `git diff`), bioinformatics (DNA/protein alignment), and data compression. The best known exact algorithm for the general $k$-string case runs in $O^*(2^n)$ by brute-force enumeration over subsequences of the shortest string#footnote[No algorithm improving on brute-force enumeration is known for the general $k$-string LCS.], where $n$ is the total length of all strings.
893+
The Longest Common Subsequence (LCS) problem is one of the fundamental string problems in computer science, listed as SR10 in @garey1979. For $k = 2$ strings, it is solvable in $O(m n)$ time via dynamic programming @wagner1974. However, Maier @maier1978 proved that the problem is NP-hard when $k$ is part of the input, even over a binary alphabet. LCS is central to diff and version control (e.g., `git diff`), bioinformatics (DNA/protein alignment), and data compression. The best known exact algorithm for the general $k$-string case runs in $O^*(2^m)$ by brute-force enumeration over subsequences of the shortest string#footnote[No algorithm improving on brute-force enumeration is known for the general $k$-string LCS.], where $m = min_i |s_i|$ is the length of the shortest string.
894894

895895
*Example.* Consider $k = 3$ strings over $Sigma = {A, B, C, D}$: $s_1 = mono("ABCDAB")$, $s_2 = mono("BDCABA")$, $s_3 = mono("BCADBA")$. An optimal common subsequence is $w = mono("BCAB")$ with length 4. We verify that $w$ is a subsequence of each string by identifying matching positions:
896896

src/models/misc/longest_common_subsequence.rs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -95,8 +95,8 @@ impl LongestCommonSubsequence {
9595
.unwrap_or(0)
9696
}
9797

98-
/// Length of the shortest string.
99-
fn shortest_len(&self) -> usize {
98+
/// Length of the shortest string (upper bound on LCS length).
99+
pub fn min_string_length(&self) -> usize {
100100
self.strings.iter().map(|s| s.len()).min().unwrap_or(0)
101101
}
102102
}
@@ -110,7 +110,7 @@ impl Problem for LongestCommonSubsequence {
110110
}
111111

112112
fn dims(&self) -> Vec<usize> {
113-
vec![2; self.shortest_len()]
113+
vec![2; self.min_string_length()]
114114
}
115115

116116
fn evaluate(&self, config: &[usize]) -> SolutionSize<i32> {
@@ -162,7 +162,7 @@ fn is_subsequence(sub: &[u8], full: &[u8]) -> bool {
162162
}
163163

164164
crate::declare_variants! {
165-
LongestCommonSubsequence => "2^total_length",
165+
LongestCommonSubsequence => "2^min_string_length",
166166
}
167167

168168
#[cfg(test)]

src/unit_tests/models/misc/longest_common_subsequence.rs

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -141,3 +141,19 @@ fn test_lcs_serialization() {
141141
let restored: LongestCommonSubsequence = serde_json::from_value(json).unwrap();
142142
assert_eq!(restored.strings(), problem.strings());
143143
}
144+
145+
#[test]
146+
#[should_panic(expected = "must have at least one string")]
147+
fn test_lcs_empty_strings_panics() {
148+
LongestCommonSubsequence::new(vec![]);
149+
}
150+
151+
#[test]
152+
fn test_lcs_empty_string_in_input() {
153+
// One empty string means LCS is always empty
154+
let problem = LongestCommonSubsequence::new(vec![vec![], vec![b'A', b'B']]);
155+
assert_eq!(problem.dims(), Vec::<usize>::new());
156+
let result = problem.evaluate(&[]);
157+
assert!(result.is_valid());
158+
assert_eq!(result.unwrap(), 0);
159+
}

0 commit comments

Comments
 (0)