Skip to content

Commit fa40ce7

Browse files
GiggleLiuclaudeisPANN
authored
Fix #445: [Model] AdditionalKey (#681)
* Add plan for #445: [Model] AdditionalKey * Implement AdditionalKey satisfaction problem model (#445) Add the Additional Key problem from relational database theory (Garey & Johnson SR7). Given a relational schema (R, F) and known candidate keys K, determines whether R has a candidate key not in K. - Model file with closure computation, minimality check, and known-key filter - 13 unit tests covering creation, evaluation, edge cases, brute force, serialization - CLI create support with --num-attributes, --dependencies, --relation-attrs, --known-keys - Module registration and re-exports - Canonical example-db entry with regenerated fixtures Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Apply rustfmt formatting fixes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add AdditionalKey model (Batch 1) Implement the AdditionalKey satisfaction problem from relational database theory (Garey & Johnson A4 SR27). Includes model, CLI registration, unit tests, and example-db entry. * Add AdditionalKey problem definition to paper Add problem-def entry, display name, and bibliography reference (Beeri & Bernstein, 1979) for the Additional Key problem from relational database theory. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: remove plan file after implementation * Fix AdditionalKey canonical example to use updated ModelExampleSpec API The merge brought in API changes to ModelExampleSpec (removed `build` closure, added direct `instance`/`optimal_config`/`optimal_value` fields). Update AdditionalKey's canonical_model_example_specs to match. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Address review findings: panic tests, pinned solution count, help text - Add 5 #[should_panic] tests for constructor validation paths - Pin brute-force solution count to 2 (additional keys: {0,2} and {0,3,5}) - Fix --known-keys help text to use [brackets] for optional flag - Remove redundant test_additional_key_paper_example Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix merge artifacts: fmt, missing fields in test helper Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Cover data getter methods in creation test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Xiwei Pan <xiwei.pan@connect.hkust-gz.edu.cn>
1 parent 90f8733 commit fa40ce7

9 files changed

Lines changed: 575 additions & 16 deletions

File tree

docs/paper/reductions.typ

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@
6262

6363
// Problem display names for theorem headers
6464
#let display-name = (
65+
"AdditionalKey": [Additional Key],
6566
"MaximumIndependentSet": [Maximum Independent Set],
6667
"MinimumVertexCover": [Minimum Vertex Cover],
6768
"MaxCut": [Max-Cut],
@@ -3638,6 +3639,29 @@ A classical NP-complete problem from Garey and Johnson @garey1979[Ch.~3, p.~76],
36383639
) <fig:d2cif>
36393640
]
36403641

3642+
#problem-def("AdditionalKey")[
3643+
Given a set $A$ of attribute names, a collection $F$ of functional dependencies on $A$,
3644+
a subset $R subset.eq A$, and a set $K$ of candidate keys for the relational scheme $chevron.l R, F chevron.r$,
3645+
determine whether there exists a subset $R' subset.eq R$ such that $R' in.not K$,
3646+
the closure $R'^+$ under $F$ equals $R$, and no proper subset of $R'$ also has this property.
3647+
][
3648+
A classical NP-complete problem from relational database theory @beeri1979.
3649+
Enumerating all candidate keys is necessary to verify Boyce-Codd Normal Form (BCNF),
3650+
and the NP-completeness of Additional Key implies that BCNF testing is intractable in general.
3651+
The best known exact algorithm is brute-force enumeration of all $2^(|R|)$ subsets,
3652+
checking each for the key property via closure computation under Armstrong's axioms.
3653+
#footnote[No algorithm improving on brute-force is known for the Additional Key problem.]
3654+
3655+
*Example.* Consider attribute set $A = {0, 1, 2, 3, 4, 5}$ with functional dependencies
3656+
$F = {{0,1} -> {2,3}, {2,3} -> {4,5}, {4,5} -> {0,1}, {0,2} -> {3}, {3,5} -> {1}}$,
3657+
relation $R = A$, and known keys $K = {{0,1}, {2,3}, {4,5}}$.
3658+
The subset ${0,2}$ is an additional key: starting from ${0,2}$, we apply ${0,2} -> {3}$
3659+
to get ${0,2,3}$, then ${2,3} -> {4,5}$ to get ${0,2,3,4,5}$, then ${4,5} -> {0,1}$
3660+
to reach $R^+ = A$. The set ${0,2}$ is minimal (neither ${0}$ nor ${2}$ alone determines $A$)
3661+
and ${0,2} in.not K$, so the answer is YES.
3662+
]
3663+
3664+
36413665
#{
36423666
let x = load-model-example("ConjunctiveBooleanQuery")
36433667
let d = x.instance.domain_size

docs/paper/references.bib

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -198,6 +198,17 @@ @article{lucas2014
198198
year = {2014}
199199
}
200200

201+
@article{beeri1979,
202+
author = {Catriel Beeri and Philip A. Bernstein},
203+
title = {Computational Problems Related to the Design of Normal Form Relational Schemas},
204+
journal = {ACM Transactions on Database Systems},
205+
volume = {4},
206+
number = {1},
207+
pages = {30--59},
208+
year = {1979},
209+
doi = {10.1145/320064.320066}
210+
}
211+
201212
@article{barahona1982,
202213
author = {Francisco Barahona},
203214
title = {On the computational complexity of Ising spin glass models},

problemreductions-cli/src/cli.rs

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -257,6 +257,7 @@ Flags by problem type:
257257
OptimalLinearArrangement --graph, --bound
258258
RuralPostman (RPP) --graph, --edge-weights, --required-edges, --bound
259259
MultipleChoiceBranching --arcs [--weights] --partition --bound [--num-vertices]
260+
AdditionalKey --num-attributes, --dependencies, --relation-attrs [--known-keys]
260261
SubgraphIsomorphism --graph (host), --pattern (pattern)
261262
LCS --strings, --bound [--alphabet-size]
262263
FAS --arcs [--weights] [--num-vertices]
@@ -540,6 +541,18 @@ pub struct CreateArgs {
540541
/// Alphabet size for LCS, SCS, or StringToStringCorrection (optional; inferred from the input strings if omitted)
541542
#[arg(long)]
542543
pub alphabet_size: Option<usize>,
544+
/// Number of attributes for AdditionalKey or MinimumCardinalityKey
545+
#[arg(long)]
546+
pub num_attributes: Option<usize>,
547+
/// Functional dependencies for AdditionalKey (e.g., "0,1:2,3;2,3:4,5") or MinimumCardinalityKey (semicolon-separated "lhs>rhs" pairs, e.g., "0,1>2;0,2>3")
548+
#[arg(long)]
549+
pub dependencies: Option<String>,
550+
/// Relation scheme attributes for AdditionalKey (comma-separated, e.g., "0,1,2,3,4,5")
551+
#[arg(long)]
552+
pub relation_attrs: Option<String>,
553+
/// Known candidate keys for AdditionalKey (e.g., "0,1;2,3")
554+
#[arg(long)]
555+
pub known_keys: Option<String>,
543556
/// Domain size for ConjunctiveBooleanQuery
544557
#[arg(long)]
545558
pub domain_size: Option<usize>,
@@ -558,12 +571,6 @@ pub struct CreateArgs {
558571
/// Number of groups for SumOfSquaresPartition
559572
#[arg(long)]
560573
pub num_groups: Option<usize>,
561-
/// Functional dependencies for MinimumCardinalityKey (semicolon-separated "lhs>rhs" pairs, e.g., "0,1>2;0,2>3")
562-
#[arg(long)]
563-
pub dependencies: Option<String>,
564-
/// Number of attributes for MinimumCardinalityKey
565-
#[arg(long)]
566-
pub num_attributes: Option<usize>,
567574
/// Source string for StringToStringCorrection (comma-separated symbol indices, e.g., "0,1,2,3")
568575
#[arg(long)]
569576
pub source_string: Option<String>,

problemreductions-cli/src/commands/create.rs

Lines changed: 53 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ use problemreductions::models::graph::{
1414
MultipleChoiceBranching, SteinerTree, StrongConnectivityAugmentation,
1515
};
1616
use problemreductions::models::misc::{
17-
BinPacking, BoyceCoddNormalFormViolation, CbqRelation, ConjunctiveBooleanQuery,
17+
AdditionalKey, BinPacking, BoyceCoddNormalFormViolation, CbqRelation, ConjunctiveBooleanQuery,
1818
FlowShopScheduling, LongestCommonSubsequence, MinimumTardinessSequencing,
1919
MultiprocessorScheduling, PaintShop, PartiallyOrderedKnapsack, QueryArg,
2020
RectilinearPictureCompression, ResourceConstrainedScheduling,
@@ -121,6 +121,10 @@ fn all_data_flags_empty(args: &CreateArgs) -> bool {
121121
&& args.sink_2.is_none()
122122
&& args.requirement_1.is_none()
123123
&& args.requirement_2.is_none()
124+
&& args.num_attributes.is_none()
125+
&& args.dependencies.is_none()
126+
&& args.relation_attrs.is_none()
127+
&& args.known_keys.is_none()
124128
&& args.domain_size.is_none()
125129
&& args.relations.is_none()
126130
&& args.conjuncts_spec.is_none()
@@ -412,6 +416,7 @@ fn example_for(canonical: &str, graph_type: Option<&str>) -> &'static str {
412416
"MultipleChoiceBranching" => {
413417
"--arcs \"0>1,0>2,1>3,2>3,1>4,3>5,4>5,2>4\" --weights 3,2,4,1,2,3,1,3 --partition \"0,1;2,3;4,7;5,6\" --bound 10"
414418
}
419+
"AdditionalKey" => "--num-attributes 6 --dependencies \"0,1:2,3;2,3:4,5;4,5:0,1\" --relation-attrs 0,1,2,3,4,5 --known-keys \"0,1;2,3;4,5\"",
415420
"SubgraphIsomorphism" => "--graph 0-1,1-2,2-0 --pattern 0-1",
416421
"RectilinearPictureCompression" => {
417422
"--matrix \"1,1,0,0;1,1,0,0;0,0,1,1;0,0,1,1\" --k 2"
@@ -1402,6 +1407,51 @@ pub fn create(args: &CreateArgs, out: &OutputConfig) -> Result<()> {
14021407
}
14031408
}
14041409

1410+
// AdditionalKey
1411+
"AdditionalKey" => {
1412+
let usage = "Usage: pred create AdditionalKey --num-attributes 6 --dependencies \"0,1:2,3;2,3:4,5\" --relation-attrs \"0,1,2,3,4,5\" --known-keys \"0,1;2,3\"";
1413+
let num_attributes = args.num_attributes.ok_or_else(|| {
1414+
anyhow::anyhow!("AdditionalKey requires --num-attributes\n\n{usage}")
1415+
})?;
1416+
let deps_str = args.dependencies.as_deref().ok_or_else(|| {
1417+
anyhow::anyhow!("AdditionalKey requires --dependencies\n\n{usage}")
1418+
})?;
1419+
let ra_str = args.relation_attrs.as_deref().ok_or_else(|| {
1420+
anyhow::anyhow!("AdditionalKey requires --relation-attrs\n\n{usage}")
1421+
})?;
1422+
let dependencies: Vec<(Vec<usize>, Vec<usize>)> = deps_str
1423+
.split(';')
1424+
.map(|dep| {
1425+
let parts: Vec<&str> = dep.trim().split(':').collect();
1426+
anyhow::ensure!(
1427+
parts.len() == 2,
1428+
"Invalid dependency format '{}', expected 'lhs:rhs' (e.g., '0,1:2,3')",
1429+
dep.trim()
1430+
);
1431+
let lhs: Vec<usize> = util::parse_comma_list(parts[0].trim())?;
1432+
let rhs: Vec<usize> = util::parse_comma_list(parts[1].trim())?;
1433+
Ok((lhs, rhs))
1434+
})
1435+
.collect::<Result<Vec<_>>>()?;
1436+
let relation_attrs: Vec<usize> = util::parse_comma_list(ra_str)?;
1437+
let known_keys: Vec<Vec<usize>> = match args.known_keys.as_deref() {
1438+
Some(s) if !s.is_empty() => s
1439+
.split(';')
1440+
.map(|k| util::parse_comma_list(k.trim()))
1441+
.collect::<Result<Vec<_>>>()?,
1442+
_ => vec![],
1443+
};
1444+
(
1445+
ser(AdditionalKey::new(
1446+
num_attributes,
1447+
dependencies,
1448+
relation_attrs,
1449+
known_keys,
1450+
))?,
1451+
resolved_variant.clone(),
1452+
)
1453+
}
1454+
14051455
// SubsetSum
14061456
"SubsetSum" => {
14071457
let sizes_str = args.sizes.as_deref().ok_or_else(|| {
@@ -4439,6 +4489,8 @@ mod tests {
44394489
domain_size: None,
44404490
relations: None,
44414491
conjuncts_spec: None,
4492+
relation_attrs: None,
4493+
known_keys: None,
44424494
costs: None,
44434495
cut_bound: None,
44444496
size_bound: None,

src/lib.rs

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -60,13 +60,14 @@ pub mod prelude {
6060
UndirectedTwoCommodityIntegralFlow,
6161
};
6262
pub use crate::models::misc::{
63-
BinPacking, BoyceCoddNormalFormViolation, CbqRelation, ConjunctiveBooleanQuery,
64-
ConjunctiveQueryFoldability, Factoring, FlowShopScheduling, Knapsack,
65-
LongestCommonSubsequence, MinimumTardinessSequencing, MultiprocessorScheduling, PaintShop,
66-
Partition, QueryArg, RectilinearPictureCompression, ResourceConstrainedScheduling,
67-
SequencingToMinimizeMaximumCumulativeCost, SequencingWithReleaseTimesAndDeadlines,
68-
SequencingWithinIntervals, ShortestCommonSupersequence, StaffScheduling,
69-
StringToStringCorrection, SubsetSum, SumOfSquaresPartition, Term,
63+
AdditionalKey, BinPacking, BoyceCoddNormalFormViolation, CbqRelation,
64+
ConjunctiveBooleanQuery, ConjunctiveQueryFoldability, Factoring, FlowShopScheduling,
65+
Knapsack, LongestCommonSubsequence, MinimumTardinessSequencing, MultiprocessorScheduling,
66+
PaintShop, Partition, QueryArg, RectilinearPictureCompression,
67+
ResourceConstrainedScheduling, SequencingToMinimizeMaximumCumulativeCost,
68+
SequencingWithReleaseTimesAndDeadlines, SequencingWithinIntervals,
69+
ShortestCommonSupersequence, StaffScheduling, StringToStringCorrection, SubsetSum,
70+
SumOfSquaresPartition, Term,
7071
};
7172
pub use crate::models::set::{
7273
ComparativeContainment, ConsecutiveSets, ExactCoverBy3Sets, MaximumSetPacking,

0 commit comments

Comments
 (0)