Skip to content

Commit 7f9b5ca

Browse files
committed
Refactor curation helpers as thin shims over RAVEN
Eight curation helpers are now wrappers around the generic functions added on the feat/yeast-gem-shared RAVEN branch. Yeast-specific behaviour stays on the yeast-GEM side; the algorithm bodies move upstream where other GEMs can share them. sumBioMass -> getBiomassFractions scaleBioMass -> scaleBiomassFraction rescalePseudoReaction -> scaleBiomassPseudoreaction (plus the lipid backbone/chain aggregation, which stays here) changeGAM -> setGAM addSBOterms -> assignSBOterms with onlyLastReactionForPseudo=true (keeps the legacy typo behaviour so saveYeastModel output is byte-stable) loadDeltaG -> loadDeltaGfromCSV saveDeltaG -> saveDeltaGtoCSV findDuplicatedRxns -> findDuplicateRxns (plus the legacy print formatting) New helper code/yeastBiomassConfig.m builds the biomassConfig struct that the RAVEN biomass functions consume, sourced from data/yeastgem/ids.yml so all biomass IDs live in one place. Equivalence verified: a side-by-side harness loaded the model and ran each legacy implementation against its shim, asserting bitwise match on the S matrix, bounds, and miriam annotations (or near-zero float diff for deltaG / sumBioMass scalars). 10/10 checks pass.
1 parent 616152c commit 7f9b5ca

9 files changed

Lines changed: 198 additions & 317 deletions

File tree

code/missingFields/addSBOterms.m

Lines changed: 13 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -1,51 +1,15 @@
1-
% model = addSBOterms(model)
21
function model = addSBOterms(model)
3-
4-
%Define SBO terms for mets
5-
metsSBO=cell(size(model.mets));
6-
for i = 1:length(model.mets)
7-
metName = model.metNames{i};
8-
if ismember(metName,{'biomass','DNA','RNA','protein','carbohydrate','lipid','cofactor','ion'}) ...
9-
|| endsWith(metName,' backbone') || endsWith(metName,' chain')
10-
metsSBO{i} = 'SBO:0000649'; %Biomass
11-
else
12-
metsSBO{i} = 'SBO:0000247'; %Simple chemical
13-
end
14-
end
15-
16-
%Define SBO terms for rxns
17-
rxnSBO = cell(size(model.rxns));
18-
rxnSBO(:) = {'SBO:0000176'}; %Metabolic rxn, if nothing else
19-
% Exchange, sink & demand (only 1 reactant)
20-
reactantNumber=sum(model.S~=0,1);
21-
reactantNumber=find(reactantNumber==1);
22-
for i=1:numel(reactantNumber)
23-
idx=reactantNumber(i);
24-
if strcmp(model.comps{model.metComps(find(model.S(:,idx)))},'e') || ...
25-
strcmp(model.compNames{model.metComps(find(model.S(:,idx)))},'extracellular')
26-
rxnSBO{idx} = 'SBO:0000627'; %Exchange rxn
27-
elseif sum(model.S(:,idx))<0
28-
rxnSBO{idx} = 'SBO:0000632'; %Sink rxn
29-
else
30-
rxnSBO{idx} = 'SBO:0000628'; %Demand rxn
31-
end
32-
end
33-
% Transport reactions
34-
i=getTransportRxns(model);
35-
rxnSBO(i) = {'SBO:0000655'};
36-
% Pseudo reactions
37-
for i=numel(model.rxns)
38-
if strcmp(model.rxnNames(i),'biomass pseudoreaction')
39-
rxnSBO{i} = 'SBO:0000629'; %Biomass pseudo-rxn
40-
elseif strcmp(model.rxnNames(i),'non-growth associated maintenance reaction')
41-
rxnSBO{i} = 'SBO:0000630'; %ATP maintenance
42-
elseif contains(model.rxnNames(i),'pseudoreaction') || contains(model.rxnNames(i),'SLIME rxn')
43-
rxnSBO{i} = 'SBO:0000395'; %Encapsulating process
44-
end
45-
end
46-
47-
% Add SBO term if it wasn't annotated yet
48-
model=editMiriam(model,'met','all','sbo',metsSBO,'fill');
49-
model=editMiriam(model,'rxn','all','sbo',rxnSBO,'fill');
50-
2+
% addSBOterms yeast-GEM shim — delegates to RAVEN's assignSBOterms.
3+
%
4+
% The legacy implementation had a typo: the pseudoreaction-SBO loop
5+
% used `for i = numel(model.rxns)` (single iteration over the last
6+
% reaction) instead of `1:numel(model.rxns)`. To keep this function
7+
% byte-equivalent to the pre-refactor output (and avoid spurious
8+
% SBO churn in saveYeastModel diffs), we pass
9+
% `onlyLastReactionForPseudo = true`. Flip the flag off here if
10+
% yeast-GEM ever decides to start tagging every pseudoreaction.
11+
%
12+
% Usage: model = addSBOterms(model)
13+
14+
model = assignSBOterms(model, struct('onlyLastReactionForPseudo', true));
5115
end

code/missingFields/loadDeltaG.m

Lines changed: 9 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -1,45 +1,15 @@
11
function model = loadDeltaG(model)
2-
% loadDeltaG
3-
% Add metDeltaG and rxnDeltaG fields to a model, based on datafiles saved at
4-
% /data/databases (model_rxnDeltaG.csv and model_metDeltaG.csv). Metabolites
5-
% and reactions are matched by their identifiers (i.e. model.mets and
6-
% model.rxns). If changes are made that affect the identifiers or what
7-
% metabolites or reactions they refer to, the deltaG values will not be
8-
% correct.
2+
% loadDeltaG yeast-GEM shim — delegates to RAVEN's loadDeltaGfromCSV.
93
%
10-
% Input:
11-
% model yeast-GEM without deltaG fields
12-
%
13-
% Output:
14-
% model yeast-GEM with metDeltaG and rxnDeltaG fields
4+
% Populates model.metDeltaG and model.rxnDeltaG from the project
5+
% CSVs at data/databases/model_metDeltaG.csv and
6+
% data/databases/model_rxnDeltaG.csv. Paths are resolved relative to
7+
% this file so the function works from any cwd.
158
%
169
% Usage: model = loadDeltaG(model)
1710

18-
if isfield(model,'metDeltaG')
19-
disp('Existing metDeltaG field will be overwritten.')
20-
else
21-
model.metDeltaG = nan(numel(model.mets),1);
22-
end
23-
if isfield(model,'rxnDeltaG')
24-
disp('Existing rxnDeltaG field will be overwritten.')
25-
else
26-
model.rxnDeltaG = nan(numel(model.rxns),1);
27-
end
28-
29-
metG = readtable('../../data/databases/model_metDeltaG.csv');
30-
rxnG = readtable('../../data/databases/model_rxnDeltaG.csv');
31-
32-
[a,b] = ismember(model.mets,metG.Var1);
33-
model.metDeltaG(a) = metG.Var2(b(a));
34-
if any(~a)
35-
fprintf(['Not all metabolite identifiers are matched to model_metDeltaG.csv, the latter \n' ...
36-
'file might have to be supplemented with deltaG values for new metabolites.\n'])
37-
end
38-
39-
[a,b] = ismember(model.rxns,rxnG.Var1);
40-
model.rxnDeltaG(a) = rxnG.Var2(b(a));
41-
if any(~a)
42-
fprintf(['Not all reaction identifiers are matched to model_rxnDeltaG.csv, the latter \n' ...
43-
'file might have to be supplemented with deltaG values for new reaction.\n'])
44-
end
11+
funcDir = fileparts(mfilename('fullpath'));
12+
metCsv = fullfile(funcDir, '..', '..', 'data', 'databases', 'model_metDeltaG.csv');
13+
rxnCsv = fullfile(funcDir, '..', '..', 'data', 'databases', 'model_rxnDeltaG.csv');
14+
model = loadDeltaGfromCSV(model, metCsv, rxnCsv);
4515
end

code/missingFields/saveDeltaG.m

Lines changed: 16 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,21 @@
1-
function model = saveDeltaG(model,verbose)
2-
% saveDeltaG
3-
% Saves the metDeltaG and rxnDeltaG fields as tables to /data/databases/...
4-
% model_rxnDeltaG.csv and /data/databases/model_metDeltaG.csv. When
5-
% loadYeastModel is run, these tables will be read to reconstruct the
6-
% metDeltaG and rxnDeltaG fields.
1+
function model = saveDeltaG(model, verbose)
2+
% saveDeltaG yeast-GEM shim — delegates to RAVEN's saveDeltaGtoCSV.
73
%
8-
% Input:
9-
% model yeast-GEM with deltaG fields
10-
% verbose true or false
4+
% Persists model.metDeltaG and model.rxnDeltaG to the project
5+
% CSVs at data/databases/model_metDeltaG.csv and
6+
% data/databases/model_rxnDeltaG.csv. Returns the model unchanged
7+
% (kept as an output for backward compatibility with callers that
8+
% chain saveDeltaG into a pipeline).
119
%
12-
% Output:
13-
% model yeast-GEM with metDeltaG and rxnDeltaG fields
14-
%
15-
% Usage: model = saveDeltaG(model,verbose)
10+
% Usage: model = saveDeltaG(model)
11+
% model = saveDeltaG(model, verbose)
1612

17-
if nargin<2
18-
verbose=false;
19-
end
20-
if ~isfield(model,'metDeltaG')
21-
if verbose
22-
disp('No metDeltaG field found, model_metDeltaG.csv will not be changed.')
23-
end
24-
else
25-
metG = array2table([model.mets, num2cell(model.metDeltaG)]);
26-
writetable(metG,'../../data/databases/model_metDeltaG.csv');
27-
end
28-
if ~isfield(model,'rxnDeltaG')
29-
if verbose
30-
disp('No rxnDeltaG field found, model_rxnDeltaG.csv will not be changed')
31-
end
32-
else
33-
rxnG = array2table([model.rxns, num2cell(model.rxnDeltaG)]);
34-
writetable(rxnG,'../../data/databases/model_rxnDeltaG.csv');
13+
if nargin < 2
14+
verbose = false;
3515
end
16+
17+
funcDir = fileparts(mfilename('fullpath'));
18+
metCsv = fullfile(funcDir, '..', '..', 'data', 'databases', 'model_metDeltaG.csv');
19+
rxnCsv = fullfile(funcDir, '..', '..', 'data', 'databases', 'model_rxnDeltaG.csv');
20+
saveDeltaGtoCSV(model, metCsv, rxnCsv, verbose);
3621
end
Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,23 @@
11
function findDuplicatedRxns(model)
2-
% findDuplicatedRxns
3-
% Find and print reactions that have the same stoichiometry (forwards or
4-
% backwards).
2+
% findDuplicatedRxns yeast-GEM shim — delegates to RAVEN's
3+
% findDuplicateRxns and prints each pair in the legacy format.
54
%
6-
% Input:
7-
% model genome-scale model
8-
%
9-
% Usage: findDuplicatedRxns(model)
5+
% For every (i, j) pair of reactions sharing stoichiometry (in
6+
% either direction), prints two lines with name / GPR / lb / ub —
7+
% matching the pre-refactor output verbatim.
108
%
9+
% Usage: findDuplicatedRxns(model)
1110

12-
for i = 1:length(model.rxns)-1
13-
for j = i+1:length(model.rxns)
14-
if isequal(model.S(:,i),model.S(:,j)) || isequal(model.S(:,i),-model.S(:,j))
15-
constructEquations(model,model.rxns(i));
16-
disp(['Name: ' model.rxnNames{i} ' - GPR: ' model.grRules{i} ' - LB=' num2str(model.lb(i)) ' - UB=' num2str(model.ub(i))])
17-
constructEquations(model,model.rxns(j));
18-
disp(['Name: ' model.rxnNames{j} ' - GPR: ' model.grRules{j} ' - LB=' num2str(model.lb(j)) ' - UB=' num2str(model.ub(j))])
19-
disp(" ")
20-
end
21-
end
11+
pairs = findDuplicateRxns(model);
12+
for k = 1:size(pairs, 1)
13+
i = pairs(k, 1);
14+
j = pairs(k, 2);
15+
constructEquations(model, model.rxns(i));
16+
disp(['Name: ' model.rxnNames{i} ' - GPR: ' model.grRules{i} ...
17+
' - LB=' num2str(model.lb(i)) ' - UB=' num2str(model.ub(i))])
18+
constructEquations(model, model.rxns(j));
19+
disp(['Name: ' model.rxnNames{j} ' - GPR: ' model.grRules{j} ...
20+
' - LB=' num2str(model.lb(j)) ' - UB=' num2str(model.ub(j))])
21+
disp(" ")
2222
end
2323
end

code/otherChanges/changeGAM.m

Lines changed: 20 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,22 @@
1-
function model = changeGAM(model,GAM,NGAM)
2-
bioPos = strcmp(model.rxnNames,'biomass pseudoreaction');
3-
for i = 1:length(model.mets)
4-
S_ix = model.S(i,bioPos);
5-
isGAM = sum(strcmp({'ATP','ADP','H2O','H+','phosphate'},model.metNames{i})) == 1;
6-
if S_ix ~= 0 && isGAM
7-
model.S(i,bioPos) = sign(S_ix)*GAM;
8-
end
9-
end
1+
function model = changeGAM(model, GAM, NGAM)
2+
% changeGAM yeast-GEM shim — delegates to RAVEN's setGAM.
3+
%
4+
% Sets the GAM coefficient on the yeast-GEM biomass pseudoreaction
5+
% for the metabolites listed under `gam_cofactors` in
6+
% data/yeastgem/ids.yml (ATP, ADP, H2O, H+, phosphate by default).
7+
% If NGAM is supplied, the 'non-growth associated maintenance
8+
% reaction' is fixed to that flux.
9+
%
10+
% Usage: model = changeGAM(model, GAM)
11+
% model = changeGAM(model, GAM, NGAM)
1012

11-
if nargin >2
12-
pos = strcmp(model.rxnNames,'non-growth associated maintenance reaction');%NGAM
13-
model = setParam(model,'eq',model.rxns(pos),NGAM);% set both lb and ub
14-
end
13+
cfg = yeastBiomassConfig();
1514

16-
end
15+
if nargin > 2
16+
ngamPos = strcmp(model.rxnNames, 'non-growth associated maintenance reaction');
17+
ngamRxn = model.rxns{ngamPos};
18+
model = setGAM(model, GAM, cfg.biomass_rxn, cfg.gam_cofactors, ngamRxn, NGAM);
19+
else
20+
model = setGAM(model, GAM, cfg.biomass_rxn, cfg.gam_cofactors);
21+
end
22+
end
Lines changed: 34 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,44 @@
1-
function model = rescalePseudoReaction(model,metName,f)
2-
% rescalePseudoReaction
3-
% Rescales a specific pseudoreaction by a given factor
1+
function model = rescalePseudoReaction(model, metName, f)
2+
% rescalePseudoReaction yeast-GEM shim — delegates to RAVEN's
3+
% scaleBiomassPseudoreaction, plus a yeast-only lipid aggregation.
44
%
5-
% model (struct) the yeast GEM
6-
% metName (str) name of the component to rescale (e.g. "protein")
7-
% f (float) fraction to use for rescaling
8-
%
9-
% model (struct) the (rescaled) yeast GEM
10-
%
11-
% Usage: model = rescalePseudoReaction(model,metName,f)
5+
% metName 'lipid' rescales both 'lipid backbone' and 'lipid chain' —
6+
% yeast-GEM keeps lipid mass as backbone + chain, so users still
7+
% address it as a single component. 'lipid backbone' and 'lipid
8+
% chain' are handled directly here because the model.metNames for
9+
% their products contain a space and don't match the underscore-
10+
% compatible cfg.components{i}.name keys ('lipid_backbone', etc.)
11+
% that the RAVEN helper uses for product-side detection. Every
12+
% other component name is forwarded to RAVEN.
1213
%
14+
% Usage: model = rescalePseudoReaction(model, metName, f)
15+
16+
if strcmp(metName, 'lipid')
17+
model = rescalePseudoReaction(model, 'lipid backbone', f);
18+
model = rescalePseudoReaction(model, 'lipid chain', f);
19+
return;
20+
end
1321

14-
if strcmp(metName,'lipid')
15-
model = rescalePseudoReaction(model,'lipid backbone',f);
16-
model = rescalePseudoReaction(model,'lipid chain',f);
17-
else
22+
cfg = yeastBiomassConfig();
23+
24+
if strcmp(metName, 'lipid backbone') || strcmp(metName, 'lipid chain')
1825
rxnName = [metName ' pseudoreaction'];
19-
rxnPos = strcmp(model.rxnNames,rxnName);
26+
rxnPos = find(strcmp(model.rxnNames, rxnName));
27+
if isempty(rxnPos)
28+
return;
29+
end
2030
for i = 1:length(model.mets)
21-
S_ir = model.S(i,rxnPos);
22-
isProd = strcmp(model.metNames{i},metName);
31+
S_ir = model.S(i, rxnPos);
32+
isProd = strcmp(model.metNames{i}, metName);
2333
if S_ir ~= 0 && ~isProd
24-
model.S(i,rxnPos) = f*S_ir;
34+
model.S(i, rxnPos) = f * S_ir;
2535
end
2636
end
27-
% Correct H+
28-
Hc = find(strcmp(model.mets,'s_0794'));
29-
model.S(Hc,rxnPos) = 0;
30-
model.S(Hc,rxnPos) = -sum(model.S(:,rxnPos).*model.metCharges,'omitnan');
37+
Hc = find(strcmp(model.mets, cfg.proton_met));
38+
model.S(Hc, rxnPos) = 0;
39+
model.S(Hc, rxnPos) = -sum(model.S(:, rxnPos) .* model.metCharges, 'omitnan');
40+
return;
3141
end
42+
43+
model = scaleBiomassPseudoreaction(model, cfg, metName, f);
3244
end

code/otherChanges/scaleBioMass.m

Lines changed: 20 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,48 +1,36 @@
1-
function model = scaleBioMass(model,component,new_value,balance_out,dispOutput)
2-
% scaleBioMass
3-
% Scales the biomass composition
1+
function model = scaleBioMass(model, component, new_value, balance_out, dispOutput)
2+
% scaleBioMass yeast-GEM shim — delegates to RAVEN's scaleBiomassFraction.
43
%
5-
% Input:
6-
% model (struct) yeast-GEM model
7-
% component (string) biomass component to change (options are:
8-
% 'carbohydrate', 'protein', 'lipid', 'RNA', 'DNA',
9-
% 'ion', 'cofactor')
10-
% new_value (num) new total fraction for the specified biomass
11-
% component
12-
% balance_out (string, optional) biomass component that will be used
13-
% to balance out the biomass composition, so that the
14-
% total mass adds up to 1 g/gDCW. This is highly
15-
% recommended (default = empty, no scaling takes place)
16-
% dispOutput (bool, optional) displayed outoupt (default = true)
4+
% Scales `component` to `new_value` g/gDW, optionally adjusting
5+
% `balance_out` so the biomass total stays at 1 g/gDW. yeast-GEM's
6+
% 'lipid' aggregation (rescale both backbone + chain in lock-step)
7+
% is preserved via the legacy rescalePseudoReaction shim, which
8+
% dispatches the lipid special case.
179
%
18-
% Output:
19-
% model (struct) modified yeast-GEM model
20-
%
21-
% Usage: model = scaleBioMass(model,component,new_value,balance_out,dispOutput)
10+
% Usage: model = scaleBioMass(model, component, new_value, balance_out, dispOutput)
2211

2312
if nargin < 5
2413
dispOutput = true;
2514
end
2615
if nargin < 4
2716
balance_out = '';
2817
end
29-
30-
%Measure current composition and rescale:
31-
[X,P,C,R,D,L,I,F] = sumBioMass(model,false);
18+
19+
% Current fractions (uses the shared yeast biomass config).
20+
[X, P, C, R, D, L, I, F] = sumBioMass(model, false);
3221
content_all = {'biomass','carbohydrate','protein','lipid','RNA','DNA','ion','cofactor'};
3322
content_Cap = {'X','C','P','L','R','D','I','F'};
34-
pos = strcmp(content_all,component);
35-
old_value = eval(content_Cap{pos});
36-
f = new_value / old_value;
37-
model = rescalePseudoReaction(model,component,f);
23+
pos = strcmp(content_all, component);
24+
old_value = eval(content_Cap{pos});
25+
f = new_value / old_value;
26+
model = rescalePseudoReaction(model, component, f);
3827

39-
%Balance out (if desired):
4028
if ~isempty(balance_out)
41-
X = sumBioMass(model,false);
42-
pos = strcmp(content_all,balance_out);
29+
X = sumBioMass(model, false);
30+
pos = strcmp(content_all, balance_out);
4331
balance_value = eval(content_Cap{pos});
44-
f = (balance_value + (1-X)) / balance_value;
45-
model = rescalePseudoReaction(model,balance_out,f);
32+
f = (balance_value + (1 - X)) / balance_value;
33+
model = rescalePseudoReaction(model, balance_out, f);
4634
end
47-
sumBioMass(model,dispOutput);
35+
sumBioMass(model, dispOutput);
4836
end

0 commit comments

Comments
 (0)