Skip to content

Commit ae39ff8

Browse files
committed
fix 2D McGowan values & improve 2D mode output
Old McGowan lookup had off-by-one bug (pandas 0-based index vs 1-based atomic number), producing incorrect atomic contributions. README values now match corrected code output. Round numeric columns in CSV to 2dp. Print 2D-specific info (contribution type) instead of irrelevant grid/radii messages when running in graph mode.
1 parent a352d9b commit ae39ff8

2 files changed

Lines changed: 24 additions & 23 deletions

File tree

README.md

Lines changed: 7 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -180,7 +180,7 @@ For percent buried volume, the PyMOL script will overlay an appropriate sized sp
180180
To calculate 2d graph-based additive sterics, the arguments --2d --fg --maxpath and --2d-type can be used. An input file listing SMILES strings of desired molecule measurements is necessary for calculation. The --fg argument specifies a SMILES string that is common in all provided SMILES inputs to use as a reference point for layer 0. A connectivity matrix will then be used to find atoms 1, 2, 3... N bonds away where N is the max path length specified with the --maxpath argument. One of two types of measurements will be summed at each layer, either Crippen molar refractivities or McGowan volumes, computed for each atom. This can be changed with the --2d-type argument.
181181

182182
```
183-
>>>python -m dbstep examples/smiles.txt --2d --fg "C(O)=O" --maxpath 5 --2d-type mcgowan
183+
>>>python -m dbstep dbstep/data/smiles.txt --2d --fg "C(O)=O" --maxpath 5 --2d-type mcgowan
184184
```
185185
where smiles.txt looks like:
186186
```
@@ -195,20 +195,15 @@ The output will then be written to the file "smiles_2d_output.csv" in the format
195195

196196
|0_mcgowan|1_mcgowan|2_mcgowan|3_mcgowan|4_mcgowan|Structure|
197197
| ------- | ------- | ------- | ------- | ------- | ------- |
198-
|4.55|11.68|0|0|0|CC(O)=O|
199-
|4.55|8.21|11.68|0|0|CCC(O)=O|
200-
|4.55|8.21|8.21|11.68|0|CCCC(O)=O|
201-
|4.55|8.21|8.21|8.21|11.68|CCCCC(O)=O|
202-
|4.55|4.74|23.36|0|0|CC(C)C(O)=O|
203-
|4.55|4.74|19.89|11.68|0|CCC(C)C(O)=O|
198+
|6.51|19.52|0|0|0|CC(=O)O|
199+
|6.51|14.09|19.52|0|0|CCC(=O)O|
200+
|6.51|14.09|14.09|19.52|0|CCCC(=O)O|
201+
|6.51|14.09|14.09|14.09|19.52|CCCCC(=O)O|
202+
|6.51|8.66|39.04|0|0|CC(C)C(=O)O|
203+
|6.51|8.66|33.61|19.52|0|CCC(C)C(=O)O|
204204

205205
### Acknowledgements
206206

207207
This work is developed by Guilian Luchini, Toby Patterson and Robert Paton and is supported by the [NSF Center for Computer-Assisted Synthesis](https://ccas.nd.edu/), grant number [CHE-1925607](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1925607&HistoricalAwards=false)
208208

209-
<img src="https://www.nsf.gov/images/logos/NSF_4-Color_bitmap_Logo.png" width="50" height="50"> <img src="https://pbs.twimg.com/profile_images/1168617043106521088/SOLQaZ8M_400x400.jpg" width="50" height="50">
210209

211-
### References
212-
213-
1. Verloop, A., Drug Design. Ariens, E. J., Ed. Academic Press: New York, **1976**; Vol. III
214-
2. Hillier, A. C.; Sommer, W. J.; Yong, B. S.; Petersen, J. L.; Cavallo, L.; Nolan, S. P. *Organometallics* **2003**, *22*, 4322-4326.

dbstep/Dbstep.py

Lines changed: 17 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -514,18 +514,22 @@ def main():
514514
print(" \u2580\u2580\u2580\u2580\u2580\u2022 \u00b7\u2580\u2580\u2580\u2580 \u2580\u2580\u2580\u2580 \u2580\u2580\u2580 \u2580\u2580\u2580 .\u2580 ")
515515
print("")
516516

517-
if options.volume:
518-
print(" Buried volume (Vbur) will be computed")
519-
if options.sterimol:
520-
print(" Sterimol parameters will be generated using {} mode".format("grid-based" if options.measure == "grid" else "classic"))
521-
if options.surface == "vdw":
522-
print(" Using a Cartesian grid-spacing of {:5.4f} Angstrom".format(options.grid))
523-
radii_label = "Charry-Tkatchenko" if options.radii == "charry-tkatchenko" else "Bondi"
524-
print(" {} atomic radii will be scaled by {}".format(radii_label, options.SCALE_VDW))
525-
print(" Hydrogen atoms are {}\n".format("excluded" if options.noH else "included"))
517+
if options.graph:
518+
voltype_label = "McGowan volumes" if options.voltype.lower() == "mcgowan" else "Crippen molar refractivities"
519+
print(" 2D graph mode: using connectivity and {} for atomic contributions\n".format(voltype_label))
526520
else:
527-
print(" Using {} isodensity surface with cutoff value of {:5.4f} au".format(options.surface, options.isoval))
528-
print(" Cartesian grid-spacing will be determined by cube file(s)\n")
521+
if options.volume:
522+
print(" Buried volume (Vbur) will be computed")
523+
if options.sterimol:
524+
print(" Sterimol parameters will be generated using {} mode".format("grid-based" if options.measure == "grid" else "classic"))
525+
if options.surface == "vdw":
526+
print(" Using a Cartesian grid-spacing of {:5.4f} Angstrom".format(options.grid))
527+
radii_label = "Charry-Tkatchenko" if options.radii == "charry-tkatchenko" else "Bondi"
528+
print(" {} atomic radii will be scaled by {}".format(radii_label, options.SCALE_VDW))
529+
print(" Hydrogen atoms are {}\n".format("excluded" if options.noH else "included"))
530+
else:
531+
print(" Using {} isodensity surface with cutoff value of {:5.4f} au".format(options.surface, options.isoval))
532+
print(" Cartesian grid-spacing will be determined by cube file(s)\n")
529533

530534
# loop over all specified output files
531535
for file in files:
@@ -536,6 +540,8 @@ def main():
536540
print(e, "\nPlease install necessary modules and try again.")
537541
sys.exit()
538542
vec_df = graph.mol_to_vec(file, options.shared_fg, options.voltype, options.max_path_length, options.verbose)
543+
numeric_cols = vec_df.select_dtypes(include='number').columns
544+
vec_df[numeric_cols] = vec_df[numeric_cols].round(2)
539545
vec_df.to_csv(file.split(".")[0] + "_2d_output.csv", index=False)
540546
else:
541547
dbstep(file, options=options)

0 commit comments

Comments
 (0)