Skip to content

Commit 607d680

Browse files
themoenenPhil DowneyMoenen ErbuerMoenen ErbuerPhillipDowney
authored
Molecules cleanup moe (#67)
* formatting and fox new_molecule bug * merge properties bug * gui notebook and readme * markdown test * readme & gui demo * gui readme update * stash * Support for display molecules from a dataframe * stash * prox server major upgrade and debug visuals * prox server major upgrade and debug visuals * prox server major upgrade and debug visuals * stash * Undid renaming gui-build to gui-build-proxy * paths for routes * acheived build * merge conflicts mergibg with gui_api_moe #2 * Merge conflicts #3 * Merge wrapup * Simplified/cleaned up JL_PROXY proxy code in gui_launcher * Demo Notebook design * patch launcher * notebook updates * stash * gui-build & gui-build-proxy wip * Updated gui build folders * Updated gui build folders * stash * initial chemchat branch * retrieving based on name before smils) * turn spinner off models gen for api * API IMPROVEMENTS * API IMPROVEMENTS * stash * Update readme * Update readme 1 * Update readme 2 * Update readme 3 * Update readme * Update readme * Update readme * Update readme * Update readme * Update readme * jupyter lab config sensing * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * Update readme * Update readme * Update readme * Update readme * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * update for continer proxy base_url * ollama fix * ollama fix * ollama fix * update for continer proxy base_url * demo update * demo update * debug set context * debug set context * debug set context * debug set context * debug set context * debug set context * debug set context * debug set context * debug set context * debug set context * debug set context * debug set context * debug set context * debug set context * debug set context * Update readme * Update readme * Update readme * Update readme * Update readme * Update readme * Update readme * Update readme * Update readme * Update readme * Update readme * Update readme * Update readme * Update readme * Updated Notebook 'Common_Client_Intro.ipynb' * Updated Notebook 'Common_Client_Intro.ipynb' #2 * Updated readme * Updated readme * Updated readme * Updated readme * Updated readme * Updated readme * Updated readme * Updated readme * add remove * cross origin for flask apps * cross origin for flask apps * cross origin for flask apps * cross origin for flask apps * api decode info (#58) (#59) * api decode info * chore: lint Co-authored-by: Brian Duenas <brian.duenas@ibm.com> * reinstate commented out notebook line for importing molecules * Fixed dataviewer submit with proxy url * static data viewer * Refactored doc generation + added auto-copy function to openad-docs repo * Intro command will now also source the text from the docs folder, centralizing all documentation to a single source of truth * Docs readme * Docs readme * Docs readme * Docs readme * Docs readme * Docs readme * Docs readme * fix merge batch and doco * update version * update version * formatting with lint * Fixed display run header rendering issue * Removed superfluous old gui-build-proxy files * Fixed command in Jupyter * Removed banner.css which wasn't used * Fixed notebook header layout issue ion wide screen, added remove run command * Cleanup * Removed trash * Removed trash * docs now also update main README.md file * Updated documentation * README.md now updated by documentation script + Analysis included in mol viewer * Updated * Removed gui-build and gui-build-proxy to overcome merge conflicts with main * Restore gui-build folders from main to avoid conflicts * Remove logger * Macromnolecules base architecture (grammar, commands, functions, api) * Established macromolecule file datastructure * stash * cif / pdb support * Removed legacy files * Added documentation * Separated data formats in a separate file * Cleanup * cleanup * Added search_fasta_sequence() to fetch protein by its FASTA string * CIF meta data parsing + fetch mmol by identifier pipeline * Moved massaging-for-human-readability of keys to frontend * cleanup * Removed biopython PDBParser, which was replaced with gemmi which reads pdb files as cif * smol.json file support * rename api routes mol to smol * Renamed smols/mmols folders * cleanup * Fixed error with enrich api * Ready for merge * Removed mols2grid * Removed mols2grid / molecule viewer * Removed mols2grid / molecule viewer #2 * Removed debug loggers * Poetry lock * Updated GUI build * Sorted MOL_PROPERTIES * Mol functions refactor * Mol functions refactor * Cleanup of smol_functions.py * CLI molecule printing improvements wip * CLI molecule printing improvements wip * CLI molecule printing improvements wip * New gui build * stash * Improved display mol output * Display mol output finetuning * refactored @dopamine>>synonyms function and output, including changes to magic commands (to be tested) * Fixed commands after refactor: add mol/remove mol/rename mol/export mol * Fixed commands after refactor: export molecules/export molecule * Fixed export mols/load mols commands and added support for sdf, smi and molset.json * Fixed 'merge with pubchem' and created more robust merge molecule function * Fixed language everywhere to refer to 'molecule working set' instead os molecule list/list of molecules/my mmolecules etc * Fixed language everywhere to refer to 'molecule working set' instead os molecule list/list of molecules/my mmolecules etc * Removed test code * Consistent display of mol|molecule and molset|molecule-set in command docs * Cleaned up molecule command documentation, halway done * Cleaned up molecule grammar help, done * Added user feedback for analysis results, added analysis to display mol output * Updated help, fixed small issues after CLI testing * Various cleanup * Various small fixes * various fixes * openad_model_toolkit fix for canonical_smiles identifier * Suppress RDKit warnings * gui-build update * Merge_mol improvement * Fixed _enrich_with_pubchem_data (enrichg clause) for load mols from file * Fix for merge mols data command * Renamed verbose to uppercase * fn_predict_reaction_batch_topn.py * fix issue with basic molecule adding * fix issue with basic molecule adding * fix issue with basic molecule adding * Fixed error with deprecated 'using' instead of 'from' * Fix ignoring iupac_name as mol name when it's over 40 chars * Move untracked test notebook to permanent location * Fixed issue with viz data fetching * Fixed minor issues with mere_mol function * Added commands for viewing (GUI) and opening (OS) your workspace * added a few new utility commands plus created GUI demo Notebook * Updates to GUI demo Notebook * Removed outdated Flask app demo * Removed gui-build.zip that should not have been committed * Fixed issue with pretty data, improved transposing of output table * Removed static global print_width variable and replaced with heloper function so available print width is always calculated on the fly (so resizing terminal won't mess up output) * Centralized all terminal width measurements so they call the same function with consistent result * black * version increment in metadata --------- Co-authored-by: Phil Downey <phildowney@pd-work-mac.local> Co-authored-by: Moenen Erbuer <moenen@arthur.io> Co-authored-by: Moenen Erbuer <themoenen@Moenens-MacBook-Pro.local> Co-authored-by: Phil Downey <Phil.downey1@ibm.com> Co-authored-by: Brian Duenas <brian.duenas@ibm.com>
1 parent db4ad46 commit 607d680

256 files changed

Lines changed: 3909 additions & 2924 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 350 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,350 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "code",
5+
"execution_count": null,
6+
"id": "dae64cf7-693b-403e-aad1-13fe1a2ea17f",
7+
"metadata": {},
8+
"outputs": [],
9+
"source": [
10+
"%openad display mol dopamine\n",
11+
"%openad display mol InChI=1S/C8H11NO2/c9-4-3-6-1-2-7(10)8(11)5-6/h1-2,5,10-11H,3-4,9H2\n",
12+
"%openad display mol VYFYYTLLBUKUHU-UHFFFAOYSA-N\n",
13+
"%openad display mol C1=CC(=C(C=C1CCN)O)O\n",
14+
"%openad display mol 681"
15+
]
16+
},
17+
{
18+
"cell_type": "code",
19+
"execution_count": null,
20+
"id": "13d7876e-3325-4cda-a48c-9f0e49f11f97",
21+
"metadata": {},
22+
"outputs": [],
23+
"source": [
24+
"%openad show mol dopamine\n",
25+
"%openad show mol InChI=1S/C8H11NO2/c9-4-3-6-1-2-7(10)8(11)5-6/h1-2,5,10-11H,3-4,9H2\n",
26+
"%openad show mol C1=CC(=C(C=C1CCN)O)O\n",
27+
"%openad show mol 681"
28+
]
29+
},
30+
{
31+
"cell_type": "code",
32+
"execution_count": null,
33+
"id": "394d07a8-0745-4932-9d1e-ac45f9e386fe",
34+
"metadata": {},
35+
"outputs": [],
36+
"source": [
37+
"%openad show molset 'base_molecules.molset.json'\n",
38+
"%openad show molset 'base_molecules.sdf'\n",
39+
"%openad show molset 'base_molecules.csv'\n",
40+
"%openad show molset 'base_molecules.smi'"
41+
]
42+
},
43+
{
44+
"cell_type": "code",
45+
"execution_count": null,
46+
"id": "58fa4ebb-3aff-4201-9ff7-3bd30cbac342",
47+
"metadata": {},
48+
"outputs": [],
49+
"source": [
50+
"%openad @dopamine>>molecular_weight\n",
51+
"%openad @VYFYYTLLBUKUHU-UHFFFAOYSA-N>>molecular_weight\n",
52+
"%openad @C1=CC(=C(C=C1CCN)O)O>>synonyms\n",
53+
"%openad @681>>molecular_weight\n",
54+
"\n",
55+
"# # Prints error double:\n",
56+
"%openad @dopamine>>feature_ring_count_3d\n",
57+
"\n",
58+
"# CID Not working in Jupyter only\n",
59+
"# %openad @InChI=1S/C8H11NO2/c9-4-3-6-1-2-7(10)8(11)5-6/h1-2,5,10-11H,3-4,9H2>>inchi_key # NOT WORKING"
60+
]
61+
},
62+
{
63+
"cell_type": "code",
64+
"execution_count": null,
65+
"id": "5db14245-53fd-4202-9dec-843f1ee3298b",
66+
"metadata": {},
67+
"outputs": [],
68+
"source": [
69+
"%openad export mol dopamine as file\n",
70+
"%openad export mol InChI=1S/C8H11NO2/c9-4-3-6-1-2-7(10)8(11)5-6/h1-2,5,10-11H,3-4,9H2 as file\n",
71+
"%openad export mol VYFYYTLLBUKUHU-UHFFFAOYSA-N as file\n",
72+
"%openad export mol C1=CC(=C(C=C1CCN)O)O as file\n",
73+
"%openad export mol 681 as file"
74+
]
75+
},
76+
{
77+
"cell_type": "code",
78+
"execution_count": null,
79+
"id": "1d09e83b-0c5e-43e2-b76f-7bae53f0b2ec",
80+
"metadata": {},
81+
"outputs": [],
82+
"source": [
83+
"%openad add mol dopamine\n",
84+
"%openad add mol C1=CC2=C(C=C1O)C(=CN2)CCN\n",
85+
"%openad add mol InChI=1S/C5H9NO4/c6-3(5(9)10)1-2-4(7)8/h3H,1-2,6H2,(H,7,8)(H,9,10)/t3-/m0/s1\n",
86+
"%openad add mol UCTWMZQNUQWSLP-VIFPVBQESA-N\n",
87+
"%openad add mol 774\n",
88+
"\n",
89+
"%openad add mol gaba force\n",
90+
"%openad add mol C1=CC(=C(C=C1C(CN)O)O)O force\n",
91+
"%openad add mol InChI=1S/C2H5NO2/c3-1-2(4)5/h1,3H2,(H,4,5) force\n",
92+
"%openad add mol KZBUYRJDOAKODT-UHFFFAOYSA-N force\n",
93+
"%openad add mol 70678557 force\n",
94+
"\n",
95+
"%openad add mol Penicillin as one basic\n",
96+
"%openad add mol CC(=O)NC1=CC=C(C=C1)O as two basic\n",
97+
"%openad add mol InChI=1S/C13H18O2/c1-9(2)8-11-4-6-12(7-5-11)10(3)13(14)15/h4-7,9-10H,8H2,1-3H3,(H,14,15) as three basic\n",
98+
"%openad add mol PFEOZHBOMNWTJB-UHFFFAOYSA-N as four basic\n",
99+
"\n",
100+
"# Error: \n",
101+
"# %openad add mol 666 as five basic"
102+
]
103+
},
104+
{
105+
"cell_type": "code",
106+
"execution_count": null,
107+
"id": "ac1a99d2-ba93-4b46-b13f-ae1c140dcb76",
108+
"metadata": {},
109+
"outputs": [],
110+
"source": [
111+
"%openad remove mol dopamine\n",
112+
"%openad remove mol C1=CC2=C(C=C1O)C(=CN2)CCN\n",
113+
"%openad remove mol InChI=1S/C5H9NO4/c6-3(5(9)10)1-2-4(7)8/h3H,1-2,6H2,(H,7,8)(H,9,10)/t3-/m0/s1\n",
114+
"%openad remove mol UCTWMZQNUQWSLP-VIFPVBQESA-N\n",
115+
"%openad remove mol 774\n",
116+
"\n",
117+
"%openad remove mol gaba force\n",
118+
"%openad remove mol C1=CC(=C(C=C1C(CN)O)O)O force\n",
119+
"%openad remove mol InChI=1S/C2H5NO2/c3-1-2(4)5/h1,3H2,(H,4,5) force\n",
120+
"%openad remove mol KZBUYRJDOAKODT-UHFFFAOYSA-N force\n",
121+
"%openad remove mol 70678557 force"
122+
]
123+
},
124+
{
125+
"cell_type": "code",
126+
"execution_count": null,
127+
"id": "d636c8be-650e-4a46-bf37-146b4eef6573",
128+
"metadata": {
129+
"scrolled": true
130+
},
131+
"outputs": [],
132+
"source": [
133+
"# %openad clear mols force\n",
134+
"# %openad add mol dopamine force\n",
135+
"\n",
136+
"%openad list mols\n",
137+
"%openad list molecules"
138+
]
139+
},
140+
{
141+
"cell_type": "code",
142+
"execution_count": null,
143+
"id": "1baffc98-dac3-4701-8868-bfb5d5e153f5",
144+
"metadata": {},
145+
"outputs": [],
146+
"source": [
147+
"%openad show mols\n",
148+
"%openad show molecules"
149+
]
150+
},
151+
{
152+
"cell_type": "code",
153+
"execution_count": null,
154+
"id": "ee103605-9fe4-4ec5-a77a-8f12184def69",
155+
"metadata": {},
156+
"outputs": [],
157+
"source": [
158+
"# %openad clear sessions\n",
159+
"# %openad remove toolkit ds4sd\n",
160+
"# %openad add toolkit ds4sd"
161+
]
162+
},
163+
{
164+
"cell_type": "code",
165+
"execution_count": null,
166+
"id": "47d82a7e-c5aa-4b84-b47b-c9ac0234056f",
167+
"metadata": {},
168+
"outputs": [],
169+
"source": [
170+
"# Enrich molecule\n",
171+
"\n",
172+
"# DS4SD\n",
173+
"# %openad set context ds4sd\n",
174+
"# %openad search for similar molecules to 'C1(C(=C)C([O-])C1C)=O'\n",
175+
"# %openad add mol C1(C(=C)C([O-])C1C)=O force\n",
176+
"\n",
177+
"# # RXN\n",
178+
"%openad add mol c1ccc2cc3ccccc3cc2c1CCO force\n",
179+
"%openad set context RXN\n",
180+
"%openad predict reaction 'BrBr.OCCc1cccc2cc3ccccc3cc12'\n",
181+
"\n",
182+
"%openad enrich mols with analysis"
183+
]
184+
},
185+
{
186+
"cell_type": "code",
187+
"execution_count": null,
188+
"id": "c10a464e-3758-453b-9fa7-32f7b35bdf3e",
189+
"metadata": {},
190+
"outputs": [],
191+
"source": [
192+
"%openad clear analysis cache"
193+
]
194+
},
195+
{
196+
"cell_type": "code",
197+
"execution_count": null,
198+
"id": "5a8acad9-1ea3-404b-81ad-871d452bf072",
199+
"metadata": {},
200+
"outputs": [],
201+
"source": [
202+
"%openad display sources dopamine\n",
203+
"%openad display sources InChI=1S/C8H11NO2/c9-4-3-6-1-2-7(10)8(11)5-6/h1-2,5,10-11H,3-4,9H2\n",
204+
"%openad display sources VYFYYTLLBUKUHU-UHFFFAOYSA-N\n",
205+
"%openad display sources C1=CC(=C(C=C1CCN)O)O\n",
206+
"%openad display sources 681"
207+
]
208+
},
209+
{
210+
"cell_type": "code",
211+
"execution_count": null,
212+
"id": "bed4da35-1a28-4869-a073-0b1467d98267",
213+
"metadata": {},
214+
"outputs": [],
215+
"source": [
216+
"%openad rename mol '2-anthracen-1-ylethanol' as foobar"
217+
]
218+
},
219+
{
220+
"cell_type": "code",
221+
"execution_count": null,
222+
"id": "0981c4b1-a5b7-424a-9f6c-04d349f2b718",
223+
"metadata": {},
224+
"outputs": [],
225+
"source": [
226+
"# %openad load mols from file 'base_molecules.smi' enrich append\n",
227+
"# %openad load mols from file 'base_molecules.sdf' enrich append\n",
228+
"%openad load mols from file 'test.smi' enrich append"
229+
]
230+
},
231+
{
232+
"cell_type": "code",
233+
"execution_count": null,
234+
"id": "a567cdd2-fd4c-47c6-89da-fd36b3226dc6",
235+
"metadata": {
236+
"scrolled": true
237+
},
238+
"outputs": [],
239+
"source": [
240+
"df = %openadd display data 'test.csv'\n",
241+
"# %openad load mols from dataframe df \n",
242+
"%openad load mols from dataframe df enrich"
243+
]
244+
},
245+
{
246+
"cell_type": "code",
247+
"execution_count": null,
248+
"id": "c6aefb9f-cac8-4de8-bca4-493d8e53a503",
249+
"metadata": {},
250+
"outputs": [],
251+
"source": [
252+
"%openad clear mols force\n",
253+
"data_df = %openadd display data 'smol_data_for_merge.csv'\n",
254+
"data_df\n",
255+
"mols = data_df['subject'].tolist()\n",
256+
"for mol in mols:\n",
257+
" %openad add mol {mol} basic force\n",
258+
"%openad merge mols data from dataframe data_df\n",
259+
"# %openad merge mols data from dataframe data_df enrich"
260+
]
261+
},
262+
{
263+
"cell_type": "code",
264+
"execution_count": null,
265+
"id": "bb7671cd-4931-4b44-9722-450cfa206ab4",
266+
"metadata": {},
267+
"outputs": [],
268+
"source": [
269+
"%openad export molecules as 'jup_mols_export_test.molset.json'\n",
270+
"%openad export molecules as 'jup_mols_export_test.sdf'\n",
271+
"%openad export molecules as 'jup_mols_export_test.csv'\n",
272+
"%openad export molecules as 'jup_mols_export_test.smi'\n",
273+
"x = %openadd export molecules\n",
274+
"x"
275+
]
276+
},
277+
{
278+
"cell_type": "code",
279+
"execution_count": null,
280+
"id": "7b3542ac-a369-4493-adb0-e586c15fade8",
281+
"metadata": {},
282+
"outputs": [],
283+
"source": [
284+
"%openad clear mols force\n",
285+
"%openad clear mols"
286+
]
287+
},
288+
{
289+
"cell_type": "code",
290+
"execution_count": null,
291+
"id": "f3e2c294-d72f-467b-9057-660c0f0f7119",
292+
"metadata": {},
293+
"outputs": [],
294+
"source": []
295+
},
296+
{
297+
"cell_type": "code",
298+
"execution_count": null,
299+
"id": "5ee84c3f-8a7f-4a85-961d-7417c1a326ff",
300+
"metadata": {},
301+
"outputs": [],
302+
"source": []
303+
},
304+
{
305+
"cell_type": "code",
306+
"execution_count": null,
307+
"id": "0e53d4d9-ffd8-490b-ac2f-c41296dbe6fa",
308+
"metadata": {},
309+
"outputs": [],
310+
"source": []
311+
},
312+
{
313+
"cell_type": "code",
314+
"execution_count": null,
315+
"id": "dd1e5138-d3f1-45b1-9988-df7bf523cfa6",
316+
"metadata": {},
317+
"outputs": [],
318+
"source": []
319+
},
320+
{
321+
"cell_type": "code",
322+
"execution_count": null,
323+
"id": "2a5b649e-cf87-4a8f-958d-dcfc11a5a251",
324+
"metadata": {},
325+
"outputs": [],
326+
"source": []
327+
}
328+
],
329+
"metadata": {
330+
"kernelspec": {
331+
"display_name": "Python 3 (ipykernel)",
332+
"language": "python",
333+
"name": "python3"
334+
},
335+
"language_info": {
336+
"codemirror_mode": {
337+
"name": "ipython",
338+
"version": 3
339+
},
340+
"file_extension": ".py",
341+
"mimetype": "text/x-python",
342+
"name": "python",
343+
"nbconvert_exporter": "python",
344+
"pygments_lexer": "ipython3",
345+
"version": "3.10.14"
346+
}
347+
},
348+
"nbformat": 4,
349+
"nbformat_minor": 5
350+
}

0 commit comments

Comments
 (0)