Skip to content

Commit efb561d

Browse files
authored
Update README.md
1 parent 05b0bf8 commit efb561d

1 file changed

Lines changed: 28 additions & 0 deletions

File tree

README.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -255,6 +255,34 @@ Response includes `'omop_id': '161'`:
255255

256256
We now support SMILES (Simplified Molecular Input Line Entry System) strings for drug molecules. SMILES data is extracted from PubChem’s CID-SMILES.gz file and integrated during the dictionary build step (06_add_smiles_from_pubchem.py), enriching each drug entry with a machine-readable structure. If available, a SMILES string is returned in the find_drugs() output under the smiles key. For visualisation, users can refer to the example notebook using RDKit (optional dependency). This enhancement allows better structural insight without relying on external APIs.
257257

258+
## Plotting the molecular structure
259+
260+
This needs you to also install `rdkit`:
261+
262+
```
263+
pip install rdkit
264+
```
265+
266+
```
267+
from rdkit import Chem
268+
from rdkit.Chem import Draw
269+
from drug_named_entity_recognition import find_drugs
270+
sentences = "I prescribed Dabigatran"
271+
drugs = find_drugs(sentences.split(), is_include_structure=True)
272+
for drug in drugs:
273+
structure_mol = drug[0]["structure_mol"]
274+
break
275+
print (structure_mol)
276+
277+
mol = Chem.MolFromMolBlock(structure_mol)
278+
image = Draw.MolToImage(mol, size=(1000,500))
279+
image.show()
280+
image.save("dabigatran.png")
281+
```
282+
283+
displays:
284+
285+
![dabigatran](./dabigatran.png)
258286

259287
# 📁Data sources
260288

0 commit comments

Comments
 (0)