Llm rxn fix (#53)

PhillipDowney · Phil Downey · brian316 · web-flow · commit 54435c7b5df9 · 2024-06-27T10:32:56.000-07:00
* notebook and LLM updates

* first merge llm

* llm fix and rxn fix

* magic performance fix

* minor tweak onLLM settings

* update docs

---------

Co-authored-by: Phil Downey &lt;phildowney@pd-work-mac.local&gt;
Co-authored-by: Brian Duenas &lt;brian.duenas@ibm.com&gt;
diff --git a/docs/output/csv/commands.csv b/docs/output/csv/commands.csv
@@ -65,11 +65,16 @@ docs;Help
 ?;Help
 ? ...<soft>   --> List all commands containing "..."</soft>;Help
 ... ?<soft>   --> List all commands starting with "..."</soft>;Help
+model auth list;Model
+model auth add group '<auth_group>' with '<api_key>';Model
+model auth remove group '<auth_group>';Model
+model auth add service '<service_name>' to group '<auth_group>';Model
+model auth remove service '<service_name>';Model
 model service status;Model
-model service config '<service_name>'|<service_name>;Model
+model service describe '<service_name>'|<service_name>;Model
 model catalog list;Model
 uncatalog model service '<service_name>'|<service_name>;Model
-catalog model service from (remote) '<path or github>' as  '<service_name>'|<service_name>;Model
+catalog model service from (remote) '<path or github>' as  '<service_name>'|<service_name>   USING (<parameter>=<value> <parameter>=<value>);Model
 model service up '<service_name>'|<service_name> [no_gpu]};Model
 model service local up '<service_name>'|<service_name> ;Model
 model service down '<service_name>'|<service_name>;Model
diff --git a/docs/output/markdown/commands.md b/docs/output/markdown/commands.md
@@ -252,7 +252,9 @@ This command Enriches every molecule in your current working list of molecules w
             - RXN Toolkit `predict Reaction` <br> 
             - RXN Toolkit `predict retrosynthesis ` <br> 
             - DS4SD Toolkit `search for patents containing molecule` <br> 
-            - DS4SD Toolkit `search for similiar molecules` <br><br>
+            - DS4SD Toolkit `search for similiar molecules` <br> 
+
+            See the Deep Search toolkit  and RXN toolkit help for further assistance on these commands.  <br><br>
 
 `clear analysis cache`{: .cmd }
 this command clears the cache of analysis results for your current workspace. <br><br>
@@ -466,11 +468,26 @@ List all available commands. <br><br>
 
 ### Model
 
+`model auth list`{: .cmd }
+show authentication group mapping <br><br>
+
+`model auth add group '<auth_group>' with '<api_key>'`{: .cmd }
+add an authentication group for model services to use <br><br>
+
+`model auth remove group '<auth_group>'`{: .cmd }
+remove an authentication group <br><br>
+
+`model auth add service '<service_name>' to group '<auth_group>'`{: .cmd }
+attach an authentication group to a model service <br><br>
+
+`model auth remove service '<service_name>'`{: .cmd }
+detatch an authentication group from a model service <br><br>
+
 `model service status`{: .cmd }
 get the status of currently cataloged services <br><br>
 
-`model service config '<service_name>'|<service_name>`{: .cmd }
-get the config of a service <br><br>
+`model service describe '<service_name>'|<service_name>`{: .cmd }
+get the configuration of a service <br><br>
 
 `model catalog list`{: .cmd }
 get the list of currently cataloged services <br><br>
@@ -481,8 +498,9 @@ uncatalog a model service  <br>
  Example:  <br> 
 `uncatalog model service 'gen'` <br><br>
 
-`catalog model service from (remote) '<path or github>' as  '<service_name>'|<service_name>`{: .cmd }
+`catalog model service from (remote) '<path or github>' as  '<service_name>'|<service_name>   USING (<parameter>=<value> <parameter>=<value>)`{: .cmd }
 catalog a model service from a path or github or remotely from an existing OpenAD service. <br> 
+(USING) optional headers parameters for communication with service backend. <br> 
 
 Example: <br> 
 
@@ -679,7 +697,7 @@ Lists all RXN AI models currently available. <br><br>
 `predict retrosynthesis '<smiles>' [ using (option1=<value> option2=<value>) ]`{: .cmd }
 Perform a retrosynthesis route prediction on a molecule. <br> 
 
-Options for the optional `using` clause: <br> 
+Optional Parameters that can be specified in the `using` clause: <br> 
 - `availability_pricing_threshold=<int>` Maximum price in USD per g/ml of compounds. Default: no threshold. <br> 
 - `available_smiles='<smiles>.<smiles>.<smiles>'` List of molecules available as precursors, delimited with a period. <br> 
 - `exclude_smiles='<smiles>.<smiles>.<smiles>'` List of molecules to exlude from the set of precursors, delimited with a period. <br> 
@@ -703,7 +721,7 @@ Run a batch of reaction predictions. The provided list of reactions can be speci
 
 Reactions are defined by combining two SMILES strings delimited by a period. For example: `'BrBr.c1ccc2cc3ccccc3cc2c1'` <br> 
 
-Options for the optional `using` clause: <br> 
+Optional Parameters that can be specified in the `using` clause: <br> 
 - `ai_model='<model_name>'` What model to use. Use the command `list rxn models` to list all available models. The default is '2020-07-01'. <br> 
 
 You can reuse previously generated results by appending the optional `use_saved` clause. This will reuse the results of a previously run command with the same parameters, if available. <br> 
@@ -717,7 +735,7 @@ Predict the reaction between two molecules. <br>
 
 Reactions are defined by combining two SMILES strings delimited by a period. For example: `'BrBr.c1ccc2cc3ccccc3cc2c1'` <br> 
 
-Options for the optional `using` clause: <br> 
+Optional Parameters that can be specified in the `using` clause: <br> 
 - `ai_model='<model_name>'` What model to use. Use the command `list rxn models` to list all available models. The default is '2020-07-01'. <br> 
 
 You can reuse previously generated results by appending the optional `use_saved` clause. This will reuse the results of a previously run command with the same parameters, if available. <br> 
@@ -731,7 +749,7 @@ Run a batch of reaction predictions for topn. The provided list of reactions can
 
 Reactions are defined by combining two SMILES strings delimited by a period. For example: `'BrBr.c1ccc2cc3ccccc3cc2c1'` <br> 
 
-Options for the optional `using` clause: <br> 
+Optional Parameters that can be specified in the `using` clause: <br> 
 - `ai_model='<model_name>'` What model to use. Use the command `list rxn models` to list all available models. The default is '2020-07-01'. <br> 
 - `topn=<integer>` Defined the number of results being returned. The default value is 3. <br> 
 
diff --git a/docs/output/markdown/installation.md b/docs/output/markdown/installation.md
@@ -284,23 +284,55 @@ To run a command in bash mode, prepend it with `openad` and make sure to escape
 
 # AI Assistant
 
-To enable our AI assistant, you'll need an account with OpenAI. There is a one month free trial.
+To enable our AI assistant, you'll need either have access to [IBM BAM](https://bam.res.ibm.com/auth/signin) or to use a free open source LLM use [ollama](ollama.com).
 
-This is available for IBM BAM service and Openai.
+**Note:** Ollama will requires a 8gb GPU
 
 > **Note:** watsonx coming soon
 
+## IBM BAM Setup
 For IBM BAM simply used your supplied API key if you have BAM access
 
-For OpenAI
-
-1. Go to [platform.openai.com](https://platform.openai.com) and create an account
-
-2. Click on the profile icon in the top right and choose "View API keys"
-
-3. Create a new key
-
-4. Run `tell me` to be prompted for your OpenAI API credentials
+### Run BAM LLM
+run `tell me` to be prompted for your BAM API credentials
+```
+>> set llm bam
+>> tell me <enter prompt>
+```
+
+## Ollama setup
+Install ollama on your platform  from [here](https://ollama.com/download)
+
+Download appropriate models
+```
+ollama pull llama3:latest
+ollama pull nomic-embed-text
+```
+
+Start the server if not already started
+```
+ollama serve
+```
+Thats it for local usage. If you want to run ollama remotely continue.
+
+### Ollama remote setup with skypilot
+Check out our configuration file to launch ollama on skypilot [ollama_setup.yaml](./ollama_setup.yaml)
+```
+sky serve up ollama_setup.yaml
+```
+
+Setup local environment variables
+
+1. For windows `setx OLLAMA_HOST=<sky-server-ip>:11434`
+2. For Linux and macos `export OLLAMA_HOST=<sky-server-ip>:11434`
+3. To reset to local use `OLLAMA_HOST=0.0.0.0:11434`
+
+### Run ollama on openad toolkit
+> if prompted for api key and none was setup just leave empty
+```
+>> set llm ollama
+>> tell me <enter prompt>
+```
 
 <br>
 
@@ -410,4 +442,4 @@ You will need to restart your Linux session before running `pip install openad`
 
 If you get an error when running `init_magic`, you may first need to setup the default iPython profile for magic commands.
 
-    ipython profile create
+ `ipython profile create`
diff --git a/openad/app/magic/openad_magic.py b/openad/app/magic/openad_magic.py
@@ -1,6 +1,7 @@
 import os
 import sys
 import pandas
+import atexit
 
 # required for Magic Template
 from IPython.display import Markdown
@@ -110,3 +111,11 @@ def strip_leading_blanks(input):
 
 ip = get_ipython()  # pylint: disable=undefined-variable
 ip.register_magics(AD)
+
+
+def cleanup():
+    print("killing magic")
+    openad.app.main.MAGIC_PROMPT.do_exit("exit magic")
+
+
+atexit.register(cleanup)
diff --git a/openad/app/main.py b/openad/app/main.py
@@ -859,7 +859,6 @@ def api_remote(
         MAGIC_PROMPT = magic_prompt
     else:
         magic_prompt = MAGIC_PROMPT
-
     if api_context["workspace"] is None:
         api_context["workspace"] = magic_prompt.settings["workspace"]
     else:
@@ -907,6 +906,7 @@ def api_remote(
 
             # Triggered by magic commands, eg. `%openad ? list files`
             starts_with_qmark = len(inp) > 0 and inp.split()[0] == "?" and inp.strip() != "??"
+            magic_prompt.do_exit("dummy do not remove")
             return magic_prompt.do_help(inp.strip(), jup_return_format=None, display_info=starts_with_qmark)
 
         # If there is a argument and it is not a help attempt to run the command.
@@ -920,8 +920,6 @@ def api_remote(
             result = magic_prompt.default(inp)
             api_context["workspace"] = magic_prompt.settings["workspace"]
             api_context["toolkit"] = magic_prompt.settings["context"]
-
-            magic_prompt.do_exit("dummy do not remove")
             if result is not True and result is not False:
                 return result
 
diff --git a/openad/flask_apps/molsgrid/routes.py b/openad/flask_apps/molsgrid/routes.py
@@ -146,6 +146,9 @@ def render_mols2grid():
         else:
             # Display the grid.
             m2g_params = _compile_default_m2g_params(mol_frame)
+            import sys
+
+            # ToDO silence redunant error for RISE
 
             return None, the_mols2grid.display(**m2g_params)
 
diff --git a/openad/llm_assist/model_reference.py b/openad/llm_assist/model_reference.py
@@ -99,24 +99,25 @@
         "model": "instructlab/granite-7b-lab",
         "url": OLLAMA_HOST,
         "template": """You are a technical documentation writer and when responding follow the following rules:
-                - Format All Command Syntax, Clauses, Examples or Option Syntax in codeblock ipython Markdown
+                - Respond like you were writing a refernce guide for a software package
+                - Format All Command Syntax, Clauses, Examples or Option  Syntax in codeblock ipython Markdown
                 - Format all Command Syntax, Options or clause quotations in codeblock ipython Markdown
-                - reply with all the paramters or options for a command from the help text if syntax requested
                 - Only format codeblocks one line at a time and place them  on single lines
                 - For each instruction used in an answer also provide full command syntax with clauses and options in codeblock format. for example " Use the `search collection` with the 'PubChem' collection to search for papers and molecules.   \n\n command: ` search collection '<collection name or key>' for '<search string>' using ( [ page_size=<int> system_id=<system_id> edit_distance=<integer> display_first=<integer>]) show (data|docs) [ estimate only|return as data|save as '<csv_filename>' ] ` \n
                 \n For Example: ` search collection 'PubChem' for 'Ibuprofen' show ( data ) ` \n"
                 - Provide All syntax, clauses, Options, Parameters and Examples separated by "\n" for a command when answering a question with no leading spaces on the line
-                - ensure bullet lines are indented consistently
                 - Compounds and Molecules are the same concept
-                - smiles or inchi strings are definitions of compounds or smiles
                 - Always explain using the full name not short form of a name
-                - Always list all parameters for a command
+                - Never refer to source files from the embeddings
                 - after explaning a command  tell them how to go to the help using `<command> ?` substituing the command into the string
-                - if asked to tell the user about a command , display the commands help
+                - respond with a output format as per following example
+                '''Command: <put command syntax here >
+
+                Description: <brief description of funciton
 
-               
-               
+                Parameters: < Tell the user what Parameters are available for the comand>
 
+                Examples:  < examples of how to use the function> '''
 
 Answer the question based only on the following 
 context: {context} 
@@ -125,10 +126,12 @@
 
 Answer:""",
         "settings": {
-            "temperature": 0.3,
+            "temperature": 0.5,
             "decoding_method": "greedy",
-            "max_new_tokens": 4000,
+            "max_new_tokens": 3000,
             "min_new_tokens": 1,
+            "top_p": 0.85,
+            "top_k": 50,
         },
         "embeddings": None,
         "embeddings_api": None,
diff --git a/openad/user_toolkits/RXN/fn_reactions/fn_predict_retro.py b/openad/user_toolkits/RXN/fn_reactions/fn_predict_retro.py
@@ -220,6 +220,7 @@ def __init__(self):
             except Exception as e:  # pylint: disable=broad-exception-caught
                 retries = retries + 1
                 sleep(15)
+
                 newspin.text = "Processing Retrosynthesis: Waiting"
                 if retries > 20:
                     raise Exception(
diff --git a/poetry.lock b/poetry.lock