You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+37-11Lines changed: 37 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,12 +1,10 @@
1
1
# What is GLaDOS?
2
-
GLaDOS is an open source/permissively licensed 20B model tuned to provide an open-source experience _similar__to_ ChatGPT.
3
-
4
-
This repo includes the model itself and a basic web server to chat with it.
2
+
GLaDOS is a family of large language models tuned to provide an open-source experience _similar__to_ ChatGPT.
5
3
4
+
This repo includes the models and a basic web server to chat with them.
6
5
7
6
## Motivation
8
-
Similar models exist but often utilize LLAMA which is only available under a noncommercial license. GLaDOS avoids this by utilizing EleutherAI's/togethercomputers apache 2.0 licensed base models and CC0 data.
9
-
7
+
Similar models exist but often utilize LLAMA which is only available under a noncommercial license. GLaDOS avoids this by utilizing EleutherAI's/togethercomputers apach 2.0 licensed base models and CC0 data.
10
8
Additionally, GLaDOS is designed to be run fully standalone so you don't need to worry about your information being collected by a third party.
11
9
12
10
## Quickstart
@@ -27,10 +25,33 @@ Then, from inside this container run
27
25
```
28
26
python src/run_server.py
29
27
```
30
-
or
28
+
This will run the server with default settings of the 7b RedPajama based GLaDOS model.
29
+
To run a different model you can pass the model path. For example
The current version of GLaDOS uses an FP16 model with ~20B parameters. This is runnable in just under 48GB of VRAM by modifying the generation options in run_server to use a beam width of 1. I am running this with two A6000's nvlinked together and so the default settings run on multiGPU.
80
+
The default model is based on RedPajama 7b, and can run on 24GB Nvidia graphics Cards. Short sequences may also be possible on 16GB graphics cards, but this is untested/I wouldn't recommend it.
81
+
82
+
Other models currently require more video memory- with testing/my hosting being done on 48GB A6000 GPUs.
57
83
58
-
It should be possible to use GPTQ to reduce the memory requirements to ~16GB so that the model can be run on consumer grade graphics cards.
84
+
It is possible to use GPTQ to reduce the memory about 4x, but there is no timeline for completion of this.
59
85
60
86
## Misc QnA
61
87
@@ -72,7 +98,7 @@ Q : How does the model handle formatting?
72
98
A : GLaDOS uses a slight variation on github flavored markdown to create lists tables and code blocks. Extra tags are added by the webserver to prettify the code blocks and tweak other small things.
73
99
74
100
101
+
=======
75
102
# Acknowledgements:
76
103
77
104
Big thanks to EleutherAI for GPT-NeoX, togethercomputer for GPT-Neoxt-chat-base and ShareGPT/RyokoAI for ShareGPT data!
0 commit comments