Commit 3b3c9d4
authored
Qualcomm AI Engine Direct - Support attention sink for long context usecase (pytorch#16574)
### Summary
- Support narrow operation
- Support attention sink for static llama
- Include the `--max_context_len` option to set the maximum length for
the model's memory, and use `max_seq_len` to define the maximum sequence
length for evaluation.
- Specified `--use_attention_sink <sink_size>,<eviction_batch_size>` to
enable attention sink feature in llama.py
- Behavior matrix in `llama.py`
- Given that `--compile_only`:
- Specify `--use_attention_sink` -> Compile the LLM and the attention
sink model
- Otherwise, -> Compile the LLM only
- Given that `--pre_get_pte`:
- Specify `--use_attention_sink` -> If the criteria below are not met,
compile the attention sink model before running inference. And then
inference with attention sink
- Check if the attention sink model exists
- Verify sink_size and eviction batch size are identical
- Otherwise, -> Inference LLM without attention sink
- Neither `--compile_only` nor `--pre_get_pte`:
- Specify `--use_attention_sink` -> Compile the LLM and the attention
sink model and inference with attention sink
- Otherwise, -> Compile the LLM and inference LLM without attention sink
### Test plan
- Test for narrow op:
- `python backends/qualcomm/tests/test_qnn_delegate.py -k
TestQNNFloatingPointOperator.test_qnn_backend_narrow --model SM8750
--device $DEVICE --build_folder build-android`
- `python backends/qualcomm/tests/test_qnn_delegate.py -k
TestQNNQuantizedOperator.test_qnn_backend_narrow --model SM8750 --device
$DEVICE --build_folder build-android`
- Test for attention sink in `llama.py`
- `python backends/qualcomm/tests/test_qnn_delegate.py
TestExampleLLMScript.test_attention_sink --model SM8750 --device $DEVICE
-b build-android -a unit_test`
### Results
- Test multi-conversation using attention sink in `llama.py`
- `python examples/qualcomm/oss_scripts/llama/llama.py -b build-android
-s $DEVICE -m SM8750 --checkpoint consolidated.00.pth --params
params.json --tokenizer_model tokenizer.model --decoder_model
llama3_2-1b_instruct --model_mode kv --max_seq_len 4096
--max_context_len 1024 --prompt "I would like to learn python, could you
teach me with a simple example?" "Could you give a more difficult
example in python?" "Could you add a GUI for this game?" "Could you tell
me more about tkinter?" "Is it possible to deploy on a website?" --tasks
wikitext --limit 1 --use_attention_sink 4,32`
<details>
<summary>
Set llama 3.2 1b instruct to a max context length of 1024, and activate
the attention sink feature with a sink_size of 4 and eviction_batch_size
of 32. Then, run a multi-turn conversation with a sequence length of
4096 using five prompts: <br>
"I would like to learn python, could you teach me with a simple
example?", <br>
"Could you give a more difficult example in python?", <br>
"Could you add a GUI for this game?", <br>
"Could you tell me more about tkinter?", <br>
"Is it possible to deploy on a website?"
</summary>
~~~
<|start_header_id|>user<|end_header_id|>
I would like to learn python, could you teach me with a simple
example?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
I'd be happy to help you learn Python. Here's a simple example to get
you started:
**Example: Guessing Game**
In this example, we'll create a simple game where the computer thinks of
a number between 1 and 100, and you try to guess it. After each guess,
the computer will tell you if your guess is higher or lower than the
number it's thinking of.
**Code:**
```python
import random
# The number the computer is thinking of
number = random.randint(1, 100)
# Keep track of your guesses
guesses = 0
# Keep asking for guesses until you guess the number
while True:
# Ask for a guess
guess = input("Guess a number: ")
# Check if the guess is valid
try:
# Convert the guess to an integer
guess = int(guess)
# Check if the guess is within the range
if guess < 1 or guess > 100:
print("Please enter a number between 1 and 100.")
else:
# Increment the number of guesses
guesses += 1
# Check if you've guessed the number
if guess == number:
print(f"Congratulations! You guessed the number in {guesses} guesses.")
break
else:
# Tell the computer how close your guess is
print(f"Your guess is {guess} away from the number.")
except ValueError:
print("Invalid input. Please enter a number.")
```
**How to Run:**
1. Save this code in a file with a `.py` extension (e.g.,
`guessing_game.py`).
2. Open a terminal or command prompt and navigate to the directory where
you saved the file.
3. Type `python guessing_game.py` to run the game.
**What to Do:**
1. Run the game by typing `python guessing_game.py`.
2. Follow the prompts to guess a number.
3. Keep guessing until you guess the number correctly.
I hope this example helps you get started with Python! Let me know if
you have any questions or need further
guidance.<|eot_id|><|start_header_id|>user<|end_header_id|>
Could you give a more difficult example in
python?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Here's a more challenging example in Python:
**Example: Rock, Paper, Scissors**
In this example, we'll create a game of Rock, Paper, Scissors where the
computer thinks of a random move and you try to guess it. After each
guess, the computer will tell you if your move is Rock, Paper, or
Scissors.
**Code:**
```python
import random
# Define the possible moves
moves = ["rock", "paper", "scissors"]
# The computer's move
computer_move = random.choice(moves)
# Keep track of your guesses
guesses = 0
# Keep asking for guesses until you guess the computer's move
while True:
# Ask for a guess
guess = input("Guess a move (rock, paper, or scissors): ")
# Check if the guess is valid
if guess.lower() in moves:
# Convert the guess to a move
move = guess.lower()
if move == computer_move:
print(f"Congratulations! You guessed the computer's move correctly!")
break
else:
# Increment the number of guesses
guesses += 1
# Tell the computer how close your guess is
print(f"Your guess is {guess} away from the computer's move.")
else:
print("Invalid input. Please enter a valid move (rock, paper, or scissors).")
```
**How to Run:**
1. Save this code in a file with a `.py` extension (e.g.,
`rock_paper_scissors.py`).
2. Open a terminal or command prompt and navigate to the directory where
you saved the file.
3. Type `python rock_paper_scissors.py` to run the game.
**What to Do:**
1. Run the game by typing `python rock_paper_scissors.py`.
2. Follow the prompts to guess a move.
3. Keep guessing until you guess the computer's move correctly.
This example is more challenging because it:
* Requires you to keep track of your guesses and the computer's move.
* Requires you to convert your guess to a specific move (rock, paper, or
scissors).
* Requires you to tell the computer how close your guess is to the
computer's move.
I hope this example helps you get started with more advanced Python
concepts! Let me know if you have any questions or need further
guidance.<|eot_id|><|start_header_id|>user<|end_header_id|>
Could you add a GUI for this
game?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Here's an example of how you can create a GUI for the Rock, Paper,
Scissors game using Python's Tkinter library:
**Code:**
```python
import tkinter as tk
from random import randint
class RockPaperScissors:
def __init__(self):
self.window = tk.Tk()
self.window.title("Rock, Paper, Scissors")
self.window.geometry("300x200")
self.move = tk.StringVar()
self.move.set("rock")
self.rock_button = tk.Button(self.window, text="Rock", command=self.rock)
self.rock_button.pack()
self.paper_button = tk.Button(self.window, text="Paper", command=self.paper)
self.paper_button.pack()
self.scissors_button = tk.Button(self.window, text="Scissors", command=self.scissors)
self.scissors_button.pack()
self.result_label = tk.Label(self.window, text="", font=('Arial', 24))
self.result_label.pack()
def rock(self):
computer_move = randint(0, 2)
if computer_move == 0:
self.move.set("rock")
self.result_label.config(text=f"Computer chose {self.move.get()}.")
elif computer_move == 1:
self.move.set("paper")
self.result_label.config(text=f"Computer chose {self.move.get()}.")
else:
self.move.set("scissors")
self.result_label.config(text=f"Computer chose {self.move.get()}.")
def paper(self):
computer_move = randint(0, 2)
if computer_move == 0:
self.move.set("paper")
self.result_label.config(text=f"Computer chose {self.move.get()}.")
elif computer_move == 1:
self.move.set("rock")
self.result_label.config(text=f"Computer chose {self.move.get()}.")
else:
self.move.set("scissors")
self.result_label.config(text=f"Computer chose {self.move.get()}.")
def scissors(self):
computer_move = randint(0, 2)
if computer_move == 0:
self.move.set("scissors")
self.result_label.config(text=f"Computer chose {self.move.get()}.")
elif computer_move == 1:
self.move.set("paper")
self.result_label.config(text=f"Computer chose {self.move.get()}.")
else:
self.move.set("rock")
self.result_label.config(text=f"Computer chose {self.move.get()}.")
def run(self):
self.window.mainloop()
if __name__ == "__main__":
game = RockPaperScissors()
game.run()
```
**How to Run:**
1. Save this code in a file with a `.py` extension (e.g.,
`rock_paper_scissors.py`).
2. Open a terminal or command prompt and navigate to the directory where
you saved the file.
3. Type `python rock_paper_scissors.py` to run the game.
4. A GUI window will appear with three buttons: Rock, Paper, and
Scissors.
5. Click on the buttons to play the game.
This GUI implementation provides a simple and intuitive way to play the
Rock, Paper, Scissors game. The game logic is implemented in the `rock`,
`paper`, and `scissors` methods, which are called when the corresponding
buttons are clicked. The game displays the result in the label below the
buttons.
Note that this is a basic implementation, and you can customize it to
suit your needs. For example, you could add additional features like
scoring, high scores, or a more complex game
logic.<|eot_id|><|start_header_id|>user<|end_header_id|>
Could you tell me more about
tkinter?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Tkinter is a Python library that allows you to create simple graphical
user interfaces (GUIs) for your Python programs. It's a great tool for
beginners and experienced Python developers alike.
Here are some key features and concepts related to Tkinter:
**Key Features:**
1. **Widgets:** Tkinter provides a variety of widgets, such as buttons,
labels, text boxes, and more, that you can use to create your GUI.
2. **Layout Management:** Tkinter allows you to create a layout for your
widgets using the `pack`, `grid`, or `place` methods.
3. **Event Handling:** Tkinter provides a way to handle events, such as
button clicks, key presses, and mouse movements, which can trigger
actions in your GUI.
4. **Graphics:** Tkinter can display images, charts, and other graphical
elements in your GUI.
**Basic Concepts:**
1. **Widgets:** A widget is a graphical element that you can use to
create your GUI. Examples of widgets include buttons, labels, and text
boxes.
2. **Layout:** A layout is a way of arranging widgets in your GUI. You
can use the `pack`, `grid`, or `place` methods to create a layout.
3. **Event Handling:** Event handling is the process of responding to
events, such as button clicks or key presses, in your GUI.
4. **Widgets:** You can create a GUI by creating a `Frame` widget, which
is a container that holds other widgets.
**Example Code:**
Here's an example of a simple GUI created using Tkinter:
```python
import tkinter as tk
class MyGUI:
def __init__(self):
self.window = tk.Tk()
self.window.title("My GUI")
self.label = tk.Label(self.window, text="Hello, World!")
self.label.pack()
self.button = tk.Button(self.window, text="Click me!", command=self.button_click)
self.button.pack()
def button_click(self):
print("Button clicked!")
def run(self):
self.window.mainloop()
if __name__ == "__main__":
gui = MyGUI()
gui.run()
```
In this example, we create a `MyGUI` class that creates a window with a
label and a button. When the button is clicked, the `button_click`
method is called, which prints a message to the console.
**Tips and Tricks:**
1. **Use the `pack` or `grid` layout manager:** These are the most
common layout managers in Tkinter.
2. **Use the `mainloop` method:** This is the main method that starts
the event loop of your GUI.
3. **Use the `bind` method:** This is used to bind a function to a
specific event, such as button click or key press.
4. **Use the `config` method:** This is used to change the properties of
a widget, such as its text or color.
I hope this helps! Let me know if you have any specific questions or if
you'd like more information on how to use
Tkinter.<|eot_id|><|start_header_id|>user<|end_header_id|>
Is it possible to deploy on a
website?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Yes, it is possible to deploy a Tkinter application on a website.
However, it requires some additional steps and considerations.
Here are some options:
1. **Run the application as a web server:** You can use a web server
like Apache or Nginx to run your Tkinter application on a website. This
will allow users to access your application through a web browser.
2. **Use a web framework:** You can use a web framework like Flask or
Django to create a web application that uses Tkinter as a backend. This
will allow you to create a web application that can be accessed through
a web browser.
3. **Use a Python web server:** You can use a Python web server like
`http.server` or `webserver` to run your Tkinter application on a
website. This will allow you to access your application through a web
browser.
Here's an example of how you can deploy a Tkinter application on a
website using `http.server`:
```python
import tkinter as tk
class MyGUI:
def __init__(self):
self.window = tk.Tk()
self.window.title("My GUI")
self.label = tk.Label(self.window, text="Hello, World!")
self.label.pack()
def run(self):
self.window.mainloop()
if __name__ == "__main__":
gui = MyGUI()
gui.run()
```
To run this code on a website, you'll need to install `http.server` or
`webserver` and add it to your `PATH` environment variable. Here's an
example of how to do this:
```bash
# Install http.server
pip install http.server
# Add http.server to your PATH environment variable
export PATH=/usr/local/bin:/usr/local/bin
```
Once you've installed `http.server`, you can run the following code to
deploy your Tkinter application on a website:
```python
import http.server
class MyGUI:
def __init__(self):
self.window = http.server.HTTPServer(("", 8000), MyGUI)
def run(self):
self.window.run()
if __name__ == "__main__":
gui = MyGUI()
gui.run()
```
This will start a web server on port 8000 and allow users to access your
Tkinter application through a web browser.
Note that deploying a Tkinter application on a website requires some
technical knowledge and setup. If you're not comfortable with this, you
may want to consider using a more modern web development framework or a
Python web framework like Flask or Django.<|eot_id|>
~~~
</details>
The performance for each run is nearly similar.
```
I 00:00:00.528911 executorch:prompt_processor.cpp:267] Prompt Processor: total 26 prompt tokens (AR-1 * 26 iters)
I 00:00:00.934570 executorch:runner.cpp:459] RSS after prompt prefill: 1326.957031 MiB (0 if unsupported)
I 00:00:08.314366 executorch:token_generator.cpp:345]
Reached to the end of generation
I 00:00:08.314416 executorch:runner.cpp:474] RSS after finishing text generation: 1326.957031 MiB (0 if unsupported)
I 00:00:08.314457 executorch:stats.h:143] Prompt Tokens: 26 Generated Tokens: 446
I 00:00:08.314473 executorch:stats.h:149] Model Load Time: 0.526000 (seconds)
I 00:00:08.314489 executorch:stats.h:159] Total inference time: 7.786000 (seconds) Rate: 57.282302 (tokens/second)
I 00:00:08.314506 executorch:stats.h:167] Prompt evaluation: 0.407000 (seconds) Rate: 63.882064 (tokens/second)
I 00:00:08.314527 executorch:stats.h:178] Generated 446 tokens: 7.379000 (seconds) Rate: 60.441794 (tokens/second)
I 00:00:08.314544 executorch:stats.h:186] Time to first generated token: 0.407000 (seconds)
I 00:00:08.314547 executorch:stats.h:193] Sampling time over 472 tokens: 0.797000 (seconds)
```
cc: @haowhsu-quic1 parent 6399767 commit 3b3c9d4
File tree
41 files changed
+1971
-429
lines changed- backends/qualcomm
- quantizer
- tests
- examples/qualcomm
- oss_scripts/llama
- assets
- model
- runner
- multimodal_runner
- wrappers
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
41 files changed
+1971
-429
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
780 | 780 | | |
781 | 781 | | |
782 | 782 | | |
783 | | - | |
| 783 | + | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
784 | 793 | | |
785 | 794 | | |
786 | 795 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
151 | 151 | | |
152 | 152 | | |
153 | 153 | | |
154 | | - | |
| 154 | + | |
155 | 155 | | |
156 | 156 | | |
157 | 157 | | |
| |||
210 | 210 | | |
211 | 211 | | |
212 | 212 | | |
213 | | - | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
214 | 216 | | |
215 | 217 | | |
216 | 218 | | |
217 | 219 | | |
218 | 220 | | |
219 | 221 | | |
220 | 222 | | |
221 | | - | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
222 | 226 | | |
223 | 227 | | |
224 | 228 | | |
| |||
250 | 254 | | |
251 | 255 | | |
252 | 256 | | |
253 | | - | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
254 | 260 | | |
255 | 261 | | |
256 | 262 | | |
257 | 263 | | |
258 | 264 | | |
259 | 265 | | |
260 | 266 | | |
261 | | - | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
262 | 270 | | |
263 | 271 | | |
264 | 272 | | |
| |||
288 | 296 | | |
289 | 297 | | |
290 | 298 | | |
291 | | - | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
292 | 302 | | |
293 | 303 | | |
294 | 304 | | |
295 | 305 | | |
296 | 306 | | |
297 | 307 | | |
298 | 308 | | |
299 | | - | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
300 | 312 | | |
301 | 313 | | |
302 | 314 | | |
| |||
330 | 342 | | |
331 | 343 | | |
332 | 344 | | |
333 | | - | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
334 | 348 | | |
335 | 349 | | |
336 | 350 | | |
337 | 351 | | |
338 | 352 | | |
339 | 353 | | |
340 | 354 | | |
341 | | - | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
342 | 358 | | |
343 | 359 | | |
344 | 360 | | |
345 | 361 | | |
346 | 362 | | |
347 | 363 | | |
348 | | - | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
349 | 367 | | |
350 | 368 | | |
351 | 369 | | |
| |||
648 | 666 | | |
649 | 667 | | |
650 | 668 | | |
651 | | - | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
652 | 672 | | |
653 | 673 | | |
654 | 674 | | |
655 | 675 | | |
656 | 676 | | |
657 | 677 | | |
658 | 678 | | |
659 | | - | |
| 679 | + | |
| 680 | + | |
| 681 | + | |
660 | 682 | | |
661 | 683 | | |
662 | 684 | | |
663 | 685 | | |
664 | 686 | | |
665 | 687 | | |
666 | | - | |
| 688 | + | |
| 689 | + | |
| 690 | + | |
667 | 691 | | |
668 | 692 | | |
669 | 693 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
73 | 73 | | |
74 | 74 | | |
75 | 75 | | |
| 76 | + | |
76 | 77 | | |
77 | 78 | | |
78 | 79 | | |
| |||
81 | 82 | | |
82 | 83 | | |
83 | 84 | | |
| 85 | + | |
84 | 86 | | |
85 | 87 | | |
86 | 88 | | |
| |||
91 | 93 | | |
92 | 94 | | |
93 | 95 | | |
| 96 | + | |
94 | 97 | | |
95 | 98 | | |
96 | 99 | | |
| |||
143 | 146 | | |
144 | 147 | | |
145 | 148 | | |
| 149 | + | |
146 | 150 | | |
147 | 151 | | |
148 | 152 | | |
| |||
153 | 157 | | |
154 | 158 | | |
155 | 159 | | |
| 160 | + | |
156 | 161 | | |
157 | 162 | | |
158 | 163 | | |
| |||
179 | 184 | | |
180 | 185 | | |
181 | 186 | | |
| 187 | + | |
182 | 188 | | |
183 | 189 | | |
184 | 190 | | |
| |||
189 | 195 | | |
190 | 196 | | |
191 | 197 | | |
| 198 | + | |
192 | 199 | | |
193 | 200 | | |
194 | 201 | | |
| |||
228 | 235 | | |
229 | 236 | | |
230 | 237 | | |
| 238 | + | |
231 | 239 | | |
232 | 240 | | |
233 | 241 | | |
| |||
257 | 265 | | |
258 | 266 | | |
259 | 267 | | |
| 268 | + | |
260 | 269 | | |
261 | 270 | | |
262 | 271 | | |
| |||
311 | 320 | | |
312 | 321 | | |
313 | 322 | | |
| 323 | + | |
314 | 324 | | |
315 | 325 | | |
316 | 326 | | |
| |||
321 | 331 | | |
322 | 332 | | |
323 | 333 | | |
| 334 | + | |
324 | 335 | | |
325 | 336 | | |
326 | 337 | | |
| |||
336 | 347 | | |
337 | 348 | | |
338 | 349 | | |
| 350 | + | |
339 | 351 | | |
340 | 352 | | |
341 | 353 | | |
| |||
359 | 371 | | |
360 | 372 | | |
361 | 373 | | |
| 374 | + | |
362 | 375 | | |
363 | 376 | | |
364 | 377 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
160 | 160 | | |
161 | 161 | | |
162 | 162 | | |
| 163 | + | |
163 | 164 | | |
164 | 165 | | |
165 | 166 | | |
| |||
173 | 174 | | |
174 | 175 | | |
175 | 176 | | |
176 | | - | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
177 | 182 | | |
178 | | - | |
| 183 | + | |
179 | 184 | | |
180 | 185 | | |
181 | 186 | | |
| |||
186 | 191 | | |
187 | 192 | | |
188 | 193 | | |
| 194 | + | |
189 | 195 | | |
190 | 196 | | |
191 | 197 | | |
192 | 198 | | |
193 | 199 | | |
194 | | - | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
195 | 203 | | |
196 | 204 | | |
197 | 205 | | |
| |||
229 | 237 | | |
230 | 238 | | |
231 | 239 | | |
232 | | - | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
233 | 243 | | |
234 | 244 | | |
235 | | - | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
236 | 248 | | |
237 | 249 | | |
238 | 250 | | |
| |||
412 | 424 | | |
413 | 425 | | |
414 | 426 | | |
| 427 | + | |
415 | 428 | | |
416 | 429 | | |
417 | 430 | | |
| |||
432 | 445 | | |
433 | 446 | | |
434 | 447 | | |
| 448 | + | |
435 | 449 | | |
436 | 450 | | |
437 | 451 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1632 | 1632 | | |
1633 | 1633 | | |
1634 | 1634 | | |
| 1635 | + | |
| 1636 | + | |
| 1637 | + | |
| 1638 | + | |
| 1639 | + | |
| 1640 | + | |
| 1641 | + | |
| 1642 | + | |
1635 | 1643 | | |
1636 | 1644 | | |
1637 | 1645 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
| 73 | + | |
73 | 74 | | |
74 | 75 | | |
75 | 76 | | |
| |||
0 commit comments