Skip to content

Tutorial triton hf#21

Draft
fredjaya wants to merge 3 commits intomainfrom
tutorial-triton-hf
Draft

Tutorial triton hf#21
fredjaya wants to merge 3 commits intomainfrom
tutorial-triton-hf

Conversation

@fredjaya
Copy link
Copy Markdown
Member

@fredjaya fredjaya commented Apr 9, 2026

Description

Adds new tutorial for spinning up a Triton inference server and configuring to serve a hugging face model.

It feels a bit long and could be split into several more focused tutorials or how-tos.

Checklist

  • My changes build successfully with Quarto (quarto preview or quarto render)
  • No broken links (quarto check)
  • Screenshots or diagrams updated (if applicable)
  • Documented any new files in _quarto.yml or relevant index pages

How to View Changes

Pull the repo, change to the relevant branch and use quarto preview to view the marked-up version of the text in a browser window.

@fredjaya fredjaya marked this pull request as ready for review April 9, 2026 08:11
@fredjaya
Copy link
Copy Markdown
Member Author

@gdmcdonald suggests using a more recent (2025-2026) model in this example. e.g.

  • gpt-oss-120b
  • Qwen 3.5 models for smaller options

@fredjaya fredjaya marked this pull request as draft April 14, 2026 01:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant