Skip to content

Commit 5ca2da2

Browse files
first pass at personalizing Astro (#2)
* front page and site config updates * updates the projects page * cleanup * updates site logo * a bunch of small tweaks * updates first note to announce release of invocate * adds a note about the demo * adds demo to the project page * remove regular blog post section for now * updates to the invocate post * minor fix * working on content * adds some images * minor updates to posts
1 parent c6c01d6 commit 5ca2da2

28 files changed

Lines changed: 237 additions & 260 deletions

public/big_top_tent__sm_trans.png

37.2 KB
Loading

public/sunrise.jpg

65.6 KB
Loading

src/components/BaseHead.astro

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ function formatCanonicalURL(url: string | URL) {
3939
<meta name="generator" content={Astro.generator} />
4040

4141
<!-- Low Priority Global Metadata -->
42-
<link rel="icon" type="image/svg+xml" href="/favicon.svg" />
42+
<link rel="icon" type="image/svg+xml" href="/big_top_tent__sm_trans.png" />
4343
<link rel="sitemap" href="/sitemap-index.xml" />
4444
<link rel="alternate" type="application/rss+xml" href="/rss.xml" title="RSS" />
4545

src/components/ListPosts.vue

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ function getYear(date: Date | string | number) {
4848
</template>
4949
<li v-for="(post, index) in list " :key="post.data.title" mb-8>
5050
<div v-if="!isSameYear(post.data.date, list[index - 1]?.data.date)" select-none relative h18 pointer-events-none>
51-
<span text-7em color-transparent font-bold text-stroke-2 text-stroke-hex-aaa op14 absolute top--0.2em>
51+
<span text-7em color-transparent font-bold text-stroke-2 text-stroke-hex-aaa op24 absolute top--0.2em>
5252
{{ getYear(post.data.date) }}
5353
</span>
5454
</div>
246 KB
Loading
220 KB
Loading

src/content/blog/notes/post-1.md

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,20 @@
11
---
2-
title: Note Title
3-
description: Your blog description, which is long text, can be an introduction to the post or a paragraph of the post.
4-
duration: 5min
5-
date: 2022-12-01
2+
title: Invocate
3+
description: I've released a wrapper around invoke that makes namespaces simpler.
4+
date: 2025-08-08
65
---
76

8-
Use [Vitesse Them for Astro](https://astro.build/themes/details/vitesse-theme-for-astro/) to start writing your blog posts.
7+
![](images/velocity.jpg "Velocity (the cat) wants to discuss his portions.")
8+
9+
I just released [Invocate](https://pypi.org/project/invocate/) which is a
10+
packaged-up version of a wrapper I wrote a while ago to make namespacing with
11+
Invoke tasks a bit easier to work with.
12+
13+
It's a huge improvement over what
14+
I've been doing, which is dragging around a collection of aging python code outside
15+
of a proper package.
16+
17+
It also includes some (breaking) changes that I've been wanting to make for a
18+
while that will make it easier and more intuitive.
19+
20+
And there are docs!
Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
---
2+
title: "Demo: Pgvector"
3+
description: A vector database implemented using Pgvector and PostgreSQL.
4+
date: 2025-08-20
5+
---
6+
7+
# Terra firma: playing at scale
8+
![](images/cat_creepin_outside_window2.jpg "Foxy (the cat) creepin' outside my office window")
9+
10+
## My last demo was impressive, but pitiable in a lot of ways
11+
For the past year or so, I've been thinking about how People can benefit from
12+
LLMs and I've been noodling out a design for sharing context in an interesting
13+
way with friends, coworkers, and customers, but I've hardly touched any code
14+
that actually does anything interesting with an LLM or any other more typically
15+
AI-adjacent construct.
16+
17+
But, just before that, I did a code up quick demo that showed off a simple RAG
18+
pipeline. It worked remarkably well but it was dead simple: a lightweight
19+
model, a Chroma vector store, and some custom chunking code.
20+
21+
A few weeks later, I built something similar at work to mine Basecamp
22+
conversations for support information. Again, though just a demo, the
23+
results were pretty badass.
24+
25+
## There were some pretty obvious scaling limitations:
26+
* at some point, I figured I'd want to put so much data in the database that it
27+
wouldn't fit in memory and Chroma was an in-memory vector database. I want to
28+
see the day when we can search curated libraries that house vast amounts of
29+
text, so persistent, non-resident, non-super-expensive storage is crucial.
30+
* the model I chose didn't fit in the VRAM so I had to run it with the CPU which
31+
made it pretty slow (but not terrible really).
32+
* it had to download the model every time it ran.
33+
34+
## Scoping out the future
35+
Because I know I'll be crossing these bridges at some point, I've been eyeing
36+
a bunch of answers to the demo's shortcomings. PostgreSQL is an easy choice if
37+
it works since I've been using it for years. Caching the model is an obvious
38+
upgrade too.
39+
40+
Docling was a bit of an unknown, but it performed admirably as did vLLM in a
41+
Docker container, once I'd upgraded my drivers to the 580 version.
42+
43+
## This demo
44+
[Demo: Pgvector](https://github.com/FredworkLemmas/demo_pgvector) is simply a
45+
proof-of-concept that shows off the same sort of RAG pipeline and query solution
46+
I'd built in the past, but with some enhancements:
47+
48+
* A PostgreSQL-backed vector index, removing in-memory constraints.
49+
* Docling for cleaner chunking strategies.
50+
* vLLM with model caching, which makes small models easy to run repeatedly.
51+
* A Dockerized GPU environment, which turned out to be easier to configure than
52+
in the past.
53+
* EPUB ingestion, Project Gutenberg unlocked!
54+
55+
## Reflections
56+
* Embedding dimensions are strict: mismatches are non-negotiable...it's a choice
57+
that's made when the DB table is created.
58+
* Model quality has improved: Qwen 1.5B was unexpectedly strong for its size.
59+
* Search thresholds were surprisingly low: semantic similarity scores were far
60+
lower than I expected, making me wonder if it's actually possible to set that
61+
as a constant. it may need to reflect the content somehow. and the oddly low
62+
number also makes me think I should be baking in some sort of full-text search
63+
(which happens to be pretty easy with postgresql).
64+
65+
## Closing thoughts
66+
There were no "Eureka!" moments with this demo, but it was pretty easy to get to
67+
where all the moving parts were in place and working and the future is bright:
68+
69+
* the AI assistant in PyCharm was super-helpful with some key bits of this
70+
effort.
71+
* with a vector database that can scale beyond a machine's RAM capacity, a
72+
surpisingly capable but smallish model, and some solid caching options with
73+
vLLM, it looks like reliable performance on modest hardware is indeed
74+
possible.
75+
76+
## What's next?
77+
* There's a lot to be done in the context department - searching chunks is cool
78+
but it's a subset of what a real-world use case will need.
79+
* It seems like there are some good ways to integrate MCP capabilities.
80+
* I'd like to see bigger models and multi-modal I/O.
81+
82+
&nbsp;
83+
# Additional Notes
84+
THERE ARE NO TESTS!! HERE BE DRAGONS! RUN AWAY!

src/content/blog/notes/post-2.md

Lines changed: 0 additions & 8 deletions
This file was deleted.

src/content/blog/notes/post-3.md

Lines changed: 0 additions & 8 deletions
This file was deleted.

0 commit comments

Comments
 (0)