Build incrementally. Each phase adds capability without breaking previous work.
Timeline: ~2-3 weeks (assuming part-time work)
Goal: Get Ollama running and build basic inference loop
-
Install Ollama on Alienware
curl -fsSL https://ollama.com/install.sh | sh ollama serve # Start server
-
Pull and test models
ollama pull llama3.3 # General purpose, 70B ollama pull qwen2.5-coder:7b # Code-specific, smaller ollama pull codellama:7b # Alternative coder
-
Manual API testing
# Test basic inference curl http://localhost:11434/api/chat -d '{ "model": "llama3.3", "messages": [{"role": "user", "content": "Say hello"}], "stream": false }'
-
Create minimal Python client
src/llm.py- Simple Ollama HTTP wrappersrc/agent.py- Single turn: user input → LLM → output- No tools yet, just text responses
-
Test basic conversation
python src/agent.py "What is Python?" python src/agent.py "Write a hello world function"
- ✅ Ollama running, models downloaded
- ✅ Can send prompts via API
- ✅ Agent script returns LLM responses
- ✅ Code runs without errors
Goal: Add tool calling - let the LLM execute functions
-
Design tool interface
src/tools/base.py- Tool base class- Define tool schema (name, description, parameters)
-
Implement first tools
src/tools/filesystem.py:read_file(path)- Read file contentwrite_file(path, content)- Write filelist_directory(path)- List files
-
Add tool registry
src/tools/registry.py- Register and lookup tools- Convert tools to Ollama function format
-
Implement tool calling loop
- Modify
src/agent.py:- Send tools to LLM
- Parse tool call responses
- Execute requested tool
- Send result back to LLM
- Repeat until LLM returns text response
- Modify
-
Test tool execution
python src/agent.py "Read the file test.txt" python src/agent.py "Create a file called hello.py with a hello world function"
- ✅ LLM can call tools
- ✅ Tools execute correctly
- ✅ Results feed back to LLM
- ✅ Agent completes multi-step tasks
Goal: Handle conversation history and token limits
-
Implement context manager
src/context.py:- Store message history
- Estimate token usage
- Prune old messages when needed
-
Add conversation persistence
- Save conversations to
~/.agent/sessions/ - Load previous context (optional)
- Save conversations to
-
Implement token management strategies
- Sliding window (keep last N messages)
- Summarization (compress old context)
-
Multi-turn conversations
- Modify agent to maintain context across multiple user inputs
- Add
/clearcommand to reset context
-
Test context handling
python src/agent.py > Create a file called test.py > Now add a function to it > What's in the file?
- ✅ Agent remembers previous messages
- ✅ Context doesn't overflow token limit
- ✅ Can handle long conversations
Goal: Make it robust and safe to use
-
Add safety checks
- Whitelist allowed shell commands
- Restrict file operations to safe directories
- Timeout long-running operations
- Ask confirmation for destructive actions
-
Implement error handling
- Graceful failure when tools error
- Retry logic for transient failures
- Clear error messages to user
-
Add more tools
src/tools/shell.py:execute_shell(command)- Run shell commands (safe)
src/tools/web.py:search_web(query)- Web search
src/tools/python.py:run_python(code)- Execute Python code in sandbox
-
Testing suite
tests/test_tools.py- Unit tests for each tooltests/test_agent.py- Integration teststests/test_context.py- Context management tests
-
Documentation
- Update architecture.md with safety notes
- Add usage examples to README.md
- ✅ Agent handles errors gracefully
- ✅ Dangerous operations require confirmation
- ✅ All tools have tests
- ✅ Project is well-documented
Goal: Make it pleasant to use and extend
-
Interactive mode
- REPL-style interface
- Command history
- Tab completion (optional)
-
Configuration system
config.yaml- Model selection, safety settings, tool config- Command-line flags (--model, --debug, --tools)
-
Logging & debugging
- Log all LLM requests/responses
- Debug mode showing tool execution
- Performance metrics (tokens used, time per turn)
-
Advanced features (pick what interests you)
- Streaming responses (real-time output)
- Multi-agent orchestration (spawn sub-agents)
- Plugin system (load tools from external modules)
- Web UI (Flask-based chat interface)
- Voice interface (Whisper STT + TTS)
-
Integration experiments
- Connect to OpenClaw as ACP harness
- Integrate with MagicMirror (control modules via agent)
- Home Assistant integration
- ✅ Agent is production-ready for personal use
- ✅ Easy to add new tools
- ✅ Well-tested and documented
- ✅ You understand every line of code
Track progress as you build:
Phase 1: Foundation
- Ollama installed and running
- Models downloaded and tested
- Basic Python client works
- Single-turn conversations work
Phase 2: Tools
- Tool interface designed
- File operations implemented
- Tool calling loop works
- Multi-step tasks complete
Phase 3: Context
- Context manager implemented
- Token estimation works
- Multi-turn conversations work
- History persists (optional)
Phase 4: Safety
- Safety checks in place
- Error handling robust
- Additional tools added
- Test suite passing
Phase 5: Polish
- Interactive mode works
- Configuration system
- Logging and debugging
- Advanced features (choose N)
Think about these while reviewing docs:
-
Which model to use as primary?
- llama3.3 (70B) - Smarter but slower
- qwen2.5-coder:7b - Faster, good for code
- Test both, see what works
-
Tool calling vs prompt engineering?
- Start with tool calling (cleaner)
- Fall back to prompts if model doesn't support it
-
Which tools to prioritize?
- File ops (read/write) - Essential
- Shell execution - Powerful but risky
- Web search - Useful for research tasks
- Code execution - Great for testing snippets
-
Safety vs convenience?
- Always ask before destructive operations?
- Or trust the LLM with whitelisted commands?
- Your call based on risk tolerance
-
How to test?
- Unit tests for each component?
- Integration tests for full workflows?
- Manual testing only?
- Combination (recommended)
See reading-list.md for detailed resources.
Quick picks:
- Anthropic's tool use guide (concepts apply to any LLM)
- Ollama API docs
- OpenClaw architecture (see what you're replicating)
- Your own MagicMirror code (patterns to reuse)
When you land Thursday:
- Install Ollama on Alienware (10 minutes)
- Run through Phase 1 tasks (2-3 hours)
- Report back - share what worked, what didn't
- Adjust plan based on what you learned
This isn't just a project - it's your bootcamp for understanding AI agents. Every line you write teaches you something you can apply to future work.
Let's build something cool. 🚀