Skip to content

Latest commit

 

History

History
113 lines (93 loc) · 5.06 KB

File metadata and controls

113 lines (93 loc) · 5.06 KB

SPEC: Simplify shell scripts — remove redundancies, DRY up service config

Date: 2026-04-23 Author: Ava Status: For coder agent

Core Problem

  • systemd/llama-node1.service and llama-node2.service are 95% identical.
  • setup-llama.sh contains duplicate heredocs for both (4 places total duplication).
  • run_all.sh is redundant subset of setup-llama.sh.
  • register_service.sh may exist but should be deleted (already removed from git).

Constraints:

  • Follow EXACT changes in task description.
  • Keep comprehensive error handling on ALL functions.
  • Use uv run or make for any Python, but here mostly shell.
  • After changes, run make lint (ensure shellcheck passes).
  • Context size MUST be -c 125000 in template (fix from 200000 in current files).
  • Ensure HF_HOME env var in BOTH nodes.
  • Template must match provided structure exactly.
  • No breaking changes to start/stop/deploy/reload/test_comm.

Required Changes (exact)

1. Delete files

  • Delete run_all.sh entirely.
  • Delete register_service.sh if present (check with ls).
  • After generation logic added, delete old systemd/llama-node1.service and systemd/llama-node2.service.

2. New file: systemd/llama-node.service.in

Create with EXACT content from task (use the ini template provided).

  • Use %NODE_DESC%, %GPU_IDS%, %PORT% placeholders.
  • Set -c 125000 (not 200000).
  • Include HF_HOME and CUDA_CACHE_PATH for both.
  • Keep all other params (ngl 99, tensor-split 1,1, batch 512, parallel 2, temp 0.8 etc, --jinja, MemoryHigh=20G etc).
  • Description uses %NODE_DESC% and %GPU_IDS%.

3. Update setup-llama.sh

  • Remove the entire setup_dual() heredocs for node1 and node2 (the two cat > << 'EOF' blocks).
  • Replace setup_dual() with:
    setup_dual() {
      log "Setting up dual node (Gemma 4 26B A4B IT, alias=coder, ctx=125000)..."
      mkdir -p "$HOME_DIR/llm-serving/systemd" "$HOME_DIR/llm-serving/nginx" || error "Failed to create directories"
      
      # Generate from template (source of truth)
      generate_service "1" "0,1" "8081"
      generate_service "2" "2,3" "8082"
      
      log "✓ Dual node systemd services generated from template"
      success "Dual node setup complete"
    }
  • Add helper function:
    generate_service() {
      local node_num=$1
      local gpu_ids=$2
      local port=$3
      local template="$SCRIPT_DIR/systemd/llama-node.service.in"
      local target1="$HOME_DIR/llm-serving/systemd/llama-node${node_num}.service"
      local target2="$SCRIPT_DIR/systemd/llama-node${node_num}.service"
      
      sed -e "s/%NODE_DESC%/${node_num}/g" \
          -e "s/%GPU_IDS%/${gpu_ids}/g" \
          -e "s/%PORT%/${port}/g" \
          "$template" > "$target1" || error "Failed to generate node${node_num}"
      cp "$target1" "$target2" || error "Failed to copy to SCRIPT_DIR"
      success "Generated llama-node${node_num}.service (GPU ${gpu_ids}, port ${port})"
    }
  • Update register_services() comment and log if needed. It already copies from $SCRIPT_DIR/systemd/ (now will have generated files).
  • Keep ensure_llm_serving_link, check_llama_cpp, call to deploy_nginx.sh, register_services in main().
  • Update top comments to reflect new template-based approach, no more heredocs.
  • Ensure set -e, error handling everywhere.

Note: This makes .service files generated (they can be in git or not; since task doesn't specify .gitignore update, include them as generated during setup).

4. Update start_service.sh

  • In check_systemd(), change warning if any from "Run ./register_service.sh first" to "Run ./setup-llama.sh first" (update if present; current uses setup but verify).

5. Update stop_service.sh

  • Verify "To re-register: ./setup-llama.sh" is there (already is).

6. Update README.md

  • Remove mentions of run_all.sh and register_service.sh from Quick Start, Daily Commands, Project Structure.
  • Update Project Structure section to list systemd/llama-node.service.in (NEW template) instead of the two node*.service.
  • Update any descriptions of setup-llama.sh to mention it generates services from template.
  • Keep other content.

7. Update AGENTS.md

  • Remove any references to run_all.sh or register_service.sh in Key Scripts section.
  • Update if it mentions service registration.
  • Ensure it still says "registration now in setup-llama.sh".

8. Update Makefile

  • No major changes expected unless it references deleted files (lint still covers *.sh).

9. Other

  • Update systemd/user-nginx.service if needed (no).
  • Ensure after changes, the generated services match exactly the previous functionality but with correct ctx=125k and no duplication.
  • In template, use the exact ExecStart, environments, etc. from task.

Workflow for Coder

  1. Implement EXACTLY as per this spec and the original task description. Do not add extra features.
  2. Use precise edits where possible.
  3. After all changes, run make lint and fix any shellcheck issues.
  4. Do not write tests (that's for test-writer).
  5. Return list of changed file paths (one per line) + brief summary.

After implementation, Ava will delegate to test-writer, then reviewer.

This spec follows the task's "Changes required" points 1-10 exactly.