Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 12 additions & 6 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,12 +132,13 @@ handle.await?;
│ - Create feature branch │
│ - Copy implementation-verification.md to feature folder │
│ - Implement with TDD (tests first) │
│ - Check off acceptance criteria as you go
│ - Check off acceptance criteria in tasks.md as you go │
├─────────────────────────────────────────────────────────────────────┤
│ 3. VERIFICATION PHASE │
│ - Run speckit.analyze │
│ - Verify all checklists complete │
│ - Create walkthrough.md │
│ - Run speckit.analyze to check spec/implementation alignment │
│ - Complete verification.md checklist (mark items [x] or [-]) │
│ - Create walkthrough.md for code documentation │
│ - Ensure 0 unchecked items remain in tasks.md │
├─────────────────────────────────────────────────────────────────────┤
│ 4. MERGE PHASE │
│ - Push feature branch │
Expand All @@ -153,13 +154,18 @@ Use the two-checklist system for quality assurance:
| Phase | Checklist | Command |
|-------|-----------|---------|
| Before coding | Validate spec quality | Review `.specify/checklists/requirements-quality.md` |
| After coding | Verify implementation | Check `.specify/templates/implementation-verification.md` |
| After coding | Verify implementation | Complete `specs/XXX-feature/verification.md` |

```bash
# Copy verification template to your feature
cp .specify/templates/implementation-verification.md specs/XXX-feature/verification.md

# Verify all items checked before PR
# Complete the verification checklist:
# - Mark [x] for items that pass
# - Mark [-] for items not applicable (N/A) to this feature
# - Leave [ ] only for actual issues needing fix

# Verify all items are addressed before PR (should be 0 unchecked items)
grep -c "\- \[ \]" specs/XXX-feature/verification.md # Should be 0
grep -c "\- \[ \]" specs/XXX-feature/tasks.md # Should be 0
```
Expand Down
8 changes: 5 additions & 3 deletions docs/FEATURES.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,16 +12,18 @@ Detailed specifications for each feature in the Nexus LLM Orchestrator.
| F02 | Backend Registry | P0 | ✅ Complete | [specs/001-backend-registry](../specs/001-backend-registry/) |
| F03 | Health Checker | P0 | ✅ Complete | [specs/002-health-checker](../specs/002-health-checker/) |
| F04 | CLI and Configuration | P0 | ✅ Complete | [specs/003-cli-configuration](../specs/003-cli-configuration/) |
| F05 | mDNS Discovery | P1 | Planned | - |
| F05 | mDNS Discovery | P1 | ✅ Complete | [specs/005-mdns-discovery](../specs/005-mdns-discovery/) |
| F06 | Intelligent Router | P1 | Planned | - |
| F07 | Model Aliases | P1 | Planned | - |
| F08 | Fallback Chains | P1 | Planned | - |
| F09 | Request Metrics | P2 | Planned | - |
| F10 | Web Dashboard | P2 | Planned | - |

### MVP Status: ✅ Complete
### Current Status

All P0 features implemented with **224 tests passing**.
- **MVP (P0)**: ✅ Complete (4/4 features)
- **Phase 2 (P1)**: 🚧 In Progress (1/4 features complete)
- **Tests**: 258 passing

---

Expand Down
264 changes: 263 additions & 1 deletion docs/MANUAL_TESTING_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ ollama serve # or systemctl start ollama
2. [F02: Backend Registry](#f02-backend-registry)
3. [F03: Health Checker](#f03-health-checker)
4. [F01: Core API Gateway](#f01-core-api-gateway)
5. [F05: mDNS Discovery](#f05-mdns-discovery)

> **Note**: Features are listed in testing order, not feature number order, because CLI/Config is needed first to set up backends.

Expand Down Expand Up @@ -698,10 +699,271 @@ rm -f /tmp/nexus.{bash,zsh,fish}
| F02: Registry | Add/remove/list backends | Backends tracked correctly |
| F03: Health | Health status, failure detection | Accurate status reporting |
| F01: API | Models list, chat completion, streaming | OpenAI-compatible responses |
| F05: mDNS | Auto-discovery, grace period, fallback | Backends discovered, manual takes precedence |

For automated testing, run:
```bash
cargo test
```

Current test suite: **224 tests passing**.
Current test suite: **258 tests passing**.

---

## F05: mDNS Discovery

mDNS Discovery automatically finds LLM backends on your local network. This feature requires a network environment where mDNS works (typically a local network, not Docker or WSL).

### Prerequisites for mDNS Testing

- At least two machines on the same local network
- Ollama running on a different machine (it advertises via mDNS by default)
- OR: An mDNS-capable service advertising `_llm._tcp.local`

> **Note:** Testing mDNS on a single machine is limited because Ollama's mDNS advertisement is meant for network discovery. For single-machine testing, focus on verifying the configuration and graceful fallback.

### 5.1 Verify mDNS is Enabled in Configuration

```bash
# Check nexus.toml includes discovery section
cat nexus.toml | grep -A5 "\[discovery\]"
```

**Expected**:
```toml
[discovery]
enabled = true
service_types = ["_ollama._tcp.local", "_llm._tcp.local"]
grace_period_seconds = 60
```

> **Note**: Service types can be configured with or without trailing dots. Nexus automatically normalizes them (adds the trailing dot if missing) for the mdns-sd library.

### 5.2 Start Server with mDNS Discovery

```bash
# Start with debug logging to see discovery activity
RUST_LOG=debug nexus serve 2>&1 | tee nexus.log &
SERVER_PID=$!
sleep 5

# Check for mDNS startup messages
grep -i "mdns\|discovery" nexus.log
```

**Expected log entries**:
```
INFO mDNS service daemon started
INFO Browsing for mDNS service: _ollama._tcp.local
INFO Browsing for mDNS service: _llm._tcp.local
```

### 5.3 Verify Discovery of Remote Ollama

If you have Ollama running on another machine (e.g., 192.168.1.100):

```bash
# Wait for discovery (may take a few seconds)
sleep 10

# List backends - should show discovered backend
nexus backends list
```

**Expected**:
```
Backends:
local-ollama (ollama) [static]
URL: http://localhost:11434
Status: Healthy

ollama-laptop (ollama) [mdns]
URL: http://192.168.1.100:11434
Status: Healthy
Models: llama3:latest, ...
```

The `[mdns]` tag indicates the backend was auto-discovered.

### 5.4 Test mDNS Disabled Mode

```bash
# Start without discovery
nexus serve --no-discovery &
SERVER_PID=$!
sleep 3

# Check logs - should say disabled
grep -i "discovery disabled" nexus.log

# Or check that no mDNS backends appear
nexus backends list --json | jq '[.[] | select(.source == "mdns")] | length'
```

**Expected**: 0 (no mDNS-discovered backends)

### 5.5 Test Graceful Fallback (Docker/WSL)

In environments where mDNS isn't available (Docker, WSL without special config), Nexus should gracefully continue:

```bash
# Start server in Docker or WSL
RUST_LOG=warn nexus serve 2>&1 | tee nexus.log &
sleep 5

# Check for fallback message
grep -i "mDNS unavailable" nexus.log
```

**Expected**:
```
WARN mDNS unavailable, discovery disabled: ...
```

The server should still work, just without auto-discovery.

### 5.6 Test Manual Config Takes Precedence

```bash
# Pre-configure a backend at the same URL that would be discovered
cat > nexus.toml << 'EOF'
[server]
host = "0.0.0.0"
port = 8000

[discovery]
enabled = true

[[backends]]
name = "my-configured-ollama"
url = "http://192.168.1.100:11434"
type = "ollama"
priority = 10
EOF

nexus serve &
SERVER_PID=$!
sleep 10

# The discovered backend should NOT override the configured one
nexus backends list
```

**Expected**: Only "my-configured-ollama" appears, not a duplicate discovered backend.

### 5.7 Test Grace Period (Service Disappearing)

This test requires control over a remote Ollama instance:

```bash
# 1. Start Nexus and wait for discovery
nexus serve &
sleep 10

# 2. Note the discovered backend
nexus backends list

# 3. Stop the remote Ollama (on the other machine)
# ssh user@192.168.1.100 'systemctl stop ollama'

# 4. Check status immediately - should show Unknown
sleep 5
nexus backends list # Status: Unknown

# 5. Wait less than grace period (60s) and restart remote Ollama
# ssh user@192.168.1.100 'systemctl start ollama'
sleep 30

# 6. Backend should recover without being removed
nexus backends list # Status: Healthy (same backend, not re-added)
```

**Expected**: Backend transitions Unknown → Healthy without removal/re-addition.

### 5.8 Test Service Types Configuration

```bash
# Only browse for Ollama services
cat > nexus.toml << 'EOF'
[discovery]
enabled = true
service_types = ["_ollama._tcp.local"] # Only Ollama, not _llm._tcp
grace_period_seconds = 60
EOF

nexus serve &
sleep 5

# Should only see Ollama services, not generic _llm services
```

### 5.9 Verify IPv6 Support

If your network has IPv6:

```bash
# Start with debug logging
RUST_LOG=debug nexus serve 2>&1 | tee nexus.log &
sleep 10

# Check if IPv6 addresses are handled correctly
grep -i "ipv6\|\[::" nexus.log
```

**Expected**: If IPv6 services are discovered, URLs use bracket notation: `http://[::1]:11434`

### 5.10 Cleanup

```bash
# Stop the server
kill $SERVER_PID 2>/dev/null || true

# Remove test config
rm -f nexus.log
```

---

## mDNS Testing on a Single Machine

If you only have one machine, you can still test some aspects:

### Simulated Test with Avahi (Linux)

```bash
# Install Avahi if not present
sudo apt install avahi-daemon avahi-utils

# Advertise a fake LLM service
avahi-publish -s "Test LLM Server" _llm._tcp 8080 "type=generic" "api_path=/v1" &
AVAHI_PID=$!

# Start Nexus and check discovery
RUST_LOG=debug nexus serve &
sleep 10

nexus backends list
# Should show discovered "Test LLM Server"

# Cleanup
kill $AVAHI_PID $SERVER_PID
```

### Simulated Test with dns-sd (macOS)

```bash
# Advertise a fake LLM service
dns-sd -R "Test LLM Server" _llm._tcp local 8080 type=generic api_path=/v1 &
DNS_SD_PID=$!

# Start Nexus
RUST_LOG=debug nexus serve &
sleep 10

nexus backends list

# Cleanup
kill $DNS_SD_PID $SERVER_PID
```

---
3 changes: 3 additions & 0 deletions nexus.example.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,10 @@ request_timeout_seconds = 300
[discovery]
# Auto-discover backends via mDNS
enabled = true
# Service types to browse for (trailing dot is optional - added automatically)
service_types = ["_ollama._tcp.local", "_llm._tcp.local"]
# Grace period before removing disappeared backends (seconds)
grace_period_seconds = 60

[health_check]
enabled = true
Expand Down
Loading