Troubleshooting Guide

This guide helps you diagnose and resolve common issues you might encounter when using NodePass. For each problem, we provide possible causes and step-by-step solutions.

Connection Issues

Unable to Establish Tunnel Connection

Symptoms: Client cannot connect to the server's tunnel endpoint, or connection is immediately dropped.

Possible Causes and Solutions:

Network Connectivity Issues
- Verify basic connectivity with ping or telnet to the server address
- Check if the specified port is reachable: telnet server.example.com 10101
- Ensure no firewall is blocking the tunnel port (typically 10101)
Server Not Running
- Verify the NodePass server is running with ps aux | grep nodepass on Linux/macOS
- Check server logs for any startup errors
- Try restarting the server process
Incorrect Addressing
- Double-check the tunnel address format in your client command
- Ensure you're using the correct hostname/IP and port
- If using DNS names, verify they resolve to the correct IP addresses
TLS Configuration Mismatch
- If server requires TLS but client doesn't support it, connection will fail
- Check server logs for TLS handshake errors
- Ensure certificates are correctly configured if using TLS mode 2

Data Not Flowing Through the Tunnel

Symptoms: Tunnel connection established, but application data isn't reaching the destination.

Possible Causes and Solutions:

Target Service Not Running
- Verify the target service is running on both server and client sides
- Check if you can connect directly to the service locally
Port Conflicts
- Ensure the target port isn't already in use by another application
- Use netstat -tuln to check for port usage
Protocol Mismatch
- Verify you're tunneling the correct protocol (TCP vs UDP)
- Some applications require specific protocol support
Incorrect Target Address
- Double-check the target address in both server and client commands
- For server-side targets, ensure they're reachable from the server
- For client-side targets, ensure they're reachable from the client

Connection Stability Issues

Symptoms: Tunnel works initially but disconnects frequently or becomes unresponsive.

Possible Causes and Solutions:

Network Instability
- Check for packet loss or high latency in your network
- Consider a more stable network connection for production deployments
Resource Constraints
- Monitor CPU and memory usage on both client and server
- Adjust pool parameters if resources are being exhausted (see Performance section)
- Check file descriptor limits with ulimit -n on Linux/macOS
Timeout Configuration
- Adjust NP_UDP_DIAL_TIMEOUT if using UDP with slow response times
- Increase read parameter in URL for long-running transfers (default: 0)
- Consider adjusting NP_TCP_DIAL_TIMEOUT for unstable network conditions
Overloaded Server
- Check server logs for signs of connection overload
- Adjust max parameter and NP_SEMAPHORE_LIMIT to handle the load
- Consider scaling horizontally with multiple NodePass instances

Certificate Issues

TLS Handshake Failures

Symptoms: Connection attempts fail with TLS handshake errors.

Possible Causes and Solutions:

Invalid Certificate
- Verify certificate validity: openssl x509 -in cert.pem -text -noout
- Ensure the certificate hasn't expired
- Check that the certificate is issued for the correct domain/IP
Missing or Inaccessible Certificate Files
- Confirm file paths to certificates and keys are correct
- Verify file permissions allow the NodePass process to read them
- Check for file corruption by opening certificates in a text editor
Certificate Trust Issues
- If using custom CAs, ensure they are properly trusted
- For self-signed certificates, confirm TLS mode 1 is being used
- For verified certificates, ensure the CA chain is complete
Key Format Problems
- Ensure private keys are in the correct format (usually PEM)
- Check for passphrase protection on private keys (not supported directly)

Certificate Renewal Issues

Symptoms: After certificate renewal, secure connections start failing.

Possible Causes and Solutions:

New Certificate Not Loaded
- Restart NodePass to force loading of new certificates
- Check if RELOAD_INTERVAL is set correctly to automatically detect changes
Certificate Chain Incomplete
- Ensure the full certificate chain is included in the certificate file
- Verify chain order: your certificate first, then intermediate certificates

Key Mismatch

Verify the new certificate matches the private key:

openssl x509 -noout -modulus -in cert.pem | openssl md5
openssl rsa -noout -modulus -in key.pem | openssl md5

If outputs differ, certificate and key don't match

Performance Optimization

High Latency

Symptoms: Connections work but have noticeable delays.

Possible Causes and Solutions:

Pool Configuration
- Increase min parameter to have more connections ready
- Decrease MIN_POOL_INTERVAL to create connections faster
- Adjust NP_SEMAPHORE_LIMIT if connection queue is backing up
Network Path
- Check for network congestion or high-latency links
- Consider deploying NodePass closer to either the client or server
- Use a traceroute to identify potential bottlenecks
TLS Overhead
- If extreme low latency is required and security is less critical, consider using TLS mode 0
- For a balance, use TLS mode 1 with session resumption
Resource Contention
- Ensure the host system has adequate CPU and memory
- Check for other processes competing for resources
- Consider dedicated hosts for high-traffic deployments

High CPU Usage

Symptoms: NodePass process consuming excessive CPU resources.

Possible Causes and Solutions:

Pool Thrashing
- If pool is constantly creating and destroying connections, adjust timings
- Increase MIN_POOL_INTERVAL to reduce connection creation frequency
- Find a good balance for min and max pool parameters
Excessive Logging
- Reduce log level from debug to info or warn for production use
- Check if logs are being written to a slow device
TLS Overhead
- TLS handshakes are CPU-intensive; consider session caching
- Use TLS mode 1 instead of mode 2 if certificate validation is less critical
Traffic Volume
- High throughput can cause CPU saturation
- Consider distributing traffic across multiple NodePass instances
- Vertical scaling (more CPU cores) may be necessary for very high throughput

Memory Leaks

Symptoms: NodePass memory usage grows continuously over time.

Possible Causes and Solutions:

Connection Leaks
- Ensure NP_SHUTDOWN_TIMEOUT is sufficient to properly close connections
- Check for proper error handling in custom scripts or management code
- Monitor connection counts with system tools like netstat
Pool Size Issues
- If max parameter is very large, memory usage will be higher
- Monitor actual pool usage vs. configured capacity
- Adjust capacity based on actual concurrent connection needs
Debug Logging
- Extensive debug logging can consume memory in high-traffic scenarios
- Use appropriate log levels for production environments

UDP-Specific Issues

UDP Data Loss

Symptoms: UDP packets are not reliably forwarded through the tunnel.

Possible Causes and Solutions:

Buffer Size Limitations
- If UDP packets are large, increase UDP_DATA_BUF_SIZE
- Default of 8192 bytes may be too small for some applications
Timeout Issues
- If responses are slow, increase NP_UDP_DIAL_TIMEOUT
- Adjust read parameter for longer session timeouts
- For applications with variable response times, find an optimal balance
High Packet Rate
- UDP is handled one datagram at a time; very high rates may cause issues
- Consider increasing pool capacity for high-traffic UDP applications
Protocol Expectations
- Some UDP applications expect specific behavior regarding packet order or timing
- NodePass provides best-effort forwarding but cannot guarantee UDP properties beyond what the network provides

UDP Connection Tracking

Symptoms: UDP sessions disconnect prematurely or fail to establish.

Possible Causes and Solutions:

Connection Mapping
- Verify client configurations match server expectations
- Check for firewalls that may be timing out UDP session tracking
Application UDP Timeout
- Some applications have built-in UDP session timeouts
- May need to adjust application-specific keepalive settings

DNS Issues

DNS Resolution Failures

Symptoms: Connections fail with "no such host" or DNS lookup errors.

Solutions:

Verify System DNS Configuration
- Verify resolution works: nslookup example.com
- Check system's DNS settings (NodePass uses system resolver)
- Ensure network connectivity is working
Network Connectivity
- Check if firewall blocks UDP port 53
- Verify domain reachability
- Test with alternative domains to isolate issue

DNS Caching Problems

Symptoms: Resolution returns stale IPs, connections go to wrong endpoints.

Solutions:

Adjust Cache TTL (default 5 minutes)
- Dynamic environments: dns=1m
- Stable environments: dns=30m
Load Balancing Scenarios
- Use shorter TTL: dns=30s
- Or use IP addresses directly to bypass DNS caching

DNS Performance Optimization

Symptoms: High connection latency, slow startup.

Solutions:

Optimize Cache TTL
- Increase TTL for stable environments: dns=1h
- Reduce TTL for dynamic environments: dns=1m
- Balance between freshness and performance
Reduce DNS Queries
- Use IP addresses directly for performance-critical scenarios
- Increase TTL for stable hostnames
- Pre-resolve addresses when possible

Master API Issues

API Accessibility Problems

Symptoms: Cannot connect to the master API endpoint.

Possible Causes and Solutions:

Endpoint Configuration
- Verify API address and port in the master command
- Check if the API server is bound to the correct network interface
TLS Configuration
- If using HTTPS (TLS modes 1 or 2), ensure client tools support TLS
- For testing, use curl -k to skip certificate validation
Custom Prefix Issues
- If using a custom API prefix, ensure it's included in all requests
- Check URL formatting in API clients and scripts

Instance Management Failures

Symptoms: Cannot create, control, or delete instances through the API.

Possible Causes and Solutions:

JSON Format Issues
- Verify request body is valid JSON
- Check for required fields in API requests
URL Parsing Problems
- Ensure instance URLs are properly formatted and URL-encoded if necessary
- Verify URL parameters use the correct format
Instance State Conflicts
- Cannot delete running instances without stopping them first
- Check current instance state with GET before performing actions
Permission Issues
- Ensure the NodePass master has sufficient permissions to create processes
- Check file system permissions for any referenced certificates or keys

Data Recovery

Master State File Corruption

Symptoms: Master mode fails to start showing state file corruption errors, or instance data is lost.

Possible Causes and Solutions:

Recovery using automatic backup file
- NodePass automatically creates backup file nodepass.gob.backup every hour
- Stop the NodePass master service
- Copy backup file as main file: cp nodepass.gob.backup nodepass.gob
- Restart the master service

Manual state file recovery

# Stop NodePass service
pkill nodepass

# Backup corrupted file (optional)
mv nodepass.gob nodepass.gob.corrupted

# Use backup file
cp nodepass.gob.backup nodepass.gob

# Restart service
nodepass "master://0.0.0.0:9090?log=info"

When backup file is also corrupted
- Remove corrupted state files: rm nodepass.gob*
- Restart master, which will create new state file
- Need to reconfigure all instances and settings
Preventive backup recommendations
- Regularly backup nodepass.gob to external storage
- Adjust backup frequency: set environment variable export NP_RELOAD_INTERVAL=30m
- Monitor state file size, abnormal growth may indicate issues

Best Practices:

In production environments, recommend regularly backing up nodepass.gob to different storage locations
Use configuration management tools to save text-form backups of instance configurations

Connection Pool Type Issues

QUIC Pool Connection Failures

Symptoms: QUIC pool tunnel fails to establish when type=1 is enabled.

Possible Causes and Solutions:

UDP Port Blocked
- Verify UDP port is accessible on both server and client
- Check firewall rules: sudo ufw allow 10101/udp (Linux example)
- Test UDP connectivity with nc -u server.example.com 10101
- Some ISPs or networks block or throttle UDP traffic
TLS Configuration Issues
- QUIC requires TLS to be enabled (minimum tls=1)
- If type=1 is set but TLS is disabled, system auto-enables tls=1
- For production, use tls=2 with valid certificates
- Check certificate validity for QUIC connections
Client-Server Pool Type Mismatch
- Both server and client must use same type setting
- Server with type=1 requires client with type=1
- Server with type=0 requires client with type=0
- Check logs for "QUIC connection not available" errors
Mode Compatibility
- QUIC only works in dual-end handshake mode (mode=2)
- Not available in single-end forwarding mode (mode=1)
- System will fall back to TCP pool if mode incompatible

WebSocket Pool Connection Failures

Symptoms: WebSocket pool tunnel fails to establish when type=2 is enabled.

Possible Causes and Solutions:

HTTP/WebSocket Port Blocked
- Verify TCP port is accessible with WebSocket protocol support
- Check firewall rules and proxy configurations
- Some proxies or CDNs may interfere with WebSocket upgrade
- Test connectivity with WebSocket client tools
TLS Configuration Issues
- WebSocket Secure (WSS) requires TLS to be enabled (minimum tls=1)
- WebSocket pool does NOT support unencrypted mode - tls=0 is not allowed for type=2
- If type=2 is set but TLS is disabled, system will automatically enforce tls=1
- For production, use tls=2 with valid certificates
- Check certificate validity for WSS connections
Client-Server Pool Type Mismatch
- Both server and client must use same type setting
- Server with type=2 requires client with type=2
- Configuration is automatically delivered during handshake
- Check logs for "WebSocket connection not available" errors

HTTP/2 Pool Connection Failures

Symptoms: HTTP/2 pool tunnel fails to establish when type=3 is enabled.

Possible Causes and Solutions:

TCP Port or HTTP/2 Protocol Blocked
- Verify TCP port is accessible with HTTP/2 protocol support
- Check firewall rules and network policies
- Some networks may block or inspect HTTPS traffic
- Test connectivity with HTTP/2-capable client tools
TLS Configuration Issues
- HTTP/2 requires TLS to be enabled (minimum tls=1)
- If type=3 is set but TLS is disabled, system will automatically enforce tls=1
- For production, use tls=2 with valid certificates
- HTTP/2 requires TLS 1.3 with ALPN (Application-Layer Protocol Negotiation)
- Check certificate validity and ALPN configuration
Client-Server Pool Type Mismatch
- Both server and client must use same type setting
- Server with type=3 requires client with type=3
- Configuration is automatically delivered during handshake
- Check logs for "HTTP/2 connection not available" errors
Mode Compatibility
- HTTP/2 pool only works in dual-end handshake mode (mode=2)
- Not available in single-end forwarding mode (mode=1)
- System will fall back to TCP pool if mode incompatible
HTTP/2 Protocol Negotiation Failures
- Verify ALPN extension is enabled and negotiates "h2" protocol
- Some older TLS implementations may not support ALPN
- Check logs for protocol negotiation errors
- Ensure both endpoints support HTTP/2 over TLS

QUIC Pool Performance Issues

Symptoms: QUIC pool tunnel has lower performance than expected or worse than TCP pool.

Possible Causes and Solutions:

Network Path Issues
- Some networks deprioritize or shape UDP traffic
- Check if network middleboxes are interfering with QUIC
- Consider testing with TCP pool (type=0) for comparison
- Monitor packet loss rates - QUIC performs better with low loss
Pool Capacity Configuration
- Increase min and max parameters for higher throughput
- QUIC streams share single UDP connection - adequate capacity needed
- Monitor stream utilization with log=debug
- Balance between stream count and resource usage
Certificate Overhead
- TLS 1.3 handshake (mandatory for QUIC) can add initial latency
- Use 0-RTT resumption for faster reconnection
- Ensure proper certificate chain to avoid validation delays
Application Compatibility
- Some applications may not work optimally over QUIC streams
- Test with both TCP and QUIC pools to compare performance
- Consider TCP pool for applications requiring strict ordering

WebSocket Pool Performance Issues

Symptoms: WebSocket pool tunnel has lower performance than expected.

Possible Causes and Solutions:

Proxy/CDN Overhead
- WebSocket connections through proxies may add latency
- Check if intermediate proxies are buffering traffic
- Consider using TCP pool (type=0) or QUIC pool (type=1) for comparison
- Direct connections usually perform better than proxied
Frame Overhead
- WebSocket protocol adds framing overhead to each message
- Larger message sizes reduce relative overhead
- Monitor frame sizes and adjust application behavior if needed
- Balance between latency and throughput
TLS Handshake Overhead
- WSS requires TLS handshake for each connection
- Use connection pooling to amortize handshake costs
- Increase min and max parameters for better performance

QUIC Stream Exhaustion

Symptoms: "Insufficient streams" errors or connection timeouts when using QUIC.

Possible Causes and Solutions:

Pool Capacity Too Low
- Increase max parameter on server side
- Increase min parameter on client side
- Monitor active stream count in logs
- Default capacity may be insufficient for high-concurrency scenarios
Stream Leaks
- Check application properly closes connections
- Monitor stream count over time for gradual increase
- Restart instances to clear leaked streams
- Review application code for connection handling
QUIC Connection Dropped
- Check keep-alive settings (configured via NP_REPORT_INTERVAL)
- Monitor for "QUIC connection not available" errors
- NAT timeout may drop UDP connection - adjust NAT settings
- Increase connection timeout if network latency is high

Connection Pool Type Decision

When to Use QUIC Pool (type=1):

Mobile networks or frequently changing network conditions
High-latency connections (satellite, long-distance)
NAT-heavy environments where UDP traversal is better
Real-time applications benefiting from stream independence
Scenarios where 0-RTT reconnection provides value

When to Use WebSocket Pool (type=2):

Need to traverse HTTP proxies or CDNs
Corporate environments allowing only HTTP/HTTPS traffic
Environments where firewalls block raw TCP connections
Need compatibility with existing web infrastructure
Web proxy or VPN alternative solutions

When to Use TCP Pool (type=0):

Networks that block or severely throttle UDP traffic
Applications requiring strict TCP semantics
Corporate environments with UDP restrictions
Maximum compatibility requirements
When testing shows better performance with TCP

Comparison Testing:

# Test TCP pool performance
nodepass "server://0.0.0.0:10101/backend:8080?type=0&mode=2&log=event"
nodepass "client://server:10101/127.0.0.1:8080?mode=2&log=event"

# Test QUIC pool performance
nodepass "server://0.0.0.0:10102/backend:8080?type=1&mode=2&log=event"
nodepass "client://server:10102/127.0.0.1:8081?mode=2&log=event"

# Test WebSocket pool performance
nodepass "server://0.0.0.0:10103/backend:8080?type=2&mode=2&log=event"
nodepass "client://server:10103/127.0.0.1:8082?mode=2&log=event"

Monitor traffic statistics and choose based on observed performance.

Next Steps

If you encounter issues not covered in this guide:

Check the project repository for known issues
Increase the log level to debug for more detailed information
Review the How It Works section to better understand internal mechanisms
Consider joining the community discussion for assistance from other users

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Troubleshooting Guide

Connection Issues

Unable to Establish Tunnel Connection

Data Not Flowing Through the Tunnel

Connection Stability Issues

Certificate Issues

TLS Handshake Failures

Certificate Renewal Issues

Performance Optimization

High Latency

High CPU Usage

Memory Leaks

UDP-Specific Issues

UDP Data Loss

UDP Connection Tracking

DNS Issues

DNS Resolution Failures

DNS Caching Problems

DNS Performance Optimization

Master API Issues

API Accessibility Problems

Instance Management Failures

Data Recovery

Master State File Corruption

Connection Pool Type Issues

QUIC Pool Connection Failures

WebSocket Pool Connection Failures

HTTP/2 Pool Connection Failures

QUIC Pool Performance Issues

WebSocket Pool Performance Issues

QUIC Stream Exhaustion

Connection Pool Type Decision

Next Steps

FilesExpand file tree

troubleshooting.md

Latest commit

History

troubleshooting.md

File metadata and controls

Troubleshooting Guide

Connection Issues

Unable to Establish Tunnel Connection

Data Not Flowing Through the Tunnel

Connection Stability Issues

Certificate Issues

TLS Handshake Failures

Certificate Renewal Issues

Performance Optimization

High Latency

High CPU Usage

Memory Leaks

UDP-Specific Issues

UDP Data Loss

UDP Connection Tracking

DNS Issues

DNS Resolution Failures

DNS Caching Problems

DNS Performance Optimization

Master API Issues

API Accessibility Problems

Instance Management Failures

Data Recovery

Master State File Corruption

Connection Pool Type Issues

QUIC Pool Connection Failures

WebSocket Pool Connection Failures

HTTP/2 Pool Connection Failures

QUIC Pool Performance Issues

WebSocket Pool Performance Issues

QUIC Stream Exhaustion

Connection Pool Type Decision

Next Steps