Troubleshooting Guide¶
Common issues and their solutions for Cortex Linux systems.
Quick Diagnostics¶
Before diving into specific issues, run these diagnostic commands:
# System health check
cortex-ops doctor --verbose
# View recent system logs
journalctl -p err -n 50
# Check disk space
df -h
# Check memory
free -h
# Check running services
systemctl list-units --failed
Boot Issues¶
System Won't Boot¶
Symptoms: System hangs at boot, black screen, or boot loop.
Diagnosis:
# From recovery mode or live USB
# Check filesystem
fsck /dev/sda1
# View boot logs
journalctl -b -1 -p err
# Check GRUB config
cat /boot/grub/grub.cfg
Solutions:
-
Repair GRUB:
-
Fix fstab:
-
Kernel rollback:
Kernel Panic¶
Symptoms: "Kernel panic - not syncing" message.
Solutions:
- Boot with previous kernel from GRUB menu
- Check hardware (RAM, disk)
- Review recent kernel updates
# List installed kernels
dpkg --list | grep linux-image
# Remove problematic kernel
apt remove linux-image-x.x.x-cortex
Network Issues¶
No Network Connectivity¶
Diagnosis:
# Check interfaces
ip link show
# Check IP addresses
ip addr show
# Check routes
ip route show
# Test connectivity
ping -c 3 8.8.8.8
# Test DNS
dig google.com
Solutions:
-
Restart NetworkManager:
-
Manually configure interface:
-
Reset network configuration:
DNS Resolution Failing¶
Symptoms: Can ping IPs but not hostnames.
Solutions:
# Check resolv.conf
cat /etc/resolv.conf
# Set DNS manually
echo "nameserver 8.8.8.8" | sudo tee /etc/resolv.conf
# Or configure systemd-resolved
sudo nano /etc/systemd/resolved.conf
# Add: DNS=8.8.8.8 8.8.4.4
sudo systemctl restart systemd-resolved
Firewall Blocking Traffic¶
# Check UFW status
sudo ufw status verbose
# Temporarily disable
sudo ufw disable
# Allow specific port
sudo ufw allow 80/tcp
# Allow from specific IP
sudo ufw allow from 192.168.1.100
Disk Issues¶
Disk Full¶
Diagnosis:
# Check disk usage
df -h
# Find large files
sudo du -h --max-depth=1 / | sort -hr | head -20
# Find large log files
sudo find /var/log -type f -size +100M
Solutions:
# Clean apt cache
sudo apt clean
# Remove old kernels
sudo apt autoremove
# Clean journal logs
sudo journalctl --vacuum-time=7d
# Find and remove large files
sudo find /tmp -type f -atime +7 -delete
# Use cortex-ops
cortex-ops repair apt --clean-cache
Filesystem Errors¶
Diagnosis:
Solutions:
# Boot to recovery mode and run fsck
sudo fsck -y /dev/sda1
# For mounted filesystem (read-only check)
sudo touch /forcefsck
sudo reboot
Disk I/O Slow¶
# Check I/O stats
iostat -x 1 5
# Check for processes using disk
iotop
# Check for disk errors
dmesg | grep -i error
Memory Issues¶
Out of Memory (OOM)¶
Diagnosis:
# Check memory
free -h
# Check OOM logs
dmesg | grep -i "out of memory"
# Find memory-hungry processes
ps aux --sort=-%mem | head -10
Solutions:
-
Increase swap:
-
Kill memory-hungry processes:
-
Configure OOM killer:
High Memory Usage¶
# Check what's using memory
smem -tk
# Clear page cache
sudo sync && echo 3 | sudo tee /proc/sys/vm/drop_caches
Service Issues¶
Service Won't Start¶
Diagnosis:
# Check service status
systemctl status service-name
# Check logs
journalctl -u service-name -n 50
# Check config syntax
service-name --test # or similar
Solutions:
# Reload and restart
sudo systemctl daemon-reload
sudo systemctl restart service-name
# Check for port conflicts
sudo ss -tlnp | grep :8080
# Fix permissions
sudo chown -R service-user:service-group /var/lib/service/
Failed Systemd Units¶
# List failed units
systemctl --failed
# Reset failed state
sudo systemctl reset-failed
# Restart failed units
cortex-ops repair services --restart-failed
Package Manager Issues¶
APT Broken¶
Diagnosis:
Solutions:
# Use cortex-ops (recommended)
cortex-ops repair apt
# Or manually:
# Clear locks
sudo rm /var/lib/dpkg/lock*
sudo rm /var/lib/apt/lists/lock
sudo rm /var/cache/apt/archives/lock
# Reconfigure
sudo dpkg --configure -a
# Fix broken dependencies
sudo apt-get install -f
Package Conflicts¶
# Find conflicting packages
apt-cache policy package-name
# Force install specific version
sudo apt install package-name=version
# Remove problematic package
sudo dpkg --remove --force-remove-reinstreq package-name
Performance Issues¶
High CPU Usage¶
Diagnosis:
# Find CPU-hungry processes
top -o %CPU
# Check load average
uptime
# Profile specific process
sudo perf top -p <PID>
Solutions:
# Limit process CPU
cpulimit -p <PID> -l 50
# Set nice value
renice 19 -p <PID>
# Kill runaway process
kill -9 <PID>
System Slow¶
# Check system load
vmstat 1 10
# Check I/O wait
iostat -x 1 5
# Check network
ss -s
# Run full diagnostics
cortex-ops doctor --verbose
LLM Connector Issues¶
Connection Failed¶
Diagnosis:
# Test connector
cortex-ops connectors test openai
# Check API key
echo $OPENAI_API_KEY | head -c 10
# Test API directly
curl https://api.openai.com/v1/models \
-H "Authorization: Bearer $OPENAI_API_KEY"
Solutions:
- Verify API key is set correctly
- Check network connectivity to API endpoint
- Verify API key has correct permissions
- Check rate limits
Timeout Errors¶
# Increase timeout
export CORTEX_CONNECTORS__TIMEOUT=120
# Or in config
echo "connectors:" >> /etc/cortex/config.yaml
echo " timeout: 120" >> /etc/cortex/config.yaml
Getting More Help¶
Generate Support Bundle¶
cortex support-bundle
# This creates a tarball with:
# - System info
# - Logs
# - Configuration (sanitized)
# - Doctor results
Useful Log Locations¶
| Log | Location |
|---|---|
| System | /var/log/syslog |
| Kernel | /var/log/kern.log |
| Auth | /var/log/auth.log |
| Cortex | /var/log/cortex/ |
| Journal | journalctl |
Community Support¶
- Discord: discord.gg/cortexlinux
- GitHub Issues: github.com/cortexlinux/cortex
- Documentation: docs.cortexlinux.com