CVE-2026-0599

7.5 HIGH

📋 TL;DR

This vulnerability in huggingface/text-generation-inference allows unauthenticated attackers to trigger resource exhaustion by exploiting unbounded external image fetching. Attackers can send malicious Markdown image links that cause the system to download large files, consuming memory, CPU, and network bandwidth. All deployments using version 3.3.6 with VLM mode enabled are affected.

💻 Affected Systems

Products:
  • huggingface/text-generation-inference
Versions: Version 3.3.6
Operating Systems: All operating systems running the affected software
Default Config Vulnerable: ⚠️ Yes
Notes: Vulnerability is triggered in VLM (Vision-Language Model) mode when processing inputs containing Markdown image links. Default deployment lacks memory limits and authentication.

⚠️ Risk & Real-World Impact

🔴

Worst Case

Complete system crash due to memory exhaustion, network bandwidth saturation, and CPU overutilization, potentially causing denial of service and data loss.

🟠

Likely Case

Service degradation or temporary unavailability due to resource exhaustion, requiring system restart and cleanup.

🟢

If Mitigated

Minimal impact if proper resource limits and authentication are configured, though some performance degradation may still occur.

🌐 Internet-Facing: HIGH - Unauthenticated remote exploitation makes internet-facing deployments particularly vulnerable to simple attacks.
🏢 Internal Only: MEDIUM - Internal deployments are still vulnerable but have reduced attack surface compared to internet-facing systems.

🎯 Exploit Status

Public PoC: ✅ No
Weaponized: LIKELY
Unauthenticated Exploit: ⚠️ Yes
Complexity: LOW

Exploitation requires only sending HTTP requests with malicious Markdown image links. No authentication or special privileges needed.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: 3.3.7

Vendor Advisory: https://github.com/huggingface/text-generation-inference/commit/24ee40d143d8d046039f12f76940a85886cbe152

Restart Required: Yes

Instructions:

1. Stop the text-generation-inference service. 2. Update to version 3.3.7 using package manager or direct installation. 3. Restart the service. 4. Verify the update was successful.

🔧 Temporary Workarounds

Disable VLM Mode

all

Disable Vision-Language Model mode to prevent image link processing

Set environment variable or configuration to disable VLM features

Implement Rate Limiting

linux

Add rate limiting to prevent multiple exploitation attempts

Configure reverse proxy (nginx/apache) with rate limiting rules

🧯 If You Can't Patch

  • Implement strict network egress filtering to block external HTTP requests from the service
  • Deploy resource limits (memory, CPU) and monitoring to detect and mitigate exploitation attempts

🔍 How to Verify

Check if Vulnerable:

Check if running version 3.3.6 of huggingface/text-generation-inference with VLM mode enabled

Check Version:

docker inspect <container_name> | grep -i version OR check package manager output

Verify Fix Applied:

Confirm version is 3.3.7 or higher and test that external image fetching is properly bounded

📡 Detection & Monitoring

Log Indicators:

  • Unusually large HTTP GET requests to external domains
  • Memory usage spikes
  • CPU utilization spikes
  • Multiple failed requests due to token limits

Network Indicators:

  • Outbound HTTP traffic to unusual domains
  • Large data downloads from external sources
  • Increased network bandwidth usage

SIEM Query:

source="text-generation-inference" AND (http_method="GET" AND url CONTAINS "http://" OR "https://") AND bytes_transferred > 1000000

🔗 References

📤 Share This