Ollama Bleeding Llama and Windows Auto-Update RCE, Response Guide

Cyera Research disclosed CVE-2026-7482 ("Bleeding Llama") on May 5, a CVSS 9.1 unauthenticated heap out-of-bounds read in Ollama's GGUF model loader that lets a remote, unauthenticated attacker leak arbitrary process memory (system prompts, API keys, environment variables, other users' conversation data) in three HTTP calls. Cyera estimates roughly 300,000 Ollama servers are currently internet-exposed. The fix shipped February 25 in Ollama 0.17.1, but per Echo (the third-party CNA that finally assigned the CVE on April 28), the original 0.17.1 release notes did not flag it as a security update, so many operators never realized they should upgrade. Separately, Striga and CERT Polska published CVE-2026-42248 and CVE-2026-42249 on April 29: a path traversal plus no-op-signature-verification chain in Ollama's Windows auto-updater that produces persistent silent code execution at every login. That chain has no published fix; no public release notes or advisory indicate a fix in v0.23.0 through the current v0.23.2, and Striga confirmed the vulnerable indicators through v0.22.0. The vendor acknowledged Striga's report in January and then went silent through the standard 90-day disclosure window.

Immediate actions

1. Upgrade Ollama to 0.17.1 or later (current stable is 0.23.2) on every instance.

2. Bind to 127.0.0.1 or firewall port 11434; put an authentication proxy in front of any instance reachable beyond the local host. Upstream Ollama has no auth by default.

3. Rotate any secrets that were ever in the Ollama process environment if a pre-0.17.1 server has been network-reachable.

4. On Windows: disable auto-download updates in Ollama Settings and remove the Ollama shortcut from the Startup folder until the vendor publishes a fix for CVE-2026-42248 / CVE-2026-42249.

Issue	Affected	Fixed?	Required action
CVE-2026-7482	Ollama < 0.17.1, all platforms	Yes, in 0.17.1 (Feb 25)	Upgrade to 0.17.1+. Restrict network exposure. Rotate secrets if previously reachable.
CVE-2026-42248 / 42249	Ollama for Windows, 0.12.10 through current (v0.23.2)	No published fix	Disable auto-download updates. Remove Ollama shortcut from the Windows Startup folder.
CVE-2026-5757	Ollama quantization engine (CERT/CC, separate finding)	Unclear	Whether 0.17.1's bounds-checking fix also addresses this report is not publicly established. Upgrading to 0.17.1+ is the operator's best available action.
Unauthenticated API (design default)	All Ollama instances reachable beyond 127.0.0.1	Not a CVE; documented behavior	Operator-side: bind to 127.0.0.1, firewall port 11434, or front with an auth proxy (Tailscale, Cloudflare Access, OAuth proxy, basic-auth reverse proxy).

Framing note: two separate vulnerability classes in the same product, disclosed in the same window, with the same vendor-responsiveness pattern (slow CVE assignment on Bleeding Llama, no response on the Windows chain). This bulletin treats them as one operator response because most affected operators are on the same Ollama install they have not been told to upgrade. The third issue, CERT/CC VU#518910 / CVE-2026-5757 (Jeremy Brown), describes the same class of OOB read-and-write in Ollama's quantization engine; whether 0.17.1's fix addresses it is not publicly clarified.

Why this is a drop-everything bulletin for AI/ML operators

Three things stack: (1) Bleeding Llama is unauthenticated, three HTTP calls, no user interaction, and Cyera estimates ~300,000 servers exposed; the vulnerable endpoints (/api/blobs, /api/create, /api/push) have no authentication in the upstream distribution and the documented OLLAMA_HOST=0.0.0.0 binding puts them on every reachable interface. (2) The patch has been available since February 25 in v0.17.1, but was not flagged as a security release; operators who have not upgraded since then are still vulnerable, and "still on a 2025 Ollama" is common because Ollama's release pace makes minor version drift cheap to ignore. (3) The Windows auto-update chain has no published fix on any current release; release notes for v0.23.0 through v0.23.2 do not mention the CVEs, no vendor advisory has been issued, and the vendor stopped responding to Striga's reports in January. The only operator-side mitigation is to turn off auto-update and break the silent on-login execution path. If you run Ollama anywhere reachable by anything you do not control, treat the network exposure and the install posture as the immediate response, not the patch alone.

Affected Versions, Fix Status, and Endpoints

CVE-2026-7482 "Bleeding Llama" Class: Heap out-of-bounds read in GGUF model loader during quantization (fs/ggml/gguf.go, server/quantization.go, WriteTo()). Mechanism: Attacker-supplied GGUF file declares tensor offset and size exceeding actual file length; server reads past allocated heap buffer; bytes are written into the resulting model file and exfiltrated via /api/push. Affected: All Ollama versions before 0.17.1 (Feb 25, 2026). Fixed in: 0.17.1 and later. Current stable: 0.23.2. CVSS: 9.1 CRITICAL Reachability: /api/blobs, /api/create, /api/push are unauthenticated in upstream distribution. Default bind 127.0.0.1, but OLLAMA_HOST=0.0.0.0 is widely used (Cyera estimates ~300,000 exposed instances). Disclosed: May 5, 2026 (Cyera). CVE assigned April 28 by Echo CNA after MITRE non-response. CVE-2026-42248 / CVE-2026-42249 (Windows updater chain) Class: CVE-2026-42248: missing signature verification (CWE-494). verifyDownload() on Windows is literally `return nil`. CVE-2026-42249: path traversal in the installer-staging path constructed from attacker-controlled HTTP ETag and Content-Disposition headers (CWE-22). Chain: Attacker controls update server response, supplies ETag with ../ sequences and arbitrary payload bytes; payload lands in the Windows Startup folder and runs at every subsequent login. Windows-only; macOS uses code-signing verification on the bundle. Affected: Ollama for Windows. CERT Polska tested 0.12.10 through 0.17.5 end-to-end with the Striga PoC. Striga separately confirmed the four chain indicators (filepath.Join(UpdateStageDir, etag, ...), no-op verifyDownload, OLLAMA_UPDATE_URL override, STARTF_TITLEISLINKNAME hidden-mode detection) are present in every release tag from 0.12.10 through 0.22.0 with no commits in that range touching the vulnerable functions. v0.23.0 / 0.23.1 / 0.23.2 release notes do not mention these CVEs. Fix: No published fix as of May 10, 2026. Striga confirmed the vulnerable indicators through v0.22.0 by code review; release notes for v0.23.0, v0.23.1, and v0.23.2 (current stable, May 7) do not mention these CVEs and no advisory has been published. Treat as unpatched on every current release until the vendor states otherwise. CVSS 4.0: 7.7 HIGH each. Disclosed: April 29, 2026 (Striga / CERT Polska), after 90-day disclosure window with no further vendor response. CVE-2026-5757 / CERT/CC VU#518910 (separate finding) Reporter: Jeremy Brown, AI-assisted research. Class: OOB heap read/write in same quantization engine; mechanism overlaps Bleeding Llama. Status: CERT/CC published April 22 with no vendor coordination and no confirmed patch. Whether 0.17.1's bounds-checking fix also addresses this report is not publicly established by either CERT/CC or the vendor. Vulnerable endpoints (Bleeding Llama three-call chain): POST /api/blobs/sha256:<hash> (upload crafted GGUF) POST /api/create (trigger quantization, OOB read fires) POST /api/push (exfil to attacker registry)

Sources: Cyera Research (CVE-2026-7482 / Bleeding Llama), Striga and CERT Polska (CVE-2026-42248 / CVE-2026-42249), CERT/CC VU#518910 (CVE-2026-5757). Version ranges for the Windows updater chain are CERT Polska's tested range (0.12.10 through 0.17.5) plus Striga's static-analysis confirmation that the same vulnerable code is present through 0.22.0 with no remediating commits. Treat all Ollama for Windows installs from 0.12.10 forward as potentially affected by the updater chain until the vendor publishes a fix.

Priorities, by deployment

Triage on three axes: is the Ollama API reachable from anywhere outside the local host, is the install on Windows with auto-update enabled (the default), and is the install on a version older than 0.17.1. Many operators will have all three.

Priority	Deployment	Why
Highest	Ollama server reachable from the public internet (any version, any OS)	This is the configuration in Cyera's ~300,000 number. Any pre-0.17.1 server here is leaking heap memory to anyone who knows the three-call chain. Even post-patch, an unauthenticated `/api/push` still lets anyone upload models to your server and trigger model creation. Apply network restrictions within hours: bind to 127.0.0.1, firewall port 11434 from external traffic, and put an authentication proxy (Tailscale, Cloudflare Access, an OAuth proxy, or a basic-auth reverse proxy) in front. Then upgrade to 0.23.2 or later. Rotate any secrets that were ever in the Ollama process environment.
Higher	Ollama server reachable from a corporate / lab / shared LAN (any version)	Cyera's three-call chain works from anywhere routable to the API. A compromised endpoint, a guest VLAN with a routing mistake, an insider, a CI runner with broad egress, or a rogue script on a developer laptop can all reach the API. CVSS 9.1 does not change because the network is "internal." Apply the same actions as the public-internet tier; the only difference is your timeline.
Higher	Ollama for Windows, any version 0.12.10 or later, with auto-update enabled (default)	The Striga / CERT Polska updater chain is unpatched on every current release. An attacker who can influence Ollama's update response (TLS interception, DNS hijack with forged cert, hosts-file edit from a prior local foothold, or environment-variable write to `OLLAMA_UPDATE_URL`) gets persistent silent code execution at every login as the user running Ollama. The compromise survives normal Ollama updates and is invisible to Ollama's own bookkeeping. Disable auto-download updates in Settings and remove the Ollama shortcut from the Windows Startup folder.
Medium	Ollama bound to 127.0.0.1 only, on a developer workstation, pre-0.17.1	Local-only exposure plus a heap-disclosure bug means risk depends on what else runs on the host. Browsers, IDEs, and any local tool with arbitrary HTTP-request capability can reach 127.0.0.1:11434. Cyera specifically calls out agentic-tooling integrations configured to route tool output through Ollama (custom agent harnesses, LangChain-style pipelines, IDE integrations, or anything explicitly proxying Claude Code or similar through a local Ollama endpoint) as a way for sensitive content to flow into the same heap that Bleeding Llama leaks. Upgrade to 0.23.2 anyway; the patch has been available since February.
Lower	Ollama for macOS or Linux, current version, bound to 127.0.0.1, no shared workstation users	Bleeding Llama patch shipped February 25 in 0.17.1; you are likely already past it on a current install. The Windows updater chain is Windows-specific, so this configuration is unaffected by it. Confirm version is >= 0.17.1, confirm bind address, and move on.

Categorization source: Bulletin's framing based on Cyera's CVSS 9.1 and exposure estimate, and Striga / CERT Polska's confirmation that the Windows updater chain is unpatched on every current release.

Am I affected?

1. What version of Ollama am I running?

ollama --version
# or, if the server is running but the CLI is on a different host:
curl -s http://<ollama-host>:11434/api/version

If the version is below 0.17.1, the host is vulnerable to Bleeding Llama. If the version is 0.17.1 or later, the Bleeding Llama heap-OOB-read is patched, but unauthenticated API access remains the design default and the Windows auto-update chain (if applicable) remains unpatched.

2. Is the API reachable beyond localhost?

Check what address Ollama is bound to:

# On the Ollama host, find the listener:
ss -tlnp | grep 11434
# Or, from another host on your network:
curl -s -o /dev/null -w "%{http_code}\n" http://<ollama-host>:11434/api/version

A 200 from a remote host means you are reachable from at least that subnet. To check public-internet exposure, consult your asset inventory or run an external scan against port 11434. Cyera explicitly recommends Shodan as an audit input. The Ollama process environment is reachable through Bleeding Llama, so anything in OLLAMA_HOST, OLLAMA_ORIGINS, and any other environment variables (including unrelated secrets shared with the shell that started the daemon) is in scope of leak.

3. (Windows only) Is auto-update enabled and is Ollama in the Startup folder?

Auto-update is on by default. Confirm in the Ollama Settings UI (system tray icon, Settings, look for "Auto-download updates"). To check the Startup folder:

Get-ChildItem "$env:APPDATA\Microsoft\Windows\Start Menu\Programs\Startup"

A first-run install creates an Ollama shortcut here by default; that shortcut is what triggers the silent on-login update routine that the path-traversal chain abuses. The shortcut itself is not the vulnerability; the vulnerability is that every Ollama startup from this folder runs any installer staged by the (broken) updater. Removing the shortcut blocks the silent execution path even if the updater were tricked into staging a malicious binary in %LOCALAPPDATA%\Ollama\updates\.

4. (Windows only) Can I see whether OLLAMA_UPDATE_URL is set?

# User scope:
[Environment]::GetEnvironmentVariable("OLLAMA_UPDATE_URL", "User")
# Machine scope (requires admin):
[Environment]::GetEnvironmentVariable("OLLAMA_UPDATE_URL", "Machine")

A non-empty value is suspicious unless you set it yourself. The Striga PoC documents this variable as the cleanest demonstration path for the chain because it bypasses the need for TLS interception. If you find an unexpected override pointing at a non-Ollama domain or an HTTP (not HTTPS) URL, treat the host as potentially compromised: something with environment-write capability has run on the machine.

Response, by priority

If the Ollama API is reachable from any network beyond localhost (Highest / Higher)

Restrict network exposure first, patch second. Patching to 0.17.1+ closes the heap-OOB-read but does not change the fact that /api/create and /api/push are unauthenticated in the upstream distribution. An unauthenticated /api/push is its own problem (anyone can upload arbitrary models to your server and consume disk and compute), and any future bug class in Ollama's parsers becomes another Bleeding Llama on a network-reachable instance. Bind to 127.0.0.1 (set OLLAMA_HOST=127.0.0.1:11434 and restart the service), or firewall port 11434 to allow only known client subnets, or both. If remote access is operationally required, put an authentication proxy in front: Tailscale, Cloudflare Access, an OAuth2 proxy, or basic-auth on a TLS-terminating reverse proxy. The "no auth by default" posture is documented behavior, not a bug, and operators carry the responsibility to add it. Common remote-access patterns that bypass localhost binding without solving the auth gap: Docker port forwards from 0.0.0.0, Kubernetes services exposing 11434 via NodePort or LoadBalancer, ngrok / cloudflared tunnels stood up for a demo and never torn down. Audit each before declaring the API closed.
Upgrade to Ollama 0.23.2 or later. The Bleeding Llama patch is in 0.17.1, but if you are upgrading anyway, take the current stable. Cyera's three-call chain relies on the OOB read in WriteTo(); that path now validates tensor element counts against actual buffer sizes before quantization. After upgrade, confirm the version with ollama --version from the same shell as the running daemon (a new binary on disk does not help if the running process is still the old one; restart the service).
Assume environment variables, system prompts, and recent conversation data have been disclosed if your server has been internet-reachable on a pre-0.17.1 version for any meaningful window. Cyera's exploitation chain is silent (the server logs no errors and does not crash), so absence of evidence is not evidence of non-exploitation. Treat any secrets that were in the Ollama process environment as potentially compromised: rotate API keys, tokens, model-registry credentials, and any third-party service credentials used by integrations that ran through Ollama. Audit recent /api/push calls in your reverse-proxy or network logs for pushes to unexpected hostnames; the data exfil step requires pushing to an attacker-controlled registry, so any /api/push to a hostname you do not recognize is a worth-investigating signal.
Audit agentic-tooling integrations that route through Ollama. Cyera flags this specifically: any integration explicitly configured to route tool outputs through Ollama (custom agent harnesses, LangChain-style pipelines, IDE integrations, or anything proxying Claude Code or similar through a local Ollama endpoint) writes those outputs into the same process heap that Bleeding Llama leaks. File contents, repository data, intermediate code, and any data the agent has read pass through. If those flows ran on a pre-0.17.1 server during the disclosure window, treat the contents as potentially leaked. (Plain Claude Code installations not configured to route through Ollama are not in scope.)

If you run Ollama for Windows (Higher, in addition to the network steps above)

Disable auto-download updates in Ollama Settings. This is the single most important Windows-side action and is currently the vendor's only mitigation for the updater chain. Open the Ollama tray icon, Settings, and toggle off "Auto-download updates." Per Striga, this short-circuits the periodic background check before any update response is fetched, so the path-traversal write never happens regardless of what an attacker's update server returns. A non-tray-UI alternative: set the registry or per-user settings file to disable auto-update; Ollama's settings storage is at %LOCALAPPDATA%\Ollama\app.json on current builds, but the supported method is the Settings UI toggle.
Remove the Ollama shortcut from the Windows Startup folder. This breaks the silent on-login execution route that runs any pending installer invisibly:
```
Remove-Item "$env:APPDATA\Microsoft\Windows\Start Menu\Programs\Startup\Ollama.lnk" -ErrorAction SilentlyContinue
```
Without the Startup-folder shortcut, Ollama no longer auto-runs on login, which means the STARTF_TITLEISLINKNAME hidden-mode detection never triggers and DoUpgradeAtStartup never runs. Start Ollama manually when you actually need it. Even if the (broken) updater were somehow tricked into staging a malicious installer, no Startup-folder shortcut means no silent execution.
Inspect the updater staging directory and Startup folder for unexpected executables. The chain's artifact is a file written outside the legitimate staging directory. Spot-check both:
```
# Legit staging directory (look for staged installers
# Ollama is unaware of, e.g. files Ollama did not create):
Get-ChildItem "$env:LOCALAPPDATA\Ollama\updates\" -Recurse -ErrorAction SilentlyContinue

# Startup folder (look for any .exe, not just Ollama-named):
Get-ChildItem "$env:APPDATA\Microsoft\Windows\Start Menu\Programs\Startup\" -Filter *.exe
```
Any executable in the Startup folder that is not something you put there is a finding. The Striga PoC writes OllamaSetup.exe there, but a real attacker would pick a less obvious name. Treat unknown executables in Startup as IR-level findings, not as something to delete-and-move-on; preserve a copy and check $env:LOCALAPPDATA\Ollama\ logs for the time of the write.
Confirm OLLAMA_UPDATE_URL is unset (per the diagnostic above). If it is set to anything other than the Ollama default, find out who set it. A non-default value is the simplest path for the chain because it removes the need for TLS interception.
Watch for a vendor patch. The Windows updater chain has been disclosed publicly since April 29 with no vendor response in the 90-day window. The bulletin will update if and when Ollama ships a Windows build that fixes the no-op signature check and the path-traversal write. Until then, do not re-enable auto-update.

Architectural takeaway

Local-AI runtimes inherit the same trust assumptions desktop apps had in the early 2000s, with one important difference: they bind a high-value parser (model loaders that consume attacker-influenceable binary formats) to a network port that the documentation encourages users to expose. Ollama's defaults (no auth, 127.0.0.1 by default but OLLAMA_HOST=0.0.0.0 documented as the standard remote-access pattern) make the network-reachable configuration the path of least resistance, and the ~300,000 exposed servers Cyera counted are the predictable result. This is the same shape as the broader pattern this bulletin has been tracking through 2026: the xinference framework supply-chain compromise (April), the Lightning PyPI package compromise (April), and now Ollama. AI/ML infrastructure is being pulled into production deployments faster than its security posture matures. Operator-side compensation, network restriction, authentication proxies, egress controls, secrets hygiene, is the realistic baseline until upstream defaults catch up.

The Windows-updater finding is a separate but related pattern: software that ships its own updater inherits the full responsibility of a code-signing trust chain, and cutting corners on Windows specifically (where the macOS build does the right thing) is a recurring failure mode. The DAEMON Tools supply-chain compromise (May 5) is a different mechanism but the same outcome: the binary that runs on user machines is not the binary the vendor signed. Trust-chain failures are a 2026 theme, not a 2026 anomaly.

And the meta-pattern in this story is the disclosure experience itself. Cyera's CVE took three months and a third-party CNA (Echo) to assign because MITRE did not respond, and the original 0.17.1 release notes did not flag the fix as security-relevant. Striga's reports went to the Ollama security contact in January and were met with silence. CERT Polska had to take over coordination. CERT/CC published VU#518910 with no vendor coordination at all. The operator-visible result is that even a fix that exists may not be a fix the operator knows about, and operators with auto-update enabled on Windows are auto-updating into a runtime whose updater is the bug. Anchor your scanner coverage and your patch-tracking against authoritative CVE records (Echo / CERT Polska / CERT/CC) rather than vendor release notes for products with this disclosure pattern.

Caveats and unknowns

At time of writing: Cyera described the three-call exploitation mechanism in its public writeup but did not publish a working PoC. CISA had not added CVE-2026-7482 or the CVE-2026-42248 / 42249 chain to the KEV catalog as of publication; monitor the KEV catalog for change. Cyera describes the vulnerability as "immediately and broadly exploitable" but does not confirm in-the-wild exploitation against specific victims, and there are no public IOCs for Bleeding Llama. The relationship between CVE-2026-7482 (Cyera / Bleeding Llama, fixed in 0.17.1) and CVE-2026-5757 (Jeremy Brown / CERT/CC VU#518910, no confirmed patch) is not publicly clarified by either the vendor or the reporters; the bulletin does not assert they are the same bug. The Windows updater chain has no vendor patch as of May 10; the bulletin will update if and when one ships.

One-line takeaway

If you run Ollama: upgrade to 0.23.2 (or at minimum 0.17.1) to close Bleeding Llama, bind to 127.0.0.1 or put an auth proxy in front of every instance reachable beyond the local host, rotate any secrets that were ever in the Ollama process environment, and on Windows additionally disable auto-download updates and remove the Ollama shortcut from the Startup folder until the vendor ships a fix for CVE-2026-42248 and CVE-2026-42249. Authentication is not a default in upstream Ollama; treat that as the operator's responsibility, not the vendor's eventual roadmap.

Ollama Bleeding Llama and Windows Auto-Update RCE: Patch, Restrict Exposure, Disable Windows Auto-Update