A joint SentinelLABS and Censys study found an unmanaged, publicly accessible layer of Ollama deployments spanning 175,108 hosts across 130 countries, with a persistent core of roughly 23,000 hosts generating the majority of observed activity. Nearly half of hosts expose tool-calling and multimodal capabilities while the ecosystem converges on a small set of model families and the Q4_K_M 4-bit quantization format, creating a brittle monoculture and governance gaps that complicate attribution and defense. #Ollama #Q4_K_M
Keypoints
- Over 293 days of scanning produced 7.23 million observations across 175,108 unique Ollama hosts in 130 countries and 4,032 ASNs, revealing a measurable public surface of self-hosted LLM instances.
- The ecosystem is bimodal: a persistent backbone (~23,000 hosts) generates ~76% of observations while a large transient layer contains many single-observation hosts that contribute little activity.
- Model adoption is concentrated: Llama, Qwen2, and Gemma2 occupy the top ranks consistently, with broad convergence on 4-bit quantization (Q4_K_M appears on 48% of hosts and 72% of quantizations are 4-bit).
- Capability surface is substantial: ~48% of hosts advertise tool-calling, 38% show [completion, tools] capability, 22% support vision, 26% run “thinking” models, and at least 201 hosts use standardized “uncensored” prompts.
- Infrastructure spans hyperscalers, indie VPS, and residential/telecom networks (56% by ASN-type in fixed-access telecoms), with geographic concentration (e.g., US–Virginia, China–Beijing) and 16–19% of infrastructure classifications returning null.
- Security risks include resource hijacking, excessive agent-like capability enabling prompt injection and execution, identity laundering via residential IPs, systemic concentration risk from monoculture, and governance inversion that reduces centralized enforcement options.
MITRE Techniques
- No MITRE ATT&CK techniques are explicitly referenced in the article.
Indicators of Compromise
- [IP/Port ] exposure configuration examples – 127.0.0.1:11434 (default local bind), 0.0.0.0 (public bind allowing internet exposure)
- [Model/Artifact ] deployed model families and packaging – Llama, Qwen2, Gemma2, and Q4_K_M quantization (plus ~17 other top model families observed)
- [ASN/Geolocation ] infrastructure attribution and concentration – 4,032 ASNs observed, notable regional concentrations such as Virginia (US) and Beijing (China), and 16–19% of classifications returned null