Researchers Find Widespread Exposure of Internet-Facing LLMs

Open-source large language models running outside commercial platforms have quietly become a stable layer of internet-facing infrastructure. At scale, they are now being indexed, scanned, and reused in patterns consistent with earlier waves of exposed services such as mail relays, databases, and CI/CD systems.

Their security risk is not theoretical. These deployments offer programmable language generation that can be redirected into phishing, fraud, and automation-driven abuse as soon as an endpoint is reachable.


What Does the Data Show?

The most complete dataset currently available comes from a long-running joint investigation by SentinelOne and Censys, spanning 293 days of continuous observation. Their analysis identified roughly 175,000 internet-reachable hosts associated with self-hosted LLM infrastructure, most commonly deployed using Ollama.

Within that population, researchers observed a stable core of approximately 23,000 systems that remained consistently reachable and active over time. Across the full dataset, they recorded more than seven million discrete observations, which suggests repeated discovery, reuse, or probing rather than one-off exposure.

Independent measurements align closely with those figures. Separate scans conducted by Censys identified over ten thousand high-confidence exposed Ollama instances, spread across more than one thousand autonomous systems, with roughly one quarter operating on non-default ports. That last detail matters; it indicates that exposure is not limited to naïve defaults and includes hosts that were at least partially customized, yet still left open.

Cisco researchers using Shodan-style discovery reached similar conclusions, identifying over one thousand exposed Ollama servers, with roughly one fifth actively serving models in a way that allowed unauthenticated interaction.

These are not theoretical counts or inferred estimates. In many cases, researchers were able to directly interact with the exposed APIs and confirm that they would respond to prompts from anywhere on the internet.


Guardrails Are Often Absent or Intentionally Removed

One of the more concerning technical findings involves system prompts. System prompts define how a model behaves, what it will refuse, and what categories of output are restricted.

Across the exposed population, researchers were able to observe system prompts in roughly 25% of deployments. Of those visible prompts, 7.5% explicitly permitted or failed to restrict harmful activity, including content generation tied to fraud, harassment, or abuse. The real number is likely higher, given that three quarters of observed systems did not expose their prompts at all.

More troubling is that many of these unsafe configurations were not accidental. Researchers documented hundreds of deployments where default safety mechanisms shipped with open-source models had been deliberately removed or weakened. That places these systems in a different category than simple misconfiguration; they are purpose-built to operate without content constraints.


Geographic Distribution and Enforcement Gaps

The exposed infrastructure is globally distributed, though not evenly. Roughly 30% of observed hosts were located in China, with about 20% in the United States, and the remainder spread across Europe, Southeast Asia, and other regions.

This distribution complicates remediation. Takedown authority, hosting norms, and enforcement capability vary widely by jurisdiction. Once an open model is downloaded and deployed, the originating lab no longer has direct technical control, and in many cases, no practical visibility at all.

Governance experts have pointed out that this does not eliminate responsibility. Model developers still shape downstream behavior through documentation, defaults, and deployment guidance. Once unsafe patterns propagate, they tend to replicate quickly.


From Exposure to Adversary Use

Security researchers operating AI-focused honeypots have recorded tens of thousands of attack sessions in a matter of weeks, specifically targeting exposed LLM endpoints. That volume only appears once tooling has been automated and shared.

The abuse patterns mirror early cloud abuse cycles. Attackers do not need to compromise the host in a traditional sense. They only need to find a reachable endpoint that will do work on their behalf. That work can include:

  • Generating phishing lures tuned to specific industries or languages
  • Producing large volumes of spam or scam copy
  • Supporting disinformation campaigns with rapid iteration
  • Acting as a content engine inside a larger intrusion workflow

Once attackers identify a responsive host, it becomes reusable infrastructure. From their perspective, it is free compute that does not trigger the safeguards applied by major AI platforms.

That transition from reachable service to reusable infrastructure is no longer theoretical. It is now observable in live, monetized campaigns.


Operation Bizarre Bazaar

In late January 2026, researchers at Pillar Security published findings from a campaign they named Operation Bizarre Bazaar, documenting what appears to be the first large-scale, commercially monetized LLMjacking operation.

Between December 2025 and January 2026, Pillar’s AI-focused honeypots captured approximately 35,000 attack sessions targeting exposed LLM and Model Context Protocol (MCP) endpoints. The activity was sustained, systematic, and clearly operational rather than exploratory.

The campaign followed a structured supply chain. Distributed scanning infrastructure identified exposed AI endpoints, including unauthenticated Ollama instances, vLLM servers, and publicly accessible MCP services. Once identified, a validation phase tested model availability, response quality, and authentication behavior. Endpoints that passed validation were then monetized through a resale platform operating under the silver.inc brand.

That platform marketed itself as a unified LLM API gateway, reselling discounted access to more than thirty LLM providers without authorization. Access attempts typically followed public scan visibility by only a few hours, indicating active monitoring of internet-wide discovery data.

Beyond compute theft, the findings highlight broader organizational risk. Compromised LLM endpoints can expose sensitive data held in context windows, including source code, customer conversations, and internal documentation. Exposed MCP servers extend the risk further, acting as pivot points into file systems, databases, cloud APIs, and container orchestration environments.

By late January, roughly 60% of observed attack traffic shifted toward MCP-focused reconnaissance, suggesting parallel campaigns oriented toward lateral movement rather than resale. A single exposed MCP endpoint can bridge directly into internal infrastructure, turning AI integrations into entry points.

Taken together, Operation Bizarre Bazaar provides a concrete example of what large-scale exposure data has been signaling for months. Open-source LLM deployments are no longer just being found. They are being validated, reused, and sold as infrastructure.


How Can Netizen Help?

Founded in 2013, Netizen is an award-winning technology firm that strengthens organizations by delivering cybersecurity capabilities that improve visibility, response, and resilience across modern environments. In the context of SOC-as-a-Service, our mission is centered on helping government, defense, and commercial clients build incident readiness without the burden of standing up a full in-house SOC. Our team develops and supports advanced monitoring, detection, and response solutions that give customers the level of coverage and operational structure they need to protect their networks, identities, and cloud workloads.

Our “CISO-as-a-Service” offering already demonstrates how we extend executive-level expertise to organizations that need high-end guidance without internal hiring. The same principle applies to our SOC; Netizen operates a state-of-the-art 24x7x365 Security Operations Center that provides continuous monitoring, alert triage, detection engineering, incident response coordination, and threat hunting for clients that require dependable coverage. These services support the readiness goals outlined in this article by improving early detection, reducing breakout time, and offering access to specialized analysts and hunters who understand the demands of sensitive and regulated environments.

Our portfolio complements SOCaaS by including cybersecurity assessments and advisory, hosted SIEM and EDR/XDR services, software assurance, penetration testing, cybersecurity engineering, and compliance audit support. This allows organizations to integrate SOCaaS with broader security initiatives such as modernization projects, compliance readiness, and vulnerability management. We specialize in environments where strict standards, technical precision, and operational consistency are mandatory, which makes our team a natural partner for organizations working to raise their detection and response maturity.

Netizen maintains ISO 27001, ISO 9001, ISO 20000-1, and CMMI Level III SVC registrations, reflecting the stability and maturity required for a high-quality SOC operation. As a Service-Disabled Veteran-Owned Small Business certified by the U.S. Small Business Administration, we have been recognized repeatedly through the Inc. 5000, Vet 100, national Best Workplace awards, and numerous honors for veteran hiring, innovation, and organizational excellence.

If your organization is evaluating how to strengthen detection and response capabilities across cloud, AI-enabled, and hybrid environments, Netizen can help. Start the conversation today.


Posted in , , ,

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.