Thoughts

How edge AI startups are powering the next wave of intelligence

Dec 2, 2025 | By Team SR

In the beginning, most AI systems relied on cloud computing. This meant data such as images or text or sensor etc. was sent over the internet to powerful servers and then processed back in the cloud. The cloud made sense because it had unlimited computing and storage that early devices (like phones or sensors) could never be equipped with. However, there were many downsides to a cloud-only model. There were delays sending data back and forth, privacy concerns, and/or needing a reliable internet connection.

Faster and smaller processors allowed the technology to be improved to the point that AI could be done directly on local devices, and these devices are now referred to as "the edge." Edge AI means that the AI model runs at the edge, or wherever data was generated (i.e., your phone, smartwatch, a car, or even a security camera). Edge AI enables decisions to be made instantly without having to rely on the cloud which is ideal for applications that require real-time response. Edge AI also preserves your privacy as the data never has to leave the device.

Why it matters

The importance of real-time intelligence and low-latency decision-making arises from an ability to respond immediately to events unfolding in the environment. This is particularly critical in scenarios where the ability to take action immediately is crucial, as even the slightest delay could result in serious complications. Consider the example of a self-driving car. The AI that is driving the car cannot wait for data to be transmitted back and forth to a far-away server before deciding to stop the car once it detects something in its path. The decision to process the information and act upon it must occur in milliseconds to avoid an accident. This is what the concepts of real-time, low-latency decision making allow us to accomplish.

The same principle can be applied in other fields. In healthcare, for example, wearables or monitors in hospitals can leverage AI to detect irregular heartbeats or a sudden drop in oxygen levels, alerting healthcare professionals immediately. In manufacturing, machines can detect defects or safety hazards that lead to an automatic shutdown before processes can lead to damage or injury occurring. Even common technology, like a voice assistant or augmented reality experiences, takes advantage of fast responses that allow seamless interactions rather than feeling slow or having glitches.

Ultimately, real-time intelligence is critical in making AI more reliable, responsive, and safe by providing a link between perceiving an event and acting on the event.

Cloud AI is reaching its limits

Initially, it was totally reasonable to run AI models in the cloud. It provided a massive compute power, software updates were instant, and you could access anything from anywhere. However, with the almost billion connected devices that are constantly generating data, the cracks are starting to show in the cloud. 

When we talk of the bottlenecks of cloud-first AI we are really referring to the practical constraints of over-relying on geographically distant data centers. 

The first major consideration is the scalability versus cost equation. Training and running AI models in the cloud requires enormous compute. Demand for cloud resources will increase at an even higher rate as data is rising. This means you are on the hook for rack and processing costs outside your data center. This can be a considerable barrier to entry for small businesses or startups, as scaling AI systems to the cloud forces them to choose between compute and abundance of data. Even big tech enterprises struggle to maintain cloud service operations as demand for infrastructure increases as it reduces sustainability and loftiness as decisions have to rely on commitments to a multiple-solution impediment.

Next, countries have different regulations on the processing and storage of data – for example, in some countries, sensitive information (particularly medical or personal information) is required to remain within the country's borders, so sending everything to the cloud and potentially being hosted a continent away raises legal challenges. In addition to the legal discussions, constant transfer of data results in additional sources of security breach, whether it be breaches or a misuse of data.

There is also the environmental and energy angle, as data centers require a huge amount of electricity not just to power their servers, but to cool their servers as well. As the AI workload grows, the carbon footprint increases. This adds not only cost to potential AI models, but sets the model as a burden to the environment.

Finally, the "speed vs. scale" paradox illustrates why it is becoming critical to move the computing closer to the source. The cloud manages enormous amounts of data (scale), but because it is far away, it raises a certain latency. The edge offers instant responses (speed), but can't handle anything nearly at the massive workloads or scale. And this is why present day AI design is shifting toward generative models: training the existing massive datasets in the cloud, while running the models at the edge for real-time decisions.

The rise of edge AI startups

Startups are influencing how and where AI will operate, as they are more agile and willing to engage in risk than big tech companies. A number of startups are finding innovative ways to decrease the size of AI and make it more specialized. Startups are building AI models that run sustainably on edge devices. This could dramatically change the field of AI.

Startups are also repositioning the purpose of AI. Rather than trying to achieve general-purpose intelligence, startups are more focused on providing a solution to a very specific problem. For example, some are building AI tools to help farmers understand their crops' health conditions in the field. Others are creating chips to enable ultra-efficient AI processing in battery-powered devices. And many are creating privacy-first AI models that process personal data locally and never send any personal data to the cloud. Startups will discover more pathways to analogy and benefits because they are untethered to legacy systems and antiquated levels of infrastructure. Hardware continues to improve and algorithms focus on low-energy consumption, and smaller companies could spur the next waves of breakthroughs.

How edge AI startups are different from Big Techs

Edge AI startups are flexible and hardware-agnostic. In contrast to large technology companies that build around substantial infrastructure and specialized hardware (think massive data-centers, proprietary chips, close ecosystems), edge AI startups have a true independent hardware approach. These startups will always work toward operating across multiple hardware platforms, which will allow them to plug into and/or interact with whichever device, sensor, etc. is in place. For example, a startup might write an application running on an ARM SoC, x86 CPUs, or any other embedded processors, thereby vastly expanding their deployment reach. 

Rather than trying to do “everything AI” like some larger technology companies, edge AI startups focus on specific use cases to solve real problems with real constraints (e.g., low latency, limited connectivity, local regulations, etc). They then develop deep domain knowledge and then write their software to closely fit that domain. Consequently, they create their solutions more quickly than if they spent a lot of time on AI development services that would appeal to a general audience in a much longer time frame.

Edge AI pioneers employ open‐source and embedded systems. Since edge devices often have limited compute/power, startups tap into open frameworks and embedded systems (e.g., low-power processors, IoT devices) instead of relying on massive cloud servers. Because of this they deploy on‐premise, offline, or in remote settings. For example, edge AI use cases emphasise local processing rather than continuous cloud connectivity. This way they reduce dependencies, costs and adapt to hardware constraints.

Key edge AI players in the market

Here are three startups that illustrate the differences.

Nodeflux (Indonesia)

Nodeflux offers computer vision (people and vehicle counting, behaviour detection) that runs on multiple kinds of video sources (CCTV, drones, mobile) and is deployed in smart-city contexts in Indonesia (traffic, flooding, etc). They’re hardware-agnostic and use modular analytics - a good example of the new model.

Verdigris Technologies (US)

Verdigris provides AI-powered sensors and software for energy management in commercial/industrial buildings. They combine hardware (sensors) + embedded analytics and focus on a specific vertical (energy/buildings) - different from large tech that might treat buildings as just one of many.

DeGirum (edge AI hardware/SDK startup)

DeGirum recently released a “hardware-agnostic PySDK” aimed at streamlining edge AI development across platforms. This shows the “agnostic” piece in focus: they’re building tools that let developers deploy AI to many types of edge hardware, not locked to one big platform.

Is there an investment potential in edge AI startups

Right now, edge AI has quantifiable momentum behind it. On the financial side, dollars, both venture and corporate, are flowing into startups which transfer inference onto devices or close to the user. You can see it in new rounds, such as Hailo's $120 million round specifically to launch an on-device GenAI accelerator for PCs and cars. It helps that Hailo raised over $120 million, pushing its total funding over $340 million with a valuation over $1.2 billion. Investors are literally betting that more AI will run locally instead of being run in data centers that may be far away.

Big Techs are investing in the edge AI

On the corporate front, major infrastructure players are also realigning around the edge. Akamai launched an "Inference Cloud" jointly with NVIDIA to process low-latency inference at their globally-distributed edge, which is a fairly unambiguous signal that hyperscale compute alone won't succeed real-time use cases.

Enterprises and cloud providers are often collaborating with edge specialists, because everything in the cloud runs into latency, bandwidth, sovereignty and cost barriers. This is why you are seeing first-party "edge zones," 5G adjacent deployments, and distributed cloud strategies targeting regulated or low-connectivity environments. 

Companies such as Microsoft, Google Cloud, and others have been building and/or demonstrating these edge footprints (including tactical and public sector scenarios) while telco-cloud partnerships and CDN players branch out on inference. The consistent theme across all of these initiatives is simple: real-time experiences require responses that are sub-second, and moving compute closer optimizes the solution by reducing round-trips and compliance friction.

Is edge AI going to last

Predictions support the change. Recent market analysis finds that edge AI at tens of billions already, growing around 20% compound annual growth rate for the next several years. One projection says the total market will grow from about $23 billion in 2025 to the $140-200 billion range in the early- to mid-thirties, or about a 21-24% CAGR. Excessive specificity aside, each firm's number may shift slightly. But the top-line story is double-digit growth, continuing into the future as volumes of workloads migrate from the cloud space to devices and CDN-like edges.

Recent M&Es for edge AI startups

There are also signs of exit to indicate that the domain is maturing. In the case of smart-buildings and industrial IoT, Johnson Controls purchased FogHorn to incorporate an edge AI platform into its OpenBlue offering. This is a typical instance of a global incumbent engaging in “buy to accelerate” behavior; it needs an on-prem intelligence play. 

In the case of on-device AI, Apple’s acquisition of Xnor.ai was an early, symbolic move to get low-power models on devices for privacy and reliability. 

The public markets reward companies that generate scale monetizing edge-generated data: for instance, Samsara’s IPO established its pure play connected operations platform on the NYSE, and Mobileye raised the profile of edge-based perceptual computing with its 2022 public listing and declared multibillion-dollar revenue targets. 

These are not all the same business model, but together they are a signal to buyers and public investors that real value exists for intelligence with higher validity outside the hyperscale core.

Key takeaways

AI shifts from cloud computing to edge AI as traditional cloud systems cannot meet real-time demands, costs are rising, and privacy concerns are increasing. Edge-AI resolves all of these issues by processing data on the device locally. This reduces latency and dependence on cloud computing. Startups are leading this charge as they can more easily be agile, use devices independent of hardware, and focus on a more specific industry such as agriculture or smart cities in an open-source way.

In summary, advances in chips, advances in TinyML, 5G, IoT, and AI frameworks for lightweight computing are contributing to the explosion of activity, speed, care, and efficiency of intelligence and processing at the edge. There is a massive amount of activity in this space with enormous amounts of venture funding, corporate investments, enterprise partnerships, and acquisitions, demonstrating that edge AI is becoming an essential component of the next decade of technology.

Recommended Stories for You