Home > Speakers >

Shivangi Agrawal

Shivangi Agrawal has 9 years of experience across semiconductor engineering and business strategy, with expertise in cloud and enterprise compute systems. Shivangi specializes in aligning market requirements with technology capabilities, ensuring platforms deliver performance, scalability, and efficiency for next-generation AI workloads. Her passion lies in enabling sustainable innovation that balances advanced AI performance with environmental responsibility.

From LLMs to SLMs in Embedded World

Status: Coming up in April 2026!

Large Language Models (LLMs) have demonstrated remarkable capabilities in reasoning, multimodal understanding, and natural-language interaction, but their computational and memory demands place them far beyond the reach of resource-constrained embedded platforms. As embedded systems increasingly require on-device intelligence, supporting tasks such as voice interfaces, anomaly detection, semantic understanding, and autonomous decision-making, there is a growing need to scale language models down without sacrificing essential accuracy, responsiveness, or safety.

This presentation traces the evolution from cloud-scale LLMs to optimized Small Language Models (SLMs) tailored for embedded systems. It examines the algorithmic, architectural, and co-design innovations that make this transition possible, including model compression, quantization, structured pruning, distillation, sparsity-aware compute, and memory-efficient attention mechanisms. It also highlights system-level considerations such as real-time inference, energy constraints, thermal limits, secure deployment, and domain-specific customization.

The session provides insight into how SLMs enable embedded devices to run meaningful language and reasoning workloads locally, reducing latency, improving privacy, increasing reliability in disconnected environments, and enabling new classes of intelligent edge applications. Representative use cases across automotive, industrial automation, smart IoT, and wearable devices illustrate the emerging potential of deploying compact language models directly at the edge.

Go to Session