Small AI, Big Impact: Why Size Isn't Everything in the New AI Landscape
July 23, 2024

The landscape of AI models is changing rapidly, and it's not just about making bigger and more powerful models anymore. While massive models like GPT-4 still garner attention, there's a growing focus on smaller, lighter AI models.
Think of AI models like vehicles in a city. Huge models like GPT-4 are akin to buses—powerful but requiring substantial infrastructure. In contrast, smaller AI models are more like electric scooters—light, quick, and ideal for short trips and specific tasks.
My personal experience reflects this shift. Initially, I explored running my own AI models locally. With a decent hardware setup, I successfully executed models up to 32 billion parameters. However, pushing beyond this limit became practically impossible. Naturally, turning to cloud-based GenAI providers like OpenAI and Amazon was the next step. But recently, smaller models have become more powerful, making it practical to handle many essential tasks directly on local devices. This development is particularly valuable for sensitive, privacy-related tasks.
Notable examples of smaller models making an impact include:
- Salesforce's Tiny Giant (xLAM): Known as the "Tiny Giant," it has only 1 billion parameters but outperforms much larger models in tasks like function-calling.
- Google's Gemma2: Compact enough to run effectively on standard devices, enabling rapid iteration and deployment of AI applications.
- Microsoft's Phi: Despite having fewer parameters, these models deliver impressive performance, proving that bigger isn't always better.
- Apple's Small Models: Apple leverages small models powered by their processors to quickly, securely, and privately handle queries and tasks locally.
These smaller AI models offer significant advantages for edge computing:
- Reduced latency for real-time applications
- Enhanced data privacy by keeping processing local
- Bandwidth savings by minimizing data transfer to the cloud
- Ability to operate even with limited connectivity
Apple exemplifies this strategy through hardware like their Neural Engine, embedded in iPhones and iPads. It efficiently manages tasks such as facial recognition and voice commands on-device, enhancing user privacy and performance. Additionally, Apple's introduction of Apple Intelligence integrates AI tools directly into their operating systems, further emphasizing on-device processing.
The future of AI will involve balancing the powerful capabilities of large models with the agility and practicality of smaller, edge-focused solutions. This dual approach isn't just reshaping AI—it's redefining what AI can achieve and how we think about technology.