Understanding the Engine Behind LLMs

Understanding the Engine Behind LLMs - Delving beyond buzzwords for effective implementation.

Dr Abiodun B Yusuf - Chief Innovations Officer - Qalhata Tech

8/23/20252 min read

From Foundation to Conversation: Understanding the Engine Behind LLMs

This is a very short primer on "Why foundational models matter, how LLMs fit in, and what this means for the future of AI engineering and human–machine collaboration."

Introduction

There is little contest to the notion that, Large language models (LLMs), such as GPT, Claude, and LLaMA, are now at the forefront of what we now colloquially refer to as Artificial Intelligence "AI", powering everything from conversational agents to creative assistants.

What many don’t realise however, is that these systems are built atop a more general paradigm of foundational models—massive pre-trained networks that underpin modern AI capabilities.

What Are Foundational Models?

Great question! These models (approximations of reality) are trained on vast and varied datasets and are designed for general-purpose learning—applicable across language, vision, code, robotics, and more.

Scale: Trained on petabytes of data
Adaptability: Flexible across domains and tasks
Transferability: Can be fine-tuned or prompted
Architectures: Transformers, diffusion models, multimodal systems

https://arxiv.org/abs/2108.07258 - Bommasani et al, 2021 - Stanford

What Are Large Language Models (LLMs)?

These so called LLMs ( literally because of how much data you need to make it work) are foundational models specialising in natural language. They are tuned to understand and generate text, from summarisation to translation to conversation. The paper below on "Attention" mechanism is a great deep-dive for the interested reader

Popular examples of LLMs which you are likely to have come across, include OpenAI’s GPT‑4, Meta’s LLaMA, Google’s Gemini, and Anthropic’s Claude.

PaperLink: Vaswani et al., 2017 – “Attention is All You Need”

LLMs vs Foundational Models

Foundational Models and their Traits

Domain: Multimodal
Architecture: Varied
Training: Unsupervised across domains
Purpose: General reasoning/extraction
Examples: CLIP, Whisper, DALL·E

LLMs and their Traits

Domain: Text/NLP
Architecture: Transformers
Training: Self-supervised on text
Purpose: Text comprehension & generation
Examples: GPT, Claude, LLaMA

Key Takeaways

LLMs are a specialized subset of foundational models. However:

Both rely on massive-scale, self-supervised pre-training
Each supports effective transfer via fine-tuning or prompting
Both enable powerful zero-shot and few-shot generalization
Scalability drives capability improvements

Some Further Reading

Conclusion

Human–Machine Collaboration & AI Engineering

LLMs form the backbone of human–machine interaction in fields like software development, healthcare, and strategic operations. The rise of AI Engineering is redefining how we build, govern, and monitor these systems at scale.

Understanding foundational models—and where LLMs fit within them—is key to designing AI technologies that are robust, ethical, and impactful.

This insight not only benefits developers and product leaders but also educators, policymakers, and those shaping the future of AI.