blog | Farid Kazimov

Encoder vs Decoder: Understanding BERT, GPT and Modern LLM Architectures

A deep dive into encoder-only, decoder-only, and encoder-decoder architectures, and how models like BERT, GPT, and BART differ.

4 min read · March 25, 2026

2026 · LLM Transformers BERT GPT Encoder Decoder · blog
A Deep Dive into Attention: Self-Attention, Multi-Head Attention and Positional Encoding

A comprehensive guide to attention mechanisms in Transformers, including intuition, QKV, self-attention, multi-head attention, and positional encoding.

4 min read · March 24, 2026

2026 · Attention Transformers LLM Deep Learning · blog
Transformer Architecture Explained: Attention is All You Need

A deep dive into Transformer architecture, including encoder-decoder structure, attention mechanism, positional encoding, and multi-head attention.

3 min read · March 22, 2026

2026 · Transformers LLM Attention Deep Learning · blog
How Do LLMs Work? Understanding Next Token Prediction

A simple but deep explanation of how Large Language Models work using next-token prediction and probability.

4 min read · March 21, 2026

2026 · LLM Generative AI NLP Transformers · blog
What is Generative AI? Types and Architectures

An overview of Generative AI, including text, image, and audio generation, and the architectures behind them.

5 min read · March 20, 2026

2026 · GenAI LLM Deep Learning Transformers RNN GAN · blog