Langformers Blog

Langformers Blog

Sign in Subscribe

BPE Tokenizer: Training and Tokenization Explained

large language models

BPE Tokenizer: Training and Tokenization Explained

In my previous blog post, I explained in detail how large language models (LLMs) work, using analogies and maths. I received many requests for a simpler explanation of how a tokenizer is trained for such models, and how tokenization works. Hence, this blog post. How LLMs Work: A Beginner’s

How LLMs Work: A Beginner's Guide to Decoder-Only Transformers

large language models

How LLMs Work: A Beginner's Guide to Decoder-Only Transformers

A language model like GPT (which stands for Generative Pretrained Transformer) takes text, breaks it into tokens (words or subwords), converts those tokens into numbers, processes those numbers through layers of Transformer decoders, and finally outputs a probability distribution over all possible tokens in its vocabulary. It then selects the

large language models

What is RAG? A Beginner's Guide to Retrieval-Augmented Generation

Not long ago, all large language models (LLMs) had what we called a knowledge cutoff. This meant they only knew information up until a certain date — anything that happened after that, they simply couldn’t help you. Today, that’s changed, at least for cloud-based LLMs like ChatGPT. These models

embedding models

Sentence Embeddings & Similarity: Explained Simply

Have you ever wondered how computers understand that two sentences mean the same thing—or are similar—even if they don't use the same words? Well, that's the magic of sentence embeddings and cosine similarity. Let’s dive in, ditch the jargon, and keep it simple.

Ollama and Local LLMs: Step-by-step Guide

large language models

Ollama and Local LLMs: Step-by-step Guide

Everyone loves ChatGPT, right? And hey, shoutout to DeepSeek Hero too. But what about the unsung heroes of the AI world—those open-source large language models (LLMs) quietly powering innovation behind the scenes? While models like Meta’s LLaMA might ring a bell for some, there’s a whole family

Fixed-size, Semantic and Recursive Chunking Strategies for LLMs

large language models

Fixed-size, Semantic and Recursive Chunking Strategies for LLMs

In modern Retrieval-Augmented Generation (RAG) systems, handling large documents efficiently is critical. Embedding models have token limits that, when exceeded, can lead to incomplete processing or errors. This is where chunking becomes critical. By breaking large documents into smaller, manageable pieces, chunking ensures that information remains accessible, relevant, and optimized

embedding models

Build your own custom, lightweight transformer with Langformers

In the world of machine learning, bigger isn't always better — especially when it comes to deploying models in resource-constrained environments like mobile apps or real-time systems. Fortunately, Langformers offers an elegant solution: you can train a smaller model to mimic the embeddings of a large pretrained model. This

embedding models

Reranking Sentences for Improved Semantic Search

When working with search systems, getting the most relevant results is crucial — yet traditional vector search methods don’t always guarantee this. That’s where reranking comes into play, and with Langformers, implementing reranking has never been easier. In this article, we’ll discuss: * What reranking is? * Why it’s

embedding models

Semantic Search with Vector Databases (FAISS, ChromaDB, Pinecone)

In today's data-driven world, finding relevant information quickly is crucial. Traditional keyword-based search engines often fall short when you need more context-aware retrieval. That’s where semantic search steps in — and Langformers makes it easier than ever to set up your own semantic search engine in just a

encoder-only models

Sentence Embedding and Similarity with Langformers

When working with semantic search or designing sophisticated NLP pipelines such as RAG (Retrieval-augmented Generation), converting text into numerical vectors—a process called embedding—is often one of the very first steps. Embeddings capture the meaning and context of sentences in a way that machines can understand, enabling powerful applications

encoder-only models

Pretrain Your Own RoBERTa model from Scratch

Masked Language Models (MLMs) like BERT, RoBERTa, and MPNet have revolutionized the way we understand and process language. These models are foundational for tasks such as text classification, named-entity recognition (NER), and many other NLP applications where the entire input sequence matters. But what if you want to create your

encoder-only models

Fine-Tune Hugging Face Models for Text Classification

Training text classifiers has traditionally required a lot of setup, coding, and hyperparameter tuning. But thanks to Langformers, you can now fine-tune powerful Transformer models (encoder-only models / masked lanuage models) like BERT, RoBERTa, or MPNet for your custom classification tasks with just a few lines of code — all while keeping

Data Labelling Using LLMs with Langformers

large language models

Data Labelling Using LLMs with Langformers

When most people think of Large Language Models (LLMs), they think of conversations, content generation, or summarization. But LLMs are also incredibly effective at data labelling — and now, with Langformers, you can easily utilize that power for you text labelling tasks. Whether you're preparing training data, building a

Build Your Own LLM-Powered Apps with Langformers: LLM Inference API

large language models

Build Your Own LLM-Powered Apps with Langformers: LLM Inference API

Today, we're excited to showcase another powerful feature of Langformers — LLM Inference via a REST API. Whether you’re building a chatbot, automating workflows, or adding AI magic to your product, Langformers gives you everything you need to bring large language models (LLMs) into your application stack effortlessly.

Chat with Open-Source LLMs Locally (Llama, Deepseek, Mistral)

large language models

Chat with Open-Source LLMs Locally (Llama, Deepseek, Mistral)

Large Language Models (LLMs) are everywhere — and everyone’s using them. But there’s a big issue emerging: people are feeding sensitive emails and company data into cloud-based LLM services like ChatGPT, DeepSeek, and others. Just to draft a few emails or generate some paragraphs, confidential information is getting handed

A step-by-step forward pass and backpropagation example

A step-by-step forward pass and backpropagation example

There are multiple libraries (PyTorch, TensorFlow) that can assist you in implementing almost any neural network architecture. This article is not about solving a neural net using one of those libraries. Instead, in this article, we'll see a step-by-step forward pass (forward propagation) and backward pass (backpropagation) example.

Understanding the Gradient Descent Algorithm: The simplest way

Understanding the Gradient Descent Algorithm: The simplest way

According to Wikipedia, "Gradient descent is a first-order iterative optimization algorithm for finding local minima of a differentiable function". Sounds a lot, right? In this article, let's get acquainted with the Gradient descent algorithm in the most straightforward (and "simplest") way. Before we continue

PyTorch and Tensors fundamentals

PyTorch is a deep learning framework that significantly simplifies the process of writing and training deep neural networks. It supports many architectures, from shallow ones to deep ones like transformers. I mean, any neural network architecture you can think of. On the other hand, tensors are fundamental data structures in

Norms in Linear Algebra

This article is part three of the Linear algebra series. Part 1: Scalars, Vectors, Matrices and Tensors Part 2: Basic Operations on Tensors Putting it in simple terms, the norms in Linear algebra lets us measure how big a tensor is. In Machine learning, tensors (vectors, matrices, etc.) are heavily

Basic Operations on Tensors

This article is part two of the Linear Algebra series. The first part gave a basic introduction to the fundamentals of Linear Algebra: scalars, vectors, matrices and tensors. The article also discussed scalars being 0th order tensors, vectors being 1st order tensors and matrices being 2nd order tensors. In this

Scalars, Vectors, Matrices and Tensors

A solid understanding of linear algebra is essential for better comprehension and effective implementation of neural networks. Getting started with deep learning libraries such as TensorFlow and PyTorch in a "black-box" manner is relatively easy, thanks to the abundance of tutorials and guides available. However, to truly grasp