The Core of Attention is Communication

Over the past year, perhaps the most cited paper across the software industry is Attention is All You Need that is at the heart of ChatGPT and GPT transformer models. The first thing you will notice in the paper is the Attention formula: $$\text{Attention(Q, K, V)} = \text{softmax}(\frac{QK^T}{\sqrt{d_k}})V$$ Unfortunately, very few sources have delved into … Continue reading The Core of Attention is Communication

Rust and Node.js: Harmonizing Performance and Safety

Prelude In the Rust world, the interaction between Python and Rust is very well-known through the amazing PyO3 ecosystem. There is a similar relation between Python and Javascript in particular Node.js that I'm going to describe in this post. All the code is available here. Most programming language interactions happen through C layer ABI i.e. … Continue reading Rust and Node.js: Harmonizing Performance and Safety

Notes on the Current State of LLM Frameworks

This post tries to shed some light on the rapidly changing LLM frameworks in particular, LangChain (LC) and Llama-index (LI). Library vs. Framework It's tricky to draw a clear boundary between a package/library and a framework, but for the sake of discussion, let's look at some well-known examples Packages: Numpy falls into this category. It … Continue reading Notes on the Current State of LLM Frameworks

From Machine Learning to Formal Math and Type Theory

The idea of this post was sparkled from the new paper Developing Bug-Free Machine Learning Systems with Formal Mathematics. Meanwhile, I have had the idea of writing about what you're going to read for a long time and this paper happily forced me to do it finally! The first and final parts are about my journey and … Continue reading From Machine Learning to Formal Math and Type Theory