Blobs (The Raw Bytes)
The most fundamental object in Git is the Blob. It stores your file data. But here is the secret: A Blob does not know its own filename.
Stripped Metadata
If you take an image named `puppy.jpg` and copy it to `dog.jpg`, your OS uses twice the storage. In Git, because both files resolve to the exact same raw bytes, Git ignores the filenames and only stores one single Blob object. It deduplicates your entire repository automatically.
Working Directory (What you see)
The Tree Object
A Tree is basically a directory file. It maps human-readable filenames to the raw cryptographic Blobs.
The Blob Objects
Blobs strip away filenames. If you have 5,000 files with the exact same content, Git only stores ONE blob.
Try It Out
In the interactive simulation above, notice the two files: index.js and utils.js. They both have the exact same contents: console.log('hello');.
If you look at the BLOB OBJECTS section on the bottom right, you will see it only created ONE blob! Git completely ignores your filename when saving file data. Hover over that blob to see which files are sharing it.
Testing Deduplication
In the simulation, try deleting the word 'hello' from `index.js`. The moment the contents change, Git creates a completely new Blob object down below!
Reading a Blob
We can use Git plumbing commands to inspect a Blob if we know its hash. Since blobs only contain data and no filenames, the output is just raw text.