This content originally appeared on DEV Community and was authored by Mukesh
Starting my journey of building projects and learning a lot of important topics, grinding DSA on LeetCode and learning system design topics. I have started with building git from scratch. One of the most used tools in the developer community: Git, a distributed version control system. Git is used by many popular companies like Github, Gitlab etc… Why do developers use it? Because it solves a major problem of contributing code by different people to a single repo. Without Git, we would probably share code with each other and manage tasks or use real-time connection to a main server, which is a lot of pain. It is the core of open source contribution. So now you can see how important this system is.
But I wondered what project would be tough enough for me to learn and also help me understand one of the tools I use. So the obvious one: Git. So I will share my part of how I encountered problems and solved them.
To make it easy, I have searched for some structured learning to start and then I came across code-crafters. It was structured and made me search for topics while also explaining the low level stuff. You could also learn from it. I will make some decent projects in the first 30 days from this platform. You could too, https://app.codecrafters.io/r/graceful-worm-869324 . Use this link and get 1 week of premium and access to all the steps of any low level projects like git, redis, Interpreter, kafka, and many more.
So the first experience was to choose a language, I chose to go for JavaScript. I would have chosen C++ for any low level projects but i need some refresher. I chose simplicity over complexity because it’s a personal project. If it was a project someone or I would use then, C++ would be way better. Got a repo link, cloned the repo and it started with uncommenting some basic code for cli. I have looked at that code for 30 minutes and searched for every part like fs, zlib, and command process. Later understood that it was git init command which created directories for it.
Git under the hood uses three directories — HEAD, objects, refs. HEAD contains the reference to the current branch you are working on. Objects would store data and refs stores pointers to commits for branches and tags. I learned about objects in detail today. It is used as the core database. It uses SHA-1 (Not sure, could also be SHA-256 in newer versions) for storing files and the contents in it. SHA-1 is 40 characters, first two characters are for folder name and the other part for the file inside that folder. Git stores them as blob \0 .
So the second step was to create git cat-file cmd. It reads the content of the file in string format. I added the command and some extra arguments for flag and hash.
const command = process.argv[2];
const flag = process.argv[3];
const hash = process.argv[4];
switch (command) {
case "init":
createGitDirectory();
break;
case "cat-file": // step 2
if (flag == "-p" && hash) {
readBlob(hash);
} else{
console.error("Usage: node app/main.js cat-file -p <hash>");
}
break;
default:
throw new Error(`Unknown command ${command}`);
}
It was a simple task, the real challenge was to understand the internals and the functions within fs, zlib and what data would stored in that object. I have coded this part for an hour nearly. Searched documentation, asked AI (i have used AI to understand topics, not to spit all the work for me), experimented with the output I got. Then I understood that it was giving me back Buffer instead of string when I checked the logs. Then I converted that to string and gave only the content of the data to print.
function readBlob(hash) {
const dir = hash.slice(0,2);
const file = hash.slice(2);
const objectPath = path.join(process.cwd(), ".git", "objects", dir, file);
if (!fs.existsSync(objectPath)) {
console.error("Object not found at path:", objectPath);
process.exit(1);
}
const compressedData = fs.readFileSync(objectPath);
const data = zlib.inflateSync(compressedData).toString();
const idx = data.indexOf('\x00');
process.stdout.write(data.substring(idx + 1));
}
Learned a lot, felt proud of myself for a moment and moved on. It was the real growth, I have completed all of this in 2 hours. Then solved DSA and learned about Docker. The best insight for the day would be how CLI ( command line interface ) works under the hood. If this helped you in any way, feel free to follow!
This content originally appeared on DEV Community and was authored by Mukesh