Andrej Karpathy Just 10x’d Everyone’s Claude Code — Transcript

Learn how Andrej Karpathy’s LLM knowledge base system revolutionizes AI memory and organization using simple markdown files in just 5 minutes.

Key Takeaways

  • Karpathy’s LLM wiki system makes AI knowledge persistent and organized, unlike ephemeral chats.
  • The setup is simple, fast, and requires only markdown files, no complex tech stack.
  • It enables efficient querying and discovery of relationships across large datasets like YouTube transcripts.
  • Token usage and cost efficiency are significantly improved by compacting scattered files into a wiki.
  • The system is customizable and scalable, suitable for personal and professional knowledge management.

Summary

  • The video demonstrates organizing YouTube transcripts into a knowledge system using LLMs and markdown files.
  • Andrej Karpathy’s approach uses Cloud Code and Obsidian to auto-ingest, organize, and link data without manual effort.
  • The system creates backlinks and indexes for videos, tools, techniques, and concepts, enabling efficient querying.
  • This method transforms ephemeral AI chats into a persistent, compounding knowledge base.
  • Setup requires no complex infrastructure, vector databases, or embeddings—just folders with markdown files.
  • The knowledge base can be customized and expanded, serving as a personal or professional second brain.
  • Token efficiency is greatly improved, reducing usage by up to 95% in some cases.
  • The approach is gaining traction on social media as a game-changer for AI agentic software development.
  • Users can easily add new content and update the knowledge base continuously.
  • Obsidian is recommended as a front-end tool to visualize relationships within the knowledge system.

Full Transcript — Download SRT & Markdown

00:00
Speaker A
What you're looking at right here is 36 of my most recent YouTube videos organized into an actual knowledge system that makes sense. And in today's video, I'm going to show you how you can set this up in 5 minutes. It's super, super easy. You can see here how we have these different nodes and different patterns emerging. And as we zoom in, we can see what each of these little dots represents. So, for example, this is one of my videos, $10,000 agentic workflows. We can see it's got some tags, it's got the video link, it's got the raw file, and it gives an explanation of what this video is about and what the takeaways are. And the coolest part is I can follow the backlinks to get where I want. There's a backlink for the WAT framework. There's a backlink for Claude code. There's a backlink for all these different tools I mentioned like Perplexity, Visual Studio Code, Nano Banana, and N and N. It also has techniques like the WAT framework or bypass permissions mode or human review checkpoint. So, as this continues to fill up, we can start to see patterns and relationships between every tool or every skill or every MCP server that I might have talked about in a YouTube video, and I can just query it in a really efficient way now that we have this actual system set up. And the crazy part is I said, "Hey, Claude code, go grab the transcripts from my recent videos and organize everything." I literally didn't have to do any manual relationship building here. It just figured it all out on its own. And then right here, I have a much smaller one, but this is more of my personal brain. So, this is stuff going on in my personal life. This is stuff going on with, you know, Up-to-AI or my YouTube channel or my different businesses and my employees and our Q2 initiatives and things like that. This is more of my own second brain. So, I've got one second brain here, and then I've got one basically YouTube knowledge system. And I could combine these or I could keep them separate, and I can just keep building more knowledge systems and plug them all into other AI agents that I need to have this context. It's just super cool. So, Andrej Karpathy just released this little post about LLM knowledge bases and explaining what he's been doing with them. And in just a matter of a few days, it got a ton of traction on X. So, let's do a quick breakdown, and then I'm going to show you guys how you can get this set up in basically 5 minutes. It's way more simple than you may think. Something I've been finding very useful recently is using LLMs to build personal knowledge bases for various topics of research interest. So, there's different stages. The first part is data ingest. He puts in basically source documents. So, he basically takes a PDF and puts it into Cloud Code, and then Cloud Code does the rest. He uses Obsidian as the IDE. So, this is nothing really too game-changing. Obsidian just lets you visually see your markdown files. But, for example, this Obsidian project right here with all this YouTube transcript stuff, that actually lives right here. This is the exact same thing. Here are the raw YouTube transcripts, and here's that wiki that I showed you guys with the different um folders for what Cloud Code did with my YouTube transcripts. And then there's a Q&A phase where you basically can ask questions about YouTube or about the research, and it can look through the entire wiki in a much more efficient way, and it can give you answers that are super intelligent. He said here, "I thought that I had to reach for fancy rag, but the LLM has been pretty good about auto maintaining index files and brief summaries of all documents, and it reads all the important related data fairly easily at this small scale." So, right now he's doing about 100 articles and about half a million words. So, there's a few other things that we'll cover later, but the TLDR is you give raw data to Cloud Code, it compares it, it organizes it, and then it puts it into the right spots with relationships, and then you can query it about anything. And it can help you identify where there's gaps in that node or in that, you know, relationship, and it can go do research and fill in the gaps. All right, so why is this a big deal? Because normal AI chats are ephemeral, meaning the knowledge disappears after the conversation. But, this method using Karpathy's LLM wiki makes knowledge compound like interest in a bank. People on X are calling it a game-changer because it finally makes AI feel like a tireless colleague who actually remembers everything and it stays organized. It's also super simple. It will take you 5 minutes to set up. I'll show you guys. You don't need a fancy vector database, embeddings, or complex infrastructure. It's literally just a folder with markdown files. That's it. You literally just have a vault up top. So, in this example, it's called my wiki. You've got a raw folder where you put all of the stuff, and then you've got a wiki folder, which is what the LLM takes from your raw and puts it into the wiki. So, in here you have all the wiki pages, which it will create, but then you also have an index, and you have a log. So, for example, in my YouTube transcripts vault, here is the index. You can see that I have all these different tools, which I could obviously click on and it would take me right to that page. Or after that, I have all the different techniques, agent teams, sub-agents, permission modes, the WAT framework, and then we've got different concepts, MCP servers, rag, vibe coding. We've got all these different sources, which are, you know, the YouTube videos. And then when I have people or when I have comparisons, they will be put in here in the index. And then we also have a log, which is the operation history. So, in this case, in the YouTube project, the log isn't huge, 'cause I only ran one huge batch of the initial 36 YouTube videos. But now every time I have one, I say, "Hey, can you go ahead and ingest the new YouTube video into the wiki?" And then we'll see every single time we update this. And then of course, you need your claw.md to explain how the project works and how to search through things and how to, you know, update things. It's also a big deal from a cost perspective, token efficiency and long-term value. One X user turned 383 scattered files and over 100 meeting transcripts into a compact wiki and dropped token usage by 95% when querying with Claude. And obviously, token management and efficiency is a huge conversation right now and will always be. The other thing that's really cool about this is there's not really like a GitHub repo you go copy or there's not a complicated setup. You literally just say, "Hey, Claude code, read this idea from Andre Karpathy and implement it." And people on X are now talking about like this is how 2026 AI agentic software and products will be made. You just give it a high-level idea and it goes and builds it out. And Karpathy even said, "Hey, you know, I left this prompt vague so that you guys can customize it." And I'll show you the ways in my two different vaults right now that it changed things a little bit based on the context and understanding of what the project is actually for. Okay, so this was the original tweet I just showed you guys. And then he followed up and said, "Hey, this one went viral, so here is the idea in a gist format." So, if you open this up, this is basically just another explanation of the core idea of how this works and why the architecture, indexing, all this kind of stuff. And by the way, this is the part where he says, "Hey, this is left vague so that you can hack it and customize it to your own project." So, we're going to come right back to this in a sec, but the first pre-wreck that we're going to do, it's not necessary, but I like to have a nice little front end to see the relationships, is we're going to go to Obsidian and download it. So, if you just go to obsidian.md, you can see this is the
00:10
Speaker A
super easy. You can see here how we have these different nodes and different patterns emerging. And as we zoom in, we can see what each of these little dots represents. So, for example, this is one of my videos, $10,000 agentic workflows.
00:22
Speaker A
We can see it's got some tags, it's got the video link, it's got the raw file, and it gives an explanation of what this video is about and what the takeaways are. And the coolest part is I can
00:31
Speaker A
follow the backlinks to get where I want. There's a backlink for the WAT framework. There's a backlink for Claude code. There's a backlink for all these different tools I mentioned like Perplexity, Visual Studio Code, Nano Banana, and N and N. It also has
00:43
Speaker A
techniques like the WAT framework or bypass permissions mode or human review checkpoint. So, as this continues to fill up, we can start to see patterns and relationships between every tool or every skill or every MCP server that I
00:55
Speaker A
might have talked about in a YouTube video, and I can just query it in a really efficient way now that we have this actual system set up. And the crazy part is I said, "Hey, Claude code, go grab the transcripts from my recent
01:06
Speaker A
videos and organize everything." I literally didn't have to do any manual relationship building here. It just figured it all out on its own. And then right here, I have a much smaller one, but this is more of my personal brain.
01:16
Speaker A
So, this is stuff going on in my personal life. This is stuff going on with, you know, Up-to-AI or my YouTube channel or my different businesses and my employees and our Q2 initiatives and things like that. This is more of my own
01:27
Speaker A
second brain. So, I've got one second brain here, and then I've got one basically YouTube knowledge system. And I could combine these or I could keep them separate, and I can just keep building more knowledge systems and plug
01:37
Speaker A
them all into other AI agents that I need to have this context. It's just super cool. So, Andrej Karpathy just released this little post about LLM knowledge bases and explaining what he's been doing with them. And in just a
01:47
Speaker A
matter of a few days, it got a ton of traction on X. So, let's do a quick breakdown, and then I'm going to show you guys how you can get this set up in basically 5 minutes. It's way more
01:55
Speaker A
simple than you may think. Something I've been finding very useful recently is using LLMs to build personal knowledge bases for various topics of research interest. So, there's different stages. The first part is data ingest.
02:05
Speaker A
He puts in basically source documents. So, he basically takes a PDF and puts it into Cloud Code, and then Cloud Code does the rest. He uses Obsidian as the IDE. So, this is nothing really too game-changing. Obsidian just lets you
02:16
Speaker A
visually see your markdown files. But, for example, this Obsidian project right here with all this YouTube transcript stuff, that actually lives right here.
02:23
Speaker A
This is the exact same thing. Here are the raw YouTube transcripts, and here's that wiki that I showed you guys with the different um folders for what Cloud Code did with my YouTube transcripts.
02:33
Speaker A
And then there's a Q&A phase where you basically can ask questions about YouTube or about the research, and it can look through the entire wiki in a much more efficient way, and it can give you answers that are super intelligent.
02:44
Speaker A
He said here, "I thought that I had to reach for fancy rag, but the LLM has been pretty good about auto maintaining index files and brief summaries of all documents, and it reads all the important related data fairly easily at
02:54
Speaker A
this small scale." So, right now he's doing about 100 articles and about half a million words. So, there's a few other things that we'll cover later, but the TLDR is you give raw data to Cloud Code, it compares it, it organizes it, and
03:05
Speaker A
then it puts it into the right spots with relationships, and then you can query it about anything. And it can help you identify where there's gaps in that node or in that, you know, relationship, and it can go do research and fill in
03:16
Speaker A
the gaps. All right, so why is this a big deal? Because normal AI chats are ephemeral, meaning the knowledge disappears after the conversation. But, this method using Karpathy's LLM wiki makes knowledge compound like interest in a bank. People on X are calling it a
03:29
Speaker A
game-changer because it finally makes AI feel like a tireless colleague who actually remembers everything and it stays organized. It's also super simple.
03:36
Speaker A
It will take you 5 minutes to set up. I'll show you guys. You don't need a fancy vector database, embeddings, or complex infrastructure. It's literally just a folder with markdown files.
03:45
Speaker A
That's it. You literally just have a vault up top. So, in this example, it's called my wiki. You've got a raw folder where you put all of the stuff, and then you've got a wiki folder, which is what
03:53
Speaker A
the LLM takes from your raw and puts it into the wiki. So, in here you have all the wiki pages, which it will create, but then you also have an index, and you have a log. So, for example, in my
04:02
Speaker A
YouTube transcripts vault, here is the index. You can see that I have all these different tools, which I could obviously click on and it would take me right to that page. Or after that, I have all the different techniques, agent teams,
04:12
Speaker A
sub-agents, permission modes, the WAT framework, and then we've got different concepts, MCP servers, rag, vibe coding.
04:19
Speaker A
We've got all these different sources, which are, you know, the YouTube videos. And then when I have people or when I have comparisons, they will be put in here in the index. And then we also have a log, which is the operation history.
04:29
Speaker A
So, in this case, in the YouTube project, the log isn't huge, cuz I only ran one huge batch of the initial 36 YouTube videos. But now every time I have one, I say, "Hey, can you go ahead and ingest the new YouTube video into
04:40
Speaker A
the wiki?" And then we'll see every single time we update this. And then of course, you need your claw.md to explain how the project works and how to search through things and how to, you know, update things. It's also a big deal from
04:51
Speaker A
a cost perspective, token efficiency and long-term value. One X user turned 383 scattered files and over 100 meeting transcripts into a compact wiki and dropped token usage by 95% when querying with Claude. And obviously, token management and efficiency is a huge
05:06
Speaker A
conversation right now and will always be. The other thing that's really cool about this is there's not really like a GitHub repo you go copy or there's not a complicated setup. You literally just say, "Hey, Claude code, read this idea
05:17
Speaker A
from Andre Karpathy and implement it." And people on X are now talking about like this is how 2026 AI agentic software and products will be made. You just give it a high-level idea and it goes and builds it out. And Karpathy
05:28
Speaker A
even said, "Hey, you know, I left this prompt vague so that you guys can customize it." And I'll show you the ways in my two different vaults right now that it changed things a little bit based on the context and understanding
05:38
Speaker A
of what the project is actually for. Okay, so this was the original tweet I just showed you guys. And then he followed up and said, "Hey, this one went viral, so here is the idea in a gist format." So, if you open this up,
05:47
Speaker A
this is basically just another explanation of the core idea of how this works and why the architecture, indexing, all this kind of stuff. And by the way, this is the part where he says, "Hey, this is left vague so that you can
05:57
Speaker A
hack it and customize it to your own project." So, we're going to come right back to this in a sec, but the first pre-wreck that we're going to do, it's not necessary, but I like to have a nice
06:05
Speaker A
little front end to see the relationships, is we're going to go to Obsidian and download it. So, if you just go to obsidian.md, you can see this is the completely free tool and you're going to go ahead and download it. So,
06:16
Speaker A
just for your operating system, download this and then open up the wizard and open up the app. So, when you open up the app, it'll look like this and what we're going to do here is we're going to
06:25
Speaker A
create a new vault. So, down here you can see I have Herc Brain and I have YouTube Transcripts. I'll just make it a little bigger. I'm going to to go to Manage Vaults. I'm going to create a new
06:33
Speaker A
one and now we just have to give this a name. So, I'm just going to call this one demo vault and you're going to choose a location where you want to put this. So, I'm just throwing this on my desktop and
06:41
Speaker A
I'm going to go ahead and create this vault. Then what you're going to do is go to wherever you like to run Claude Code. So, in this case, I'm doing it in VS Code and I open up that folder. So,
06:50
Speaker A
demo vault, we get an Obsidian and then we get a welcome.md. So, I'm going to open up Claude. So, I'm going to do it in my terminal. I'm going to run Claude and lately I've been liking using my terminal better for
07:01
Speaker A
Claude. I like to do it inside of VS Code, but the reason is just because I like to see the status line and I have, you know, a little bit more functionality.
07:08
Speaker A
So, anyways, now that we have Claude Code open, here's what we're going to do. We're going to go back over to the LLM Wiki thing that we got from Andrej Karpathy. We're going to copy all of this and we're going to go back into Claude
07:19
Speaker A
Code and then just paste it in there. So, that is the prompt from Karpathy that's going to build out everything we need. And then before we send that off, we're dropping this in, which you guys can screenshot and then just throw it to yours. But I'm
07:31
Speaker A
saying, "You are now my LLM Wiki agent. Implement this exact idea file as my complete second brain. Guide me step-by-step. Create the claude.md schema. Blah, blah, blah." So, So is just telling it what it needs to do with
07:43
Speaker A
this idea that we just got from Karpathy. So anyways, on the right we have this cloud code running and on the left we have our Obsidian vault and you can see it just created those two folders. So it created the raw and it
07:52
Speaker A
created the wiki as you can see. Now by default it threw in four folders, it threw in analysis, concepts, entities, and sources. Once we start to populate stuff we can talk to it to see if that's actually the way we want to do it or
08:02
Speaker A
not. Because it's interesting in my personal kind of second brain, the wiki is literally just markdown files.
08:08
Speaker A
There's no structure to it. And in some cases that's good. Karpathy actually said sometimes I like to keep it really simple and really flat, which means like no subfolders and not a bunch of over organizing. But then you guys did see in
08:19
Speaker A
my YouTube transcript one there were different subfolders and I think that in this case it actually makes more sense.
08:25
Speaker A
But you can see that it went ahead and it created a Claude.md, it created an index and a log, and then a few different folders in our wiki. But now it's saying, "Hey, let's go ahead and try it out. Drop in your first source
08:33
Speaker A
into the raw folder and tell me to ingest it." Okay, so I'm at this website called AI 2027. If you guys haven't read this before, it's kind of an interesting read, so go check it out. And now let's
08:43
Speaker A
say I want to get this into my vault. What I could do is just copy the whole page, right? And it might just come through a little weird. Or we can just get an Obsidian extension, which lets us
08:52
Speaker A
basically take articles right from the web and just put it right into our vault super easy. So search for this extension called Obsidian Web Clipper. You would go ahead and add this to Chrome. So then when you're in the article that you
09:01
Speaker A
want, you basically just click on your extensions, you open up Obsidian Web Clipper, and then you can just chuck it into your vault. And then right here you're going to want to set this to raw because this is the actual folder that
09:10
Speaker A
it's going to put it in. So you can go ahead and click add to Obsidian, open Obsidian, and then now you can see in my raw section we have this AI 2027 source with the title, the source, and it's not
09:21
Speaker A
super super populated yet because the LLM in Cloud Code is going to do that.
09:26
Speaker A
So here is our file. I'm going to open up Cloud Code and say, "Awesome, I just threw in an article called AI 2027 into the raw. Can you please go ahead and ingest that?" It It ask you some
09:35
Speaker A
questions. It might also be helpful to before you start ingesting stuff, say, "Hey, by the way, this project is specifically for my second brain. So, personal things, business things, whatever. Or, this is just a research project. This is where I'm going to
09:47
Speaker A
chuck you all of the articles and all the things that I want to learn about and all the things that I know." So, there's different ways that you can set up the project, as you saw with mine, one for YouTube, one for just personal
09:56
Speaker A
second brain. So, now what it's doing is it's going to read through this article, and then it's going to figure out where should I chuck everything into the wiki.
10:03
Speaker A
It's not just going to create one MD file for this. It might create five, or it might create 10. And there's going to be relationships between each of the different sections that it creates. So, it's kind of doing its own method of
10:12
Speaker A
chunking. Now, one thing I want to call out real quick is with this extension, if you go here and you open up the options for it, you can see that you can actually change where by default the folders are dropped, which is in the
10:24
Speaker A
location section. By default, it'll be going to a place called clippings, but just go ahead and change that to raw.
10:29
Speaker A
Okay, so here it came back with all these questions, right? It said, "Here are my key takeaways from this article, blah blah blah." And now it'll ask you, "What do you want to emphasize from this article? What's your focus? How granular
10:39
Speaker A
do you want to be? What's your plan?" So, I'm just going to say, "I want this to be extremely thorough.
10:43
Speaker A
This is my passion looking at where AI is going to go. Um and this whole project, by the way, that you're setting up in this vault is basically just going to be my place to dump in research about
10:54
Speaker A
AI. So, help me keep all that organized so that I can query it and that I can, you know, keep my thoughts related." So, that's just a quick example of what it might look like for you to give it some more context to
11:03
Speaker A
continuously build your project. So, I'm going to switch over over here to the graph view because I think it'll be interesting to see as it is starting to go through and create those different wiki files, it's going to go ahead and
11:14
Speaker A
it's going to create all those relationships, and we'll be able to watch it in real time. All right, so it's creating all of the wiki pages now, and you can see that it said it's going to make about 25 because there's so much
11:23
Speaker A
stuff going on in the original AI 2027 article. Okay, so our first one just popped in here, and there a second one just came through, and now Now understand you're starting to see where do you have hubs, or where do you just
11:33
Speaker A
have little individual nodes. So, this is a major hub. Someone named Eli, someone named Thomas, Daniel, and you can see all the different relationships here with things like AI governance, with things like open brain, superhuman coder. Okay, so that ingest took about
11:48
Speaker A
10 minutes. So, sometimes you have to be a little patient with, you know, it reading through everything and organizing everything, but it does a lot of heavy lifting, of course. When I uploaded the 36 YouTube transcripts in batch, it took about 14 minutes, so it
12:00
Speaker A
kind of just depends. But, it created 23 wiki pages. We have the source, we have six people, five organizations, and one AI systems page. Different concepts, so technical, alignment, and geopolitical.
12:11
Speaker A
And then an analysis, and then it asks some questions about it so that it can help make the relationships and make the structure even better.
12:19
Speaker A
Now, let's just open this one up a little bit deeper and see what it actually did in here with this stuff.
12:23
Speaker A
So, we have This is the source with all the main relationships. So, as we start to add other articles, we will see other big kind of like nodes, and maybe in some cases we'll have relationships between like compute scaling with
12:34
Speaker A
different articles that we upload as well. So, let's just see. If I click into the main source, we can see the tags that it got, we can see the authors, and we can click around. So, here's a link to OpenAI. Okay, what's
12:44
Speaker A
OpenAI? Here's references in AI 2027. Here's some other connections with OpenAI like model spec. Okay, we're in model spec, let's take a look. We can see other things about model spec, and we could also go to how the LLM
12:55
Speaker A
psychology model works. So, this is just super super cool, all the relationships that we get. And once again, all of this stuff that we're looking at was derived from one article, and automatically organized and related. So, the question
13:07
Speaker A
now is like, what do we do from here? Do we query it inside of this environment?
13:11
Speaker A
Do we query it from somewhere else? And that's completely up to the way that you want to use this. So, for example, with my YouTube project, I'm probably just going to keep this here. And whenever I want to ask questions about YouTube, or
13:20
Speaker A
if I want to turn this into like a website, I can just do that from here.
13:24
Speaker A
Or, if I need to, I can point a different project at this folder since everything's here. And it can crawl through the wiki, it can read the index, and it knows how this stuff works because you can give it the claw.md so
13:34
Speaker A
it understands the project as well. So, for example, in this one, which is just my second brain where we have all of the different things about like I drop in my meeting recordings, I drop in, you know, click-up channels, summaries, and things
13:44
Speaker A
like that. This is something that I want to use in my executive assistant. So, what I did in my executive assistant here called Herc 2, if I go to this claw.md, you can see that we have a wiki
13:53
Speaker A
path. So, whenever you need to read things about me and my business that you don't have already, you would basically go to my Herc brain vault. You would go to that directory, and then you would read through the wiki. You can read the
14:04
Speaker A
hot cash, which I'll explain in just a sec. You can read the index. You can read the domain sub-index. And then you can also just search through everything here. And I said, "Don't read from the wiki unless you actually need it." So,
14:13
Speaker A
here are some things that you might do that you don't need to go read the wiki for. And all of this is my business knowledge. Now, if you guys remember, if you watch my video on setting up an
14:20
Speaker A
executive assistant, I used to do this with context files inside of this project. And when I changed over to this method, I actually saw a reduction in tokens that I was actually calling in this project. So, the thing about the
14:32
Speaker A
hot cash, right? I didn't actually have this in my YouTube one. So, if I go to YouTube, you can see there's no hot cash.
14:39
Speaker A
But if I go to the Herc brain in the wiki, you can see there's a hot.md right here. And this is basically just a cache of like 500 words or 500 characters that it saves, which is like what is the most
14:49
Speaker A
recent thing that Nate just gave me or that we talked about. In the context of my executive assistant, this is really helpful, you know, it might save me from having to crawl different wiki pages.
14:58
Speaker A
But in something like the YouTube transcript project, I don't really need a hot cash. So, another thing that I alluded to but didn't really cover was the idea of linting. So, Karpathy says that he runs some LLM health checks over
15:09
Speaker A
the wiki to find inconsistent data, impute missing data with web searches, find interesting connections for new article candidates, things like that.
15:18
Speaker A
So, it basically helps you run a lint, you know, every day, every week, whenever you want, which helps make sure that everything is scalable and structured in the right way. And it might even come back and say, "Hey, I
15:28
Speaker A
don't fully understand this. Can you give me some more info or can you grab some more articles that might help me out here?" So, now the final question about this that I wanted to cover is like, does this kill semantic search
15:37
Speaker A
rag? And the answer is no, but kind of yes. And it all depends on the goal of the project and the goal of the context, how much context you have. So, here's a really quick chart that I had my cloud
15:48
Speaker A
code make. I was in my Herk brain where I dumped in a bunch of information about Karpathy's LLM knowledge and I just said, "Hey, can you please explain Karpathy knowledge as simple as possible, keep it super concise, and um
16:00
Speaker A
compare it to typical semantic search rag?" So, it found Karpathy's idea instead of a database, you just give the LLM well-organized markdown files, and it compares it here to the actual semantic search rag. So, actually I might as well just read it off from
16:14
Speaker A
here. So, it finds it by reading indexes and follows links rather than using similarity search. getting a deeper understanding of relationships because they're links rather than just saying, "Hey, these chunks seem similar." As far as infrastructure, it is literally just
16:26
Speaker A
markdown. So, like I said, you don't even need the Obsidian, you just need these markdown files. Whereas with semantic search, you need an embedding model, you need a vector database, and a chunking pipeline. The cost over here is
16:37
Speaker A
basically free. Your only cost is going to be tokens. Whereas over here, you might have ongoing compute and storage.
16:42
Speaker A
And for maintenance, you just run a lint. You clean up things. You add more articles. You give it more context rather than having to re-embed when things change. But right now, the weakness, of course, with the uh LLM
16:53
Speaker A
knowledge wiki is that it doesn't scale huge across enterprises, right? Because it's just a bunch of files. Um and that is where the cost will probably get more and more expensive than going to something like standard semantic search or knowledge graph or light rag
17:08
Speaker A
or whatever other tool is out there for that. So, here you can see, if you have hundreds of pages with good indexes, you're fine with wiki graph. But if you were getting up to the millions of documents, then you're going to want to
17:16
Speaker A
actually do more of a traditional rag pipeline. At least for now with how the current models are and everything we know right now in April 2026. So, that is going to do it for today. I hope you guys learned something new or enjoyed
17:29
Speaker A
the video and if you did, please give it a like it helps me out a ton. Now, after this video if you're interested in learning how you can create your own sort of executive assistant and then plug it into this Obsidian Vault, then
17:37
Speaker A
definitely check out this video up here where I go over how I built my executive assistant and the way that you should be thinking about it. So, hopefully I'll see you guys over there, but if not, I'll see you in the next one.
Topics:Andrej KarpathyLLM knowledge baseAI automationCloud CodeObsidianmarkdown wikiagentic workflowsknowledge managementtoken efficiencyAI memory

Frequently Asked Questions

What is the main benefit of Andrej Karpathy’s LLM knowledge base system?

The system transforms AI interactions from ephemeral chats into a persistent, organized knowledge base that compounds over time, enabling efficient querying and memory retention.

How complex is it to set up this knowledge base system?

The setup is very simple and can be done in about 5 minutes using just folders with markdown files, without the need for complex infrastructure or vector databases.

What tools are recommended to use with this knowledge base approach?

Cloud Code is used for ingesting and organizing data, while Obsidian is recommended as a front-end IDE to visualize and navigate the markdown wiki files.

Get More with the Söz AI App

Transcribe recordings, audio files, and YouTube videos — with AI summaries, speaker detection, and unlimited transcriptions.

Or transcribe another YouTube video here →