Matt Pocock’s Agentic Engineering Workflow (just copy h… — Transcript

Explore Matt Pocock’s agentic engineering workflow focusing on strategic programming, AI harness optimization, and skill development for software engineers.

Key Takeaways

  • Focus on optimizing the AI harness rather than solely on the AI model.
  • Strategic programming is essential for maximizing AI’s benefits and requires long-term planning and good design.
  • AI excels at tactical programming, freeing developers to concentrate on higher-level tasks.
  • Skills and domain knowledge are multipliers that enhance AI’s effectiveness.
  • Using APIs and well-structured codebases enables AI agents to work more efficiently and reliably.

Summary

  • The video emphasizes the importance of optimizing the AI harness—prompting, skills, and environment—over just focusing on the AI model itself.
  • It distinguishes between tactical programming, which AI now handles efficiently, and strategic programming, which remains a critical human skill.
  • Strategic programming involves long-term thinking, codebase architecture, and designing well-scoped tasks to maximize AI’s potential.
  • The speaker discusses how AI acts as a multiplier for skilled developers, making senior engineers significantly more productive.
  • The concept of delegating tasks to AI is compared to delegating to junior programmers, highlighting the need for strong domain expertise.
  • The video introduces the 'teach skill,' a tool designed to help engineers learn and improve by encoding teaching principles into AI workflows.
  • Practical advice is given on using APIs like Google Search and Google Images to enhance AI agents with live data and structured information.
  • The importance of good software design, testing, and modular architecture is stressed to facilitate effective AI collaboration.
  • The speaker encourages continuous upskilling and strategic thinking to leverage AI tools for business growth and software velocity.
  • The discussion includes challenges like rate limits and infrastructure maintenance, advocating for clean, well-documented codebases.

Full Transcript — Download SRT & Markdown

00:00
Speaker A
Everyone's obsessed with the model, and I think they should be more interested in the harness, what you can do to get the most out of the harness, giving it the right prompts, giving it the right skills to work with, and improving the environment in which the model runs.
00:11
Speaker A
improving the environment in which the model runs. As I sort of said with Fable, like the model is useful, but I think the harness has an equal amount of work and you have much more control of the harness than you do the model.
00:25
Speaker A
As I sort of said with Fable, the model is useful, but I think the harness has an equal amount of work, and you have much more control of the harness than you do the model.
00:37
Speaker A
how do you optimize for token spend? Have a code base that's easier to make changes in, right, Matt? So what's going to be the main difference between people who use AI to get insanely ahead and the majority of people who only get a small
00:50
Speaker A
People are focused on the wrong thing. They're looking at the big shiny new thing when in fact, just focus on the stuff that's been working for 30, 40 years, you know, and it really does work. Like people ask me all the time,
01:07
Speaker A
stuff, the actual writing of the code, the actual messing about with the syntax, figuring out bugs as they come up, and actually creating the code, creating the commits. Strategic programming is winning the war, not the battle. It's longer term thinking. It's
01:21
Speaker A
how do you optimize for token spend? Have a code base that's easier to make changes in, right, Matt? So what's going to be the main difference between people who use AI to get insanely ahead and the majority of people who only get a small
01:35
Speaker A
junior. How can we increase our velocity? How can we do more with less? And AI has basically eaten tactical programming. It's gone, right? It's all gone. So AI is just better at doing tactical programming than you are because it can do it for cheaper, right?
01:50
Speaker A
boost from it? So in his book Philosophy of Software Design, John Ousterhout talks about the difference between tactical and strategic programming. So I find this distinction so useful when thinking about AI because tactical programming is all about the on-the-ground day-to-day
02:00
Speaker A
So does that mean knowing how to orchestrate these agents plus some like fundamentals of software design, codebased architecture, like how would you break that down into like these specific skills that people can learn?
02:10
Speaker A
stuff, the actual writing of the code, the actual messing about with the syntax, figuring out bugs as they come up, and actually creating the code, creating the commits. Strategic programming is winning the war, not the battle. It's longer-term thinking. It's
02:29
Speaker A
same. You need to design the hard parts up front. You need to make sure those tasks are really, really well scoped.
02:35
Speaker A
the general sitting right at the top. How does the codebase need to look? What strategies can I use to improve our velocity? And for me, strategic programming has always been the most interesting, the most exciting. That's how I was thinking even when I was a
02:47
Speaker A
just enough documentation that can point AI to the right places where it's going to make those changes and make them effectively.
02:52
Speaker A
junior. How can we increase our velocity? How can we do more with less? And AI has basically eaten tactical programming. It's gone, right? It's all gone. So AI is just better at doing tactical programming than you are because it can do it for cheaper, right?
03:05
Speaker A
you can get the latest tools but ultimately anybody can do that but there is still going to be people who use these tools to massively grow their business you know to ship more and better software than ever before and
03:14
Speaker A
And so you need to be great at strategic programming in order to get the most out of this infinite fleet of tactical programmers that you now have access to.
03:24
Speaker A
Yeah. Um people ask me all the time like because you know I sell developer courses right? So I'm sort of, you know, you can take my advice here with a pinch of salt, but I personally feel that my
03:35
Speaker A
So does that mean knowing how to orchestrate these agents plus some fundamentals of software design, codebase architecture? Like how would you break that down into these specific skills that people can learn?
03:48
Speaker A
context to work with. And I think of this, I mean, I see this everywhere and people uh like CTO's and like people I talk to at conferences tell me this all the time is that AI makes senior developers just 10 times better. And it
04:02
Speaker A
Yeah, great question. So strategic programming really hasn't changed in AI, right? AI is just, all we're doing is instead of delegating to junior or mid-level programmers, we're delegating to AI instead. So the things that you need to do good delegation are still the
04:16
Speaker A
can do. And if your skills are low, then AI is not going to be able to go past that, you know. So getting good with AI is really about getting good at your domain, getting good at what um AI is
04:28
Speaker A
same. You need to design the hard parts up front. You need to make sure those tasks are really, really well scoped.
04:36
Speaker A
So I think skills are more important now than they used to be because again, you just have this multiplier available to you and you can delegate more. So you recently shipped the new teach skill, right? Can you tell us more about that?
04:49
Speaker A
You need to be thinking about the interfaces between all of the modules in your codebase. You know, you need to be thinking about test scenes and good tests. You need to essentially design a codebase that's easy to work in and have
04:57
Speaker A
Then you hit captures, your proxies get blocked, rate limits everywhere, and suddenly you find yourself maintaining scraping infrastructure instead of building the actual project. This is where SER API comes in. It gives you clean structured search results from
05:11
Speaker A
just enough documentation that can point AI to the right places where it's going to make those changes and make them effectively.
05:27
Speaker A
work, this is huge. Say you're building an agent that needs live information. Just use their Google search API. Or maybe you're training an AI model that needs a data set. Their Google Images API gives you pre-classified titles, URLs, and thumbnails ready to go. A ton
05:42
Speaker A
I think everybody at this point agrees that the AI progress is very fast, if not speeding up. So, I think a lot of people also miss the part of upskilling themselves, right? Because, like, yeah, you can pay for subscriptions, you know,
05:56
Speaker A
you to Surf API for sponsoring this video. Yeah, I know a lot about teaching. I've been a teacher for 10 years, actually.
06:02
Speaker A
you can get the latest tools, but ultimately anybody can do that, but there are still going to be people who use these tools to massively grow their business, you know, to ship more and better software than ever before, and
06:14
Speaker A
thought okay what if I take some of the teaching principles that I know about such as the zone of proximal development such as the difference between knowledge skills and wisdom encode that into a skill and essentially use it to create a
06:26
Speaker A
there are going to be people who try it a bit and, you know, maybe use the free version or the cheaper model. So how would you advise people to start teaching themselves to be better?
06:40
Speaker A
I've been using this for all sorts of stuff. So yesterday I was I was messing about with what it might look like to ask the teach skill how to become a senior developer. And it basically went on this big journey looking at a bunch
06:54
Speaker A
Yeah. People ask me all the time, like, because, you know, I sell developer courses, right? So I'm sort of, you know, you can take my advice here with a pinch of salt, but I personally feel that my
07:04
Speaker A
Yeah. Yeah, I think people would love to see that. Definitely. Okay, so uh David, what do you want to learn?
07:11
Speaker A
skills are a multiplier for AI, right? If I'm able to oversee a codebase and think about how things should be built and just tell AI how to do it, then AI just has so much richer
07:22
Speaker A
I'm intrigued by if this skill can teach you basically the basics of engineering, you know what I mean? To fill in the gaps that you might have if you're a vibe coder, you know? So, I'm going to pretend that I'm a vibe coder. I'm going
07:36
Speaker A
context to work with. And I think of this, I mean, I see this everywhere, and people like CTOs and people I talk to at conferences tell me this all the time, is that AI makes senior developers just 10 times better. And it
07:45
Speaker A
I am a vibe coder and I want to fill in my knowledge gap so that I can ship better software. I know some very, very basic CLI commands and I know just about enough to read some code and use the
07:59
Speaker A
sort of doesn't make sense to hire that many juniors anymore because juniors get a little boost from AI, but seniors just get this ridiculous huge boost from it, and they can do so much more with it. So your skills are the ceiling on what AI
08:06
Speaker A
So, I'm going to put that a simple prompt, plain English. Anybody can ask you this.
08:11
Speaker A
can do. And if your skills are low, then AI is not going to be able to go past that, you know. So getting good with AI is really about getting good at your domain, getting good at what AI is
08:22
Speaker A
session with the teacher today. And you can think of this really as a collaborative effort. Basically, I am talking to a teacher and the agent is my teacher. and it should know how best to teach me.
08:33
Speaker A
going to be doing for you. So a better teacher can use AI to teach people better than a random can, you know.
08:44
Speaker A
cuz like I I I know I'm not where I can be. You know, there's people much better than me, much more skilled than me. So like it's it's great timing that you have this skill now.
08:53
Speaker A
So I think skills are more important now than they used to be because, again, you just have this multiplier available to you, and you can delegate more. So you recently shipped the new teach skill, right? Can you tell us more about that?
09:03
Speaker A
And so for a vibe coder who can read code and use a basic terminal, the highest leverage gap is almost never more syntax. It's the stuff around the code that lets you ship without fear.
09:12
Speaker A
If you've ever tried pulling search data from the web at scale, you know it's a nightmare. You write a scraper, it works for a week, but then the layout changes.
09:26
Speaker A
aligns with what you want to do. I think of teaching and learning as not getting information into your head but orienting you in the world, putting you in a new place in the world. And this is kind of
09:40
Speaker A
Then you hit captchas, your proxies get blocked, rate limits everywhere, and suddenly you find yourself maintaining scraping infrastructure instead of building the actual project. This is where SER API comes in. It gives you clean structured search results from
09:53
Speaker A
what does ships better software mean to you right now what's a concrete project you're working on so let's imagine I answer this let's imagine I was a I'm a voice coach wanting to learn how to be a better
10:02
Speaker A
Google, Bing, Yahoo, and more through a single API call. You send a request and you get back a clean JSON object with exactly the data you want. No captcha solving, no rotating proxies, no broken HTML. They handle all of it. And for AI
10:11
Speaker A
students to teach them better and to, you know, build something that they can help practice with. That's the kind of app that I'm looking at. So, probably a full stack application with a database with some kind of authentication, but
10:24
Speaker A
work, this is huge. Say you're building an agent that needs live information. Just use their Google search API. Or maybe you're training an AI model that needs a data set. Their Google Images API gives you pre-classified titles, URLs, and thumbnails ready to go. A ton
10:29
Speaker A
Yeah. So, this is going to be also the game like how fast you can output your tokens from your brain and input them back into your brain.
10:35
Speaker A
of production agents already use SER API as one of their core tools, and you can get started with 250 free credits. No credit card required. Just scan the QR code on screen or click the first link below the video. Oh, and a huge thank
10:45
Speaker A
It's very fast for me because I'm like quite a fluid speaker. So I can translate my brain into words quite effectively.
10:52
Speaker A
you to SER API for sponsoring this video. Yeah, I know a lot about teaching. I've been a teacher for 10 years, actually.
10:58
Speaker A
Exactly. And it's a skill that is actually overpowered if you're if you're a developer. It really really is. like uh I found that being able to communicate and being able to speak was something that was just ridiculously overpowered in the development world and
11:11
Speaker A
So I was teaching singing and voice when I was just straight out of university. Then I became a developer, and now I teach developers. I've been doing that for the last four years. So I know a lot about teaching, and I
11:21
Speaker A
What success looks like? Being able to ship that app, not break it, get it live, and trust that it works for real students. This is now going to orient everything about this skill and what it does next. So, you can see it's doing
11:32
Speaker A
thought, okay, what if I take some of the teaching principles that I know about, such as the zone of proximal development, such as the difference between knowledge, skills, and wisdom, encode that into a skill and essentially use it to create a
11:40
Speaker A
Let me set up your resources, a learning record, a reference cheat sheet, and your first lesson. So it's going to start churning out some material that's running locally. And this is going to the idea of this is I think of there are
11:53
Speaker A
course on the fly about any topic, and that's what I've done, and it's extremely effective. I've actually been learning, I'm teaching myself Rubik's Cube from this. I can solve a Rubik's Cube now from memory thanks to this skill. And
12:10
Speaker A
teach skill is a stateful skill because if you think about working with a great teacher, a teacher remembers what you've done before. A teacher knows about where your sort of need to go next. Knows what your mission is, all that stuff. And so it's saving a
12:24
Speaker A
I've been using this for all sorts of stuff. So yesterday I was messing about with what it might look like to ask the teach skill how to become a senior developer. And it basically went on this big journey looking at a bunch
12:37
Speaker A
it in a browser and have like a really rich thing to look at because learning stuff in the terminal is just brutal.
12:42
Speaker A
of trusted resources, getting a big sort of curriculum together, and just produced something that was gorgeous. And so that absolutely, let's give it a go.
12:50
Speaker A
Not quite yet. I haven't decided whether I want to get into fable yet or not. Um really I well I mean yeah I I don't I don't really believe in all the kind of like yesterday was it yesterday when it was
13:04
Speaker A
Yeah. Yeah, I think people would love to see that. Definitely. Okay, so David, what do you want to learn?
13:12
Speaker A
It does seem to be slightly better. But then you've got to weigh that against the cost of the tokens and how available it is, the latency of it. Um I prefer to essentially not try a new model when it
13:24
Speaker A
Let's do systems design. I tell you what, I mean, I, I, I, I've got an idea here, which is that a lot of people come to me. I teach courses for engineers, really, people who already know how to be an engineer.
13:36
Speaker A
fine. You know, you're not you're not losing that much by just waiting a little while to see how things shake out.
13:42
Speaker A
I'm intrigued by if this skill can teach you basically the basics of engineering, you know what I mean? To fill in the gaps that you might have if you're a vibe coder, you know? So, I
13:54
Speaker A
and nicer than doing in the terminal. This is saved locally so you can always go back and reference this and it's giving you actual things you can do in the terminal giving you proper exercises to go and do it. So you make a folder, go
14:08
Speaker A
into it, start getting it, create a file, check the status, stage it, save the snapshot, and because of course it's running like on my system, it knows what my setup is. It's probably already checked whether I've got git installed,
14:21
Speaker A
that kind of thing. So it's, you know, it's perfectly personalized education, you know. Exactly. Totally personalized. So which command saves a snapshot of your stage changes? David, do you reckon you're going to answer this one for me?
14:33
Speaker A
uh get command say snapshot your state changes get commit get commit bam so again it's using techniques that are well known in education for increasing um storage strength right so quizzes are such an awkward thing like I sort of hate
14:51
Speaker A
quizzes but quizzes are just unreasonably effective for uh increasing the strength that something is stored in uh what command shows what has changed right now David good status Get status. Yeah. Oh, bugger. I pressed the wrong thing.
15:07
Speaker A
Uh, what does git add do to a change? Stages changes. Stage changes. A commit is best pictured as a safe point.
15:17
Speaker A
Safe point. You broke a file but haven't committed. To restore it, you run um get restore.
15:26
Speaker A
Yeah, I think it's git restore, isn't it? Yes, there it is. Very good. Okay.
15:30
Speaker A
And so it then sends you off to read a primary source if you fancy it. So the um progit book um and then invites you to ask your teacher follow-up questions and create the next lesson. And so the idea of this
15:43
Speaker A
is you I think of knowledge as like a graph, right? It's like a big forest through which you're exploring. And what this is doing is it's creating a linear path through that graph. It's basically going okay, you've learned this. Now
15:56
Speaker A
that's I I know that you've learned it. It's in your learning record. We can see it's retaining a list of learning records in the top right here, which is your mission and your starting point.
16:06
Speaker A
So, it's captured your mission, a decision to start with Git zone of approximal development, current estimates, you get the idea. So, it's great. I freaking love it. And that's what I would recommend to anyone starting with uh especially development
16:21
Speaker A
because it's sort of I mean, I'm a developer. I know what developer education is. And so I've sort of put that into this teach skill. And I think I've always thought coding was quite easy to learn. Like I didn't have that
16:33
Speaker A
much trouble when I was learning it myself. And I I think this is a great way to do it.
16:39
Speaker A
So is this live on GitHub somewhere? Where can people find this? GitHub Matt Pokco skills. And if you head there, you just run this uh CLI command npx skills latest add map skills. You can choose the teach skill
16:54
Speaker A
and it will just save to your local setup. So whether you're using claw code, whether you're using codeex, it will work and you'll be able to then just invoke teach inside a fresh workspace.
17:04
Speaker A
So you have, you know, perhaps the most at least one of the most famous and popular skills repos. What separates a good agent skill from a bad one?
17:14
Speaker A
It's such a deep question. It's such a deep question because it depends what you want. Um you can think of there as being two types of skills.
17:24
Speaker A
There are skills that are procedures, skills that you intend to uh run yourself and then there are skills that are more like abilities. Those are abil like things that you intend the model to invoke itself. And so a good ability for
17:42
Speaker A
instance might be uh your coding standards let's say. So let's say your agent is sort of doing its own thing kind of working along and it needs to check uh how you like your react code written. So it's going to write some
17:54
Speaker A
React code you it pulls in the ability uh great React coding standards let's say and then it reads it and it understands okay I shouldn't use use effect I should use something different a procedure is more like something uh
18:07
Speaker A
this is how I prefer my skills written it's something that you invoke yourself to get the model to behave a certain way it's something I love is my grill me skill that's one of my most popular skills what it essentially does It turns
18:23
Speaker A
the model into an adversarial interviewer. So this is under productivity under grill me. It's incredibly short and you can see it's literally just uh four sentences I think this skill maybe five sentences and it's unreasonably effective because it just
18:39
Speaker A
turns the agent into an adversarial interviewer asking you questions interviewing you and popping up with ideas that you might not have considered until you reach a shared understanding.
18:50
Speaker A
I've been using this for coding first of all just like as a replacement for plan mode. So before you actually go and implement some code, you go, okay, here's my idea. Uh, interview me about it. Let's reach a shared understanding.
19:02
Speaker A
Let's flush out any weirdness or any unexpected stuff before we get in as much as you can. And it's just unreasonably effective. And this is a procedure. This is not an ability.
19:14
Speaker A
I tend to prefer my skills as procedures. I like to be the one in control. I like to go, okay, we'll do grill me and then we'll go, let's write a product requirements document. So we use two PRD for instance. Then let's
19:26
Speaker A
take that PRD and turn it into individual issues so that we can work through them. That's just personally how I like to do it. But other skills such as superpowers from Opera, which is probably the most popular skills repo
19:38
Speaker A
out there, it takes the opposite approach and it prefers things to be more like the model is in control. But I've always preferred to me personally be in control because I know my skills, I know my abilities. I don't want to
19:50
Speaker A
delegate my thinking to the model. Yeah. I mean that I think is one of the See, I'm like playing with this idea of the list. It's like a list of abilities, you know, knowledge.
20:03
Speaker A
Basically something that like if you could take the average, you know, 100x developer that uses AI versus a, you know, 1x developer, whatever. What would be the list of the differences, right?
20:15
Speaker A
You can say like okay some of these are like raw intelligence you know blah blah blah but most of them are probably teachable most of them are some skills some knowledge something like that so I'm obsessed with this idea and I think
20:25
Speaker A
one of them is kind of knowing when to have the AI ask you right like kind of this grill me style of skill because personally I found out like the biggest difference instead of like saying oneshot this app I describe my vision
20:38
Speaker A
for this app and say like list out the 10 most consequential decisions right the software design decisions architectural decisions product decisions that will shape this project and ask interview me until you understand 98% about it. Right? So, kind
20:52
Speaker A
of that is like one of the things I would put on the list. What what are some of the things you think are on the list?
21:00
Speaker A
Well, can we um can I challenge the idea that this is possible? Is it right if I take this question in a different way?
21:08
Speaker A
Because skills are really hard to write uh especially because every single skill that you write it leaks a description this description here into the context window. Right.
21:21
Speaker A
Yeah. And you can disable this. So you can uh there are some skills in here I think in my engineering zoom out I think which has uh disable model invocation true. So this one uh this skill can only be
21:35
Speaker A
invoked by the user and this means its description is not leaked into context. Every single ability uh let's say we have the list. Let's say we have a 100 different skills. You're going to be leaking 100 descriptions into the
21:49
Speaker A
context window. Right? Okay. Maybe let me rephrase. I didn't mean it for the AI. I meant the list is the person, right? Like if you had to say like I know it's difficult to like it's maybe a reductionist to take
22:03
Speaker A
someone who's like really insanely productive you know maybe like so some of the top people at opening are onic who like worth hundreds of typical developers right what would be the list of their abilities skills knowledge that compare them to an average developer
22:20
Speaker A
yeah got you well this I mean the you're kind of heading in my direction I think which is I prefer to hide most of these descriptions from the AI itself and keep all of that knowledge inside the human,
22:34
Speaker A
right, inside the developer. And so I prefer that's how I prefer my skills to be used is you essentially are the driver. You know, you take the steering wheel. And so I do think that this is such an exciting time to be a senior dev
22:50
Speaker A
and to like be able to share and like procedurize proceduralize maybe um your work into reusable chunks, right? Like in in a codebase you have a function that's repeated three times. you take that function and you pull it out into a
23:08
Speaker A
shared function that is then you know um you reduce the duplication basically and we're able to do that now with our own procedures with how we build software we're able to take these like okay I've uh you know made this plan a h 100 times
23:23
Speaker A
I know how to make good plans I can turn that into a skill distribute that to my team and everyone can be planning in the same way contributing back to that same skill making everyone on the team better
23:33
Speaker A
so you're raising the floor really on what engineers can do. It's such an exciting time. And what I would say though is that skills like there's like know I'm going to sort of confuse our terminology a bit. I think
23:49
Speaker A
of there as being three things that you need to be good at anything which is you need knowledge. You need the fundamental sort of uh what is that thing like understanding it in your head. You need the skills. You need to be able to have
24:00
Speaker A
done it a bunch of times to like um you know in muscle memory. And then you need wisdom. You need to know when to do it.
24:08
Speaker A
You need to know how it fits in in the real world. And wisdom is almost impossible to obtain without actually having done the thing in the exact context where you need to do it. So if you want to be like someone at
24:23
Speaker A
anthropic, sure you can gain the knowledge, you can gain the skills, but then how are you going to gain the wisdom, right? Like you need to probably go to anthropic to gain the wisdom to actually understand how to do the thing,
24:33
Speaker A
you know? But I think it's like being able to bundle the first two knowledge and skills into something that's reusable is such a fascinating um outcome of this weird age we're living in.
24:46
Speaker A
So currently we talked about skills. What's your agentic engineering setup? Like what tools do you use? What models?
24:54
Speaker A
How many agents? Yeah. Um so my setup is um I use claude code essentially for planning and for um some implementation locally. So I'm using Opus 4.8 with um medium effort is kind of what I've landed on and it works
25:11
Speaker A
fine. I do most of my uh development and a lot of my work now um AFK so with me away from the keyboard and the way I do that is with something I built which is uh a tool called sand castle and sand
25:27
Speaker A
castle is essentially a way to run agents inside sandboxes. Okay, so you can um inside like if you don't run an agent in a sandbox then it's going to do weird stuff. So it might you know randomly delete your home directory
25:42
Speaker A
or um you know exfiltrate your um environment variables out to um bad sites etc. With Sand Castle, you're essentially able to plug in things like Docker or Podman and run agents, run either uh this is what it looks like,
25:57
Speaker A
run clawed code uh inside some sandbox, which is extremely cool, extremely effective and it means that you can parallelize a bunch of agents at once either on your own machine or you can use like Versel sandboxes for instance
26:11
Speaker A
to just ping up a remote agent and then pull the commits back into your local workspace.
26:18
Speaker A
I've been doing that and I've been combining it actually with GitHub actions. So we can see inside for instance here inside the actions tab of map sand castle this one this was an agent review action which happened a
26:32
Speaker A
little while ago which checks out the branch. This runs on on a PR. It uh runs the review agent which is just a prompt I have locally. We can see all of the um things the agent did. It's checking
26:46
Speaker A
various things blah blah blah blah blah uh type check round clean and then it replies saying cool it all looks good to me. So that's mostly how I've been doing things is running uh agents using sand castle on GitHub actions and essentially
27:03
Speaker A
just telling them to do things and that has been extremely unreasonably effective because you just get to parallelize as much as you want. you're not worried about constraining the resources on your local machine and yeah it's just very very quick to just spin
27:17
Speaker A
up an agent and get it to do something. So in terms of models are these 5.5 extra high are these another cloth codes what do you prefer?
27:25
Speaker A
These are I think uh again just claw code uh opus 4.8 eight medium. I think I don't think I've varied it too much to be honest. I mostly don't worry about models that much. I mostly just use um like I think
27:42
Speaker A
yeah, this is my sort of hot take I suppose which is that everyone is obsessed with the model.
27:48
Speaker A
Everyone's obsessed with the engine of the Formula 1 car whereas in fact the engine is really only a part of the whole system. Right? You've got the entire chassis. You've got uh how it um how it moves through the air. Everyone's
28:03
Speaker A
obsessed with the uh model and I think they should be more interested in the harness, what you can do to get the most out of the harness. Uh giving it the right prompts, giving it the right skills to work with and improving the
28:15
Speaker A
environment in which the model runs, improving the codebase and all that stuff. So yeah, as I sort of said with Fable, like I the model is useful, but I think the harness has an equal amount of work and you have much more control of
28:30
Speaker A
the harness than you do the model. That's true. I would maybe challenge you a bit on this because I don't see why you cannot do both because like obviously I I agree that you need the right skills, you need the right setup,
28:42
Speaker A
all of that matters, but then if you swap in a better engine, all of that is instantly better. No.
28:47
Speaker A
Yeah, it totally is. Um but I think they you need to think of them as 50/50 right so instead of um the model being like 90% and sort of the 10% optimization of har like everyone's so focused on the
29:00
Speaker A
model people are not so like intrigued by so okay let's go back one step there's a famous idea in ML which is the bitter lesson you heard of the bitter lesson yes yes the bitter lesson is the idea that
29:18
Speaker A
Um, whatever you do in machine learning research, compute, raw compute will just beat you every time because compute is increasing at such a high rate that you can just essentially trust that the underlying thing will get better and
29:32
Speaker A
that will uh beat any optimizations you put on top of it. And there's a a sort of idea here that maybe I'm falling into the bitter lesson that instead of like like optimizing my setup, optimizing my harness, I should just wait for the
29:44
Speaker A
models to get better, wait for the engine to get better, and then my car will be faster.
29:50
Speaker A
I don't know. I still think there's a lot to be gained by just optimizing the harness and focusing on creating like good code bases that the agent can do well in instead of hamstringing the agent before it even gets started. I
30:03
Speaker A
would say probably I I I agree that you shouldn't wait. Like that's that's that was a very stupid idea. People just waiting around for AGI or not doing anything. Obviously, I completely agree with you there. I would say I'm
30:13
Speaker A
somewhere in the middle. I would say like I'm actively trying to improve my setup every single day trying to, you know, get faster at using these agents.
30:22
Speaker A
Figure out, okay, should I be using Cmax here? Should I be using should I put this on VPS? Should I be using Tailscale here? trying to like actively improve everything else except for the model but also trying to use the best model
30:34
Speaker A
possible because fundamentally like you said you might be falling into that. I would say maybe if it's 50/50 now for the simplicity of this argument, what if like the model really becomes a lot better, right? Like let's let's assume
30:45
Speaker A
the next generation, right? Like Opus 6, Fable 6, GPD 6, whatever 7 like don't you think these models will require less steering and like less handholding as they become more competent or or no?
31:01
Speaker A
I'm not a pundit, right? This is what I say to every single one of these questions. I'm trying to do the best with what I have right now and I don't I don't have the insight to know whether
31:12
Speaker A
these things will get better. I don't really want to make predictions about the future. I think that if I try to keep my um workspace and my harness agent agnostic as much as possible, if I try to apply good software fundamentals
31:28
Speaker A
to what I'm doing, if I do stuff that's always worked, then it will probably continue to work in the future. You know what I mean?
31:35
Speaker A
So, if I try to overoptimize around a model, if I get too focused on the model, I will lose focus on um the fundamentals. That's that's that's my point of view.
31:45
Speaker A
Yeah. So basically you're focused on like okay what has been true for the last 10 20 30 years you know the the the really best principles of great software and it's likely going to hold up with the next model rather than people going
31:58
Speaker A
from the model first and like okay this model maybe requires shorter prompts this model you know sucks at that part let me patch that part like building up you know properly proper foundation rather than like starting with the model
32:09
Speaker A
maybe exactly people are focused on the wrong thing they're looking at the big shiny new when in fact just focus on the stuff that's been working for 30 40 years, you know, and it really does work, you know,
32:22
Speaker A
if you have a codebase that's easy to change. Like people like people ask me all the time like how do you optimize for for token spend, right? How do you optimize for token spend? Have a code base that's easier to make changes in?
32:34
Speaker A
Because then you can employ a stupider model. If your codebase architecture is better, then you can get a cheaper model to do the same work because it your guard rails are better, it's easier to explore, it needs to spend fewer tokens
32:48
Speaker A
banging its head against the wall. If you're hamstringing your model from day one, then you will need a smart model to get the most out of it. Um, but yeah, so I think thinking from the model first is
32:59
Speaker A
the wrong way to do it. Yeah. So basically I would say like the exact opposite of you is like the quintessential vibe coder who like is switching tools every single week right like there's a new replet update goes to
33:11
Speaker A
replet agent switches to lovable switches to this and that constantly switching and never learning any any programming principles anything about software engineering nothing you're like your approach is basically the difference is in approach it's not like you don't believe in AI obviously right
33:26
Speaker A
now you you're heavily trying to be at the age of AI and educating people how to use it it's more about the difference of approach. She's like, "Listen guys, learn the fundamentals. Learn how code works, how good software looks like, and
33:39
Speaker A
this is going to be valuable no matter what. No matter if OpenAI is ahead, Enthropic is ahead, Gemini is ahead." Versus the exact opposite approach, which unfortunately I think most of the people who are new to AI take is like
33:49
Speaker A
jumping on the latest trend and like switching everything the moment, you know, some new update or tool comes out.
33:55
Speaker A
Totally. And I think, you know, that's, you know, you can do that and that's exciting. Um, but you're not really increasing your skills that way. And it's your skills. I firmly believe that are the ceiling to what AI can do. You
34:08
Speaker A
should be focused on yourself. You know, upskilling yourself for this new world instead of thinking um, right? How do I delegate my thinking? How do I delegate more? You know, you should be pulling more into your own domain and delegating
34:22
Speaker A
only the tactical stuff. Keep the strategic mindset. keep thinking about you know the next uh months and weeks ahead the road map of where you're going in your code instead of just um trying to delegate that to you know people are
34:35
Speaker A
obsessed by the idea that you know you can just delegate everything to AI and you can't you really can't and I don't see I mean again I'm not a pundit you know I'm just looking at what we have
34:44
Speaker A
right now and it doesn't yeah yeah I I am the person in the real world that's driving this stuff I need to be the one making product decisions I know where I'm going And I think me as a
34:57
Speaker A
developer, I should be in control and I need the skills to be able to do that.
35:01
Speaker A
I agree. One note I'm going to share on Fable is that happened yesterday which is a bit scary and it definitely doesn't follow security practices is that I was setting up a new um like a new agent for
35:12
Speaker A
for like Twitter and basically u the Twitter API was bugged. The developer console was wasn't loading some buttons and I tried it on a different browser.
35:20
Speaker A
It still didn't work. I disabled all extensions. It still didn't work. So I gave it like a few solid minutes to try to debug it and I failed. I mean I didn't it wasn't the main thing I needed
35:28
Speaker A
to get done so I didn't really try as hard as I could but I gave it to Cursor powered by Fable. It used the built-in browser inside of cursor. You know I had to log in obviously to the console but
35:40
Speaker A
apart from that it started clicking. It created API keys copied them. Again I do not recommend this for production apps.
35:46
Speaker A
This is just a simple thing for me. And then it figured out when it did the testing that those API keys were in a different like app in the console and they actually weren't using the credits I charged up. So then it moved the app
35:59
Speaker A
again using the built-in browser inside of cursor and like for me I really felt like what am I doing here? Like obviously I described what we're building, why are we building it, some of the you know kind of my version of
36:09
Speaker A
grill me at the start but then I felt like okay I just logged in into the console and I just charge up a few dollars but like everything else the AI was doing right so like I felt like my
36:20
Speaker A
value in this project was a lot lower than with previous models. So what's your thoughts on this?
36:29
Speaker A
I mean, if you think about the AI's output, right, what it what it was doing at the end there, um, it needed like how does the AI know at the end that it's done a good job, right? What it is is
36:42
Speaker A
the theory here that you can disappear from the project completely. No, you're still needed, right? Like all we're doing here is we've just given the AI a set of tools and we're it's, you know, we've given it a scoped task and it's
36:53
Speaker A
performing that task, right? you know it we've given it a goal and we said you know do blah blah blah blah blah I don't think of that as that particularly magical you know that's something that uh agents can do now you just give them
37:03
Speaker A
the tools and they go and do it but to decide whether that's the right thing to do to um uh security test that at the end of that that's something that you're needed for right you David are needed
37:15
Speaker A
for that to know whether it's done a good job and so yeah we can delegate more but I don't think that's a reason to start thinking um you know or have AI psychosis or anything. It's just yeah, it's a reasonable thing that the AI can
37:28
Speaker A
do with computer use. I've also seen a lot of people report like they they were, you know, maybe looking for optimizations or doing some feature and then again I'm talking about fable because it just came out. It's topical, right? So, it's on top of my
37:42
Speaker A
mind. But a lot of people reported that it found like deeper bugs that they didn't notice at all. Whereas other models completely miss those. And that I I I would like again I would challenge you slightly that that's a sign of like
37:54
Speaker A
AI being able to do more. I'm not saying we need to be completely removed from the loop but like if the AI is you know redesigning the front end and it finds a issue in one of the like backend API
38:05
Speaker A
endpoints like a major security issue I would argue that that's like AI being more involved.
38:11
Speaker A
It's not a 50/50 at that point. Yeah. So you're saying that um the better a the better the engine is the more value you can bring to the business just by having the engine and those effects are emergent. You don't
38:26
Speaker A
know what you're going to get by increasing the power of it. It will still know the vision, right? It will still know what you're doing here. Like this is a educational repository for my students in my paid community or this is something just for
38:37
Speaker A
my team. It will be used by roughly five people. The purpose of this is XY Z. It will still know the core idea, the initiative that comes from you. But in terms of the actions and like what happens, my argument would be that as
38:49
Speaker A
the models get more powerful, more and more of these is going to be done by the AI. But not only that, the AI will spot what needs to be done, such as the example with the, you know, deep bugs
38:59
Speaker A
that the user wasn't even debugging. Totally. But I I think that we think that the model is the only way to get there, right? What you could be doing is in your repository is you could run an a
39:10
Speaker A
cron job that runs every single day, let's say, and does a security review and every day it checks a new part of the repo, right? And you could use a relatively simple model for that and you probably get some decent results. I mean
39:20
Speaker A
this idea that there are deep bugs that um you know or deep sort of security things inside your application that uh the model could spot and others cannot you know like sure that's like it sounds attractive but you could probably also
39:36
Speaker A
uncover those bugs with cheaper models if you just looked in the right places you know and you gave it the right prompt let's say for one of a better word or the right harness so I don't think there's something that's
39:47
Speaker A
necessarily special about the model that does those things or you know and I think that's again 50/50. If you had a harness that sort of was looking specifically for those things then you would find them and I think we're
39:58
Speaker A
lagging behind in our practices and expecting the model to just pick up the slack.
40:03
Speaker A
You can absolutely just run Opus and get it to do that stuff. You know, people were talking about this like when Opus 4.5 came out, whoa, all these security things that Opus, it's just like, sure, it's found them and you can just get
40:16
Speaker A
that with a harness and just get it to do it again and again. I like Yeah, I I understand. I understand like you're basically pushing against the hype wave, you know, you're trying to like implement some sense, some wisdom
40:28
Speaker A
into this. Say like guys, okay, the models are getting better. Yes. But at the same time, let's not lose the obvious, you know, optimizations, the obvious things that has always always been true. Maybe like if you had a
40:40
Speaker A
better harness, you could support it even in the previous generation model. Or maybe you didn't have to spend $2,000 on API tokens, maybe only 200, you know, stuff like that. So yeah, I I completely agree with you there. You're trying to
40:50
Speaker A
be like one thing one thing just to to finish there, which is that what is this what is this thing that you've learned from Fable looking at your code and spotting a security issue?
41:00
Speaker A
What you've actually learned? Sure, you've learned that Fable is good definitely, but you've also learned that there are security issues in your code, right? And you should probably have something that runs and checks for more security issues in the future. We need
41:12
Speaker A
to build loops into our um loops for God's sake um into our I mean, we can talk about that as well if I've got some opinions there. Um, you need to build these uh systems that just check your
41:27
Speaker A
like you need, what am I trying to say? You need to figure out why it happened like why it even got to this place. You know, it's like if someone keeps stealing your bike, maybe buy a lock.
41:38
Speaker A
Yes, exactly. Maybe we need to uh be designing systems that are self-improving over time, right? And this is something that we've been doing as software engineers for a long time.
41:50
Speaker A
We write test suites so that we can test our own code. We do human reviews so that we can make sure things are looking the way they need to. We refactor so that we can change code better in the
42:00
Speaker A
future. And sure, a model has uncovered that we need to do a bit more of that.
42:04
Speaker A
So let's do a bit more of it. But we don't need to use the fancy model in order to get those insights. See, that's what one of the things I would put on the list is like the thing that really
42:13
Speaker A
separates the people who are going to go super fast with the AI and build better and more software versus people who are not. Like most people in that situation, they would just say, "Oh yeah, Fable is great. Fix the bug. It fixes the bug."
42:24
Speaker A
But like the the people I don't know if it's like 10x developer, it's almost like 10x AI builder, you know, because everybody's becoming more of a builder, whether it's a designer background, a developer background. It's like that person would look at the underlying
42:37
Speaker A
issues. is like how did that even happen? How did I have this bug for so long that I didn't notice it and try to patch the underlying issue? You know, whether it's a new skill, a new system, better staging process, whatever that I
42:49
Speaker A
think I would put as one of the things on the list of your human capabilities or things you should have to get the most out of AI. Totally agree.
42:58
Speaker A
All right. So, you mentioned loops. This was super viral on Twitter. Maybe it still is, but like, you know, a week ago. Um I think it started with Peter Stanberger if I'm not mistaken. But basically people are like obsessing over
43:09
Speaker A
agentic loops. Uh half of it I would say is like the research labs selling more tokens. You know basically you should be running loops to pay us more endless tokens. Stop prompting your agents.
43:19
Speaker A
Figure out what loops it can run forever permanently. Half of it could be useful.
43:23
Speaker A
What's your thoughts? So what we're essentially talking about here is the difference between human in the loop work and AFK work. Right. human in the loop work being the human you are there with the agent um talking together
43:38
Speaker A
and like uh figuring out something. So really useful for planning, really useful for some kind of more complicated implementations, uh really useful for unscoped work, you know, stuff that you just need to uh figure it out locally with the agent. And then we're talking
43:51
Speaker A
about AFK stuff. So AFK away from keyboard, you ping off the agent and it goes and does something. Now, I think that I mean the moment that I discovered AFK was the moment I really got into AI coding and the moment I was
44:07
Speaker A
really able to massively increase my output because then instead of me having to sit in the loop, handle all the permissions requests, handle all of the, you know, anything the agent needs to ask me. The moment I can just remove
44:18
Speaker A
myself from the equation, I've paralyzed myself. Suddenly there are two of me, you know, or three of me, four of me, five of me able to go and produce so much more code that I then go and review. This idea that loops are the
44:29
Speaker A
only way to do it is crazy, you know, like we're essentially talking about the history of this goes back to Jeffrey Huntley. Uh where is it? G Huntley Ralph uh goes back to Ralph. You remember Ralph?
44:42
Speaker A
Yeah, I was talking about Ralph in uh January, I think. Um the original article comes from 14th of July last year. And essentially it's a loop. So this is the idea where you have a while loop that says okay pass this prompt to claude
44:57
Speaker A
code and then um eventually you'll be done. Now it's essentially just uh running clawed code again and again and again. That's the idea of the Ralph loop that I was talking about for a while.
45:11
Speaker A
And what I realized is I don't really need to run this as a loop, right? The only thing I need out of this is the AFK agent to take on a specific task and do that task.
45:20
Speaker A
The way I mostly think about these things as cues, okay, cues, not loops. The Q is really the backlog of tasks that I need to complete. I'm looking at the Sand Castle issues right now. These are bug reports coming in about uh Sand
45:34
Speaker A
Castle feature requests, things like that. I need to scope the item. Let's say it's this for instance. So, I've done a bit of triage here. It's um sort of explored. Okay, is this trivial? Is this possible? This uh was done AFK,
45:48
Speaker A
right? So, this this item has been picked off the queue. It's been explored, been put back on the queue. I might then need to go and actually implement this. Uh looks like yeah, this looks pretty good. I'll actually add the
45:59
Speaker A
agent implement label and I'll go and implement this in my GitHub action sand castle setup that I was talking about earlier.
46:06
Speaker A
Now, this isn't a loop really like it's sort of just uh it's a queue that eventually gets resolved. This will come off the queue once it gets um uh once the pull request gets merged. And that's all development is really. You just have
46:21
Speaker A
a queue of tasks that you need to get done. Project managers add more stuff to the queue. You complete the tasks in the queue. Like that's how we've always done it. And there are multiple nodes picking stuff off the queue, multiple
46:31
Speaker A
developers. And so an idea that there's a single loop that just sort of goes and completes all the tasks doesn't really match with how like you developer teams generally work when it's all sort of inside GitHub actions like this. Uh
46:46
Speaker A
anyone any developer can add one of these labels can trigger something and can just get work going. So yeah, I I think the idea of the loop is useful but it's not the whole picture. And I think an idea of a queue where you're picking
47:01
Speaker A
tasks off is is better. But mostly it's just sort of nonsensical really. Like when people talk about you need a loop prompting your agent, we're really just talking about AFK agents.
47:10
Speaker A
Yeah. I guess uh when you talked, I don't know why, but the the image that came into my head is like a medieval king managing a a kingdom with like some ministers or whatever. And basically assuming you know the king knows the
47:24
Speaker A
best know has most the most context not like a king that just like randomly got inherited empire right. So if you deployed a minister into some region, far region and you never heard from him, never gave him commands, he would be
47:37
Speaker A
running on a loop and that could go wrong or could go right depending on you know how complex the issues are in that region, how how smart the minister is, whatever. But ultimately as as the king in that medieval kingdom, you want to do
47:49
Speaker A
the Q approach. You want to have people come to you and say like we have a problem upcoming invasion you know or there's a famine in this region and like you have this queue of problems and you are still in charge so that would be the
48:01
Speaker A
equivalent of a human here with you know a bunch of agents bunch of AIs still you would be prioritizing okay we have these 50 bug reports only three of them are critical let's fix those first okay we have these resources um this brand deal
48:16
Speaker A
this company wants to work with us check their reputation first. Is that a good way for to think about it?
48:22
Speaker A
Totally. And what we're doing here is like you're still able to build tons of automation into here. Let's say that I had some kind of telemetry set up for sandbox for sand castle or like an observability tool like Sentry or
48:34
Speaker A
something. I could get a bug report from a live application uh create an issue from it immediately tag that issue as like explore the issue. Maybe the agent could return some structured data from the explore saying can we fix this
48:46
Speaker A
immediately or does this need a human in the loop? It goes and implements it. It goes and reviews it. And then maybe it has a little tag on it saying, "Can we automatically merge this or does it finally like um ping the user to go and
48:57
Speaker A
do it?" Like I see these systems as you need human in the loop checkpoints and you need to push those further and further right further and further towards the final thing as as or the final output as you can. So you would
49:12
Speaker A
essentially get these like in instead of like seeing the bug report, you would see the bug report, you would see the um exploration of the codebase, you would see the fix and you see like um can we review this?
49:25
Speaker A
Yeah. Just like that's that's what you get as the human instead of seeing the bug report and it's just so much richer and it means it's one button click away instead of a whole debugging session away.
49:36
Speaker A
So that's I mean so then the question here where Yeah. Yeah. So the question in that situation becomes because it's not a loop right it only runs when the buck comes there's no point for it to running infinitely just paying open AI or
49:49
Speaker A
enthropic infinitely but my question was like you know again as the AI gets more powerful where because you mentioned you push yourself further further to the right to like last step is pushing to production what are the like when does
50:04
Speaker A
it cross the threshold where like these type of things whether it's like a small UI change you know user requests a new color scheme whatever like it could be approved automatically, right? And then maybe we go more and more. So, how does
50:16
Speaker A
that look like? Do you see what where I'm getting at? Oh, how do you remove human in the loop checkpoints is what you Yeah. Like where do you decide basically where it's it's trivial enough for you to not even look at, right? Like maybe
50:28
Speaker A
maybe all the agents you have which again you set up the harnesses, they have your skills, you use a good model and all the agents are like, "Okay, this is a small bug. It was just a misaligned UI element. there is no, you know,
50:41
Speaker A
harmful intent from the user. The user isn't trying to hack the application. We're just going to merge it into prod right away. That will presumably grow like the the scope of things that could be merged to prod right away. So, how
50:52
Speaker A
would you think about that? Well, what I'd say is like what do you gain from review, right?
50:57
Speaker A
Sure. You gain okay like you gain the ability to gate things gate dangerous things from going into production. So, prevent um security bad stuff happening, you Yeah, prevent you know uh let's say uh claude code source code being leaked to the
51:14
Speaker A
world you know you prevent that bad stuff. Um, so but you also gain uh insight into your own system into the into the plumbing, right? So you're watching the thing do its work and you're assessing did it do a good job.
51:28
Speaker A
And so that second one, you don't want to lose that because like again we're talking about the harness, right? You want to improve your harness over time and you want some observability into it.
51:39
Speaker A
Now you could remove some human in the loop checkpoint. So you could say, okay, this uh this PR is just an internal refactor. It just moves some code around. It doesn't actually change any behavior. And you could have an AI that
51:51
Speaker A
kind of says, okay, you don't really need to review that one. But then who reviews the AI that's doing that, right?
51:57
Speaker A
How do you give feedback to that over time? You probably do need to check some of the PRs that the agent says are fine to review to check if they are actually fine to review. And then you improve
52:06
Speaker A
that over time. And so we need to think about this. We're not just reviewing the code. We're also reviewing the system that produces the code and that is important and useful but I agree the goal is to remove human in the loop
52:18
Speaker A
checkpoints where possible definitely. So maybe the better way rather than like okay let's say in a average day for this application AI autonomously fixes 20 things and pushes to production right away because they were super small at the end of the day instead of you like
52:33
Speaker A
reviewing all these because that'll be boring and slow maybe you get a custom you know teach skill HTML file and say like okay this is the common patterns in the bugs that were fixed right so like instead of you having to go through all
52:45
Speaker A
of the GitHub commits PRs whatever which is not really optimized for this agenda IC era. I mean again GitHub was created a long time ago. It would be a custom software, a custom HTML file, whatever that's you know knows you, your learning
52:59
Speaker A
style, your common mistakes. It has a history of the bugs in the past, you know, whatever. And it would be more optimized to helping you improve yourself and the system.
53:10
Speaker A
Totally. I mean, one really cool like what we're talking about here is in making review seamless and taking um taking the human effort out of review.
53:20
Speaker A
One thing that I've seen people do uh which is crazy is on any front-end change, it gets the AI to record a video of itself walking through the code and like the the thing that changed. It then calls a texttospech API and overlays
53:36
Speaker A
some speech on top. So it's like the AI is talking to you while it walks through the code and you just have a video on the PR of the thing working like that sort of richness is something that we
53:47
Speaker A
should be building into everything that we do and trying to optimize for human review and make human review faster because everyone's sort of moaning about you know like oh man we've got so much code to review but probably you could be
53:58
Speaker A
using AI to help you review the code right like in in all sorts of interesting ways that I think we're just scratching the surface of Absolutely. So a lot of people want to build something with AI, right? Whether like you could
54:12
Speaker A
start with some personal tools, some you know something for your team, but a lot of people want to build a like business, whether it's AI startup, whether it's some other business. How would you think about that? Like you know, a lot of
54:23
Speaker A
people there's there's a group of people who say like, oh yeah, SAS, you know, subscriptions, they're going to be more valuable than ever because you're going to be adding more seats for the agents.
54:31
Speaker A
There's a group of people who say like SAS is dead. How are you thinking about building a business, building software in the age of AI?
54:39
Speaker A
Well, I I don't think that much has changed about it to be honest. Um, like again, I'm not a pundit. I don't really watch markets. I don't really like care whether SAS dies or thrives. Um, like if you're building a business, what you
54:52
Speaker A
need to do is the fundamental stuff. You need to go and talk to customers. You need to figure out what they need. And then you need to build stuff um like you need to build prototypes that look like
55:00
Speaker A
what they need and solve their actual problem. I don't think anything has changed there and I think you can learn to do that and be better with it but I don't think AI gives you any particular advantage there because what you need to
55:10
Speaker A
do is go out in the real world and have conversations and figure out what it actually is people need. So I think all of the classic product design books will still make sense here. It's just you have a massive leg up when it comes to
55:23
Speaker A
actually implementing it and the procedures they talk about you can start delegating them to AI too. So mostly though, it's just about having the right idea and building the right thing. And that's not something that AI can help
55:36
Speaker A
with. If you're not also talking to actual people and figuring out what they want, as soon as you figure out what people want, um, you're good to go.
55:44
Speaker A
Yeah, I think that's actually the thing that AI is notoriously bad at is like the original ideas out of the box. And uh yeah, like that would be probably one of the main pieces of advice I would give to people is like you need to be
55:57
Speaker A
choosing the features that get added, right? If you see somebody who's like delegating all of that, it's like what's the next big thing we should add? It's like no, you should be in charge of the product. You could Yeah, obviously you
56:06
Speaker A
don't have to like learn the exact syntax or whatever. You don't have to read every file, but like you cannot be asking the AI to build your app. Like you need to have the vision. You need to know why you're building it and like
56:16
Speaker A
what problem it's solving. Absolutely. You should be asking AI what thing you can remove from your app.
56:21
Speaker A
Basically, you should be asking how do I make this simpler? How do I improve the UX? How do I actually focus in on what people want instead of ending up like you know uh one of those dreadful uh VC
56:32
Speaker A
funded apps that we've all seen where there's a thousand features and uh you can't find the thing that you want to do. So again, this is just product design fundamentals.
56:40
Speaker A
We mentioned that senior devs get like 10x improvement and you know speed up. How do you because from my experience that's true but only if they actually used the AI tools. There's a group of there still developers still that are
56:54
Speaker A
kind of refusing to believe it or you know AI is not that good. They tried it a year ago two years ago they were you know disappointed but obviously tools harnesses models are much better. But my counterargument or maybe it's not a
57:05
Speaker A
counter argument is like what about just hiring the true if you were hiring hiring young people who are true believers in AI who like know these tools inside and out. They use them all the time. They know what's the best
57:17
Speaker A
model, what's the best skill, what's the best, you know, agent in each situation. And obviously, they need to have some technical fundamentals. But like, how would you reconcile this tension of like these are seniors who have 10, 15, 20
57:31
Speaker A
years of experience and they get a 10x versus these are like true AI believers who might not have as much experience as the seniors, but like are better operators at using the AI? Well, hiring great juniors has always been the goal
57:46
Speaker A
of any company basically because if you find a great junior then anyone who's enthusiastic will do a better job than someone who's more experienced basically like enthusiasm beats experience just in pure output and because they develop so much faster and they learn so much
58:01
Speaker A
faster. And so people who are really uh excited about this new age and know a lot about this stuff, if you can just pair that with a little bit of software fundamentals with because what we're talking about here is I think of there
58:14
Speaker A
as being a difference between DX developer experience and AX, right? Agent experience. And so agent experience is the experience that the agent has working in the codebase. And anything you can do whether that's um um better skills um you know increasing the
58:31
Speaker A
power of the model works of course um you know improving the harness and improving the codebase as well is like that's amazing often people forget about improving the codebase actually for better AX you know improve about uh they
58:44
Speaker A
forget about all the edges you can get with like good software fundamentals and so that's where the senior will be useful because the senior knows how to build good DX right they know how do if they're a good senior, they know how to
58:57
Speaker A
build a codebase that can work well with humans. And there's a huge overlap between good DX and good AX, but they're just coming at it like the junior who's great at AI is just coming at the problem from a different point of view
59:11
Speaker A
from the um senior. Um what was your original question? How do they get hired or like how do you or like you hire out of both of them?
59:19
Speaker A
Sure. But not not like who would you hire but like who will maybe get more alpha you know who be more valuable like is it like the senior who has a lot of these experiences you know the right way
59:28
Speaker A
of thinking about software but maybe isn't as true of AI believer and versus somebody who's like fully embracing AI to the maximum and knows how to use it to the fullest.
59:38
Speaker A
I think if you have an experimental mindset and you're excited about AI then you're going to get a ton out of it whether you're junior or senior. And I think again if you're intrigued by the harness first of all and intrigued by
59:50
Speaker A
improving AX everywhere that you can then you're gonna you're gonna thrive and love it. Now there's obviously a lot of like good reasons that people have for not wanting to get on the AI train.
60:01
Speaker A
You know they might um just be a bit you know squeamish with the ethical stuff.
60:05
Speaker A
You know like anthropic stealing everyone's um novels and just sort of pumping them into Claude. Um, but like it is here and it's that's how the job is now. You know, if you're just a tactical programmer just plumbing away
60:22
Speaker A
doing your work, you're gone, right? Like that's out. You know, you can't be a code monkey anymore. You need to think strategically. And so seniors can absolutely make the most of that. But juniors can learn that, too.
60:34
Speaker A
All right. My closing question is going to be practical for the people watching. If you could take the average AI enthusiast and give him like one or two action steps to do today to either improve his setup, improve his harness,
60:46
Speaker A
learn something, what would those one or two things be? First thing I would do is I would delete every single skill, every single plugin, every single MCP server. I would go back. I' delete your claw.md, delete your agents.mmd, go back to absolutely
60:59
Speaker A
nothing, and then observe the agent. See what it does. In my experience, everyone bloats up their context window with too much stuff, with too many instructions.
61:09
Speaker A
Go back to a blank slate and see what the agent does. Once you're seeing what the agent does in that basic um sort of mode, then layer things on top of it and make sure those things are procedures,
61:22
Speaker A
procedure skills, not um ability skills. Layer things on that you yourself decide. And my skills repo is is a great place to start there. If you really miss something, if you really miss like brainstorming from superpowers, then bring that back. If you miss this, if
61:38
Speaker A
you miss that and make sure that you install them in a way that you can customize them, you can play around with them and experiment, you know. Um, if you're noticing problems, then try to find solutions to fix those problems and
61:50
Speaker A
try as much as you can to delegate the implementation to an AFK agent. Um, AFK is just incredible way to work. It's just takes a little bit of setup, but once it's set up, it's just goes crazy.
62:02
Speaker A
Right, V? Appreciate your time. Where should people find you? Uh, find me on Twitter, find me at aihero.dev, and I've got a newsletter where I post about all this stuff. So, aihero.dev, especially if you want to learn about my skills and learn about
62:15
Speaker A
updates to them, then go to aihero.dev/skills. All right, I'm going to link all of that below. Once again, thank you for your time, Matt, and have a great day. Now where is David?
Topics:agentic engineeringstrategic programmingAI harnesssoftware designAI delegationdeveloper productivityteach skillcodebase architectureAI toolssoftware development

Frequently Asked Questions

What is the main difference between focusing on the AI model and the AI harness?

The AI model refers to the underlying machine learning system, while the harness includes the prompts, skills, and environment that control and optimize how the model is used. The video stresses that optimizing the harness gives developers more control and better results than focusing solely on the model.

How does strategic programming differ from tactical programming in the context of AI?

Tactical programming involves day-to-day coding tasks, which AI can now perform more efficiently. Strategic programming involves long-term planning, designing code architecture, and scoping tasks well, which remain critical human skills to maximize AI’s potential.

What role do developer skills play when working with AI according to the video?

Developer skills act as a multiplier for AI effectiveness. Skilled developers who understand their domain and software design can delegate tasks to AI more effectively, resulting in higher productivity and better software outcomes.

Get More with the Söz AI App

Transcribe recordings, audio files, and YouTube videos — with AI summaries, speaker detection, and unlimited transcriptions.

Or transcribe another YouTube video here →