/handoff is my new favourite skill — Transcript

Matt Pocock introduces his new 'handoff' skill for AI coding agents, explaining its use and benefits in managing long coding sessions efficiently.

Key Takeaways

  • Handoff skill enables efficient splitting of AI coding sessions to maintain focus and avoid context dilution.
  • Large AI model context windows have practical limits, requiring smart management of conversation history.
  • Compact is useful for long single sessions, but handoff is better for branching work into separate sessions.
  • Creating clear, markdown handoff documents helps continue work seamlessly across sessions.
  • Redacting sensitive data in handoff files is essential for security and privacy.

Summary

  • Matt Pocock discusses his new 'handoff' skill designed to compress and transfer session context between AI coding agents.
  • The skill creates a markdown summary of the current conversation to hand off to a fresh agent session.
  • He explains the challenge of large context windows in AI models and the concept of 'smart' and 'dumb' zones.
  • The 'compact' tool summarizes conversations to keep the context window manageable but is limited to a single session.
  • Handoff allows splitting work into separate sessions to avoid diluted contexts and maintain focus on distinct tasks.
  • Matt demonstrates using handoff during a grilling session to manage out-of-scope tasks efficiently.
  • The skill helps sharpen the current session by clearly defining what is handed off and what remains in scope.
  • He highlights the importance of redacting sensitive information in handoff documents.
  • The video also promotes his AI coding course for engineers starting June 1st.
  • Overall, the handoff skill improves session management and productivity when working with AI coding agents.

Full Transcript — Download SRT & Markdown

00:00
Speaker A
A few weeks ago, I noticed myself doing something with agents that I thought was very clever, but I thought it was just too simple to require a skill. For those who don't know, I'm constantly thinking about skills. I'm constantly thinking
00:14
Speaker A
about how to package my instincts and coding practices into reusable skills. And this has meant my skills repo has almost 100,000 stars at the time of recording. The skill that I started to think about was a handoff skill. And the
00:28
Speaker A
theory was that this skill would take the context window of the current session and compress it down into a markdown file that could be handed off to another session. And so a couple of weeks ago, I shipped this. It's inside
00:39
Speaker A
skills inside productivity and it's inside handoff here. And it's a very, very simple skill. It says to write a handoff document summarizing the current conversation so a fresh agent can continue the work. Save it to the temporary directory of the user's
00:54
Speaker A
operating system, not the current workspace. I put this into my skills folder as an experiment to see how much I would use it. And it turns out I used it a lot. In this video, I'm going to show you a deep dive of the skill, kind
01:05
Speaker A
of why I designed it, what is the point of it, how it compares to built-in tools in some of these harnesses like compact, and also how you can get the most out of it to make the most of your grilling
01:17
Speaker A
sessions. And if you dig the kind of stuff I've been showing you, then you will love the course that I've put together, which is AI coding for real engineers, a two-week cohort for folks who want to use AI coding tools for
01:28
Speaker A
shipping quality code, not slop. It starts on June the 1st. We're doing a discount right now. Get into the link below so you can check it out. Let's start first of all by explaining why I made this skill and how it differs from
01:41
Speaker A
compaction, which you may have heard of before. When we're inside a session like this, a coding session, we essentially as we, you know, converse with the agent as it does tool calls, as it makes file edits, then this context window is going
01:53
Speaker A
to be filled up and filled up with more and more stuff in it. More and more tokens will fill up the context window.
01:59
Speaker A
Now in the harness I use, Claude Code, its context window is huge, right? You get 1 million tokens worth of context window, but there is actually a smart zone and a dumb zone in these context windows. Early on in the context
02:13
Speaker A
window, you are going to get much better performance from the agent because the attention relationships are not so strained there because there are much fewer tokens to calculate, fewer attention relationships between those tokens. Then the agent's attention isn't so diffuse. In other words, it's better
02:31
Speaker A
able to focus when there's less content in there. This means that as your conversation develops, you're going to get dumber and dumber and dumber responses from the agent all the way up to going up to, you know, 800,000
02:44
Speaker A
tokens, which personally I've never been in because around the 120k token mark, I start to feel like I'm in the dumb zone. So this means, yes, that even though Anthropic advertises a ton of context window on these models, really
02:59
Speaker A
for, you know, proper smart tasks, you've only got about 120k to work with, which means you need to budget really efficiently and you need to be aware of your context window at all times. So the question then becomes, what do you do
03:10
Speaker A
when you're starting to hit up against this dumb zone? How do you recover your conversation? How do you continue the conversation beyond the dumb zone while staying smart? And the answer to that is compact. What compact does is it will
03:24
Speaker A
take a large conversation like this and summarize it. So you go essentially from near to the dumb zone to all the way into the smart zone here.
03:34
Speaker A
And there's even sometimes an auto-compact buffer depending on what harness you're using and whether you've got it turned on, which means that when you're near to the end of the context window, let's say deep in the dumb zone, the
03:45
Speaker A
auto-compact buffer will kick in and automatically summarize your conversation inside a new session. This summary usually looks like the files referenced. So just a list of files that have been referenced. The things that you said in the conversation are usually
04:00
Speaker A
included and the general tone of the conversation as well. This is then included as a little nugget at the start of the new session. And as you build up context in the new session, then you're continually referencing the old session.
04:12
Speaker A
This means as you continue to compact and compact, you're going to end up with this kind of sediment of different layers here from previous conversations.
04:20
Speaker A
And this can be a little bit inefficient, but it's also a decent way if you want to do certain types of sessions where you just need to barrel on on the same problem again and again and again. It can be really useful for
04:32
Speaker A
debugging actually because you can compact all of the other options that you've tried and then continue to try different things, hit the barrier and then compact again to just save your state essentially. So it's a way of doing a long-running session, but it's
04:48
Speaker A
only really one session. So I continue to find compact a really, really useful tool for creating these long single sessions. But what I started to notice was I wanted to do other things with compact. I wanted to compact into
05:02
Speaker A
another session. For instance, let's say I was in one session here and while I was in this session, I noticed a little refactoring opportunity, something that was totally out of bounds, out of scope from my current session, but I knew I
05:14
Speaker A
would need to get there eventually. So, what were my choices? I could extend my current session, but then I would end up with this sort of like diluted context where I was half working on one thing, half working on the other, and I would
05:26
Speaker A
definitely hit the dumb zone, right? So, I probably wouldn't be able to finish my initial goal. I could compact, but then I would clobber all of the progress that I'd made in my current session, right?
05:38
Speaker A
What I really wanted to do was just say, "Okay, I want to complete this other thing in a separate session and keep my current session pure." In other words, this was what I wanted. I wanted to essentially take the context or take
05:50
Speaker A
just the slice that pertains to this extra bug fix, hand it off to another session, and then these two could just run independently. And so for a while what I was doing was saying, "Okay, take the stuff in my current session. I want
06:02
Speaker A
to fix this particular bug. Write me a handoff.md document so that I can then just pass that into another agent." And it turned out I was doing this so freaking often that I just decided, okay, I need a skill for this. I most
06:15
Speaker A
often use handoff while I'm grilling here. Here I'm inside a grilling session that I did for planning some future features for Sand Castle, which is my sort of software factory. And what you can see here is that I'm kind of
06:27
Speaker A
answering some questions. I'm only in Q2 of this grilling session. So not a long one. And I say here, I think in future we may want to move the iterations and the completion signal onto a separate API. In fact, let's hand off that task
06:40
Speaker A
to a separate agent. You can see here that when I'm defining handoff, when I'm saying the reason why I'm handing off and exactly what should be in that document. This does two things.
06:52
Speaker A
First of all, it actually sharpens the current grilling session I'm on. So, it says that given that constraint, Q2 collapses. So, it doesn't actually, like, it helps my current grilling session because I'm saying that's out of scope.
07:03
Speaker A
We'll pick that up somewhere else. It then goes and creates a markdown file just here with the focus for the next session. File a GitHub issue and eventually designed for splitting iterations and the completion signal into a separate API. And then later, I
07:16
Speaker A
just pass this into another agent in order to create the issue. Simple. Another pattern that I really strongly recommend is handing off
07:32
Speaker A
grill with docs, which are more of my skills, you will often find there's two categories of questions you need to answer. There are the kind of known unknowns, the ones that the agent can ask you about, and then there's stuff
07:43
Speaker A
that you really need to see in code or need to see prototyped. This can be really true with like UI prototypes or complicated bits of logic that you're not quite sure how to deal with yet. So in this grilling session, we're down to
07:55
Speaker A
question 13 actually and we've got a sort of final uh resolution from the agent. And then we can see I say hand off to prototype the difficult bits here, the window communication, the TL draw SDK integration which was something
08:09
Speaker A
I was building at the time. It creates the handoff and then I go and implement the prototype on that branch. So in the prototype session, this ended up being a huge session. So 169K tokens, so way bigger than would have fit inside the
08:22
Speaker A
grilling. And what I did was I created this prototype of the UI and the kind of interaction that I wanted to see. And then I said, okay, let's hand this off back to the grilling session that spawned this. Take all of the learnings
08:35
Speaker A
from the prototype, anything that's not directly captured in the prototype itself or that's nonobvious, give me a handoff document that I can pass back to the planner. This is actually a really common pattern that I'm using here where
08:47
Speaker A
you have the initial session where you do some work, you hand off to another session. That session then creates another handoff document and then passes it back to the original session. It's almost like you've done a kind of DIY
08:59
Speaker A
sub agent where you're able to use a context window for one specific task, compress your learnings from that task, and pass it back to the parent. Then I was able to finish the grilling session and create some proper PRDs and issues
09:13
Speaker A
with the prototype in there. So it's an incredibly rich pattern for actually getting what you need out of AFK agents and using prototypes. It's very very cool. It's worth saying too that the thing that's cool about just using like
09:27
Speaker A
a markdown document here and not relying on kind of native agent stuff is that you can have this first session be clawed code, but you can just pass this to another agent, right? you can pass it to codeex or pass it to you know copilot
09:40
Speaker A
CLI whatever you're using. So if you want to do any kind of adversarial review or any kind of um you know interaction between different coding agents this is a very very simple way to do it. We should also just read through
09:52
Speaker A
the final bits of the skill here just so you understand the reasoning behind everything. The theory here is include a suggested skills section in the document which suggests skills that the agent should invoke. I added this because sometimes it would I I use skills to
10:08
Speaker A
kind of define the flavor of that session. And so having a suggested skills section means that you can kind of just paste the handoff document into the new session. It will invoke the skills needed like grill with docs or
10:20
Speaker A
diagnose or prototype or something and then you're kind of good to go. So you don't need to think about the skills that you need to use in the next session. It's pretty handy. Another one is do not duplicate content already
10:30
Speaker A
captured in other artifacts. I would often find these handoff documents just got really big and they were just duplicating stuff that was already present either in other markdown files or in resources like GitHub issues or things like that. So it's basically
10:44
Speaker A
saying just use pointers instead of um you know repeating everything that's in the document. I also really strongly believe that you should save these handoff files to the temporary directory of the user's OS. In other words, these handoff files are disposable. They are
10:58
Speaker A
not something to be kept around for a long time to rot in your codebase as documentation. Another one is redact any sensitive information, API keys, passwords or PII. This is, you know, pretty essential. You don't want these floating around in markdown files in
11:12
Speaker A
just random places. And finally, if the user passed arguments, in other words, what the next session will be used for, treat those as a description as to what the next session will focus on and tailor the dock accordingly. I think of
11:22
Speaker A
this as essential for handoff because in order to write a decent document, the agent needs to know what the next agent session is going to focus on. Every time I use handoff, I always describe the purpose, the reason that we're handing
11:36
Speaker A
off because I just can't see how you would write a good handoff document otherwise. And of course, dictation makes this really easy because I just blast it out and we're good to go. So there we go. That's handoff. This is an
11:47
Speaker A
essential skill in my toolkit that, you know, just like a lot of my other skills, didn't exist but a few weeks ago. If you've been enjoying my skills, then you should check out the cohort course. It is an absolute banger. We had
11:57
Speaker A
about 2,500 people take it last time, and I'm expecting, you know, a decent whack this time, too. Other than that, thank you so much for watching. My bookshelf behind me is filling up with new coding books that I'm going to be
12:09
Speaker A
reading over the next couple of weeks. I'm thinking about maybe making a sort of what's on my bookshelf video of recommended books. And I don't know, if you like that, then maybe give us a like and a comment or let me know what you
12:20
Speaker A
want to see next. Either way, thanks for watching and I will see you very
Topics:AI codinghandoff skillcontext windowcompact toolsession managementcoding agentsMatt PocockproductivityAI toolssoftware development

Frequently Asked Questions

What is the handoff skill introduced by Matt Pocock?

The handoff skill is a tool that compresses the current AI coding session's context into a markdown file, allowing the work to be handed off to a new session or agent to continue independently.

How does the handoff skill differ from the compact tool?

While compact summarizes a conversation to fit within a single session's context window, handoff allows splitting work into separate sessions, preventing context dilution and enabling parallel task management.

Why is managing the context window important in AI coding sessions?

AI models have large but practically limited context windows, with performance degrading in the 'dumb zone' as more tokens accumulate, so managing context efficiently ensures smarter and more focused AI responses.

Get More with the Söz AI App

Transcribe recordings, audio files, and YouTube videos — with AI summaries, speaker detection, and unlimited transcriptions.

Or transcribe another YouTube video here →