Every short-form video that works is made of the same seven pieces.
Source: This framework comes from Kallaway's appearance on Open Residency. Kallaway runs one of the most analytically-rigorous marketing YouTube channels, known for breaking down content strategy into testable frameworks. Open Residency is a podcast that goes deep on content creation, marketing, and building audiences.
Most creators approach content like it's magic. Something either works or it doesn't, and the reasons feel mysterious. But when you break down any viral short into its constituent parts, patterns emerge. The creators who consistently perform aren't lucky. They've figured out which pieces to hold constant and which to vary.
Here's the framework.
The Seven Pieces
1. Topic - What the video is about. The subject matter.
2. Angle - Your specific take on the topic. Two creators can cover the same topic with completely different angles.
3. Hook Structure - This breaks into three sub-components: visual hook, text hook, and spoken hook. They need to work together, which we'll get into.
4. Story Structure - How the information is organized. Case study, breakdown, listicle, narrative, tutorial. The container for your content.
5. Visual Format - How the visuals are laid out on screen. Split screen, full screen, picture-in-picture, vlog POV, talking head with graphics.
6. Key Visuals - The actual visual assets you're using. A-roll, B-roll, screen recordings, graphics, text overlays.
7. Audio - Music and sound effects. Often overlooked, but it shapes how content feels.
That's it. Every short-form video you've ever watched is some combination of these seven elements.
Why This Matters
The framework changes how you approach content creation. Instead of staring at a blank screen wondering what to make, you can reverse-engineer what's already working.
Find creators winning in your space. Take their best-performing videos. Explode each one into these seven pieces. Write down exactly what they did for each category.
Now you have a blueprint.
The goal isn't to copy. It's to identify which elements are driving performance and which are just stylistic choices. If every top performer in your niche uses the same story structure, that's probably load-bearing. If they all use different visual formats, that's probably flexible.
The Beginner Strategy
If you're early in your content journey, you have too many unknown unknowns. There are a hundred things you can't do yet and another hundred you don't even know you should be doing.
The fastest way to reduce that sphere of unknowns is to hold most variables constant.
Of the seven bricks, topic and angle will probably change for every video. That's fine. But hook structure, story structure, visual format? You can replicate what's already working in your space. Same split-screen motion. Same case-study format. Same hook cadence.
This isn't creative bankruptcy. It's learning the fundamentals before you try to innovate. Beginners who try to invent new formats before mastering the basics usually fail because they're solving too many problems at once.
Copy the structure. Change the topic. Ship the video. Learn what happens.
The Expert Strategy
Once you've internalized the fundamentals, you can start varying the pieces strategically.
Maybe everyone in your niche uses talking-head format with text overlays. You could borrow the visual format from a completely different industry and see what happens. Maybe fitness creators are doing something with split-screen motion that no one in B2B has tried yet.
This is where creativity actually lives. Not in ignoring what works, but in remixing elements from different sources into something that feels fresh while still being structurally sound.
The framework gives you a vocabulary for these experiments. Instead of vaguely trying to "be more creative," you can say: I'm going to hold story structure and audio constant, but try a completely different visual format. That's a testable hypothesis.
Hook Alignment
The hook structure brick deserves extra attention because it's where most videos fail.
Your hook has three pieces: visual, text, and spoken. They need to deliver the same message, or close to it. The human brain can only process one message at a time. If your visual says one thing, your text says another, and your spoken hook says a third, the viewer's brain can't keep up. They swipe.
Think about it like this: your viewer is scrolling fast. They see motion in their peripheral vision (visual hook). Their eye catches text on screen (text hook). They hear someone start talking (spoken hook). All three of these inputs hit almost simultaneously. If they're not aligned, there's cognitive friction. Friction means swipe.
The best hooks have all three elements reinforcing the same core message. The visual demonstrates it. The text states it. The spoken word elaborates on it. Same message, three channels, maximum comprehension.
Visual Hooks: What Actually Stops the Scroll
Your brain is wired like a deer's. It detects motion before anything else. High motion, high color, high contrast. That's what your visual hook needs.
This is why split-screen formats with left-to-right motion work so well. The movement catches the eye before the content even registers. Static talking heads have to work much harder to stop the scroll because there's nothing triggering that primal motion-detection system.
You can engineer this. Fast cuts in the first second. Camera movement. Graphics that animate in. Anything that creates visual motion gives you an advantage in the first fraction of a second when the viewer decides whether to stop or keep scrolling.
Story Structures That Work
Different story structures serve different purposes:
Listicle - "7 ways to do X" - Easy to produce, but heavily saturated. Works for topics with genuine list-based answers.
Breakdown - "How X achieved Y" - Case study format. Borrows credibility from the subject you're analyzing.
Tutorial - "Here's how to do X" - Step-by-step instruction. Best when you can show the actual process.
Narrative - "I experienced X, here's what happened" - Personal storytelling. Harder to execute, but harder to commoditize.
The trend is moving toward narrative formats. Transactional content (listicles, quick tips) dominated early short-form platforms because supply was low. Now that everyone can make a "5 tips for X" video, those formats are saturated. Personal narrative content is harder to copy because no one else has lived your experiences.
Applying the Framework
Here's a practical workflow:
- Identify 5-10 creators consistently winning in your space
- Pull their top 10 performing videos each
- For each video, document all seven bricks
- Look for patterns across the set
- Note which elements are consistent (probably important) vs. varied (probably flexible)
- Build your first 10 videos holding the consistent elements constant
- Once you have baseline data, start experimenting with one brick at a time
The key is treating content like a system, not a lottery. Every video teaches you something about which elements matter for your specific audience. Over time, you develop instincts. But those instincts are built on data, not guesswork.
The Bigger Picture
Frameworks like this matter because content creation is increasingly competitive. The days of posting whatever and hoping it works are over. The creators and agencies that win are the ones who approach this analytically.
Which doesn't mean removing creativity. It means channeling creativity into the right places. The framework handles the structure. Your unique perspective, your specific experiences, your particular angle on topics, that's where the differentiation lives.
Seven bricks. Infinite combinations. The constraint is actually liberating once you understand it.







































