By comparing frames, the computer realizes that most objects don't disappear—they just move. We use two main processes to exploit this:
The computer analyzes the I-Frame and searches for where those pixels went in the next frame. It outputs a Motion Vector—a mathematical "map" of the movement.
During playback, the computer takes the I-Frame and "warps" it using the stored map. By adding a tiny Residual (error correction), it perfectly reconstructs the new frame.
Modern H.264 video organizes frames into a GOP sequence, typically using an IBBP pattern. Unlike P-frames which only look backward, B-frames (Bi-directional) reference both previous and future frames.