Datamoshing involves deleting certain frames, or otherwise messing with their order, so that P- and B-frames reference the wrong I-frames, and such. Encoding chunks of an image like this can be much, much smaller that encoding the full image data. The idea of P- and B-frames is that often, small chunks of a video do not change from frame to frame, or can be approximated very closely as just a small amount of motion of a chunk form the previous frame. Frames that are coded in reference to other frames look at the motion vector displacements for small chunks of the frame ( macroblocks, &c.). In digital video compression, some frames are coded only with reference to themselves ( 'I-frames' or 'key frames,' &c.) and some frames are coded with reference to other frames ( 'P-frames' and 'B-frames').
The important thing to understand about datamoshing is that it leverages a technique that is common to many digital video compression algorithms.