I think it's because the pixel read/writes are all done indidually between motherboard memory and graphics hardware memory, at the OS level. Block transfers are much more efficient, and of course, operations which only involve the graphics hardware.
Opengl compiled lists are most efficient of all, and ideal for objects that do not change their shape, or texture mapping. (you can still stretch them).
Charles