Feb 2, 2022
I wrote a comment on that. For nearly all processors from the 2000's onwards your approach is faster. The number of if-statements/jumps to do the 4 byte and 2 byte copies is not worth it. The processor is much better at doing the same thing very fast, so doing up to 7 one byte copies is much faster.