I replaced some doubled up nops with pushes and pops again, saving two bytes. There was also a nop in a loop that didn't look necessary, since the jump back to the top of the loop is already 13 cycles, so way more than 80 cycles are spent in that loop anyway.
I reworked things a little in parseHexPair and saved 5 bytes and 6 cycles, with more cycles saved in error cases.