mirror of
https://github.com/Mbed-TLS/mbedtls.git
synced 2025-02-05 09:40:32 +00:00
fb0e4f0d1a
(A similar commit for Arm follows.) Use specific instructions for moving bytes around in a word. This speeds things up, and as a side-effect, slightly lowers code size. ARIA_P3 (aka reverse byte order) is now 1 instruction on x86, which speeds up key schedule. (Clang 3.8 finds this but GCC 5.4 doesn't.) I couldn't find an Intel equivalent of ARM's ret16 (aka ARIA_P1), so I made it two instructions, which is still much better than the code generated with the previous mask-shift-or definition, and speeds up en/decryption. (Neither Clang 3.8 nor GCC 5.4 find this.) Before: O aria.o ins s 7976 43,865 2 10520 37,631 3 13040 28,146 After: O aria.o ins s 7768 33,497 2 9816 28,268 3 11432 20,829 For measurement method, see previous commit: "aria: turn macro into static inline function"