将float16与float32格式的复数相乘可以通过ARMv8-A(Cortex A53)指令集中的VMLA.F32指令实现。下面是一个如何将float16和float32格式的两个复数乘在一起的示例代码:
float16_t complex_mul_float16 (float16_t a, float16_t b) { float32_t real = (float32_t) a.real * (float32_t) b.real - (float32_t) a.imag * (float32_t) b.imag; float32_t imag = (float32_t) a.real * (float32_t) b.imag + (float32_t) a.imag * (float32_t) b.real; return (float16_t) { real, imag }; }
float16_t complex_mul_float32 (float16_t a, float32_t b_real, float32_t b_imag) { float32_t real = (float32_t) a.real * b_real - (float32_t) a.imag * b_imag; float32_t imag = (float32_t) a.real * b_imag + (float32_t) a.imag * b_real; return (float16_t) { real, imag }; }
由于float16_t类型在现代ARMv8-A硬件上实现时仍然使用32位浮点寄存器,因此在计算的过程中将float16_t类型的值强制转换为float32_t类型即可实现复数的相乘。