指令集(sse)中PACKAGE 和 SCALAR 关系

来源:互联网 发布:热血江湖网络不稳定 编辑:程序博客网 时间:2024/06/02 10:21


转自:http://stackoverflow.com/questions/16218665/simd-and-difference-between-packed-and-scalar-double-precision


In SSE, the 128 bits registers can be represented as 4 elements of 32 bits.

SSE defines two types of operations; scalar and packed. Scalar operation only operates on the least-significant data element (bit 0~31), and packed operation computes all four elements in parallel.

_mm_cmpeq_sd would only compare the least-significant data element (first 32 bits) of the two operands while _mm_cmpeq_ps would compare each group of 32 bits in parallel.

If you're using 64 bits double, you could pack the double by pair to make use of the 128 bits space. That way, _mm_cmpeq_ps would be able to make two comparaison of 4 double in parallel.

If you want to make only one comparison at a time, you can use _mm_cmpeq_pd to compare two 64 bits double.

Note that _mm_cmpeq_pd is SSE2 while _mm_cmpeq_ps is SSE.


0 0