In C++, the standard way of accessing AltiVec support is mutually exclusive with the use of the Standard Template Library vector<> class template due to the treatment of "vector" as a reserved word when the compiler does not implement the context-sensitive keyword version of vector. However, it may be possible to combine them using compiler-specific workarounds; for instance, in GCC one may do #undef vector to remove the vector keyword, and then use the GCC-specific __vector keyword in its place. AltiVec prior to Power ISA 2.06 with VSX lacks loading from memory using a type's natural alignment. For example, the code below requires special handling for Power6 and below when the effective address is not 16-byte aligned. The special handling adds 3 additional instructions to a load operation when VSX is not available. • include typedef __vector unsigned char uint8x16_p; typedef __vector unsigned int uint32x4_p; ... int main(int argc, char* argv) { /* Natural alignment of vals is 4; and not 16 as required */ unsigned int vals[4] = { 1, 2, 3, 4 }; uint32x4_p vec; • if defined(__VSX__) || defined(_ARCH_PWR8) vec = vec_xl(0, vals); • else const uint8x16_p perm = vec_lvsl(0, vals); const uint8x16_p low = vec_ld(0, vals); const uint8x16_p high = vec_ld(15, vals); vec = (uint32x4_p)vec_perm(low, high, perm); • endif } AltiVec prior to Power ISA 2.06 with VMX lacks 64-bit integer support. Developers who wish to operate on 64-bit data will develop routines from 32-bit components. For example, below are examples of 64-bit add and subtract in C using a vector with four 32-bit words on a
big-endian machine. The permutes move the carry and borrow bits from columns 1 and 3 to columns 0 and 2 like in school-book math. A little-endian machine would need a different mask. • include typedef __vector unsigned char uint8x16_p; typedef __vector unsigned int uint32x4_p; ... /* Performs a+b as if the vector held two 64-bit double words */ uint32x4_p add64(const uint32x4_p a, const uint32x4_p b) { const uint8x16_p cmask = {4,5,6,7, 16,16,16,16, 12,13,14,15, 16,16,16,16}; const uint32x4_p zero = {0, 0, 0, 0}; uint32x4_p cy = vec_addc(vec1, vec2); cy = vec_perm(cy, zero, cmask); return vec_add(vec_add(vec1, vec2), cy); } /* Performs a-b as if the vector held two 64-bit double words */ uint32x4_p sub64(const uint32x4_p a, const uint32x4_p b) { const uint8x16_p bmask = {4,5,6,7, 16,16,16,16, 12,13,14,15, 16,16,16,16}; const uint32x4_p amask = {1, 1, 1, 1}; const uint32x4_p zero = {0, 0, 0, 0}; uint32x4_p bw = vec_subc(vec1, vec2); bw = vec_andc(amask, bw); bw = vec_perm(bw, zero, bmask); return vec_sub(vec_sub(vec1, vec2), bw); } Power ISA 2.07 used in Power8 finally provided the 64-bit double words. A developer working with Power8 needs only to perform the following. • include typedef __vector unsigned long long uint64x2_p; ... /* Performs a+b using native vector 64-bit double words */ uint64x2_p add64(const uint64x2_p a, const uint64x2_p b) { return vec_add(a, b); } /* Performs a-b using native vector 64-bit double words */ uint64x2_p sub64(const uint64x2_p a, const uint64x2_p b) { return vec_sub(a, b); } == Implementations ==