MXU

MXU is the name for the XBurst SIMD instructions. SIMD means Single Instruction Multiple Data and is often used to speed up audio/video processing. Examples of SIMD instruction sets for other CPUs are MMX, SSE and AltiVec.

= Instruction Naming =

The initial letter indicates the number of elements in the vector(s) operated upon: S(ingle) for 1, D(ual) for 2, Q(uad) for 4. The letter is followed by a number, which denotes the length of the input elements in bits. The number is followed by the name of the operation that will be performed.

= Register Naming =

There is a dedicated register set for the MXU operations. It contains 17 32-bit registers which will be referred to as .. . Registers .. are used in computations,  is a control register. MXU register  always has value 0; writes to it have no effect.

The main MIPS registers will be referred to as .. .

= Enabling MXU =

Before the MXU can be used, it must be enabled. This is done by setting bit 0 (the lowest bit) of  to 1.

= Load and Store Instructions =

S32I2M
Assigns the value of main register  to MXU register.

S32M2I
Assigns the value of MXU register  to main register.

S32LDD
Loads the contents of the memory at  (pointer + offset) into MXU register.

S32LDDV
Loads the contents of the memory at  (pointer + shifted offset) into MXU register.

S32LDI
Loads the contents of the memory at  (pointer + offset) into MXU register. After that,  is incremented by.

S32LDIV
Loads the contents of the memory at  (pointer + shifted offset) into MXU register.

After that,  is incremented by.

S32STD
Stores the contents of MXU register  into the memory at   (pointer + offset).

S32STDV
Stores the contents of MXU register  into the memory at   (pointer + shifted offset).

S32SDI
Stores the contents of MXU register  into the memory at   (pointer + offset). After that,  is incremented by.

S32SDIV
Stores the contents of MXU register  into the memory at   (pointer + shifted offset). After that,  is incremented by.

= Addition and Subtraction Instructions =

D32ADD, Q16ADD
Performs addition and/or subtraction on vectors  and   and writes the results to vectors   and.

Whether the values are added or subtracted is controlled by, as shown in the following table:

When the vector elements are 16-bit, it is possible to swizzle the values read from vector  as follows:

The values read from vector  are always used as-is.

D32ACC, Q16ACC
Performs addition and/or subtraction on vectors  and   and adds the results to vectors   and.

Whether the values are added or subtracted is controlled by, as shown in the following table:

When the vector elements are 16-bit, it is possible to swizzle the values read from vector  as follows:

The values read from vector  are always used as-is.

Q8ADD
Adds or subtracts the four 8-bit values in the vectors  and. The four 8-bit results are stored in the vector.

Whether the values are added or subtracted is controlled by, as shown in the following table:

Q8ADDE
Adds or subtracts the four 8-bit unsigned values in the vectors  and. The four 16-bit results are stored in the vectors  and.

Whether the values are added or subtracted is controlled by, as shown in the following table:

Q8ACCE
Adds or subtracts the four 8-bit unsigned values in the vectors  and. The four 16-bit results are added to the vectors  and.

Whether the values are added or subtracted is controlled by, as shown in the following table:

D16AVG, Q8AVG
Computes the average, rounded down, of the unsigned values in vectors  and   and assigns the result to vector.

D16AVGR, Q8AVGR
Computes the average, rounded up, of the unsigned values in vectors  and   and assigns the result to vector.

Q8SAD
Computes the absolute difference of the unsigned values in vectors  and. The sum of these 4 differences is assigned to the full register  and added to the full register.

= Multiply Instructions =

D16MUL, Q8MUL
Multiplies the signed values in vector  by the signed values in vector   and assigns the results to vectors   and.

When the vector elements are 16-bit, it is possible to swizzle the values read from vector  as follows:

The values read from vector  are always used as-is.

D16MAC, Q8MAC
Multiplies the signed values in vector  by the signed values in vector   and adds or subtracts the results to vectors   and.

Whether the values are added or subtracted is controlled by, as shown in the following table:

When the vector elements are 16-bit, it is possible to swizzle the values read from vector  as follows:

The values read from vector  are always used as-is.

D16MADL, Q8MADL
Multiplies the signed values in vector  by the signed values in vector. The results of the multiplication are added or subtracted from the values in vector  and that final result is written to vector.

Whether the values are added or subtracted is controlled by, as shown in the following table:

When the vector elements are 16-bit, it is possible to swizzle the values read from vector  as follows:

The values read from vector  are always used as-is.

D16MULF
Multiplies the signed values in vector  by the signed values in vector. The highest 16 bits of the results of the multiplication are written to vector. Note that the result of multiplying two 16-bit signed numbers is a 31-bit signed number (bit 30 being the sign bit), so vector  will contain bits 30..15 of the two multiplication results, not bits 31..16.

It is possible to swizzle the values read from vector  as follows:

The values read from vector  are always used as-is.

D16MACF
Multiplies the signed values in vector  by the signed values in vector. These results are doubled to make two 32-bit signed numbers. Those numbers are then added to or subtracted from vector  and. The upper 16 bits of those numbers, rounded up, are written to vector.

Whether the values are added or subtracted is controlled by, as shown in the following table:

It is possible to swizzle the values read from vector  as follows:

The values read from vector  are always used as-is.

S16MAD
Multiplies a 16-bit signed value from vector  with a 16-bit signed value from vector. The result is added to or subtracted from  and the final result is written to.

Whether the multiplication result is added or subtracted is controlled by, as shown in the following table:

Which parts of  and   are used is controlled by , as shown in the following table:

= Other Math =

S32MAX, D16MAX, Q8MAX
Takes the maximum of the signed values of vector  and vector   and assigns those to vector.

S32MIN, D16MIN, Q8MIN
Takes the minimum of the signed values of vector  and vector   and assigns those to vector.

Q16SAT
Saturate: The values in  and   are taken as four 16-bit signed integers and clamped to the range [0..255]. The result is written to, with from high to low: upper half of  , lower half of  , upper half of  , lower half of.

S32CPS, D16CPS
Copy Sign: For each signed value in vector : If it is non-negative signed value, assign the corresponding value from vector , unmodified, to vector. Otherwise, assign the corresponding value from vector, negated, to vector.

Q8ABD
Absolute difference: Computes the absolute value of the difference of the unsigned values in vector  and vector   and assigns the result to vector.

Q8SLT
Set on Less Than: Compares the signed values in vector  and vector. If the value from  is less than the value from , 1 is assigned to the corresponding position in vector  , otherwise 0 is assigned.

This is a vectorized version of the MIPS instruction.

= Shift and Shuffle Instructions =

D32SLL
Shift Logical Left: The value of  is shifted   bits to the left and the result is assigned to. Also, the value of  is shifted   bits to the left and the result is assigned to. is a constant in the range [0..31].

D32SLLV
Shift Logical Left: The value of  is shifted   bits to the left and the result is assigned to. Also, the value of  is shifted   bits to the left and the result is assigned to. is [0..31]: the value of the lowest 5 bits of main MIPS register.

D32SLR
Shift Logical Right: The unsigned value of  is shifted   bits to the right and the result is assigned to. Also, the unsigned value of  is shifted   bits to the right and the result is assigned to. is a constant in the range [0..31].

D32SLRV
Shift Logical Right: The unsigned value of  is shifted   bits to the right and the result is assigned to. Also, the unsigned value of  is shifted   bits to the right and the result is assigned to. is [0..31]: the value of the lowest 5 bits of main MIPS register.

D32SAR
Shift Arithmetic Right: The signed value of  is shifted   bits to the right and the result is assigned to. Also, the signed value of  is shifted   bits to the right and the result is assigned to. is a constant in the range [0..31].

D32SARV
Shift Arithmetic Right: The signed value of  is shifted   bits to the right and the result is assigned to. Also, the signed value of  is shifted   bits to the right and the result is assigned to. is [0..31]: the value of the lowest 5 bits of main MIPS register.

D32SARL
Shift Arithmetic Right: The signed value of  is shifted   bits to the right and the lower 16 bits of the result are assigned to the higher 16 bits of. Also, the signed value of  is shifted   bits to the right and the lower 16 bits of the result are assigned to the lower 16 bits of. S is a constant in the range [0..31].

D32SARW
Shift Arithmetic Right: The signed value of  is shifted   bits to the right and the lower 16 bits of the result are assigned to the higher 16 bits of. Also, the signed value of  is shifted   bits to the right and the lower 16 bits of the result are assigned to the lower 16 bits of. is [0..31]: the value of the lowest 5 bits of main MIPS register.

Q16SLL
Shift Logical Left: The values of the upper and lower halves of  are shifted   bits to the left and the result is assigned to. Also, the values of the upper and lower halves of  are shifted   bits to the left and the result is assigned to. is a constant in the range [0..15].

Q16SLLV
Shift Logical Left: The values of the upper and lower halves of  are shifted   bits to the left and the result is assigned to. Also, the values of the upper and lower halves of  are shifted   bits to the left and the result is assigned to. is [0..15]: the value of the lowest 4 bits of main MIPS register.

Q16SLR
Shift Logical Right: The unsigned values of the upper and lower halves of  are shifted   bits to the right and the result is assigned to. Also, the unsigned values of the upper and lower halves of  are shifted   bits to the right and the result is assigned to. is a constant in the range [0..15].

Q16SLRV
Shift Logical Right: The unsigned values of the upper and lower halves of  are shifted   bits to the right and the result is assigned to. Also, the unsigned values of the upper and lower halves of  are shifted   bits to the right and the result is assigned to. is [0..15]: the value of the lowest 4 bits of main MIPS register.

Q16SAR
Shift Arithmetic Right: The signed values of the upper and lower halves of  are shifted   bits to the right and the result is assigned to. Also, the signed values of the upper and lower halves of  are shifted   bits to the right and the result is assigned to. is a constant in the range [0..15].

Q16SARV
Shift Arithmetic Right: The signed values of the upper and lower halves of  are shifted   bits to the right and the result is assigned to. Also, the signed values of the upper and lower halves of  are shifted   bits to the right and the result is assigned to. is [0..15]: the value of the lowest 4 bits of main MIPS register.

S32ALN
Takes the value of, shifts it   bytes (0..4) to the left and assigns the highest 32 bits of the result to. Can be used to realign values that are not aligned in memory.

S32SFL
Shuffles (swizzles) the bytes of  and   as indicated in the table below and writes the result into   and.

= New instructions in JZ4770 =

The JZ4770 has a quite a few additional MXU instructions. Ingenic writes 3 or 7 to register xr16 to activate these. This may imply that there are two levels of extension between JZ4740 and JZ4770.

Load and store instructions










































Other math






























Addition and subtraction instructions
















Multiply instructions




















Bitwise instructions