Chapter 7 Code Optimization
© National Instruments Corporation 7-17 AutoCode Reference
constant block is optimized away, including the output variable. Also notice
that the existing Constant Propagation optimization can be used with the
constant block, but will only operate on scalar pieces of the constant block
output.
Optimizing with Callout BlocksThe MatrixInverse, MatrixRightDivide, and MatrixLeftDivide blocks are
implemented with callouts and therefore carry a special set of rules that
must be followed to generate optimal code. The most important rule,
applicable to all three blocks, is that the block output should be labeled as
a single matrix or vector (or generate code using maximal vectorization).
This is because the algorithms associated with these blocks are stand-alone
entities with fixed interfaces; a single array is used for the block output for
the MatrixInverse, MatrixRightDivide, and MatrixLeftDivide algorithms.
When the output of a callout block in the generated code is spread among
several different variables, a copy-back must be emitted after the callout to
ensure the results are correctly stored in the desired output variables.
Following this rule can have a significant impact on code generation for
these three blocks. However, regarding input rules, the callout interface
introduces certain constraints on inputs if optimality is desired.
Optimizing with Inverse BlocksThe MatrixInverse block callout has a single argument which is both the
input and output to the inversion algorithm. Thus, the input is modified by
the algorithm. For this reason, a copy-in must always be done for this block
and the input connectivity is not nearly as important as the output
connectivity. You may see tighter code with good input connectivity,
because the copy-in will be looped rather than unrolled, but the copy-in will
still be present.
Optimizing with Division BlocksThe two division blocks: MatrixRightDivide and MatrixLeftDivide, solve
the equations XA = B and AX = B, respectively. When you consider which
to use for your application, notice that you will get more efficient code by
using the MatrixRightDivide rather than the MatrixLeftDivide. This is
because AutoCode generates output matrices in row-major order, whereas
the LINPACK callouts are written expecting column-major inputs.
Solving AX = B under such mismatched conditions requires an extra
transpose-copy not required to solve XA = B.