www.ti.com

Submitting DMA Transfer Requests

6.11 Submitting DMA Transfer Requests

The specification of the ACPY2 interface strives to perform a delicate trade-off between allowing high performance and requiring error checking at run time. Optimized algorithms require high speed transfer mechanisms and invariably use aligned addresses and 32 or 16-bit element sizes as their fundamental type of data transfer. At the other end of the spectrum, are algorithms that need a DMA library to perform the transfer of the required number of bytes from any sources address to any destination address without being any more complicated than a simple memory copy (memcpy) function in the C standard library.

The ACPY2 interface provides algorithm developers two interface functions to submit DMA transfer requests: ACPY2_start() and ACPY2_startAligned(). The only operational difference between ACPY2_startAligned() and ACPY2_start() is the additional requirement by ACPY2_startAligned() for its source and destination addresses to be properly aligned with respect to the configured element size. When using 32-bit transfer mode, these addresses must be at least 32-bit aligned. For 16-bit transfers,

16-bit alignment is required. When called with properly aligned addresses, both functions implement an identical behavior. However, in architectures, such as C6000, which permit DMA transfers using 8-bit or 16-bit alignment of source or destination addresses irrespective of the actual transfer element size, the ACPY2_startAligned() function can be optimized to operate more efficiently. On the other hand, certain architectures, such as C55x, may impose device-dependent DMA rules that require stricture alignment of the source and destination addresses for all transfers and therefore may provide the same implementation for both APIs.

ACPY2_start() makes no assumptions on the alignment of the source and destination addresses. It accepts addresses at any alignment and when allowed by the architecture, adjusts the transfer parameters (including element size, number of elements, transfer type) to transparently perform the desired transfer using the given alignment. It is intended to simplify algorithm development in the initial states. ACPY2_start() thus strives to maintain simplicity while maintaining reasonable levels of performance. The ACPY2_startAligned() API, on the other hand, makes no run-time checks on the alignment and performs the transfer using the configured transfer settings of the channel. Passing source or destination addresses with incorrect alignment, with respect to the configured element size of the DMA handle, will result in unspecified behavior. In this respect, the sole aim of ACPY2_startAligned() is to guarantee performance by eliminating run-time checks by a pre-negotiated contract with the algorithm developer.

6.12 Device Independent DMA Optimization Guideline

In this section, we outline a general guideline applicable to all architectures that may result in significant performance optimizations. The basic premise is that configuring a logical channel is an expensive operation in terms of cycles, even when compared to the standard ACPY2 scheduling and synchronization APIs. Therein lies the motivation for the following new guideline:

DMA Guideline 2

All algorithms should minimize channel (re)configuration overhead by requesting a dedicated logical DMA channel for each distinct type of DMA transfer it issues, and avoid calling ACPY2_configure and use the new fast configuration APIs where possible.

DMA Guideline 2 is useful when different types of DMA transfers are needed in a critical loop of an algorithm. By defining different IDMA2 logical channels for each transfer type, ACPY2_configure() can be called on each channel at the beginning of the algorithm code. Then, transfer requests can be rapidly submitted on these preconfigured channels in the critical loop using the new ACPY2_start() or ACPY2_startAligned() function.

In the next two sections, we present additional DMA rules and guidelines specific to C5000 or C6000 architectures.

68

Use of the DMA Resource

SPRU352G –June 2005 –Revised February 2007

Submit Documentation Feedback

Page 68
Image 68
Texas Instruments TMS320 DSP manual Submitting DMA Transfer Requests, Device Independent DMA Optimization Guideline