Deep.Net


CudaExecUnit

Namespace: SymTensor.Compiler.Cuda

Nested types and modules

TypeDescription
BlasArgOperation

The operation the blasArg will perform.

Functions and values

Function or valueDescription
batchReduceLastAxis (...)
Signature: memAllocator:MemAllocatorT -> reduceFn:(ArrayNDManikinT -> ArrayNDManikinT -> CudaExecItemT list) -> trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> CudaExecItemT list

exection items to reduce src over the last axis into trgt

blasArg (...)
Signature: memAllocator:(TypeNameT -> int64 -> MemAllocKindT -> MemManikinT) -> manikin:ArrayNDManikinT -> shared:bool -> willOverwrite:bool -> ArrayNDManikinT * BlasTransposeOpT * CudaExecItemT list * bool

BLAS input argument passing, so that orientation is preserved. Can return copy items if deemed necessary.

blasArgOperation (...)
Signature: manikin:ArrayNDManikinT -> shared:bool -> willOverwrite:bool -> BlasArgOperation

Returns the operation that blasArg will perform.

blasTarget manikin
Signature: manikin:ArrayNDManikinT -> ArrayNDManikinT

BLAS target argument passing, so that orientation is preserved

copyExecItems trgt src
Signature: trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> CudaExecItemT list

generates ExecItems to copy srcView to trgtView

copyKeepingBroadcasted (...)
Signature: memAllocator:(TypeNameT -> int64 -> MemAllocKindT -> MemManikinT) -> broadcastAllowed:bool list -> src:ArrayNDManikinT -> ArrayNDManikinT * CudaExecItemT list

Generates ExecItems to copy srcView into newly allocated memory in C-order. Broadcasted dimensions of srcView for which broadcastAllowed is true are kept broadcasted.

cppTemplateInstantiation tmpl args
Signature: tmpl:string -> args:string list -> string

returns the C++ template instantiation code for the given template and argument list

dynamicSubtensorTmplAndIdx (...)
Signature: bas:ArrayNDManikinT -> rngs:UExprRngsSpecT -> rngManikins:ArrayNDManikinT list -> ArrayNDArgTmpl * ArrayNDSDArgTmpl * CPPArrayTmpl<IntPtr>
elementsFuncnameAndArgs (...)
Signature: trgt:ArrayNDManikinT -> cOp:ICudaArgTmpl -> srcViews:'?179146 list -> workSize:int64 list -> string * ICudaArgTmpl list * ICudaArgTmpl list
Type parameters: '?179146

function name of elements wrapper and its arguments for the given target, operation and sources

elemwiseFuncnameAndArgs (...)
Signature: trgt:ArrayNDManikinT -> cOp:'?179134 -> srcViews:'?179135 list -> string * ICudaArgTmpl list
Type parameters: '?179134, '?179135

function name of element-wise wrapper and its arguments for the given target, operation and sources

execItemsForCFunc tmplTmpls argTmpls
Signature: tmplTmpls:ICudaArgTmpl list -> argTmpls:ICudaArgTmpl list -> CudaExecItemT list
Type parameters: 'FuncDelegate

generate ExecItems to call a C++ template function

execItemsForCopyFromDynamicSubtensor (...)
Signature: trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> rngs:UExprRngsSpecT -> rngManikins:ArrayNDManikinT list -> CudaExecItemT list
execItemsForCopyToDynamicSubtensor (...)
Signature: trgt:ArrayNDManikinT -> rngs:UExprRngsSpecT -> rngManikins:ArrayNDManikinT list -> src:ArrayNDManikinT -> CudaExecItemT list
execItemsForElements (...)
Signature: compileEnv:CudaCompileEnvT -> trgt:ArrayNDManikinT -> elemFunc:UElemFuncT -> srcViews:ArrayNDManikinT list -> CudaExecItemT list

execution items for an element-wise operation

execItemsForElemwise trgt cOp srcViews
Signature: trgt:ArrayNDManikinT -> cOp:'?179137 -> srcViews:'?179138 list -> CudaExecItemT list
Type parameters: '?179137, '?179138

execution items for an element-wise operation

execItemsForGather trgt src idxViews
Signature: trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> idxViews:'?179140 option list -> CudaExecItemT list
Type parameters: '?179140

execution items for a gather operation

execItemsForIdxReduceAxis (...)
Signature: memAllocator:'?179161 -> ax:int -> eOpName:string -> initial:ConstSpecT -> trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> CudaExecItemT list
Type parameters: '?179161

reduce one axis by appling an operation on indices such as argMax, argMin, ...

execItemsForKernel (...)
Signature: cppFuncName:string -> tmplTmpls:ICudaArgTmpl list -> argTmpls:ICudaArgTmpl list -> (int64 * int64 * int64) -> CudaExecItemT list

execution item to launch the given kernel template function

execItemsForOp compileEnv arg2
Signature: compileEnv:CudaCompileEnvT -> ExecItemsForOpArgs -> CudaExecItemT list

returns the execution units for the specified op

execItemsForReduce (...)
Signature: memAllocator:MemAllocatorT -> eOpName:string -> initial:ConstSpecT -> trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> CudaExecItemT list

exection items to reduce all elements of src into the scalar trgt

execItemsForReduceAxis (...)
Signature: memAllocator:MemAllocatorT -> ax:int -> eOpName:string -> initial:ConstSpecT -> trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> CudaExecItemT list

reduce one axis by appling an operation such as sum, max, min, ...

execItemsForReduction (...)
Signature: trgt:ArrayNDManikinT -> indexed:bool -> cOp:ICudaArgTmpl -> cInitialOp:ICudaArgTmpl -> src:ArrayNDManikinT -> CudaExecItemT list

execution items for a reduction operation

execItemsForScatter trgt src idxViews
Signature: trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> idxViews:'?179142 option list -> CudaExecItemT list
Type parameters: '?179142

execution items for a scatter operation

exprToCudaExecUnits compileEnv
Signature: compileEnv:CudaCompileEnvT -> UExprT -> ExecUnitsForExprT

generates CUDA execution units that will evaluate the given unified expression

needExtra op
Signature: op:'?179112 -> '?179113
Type parameters: '?179112, '?179113

failure for extra ops

reductionFuncnameAndArgs (...)
Signature: trgt:ArrayNDManikinT -> indexed:bool -> cOp:ICudaArgTmpl -> cInitialOp:ICudaArgTmpl -> src:ArrayNDManikinT -> string * ICudaArgTmpl list

function name of reduction wrapper and its arguments for the given target, operation, initial value and source

srcReqs cudaEnv arg2
Signature: cudaEnv:CudaCompileEnvT -> SrcReqsArgs -> ChannelReqsT list

Computes desired source views given desired target view. There is no guarantee that the desired source views will be used.

toCudaUOp uop
Signature: uop:obj -> ICudaUOp

converts a IUOp or a IOp to a ICudaUOp

toIExecItem items
Signature: items:'?179169 list -> IExecItem list
Type parameters: '?179169
tracePostItemsForExpr compileEnv arg2
Signature: compileEnv:'?179167 -> TraceItemsForExprArgs -> CudaExecItemT list
Type parameters: '?179167

returns the execution units for tracing the result after execution of the op items

tracePreItemsForExpr compileEnv arg2
Signature: compileEnv:'?179165 -> TraceItemsForExprArgs -> CudaExecItemT list
Type parameters: '?179165

returns the execution units for tracing becore execution of the op items

trgtGivenSrcs compileEnv arg2
Signature: compileEnv:CudaCompileEnvT -> TrgtGivenSrcsArgs -> ChannelManikinsAndSharedT

computes the definitive target view of an op given its source views

trimUnitaryBatchedBlasDims manikin
Signature: manikin:ArrayNDManikinT -> ArrayNDManikinT

If all batch dimensions (all dimensions but the last two) of the array are of size one, a view of the last two dimensions is returned. Otherwise the original array is returned.

unsupLoc dev
Signature: dev:ITensorDevice -> '?179115
Type parameters: '?179115
workDimForElemwise trgt hetero
Signature: trgt:ArrayNDManikinT -> hetero:bool -> int64 * int64 * int64

returns the CUDA work dimensions (x, y, z) for an element-wise or elements operation

workDimForWorkSize workSize hetero
Signature: workSize:int64 list -> hetero:bool -> int64 * int64 * int64

returns the CUDA work dimensions (x, y, z) for work of given size

Fork me on GitHub