Deep.Net


CudaElemExpr

Namespace: SymTensor.Compiler.Cuda

Nested types and modules

TypeDescription
CodeT
VarNameT

Functions and values

Function or valueDescription
bestPosOrder trgt srcs uElemFunc
Signature: trgt:ArrayNDManikinT -> srcs:ArrayNDManikinT list -> uElemFunc:UElemFuncT -> int list

Returns the best position ordering so that coalesced memory access is maximized.

generateFunctor name arg2 posOrder
Signature: name:string -> UElemFuncT -> posOrder:int list -> string

generates a functor that evaluates the UElemFuncT

generateSizeSpecCode sizeSymVars ss
Signature: sizeSymVars:Map<SizeSymbolT,VarNameT> -> ss:SizeSpecT -> string
strideStats trgt srcs arg3
Signature: trgt:ArrayNDManikinT -> srcs:ArrayNDManikinT list -> UElemFuncT -> int64 list

Returns a map from dimension to the number of stride one reads/writes that would occur if this dimension was the X work dimension of the CUDA kernel calculating the elements expression.

Fork me on GitHub