GPU Radix Sort written in Apple Metal

I was hoping one would be announced at WWDC but no such luck.

I need a GPU Radix Search written in Metal for integer key value pairs. Currently using thrust::stable_sort_by_key on Nvida GPU's but want to get working in Metal.

One option would be to look at CUDPP: [login to view URL]

But I know there are faster options:

[login to view URL]

Merrill, D. and Grimshaw, A. High Performance and Scalable Radix Sort: A case study of implementing dynamic parallelism for GPU computing, Parallel Processing Letters 21 (2011)

Ideally would like to have as a single compute kernel function.

// thrust::stable_sort_by_key(thrust:: device_ptr<uint>(dGridParticleHash),

// thrust:: device_ptr<uint>(dGridParticleHash + numParticles),

// thrust:: device_ptr<int>(dGridParticleIndex));

Skills: C++ Programming, CUDA, GPGPU, Swift

See more: apple metal shading, metal compute shader, apple metal tutorial, dispatchthreadgroups, metal threadgroup memory, thread_position_in_grid, apple simd documentation, apple metal context, radix sort in data structure, radix sort example step by step, radix sort example in java, radix sort code in java, radix sort algorithm with example, radix sort algorithms, radix sort algorithm java

About the Employer:
( 136 reviews ) Bonita Springs, United States

Project ID: #17139435