Kokkos 相关

Kokkos 是 C++ library

hierarchy :
device, host-parallel, host-serial

同步：
Kokkos::fence()

两个问题：

memory space : device / host
memory layout : LayoutLeft / LayoutRight

DualView : 维护在 device memory 上的 Kokkos::View 和其在 host memory 上的 Kokkos::View mirror，同时维护在两个不同 memory space 的 data

template argument：DataType, Layout, Device
using view_type = Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device>

原子操作相关

scatter-add ：两个粒子共享邻居，当两个粒子同时更新邻居时可能造成 race
使用 data replication V.S. 使用 atomic operation 解决 race 问题
ScatterView ：在编译时透明地选择处理原子操作方法，对于 CPU 使用 data replication，对于 GPU 使用 atomic operation

`Kokkos::Experimental::contribute(View &dest, Kokkos::Experimental::ScatterView
const &src) 将 ScatterView 的 reduction 结果放回到 dest，可能在 Kokkos::parallel_reduce()` 后调用

检查 Kokkos 属性

判断 layout :

  if (std::is_same<typename decltype(fpair->f)::traits::array_layout, Kokkos::LayoutLeft>::value) {
    printf("array fpair->f is LayoutLeft\n");
  }
  if (std::is_same<typename decltype(fpair->f)::traits::array_layout, Kokkos::LayoutRight>::value) {
    printf("array fpair->f is LayoutRight\n");
  }

获取 stride :

  int strides[2];
  (fpair->x).stride(strides);
  printf("array fpair->x stride : (%d, %d)\n", strides[0], strides[1]);

Access traits

RandomAccess : Kokkos::MemoryTraits<Kokkos::RandomAccess> ，当在 Cuda execution space 中执行时，如果对于在 CudaSpace 或 CudaUVMSpace 中的 const View，Kokkos 会使用 texture fetch 访问

Unmanaged View : Kokkos::MemoryTraits<Kokkos::Unmanaged>，对于一个 raw pointer， Kokkos 不进行 reference counting 和 deallocation

原子操作相关

检查 Kokkos 属性

Access traits

添加新评论

ヒトコト

my friends

最新文章

最近回复

分类

其它

归档