MPI && OMP 相关

作者: leenldk

时间: 2024-06-11

CPU 结构：
threads per core ：超线程
cores per socket ：每个 socket 核数
sockets

MPI

每个 mpi 进程有 affinity mask，长度为 CPU cores
--bind-to core : affinity mask 中只有对应 core 一位被 set
--bind-to socket : affinity mask 中 socket 对应所有 core 被 set
--bind-to none

--map-by node
--map-by socket
--map-by node:PE=8 : PE为每个进程分配的物理核数

多机

hostfile :

i1 slots=2 max-slots=8
i2 slots=2 max-slots=8

`which mpirun`  -np 2 --host i1:1,i2:1 hostname
`which mpirun` -np 4 --hostfile ./hostfile hostname

使用脚本时开头要加 #!/bin/bash

OMP

OMP_DISPLAY_ENV=true 输出 OMP 绑定情况
OMP_PLACES=threads, OMP_PLACES=cores,
OMP_PLACES=sockets

hyperthread cpu 分布：/sys/devices/system/cpu/cpu0/topology$ cat thread_siblings_list

#include <omp.h>
#include <sched.h>

    #pragma omp parallel 
    {
        int id = omp_get_thread_num();
        int max_threads = omp_get_num_threads();
        int cpuid = sched_getcpu();
        printf("hello from cpu: %d thread: %d out of %d threads @ rank = %d\n", cpuid, id, max_threads, rank);
    }

标签: none

MPI && OMP 相关

MPI

多机

OMP

添加新评论

ヒトコト

my friends

最新文章

最近回复

分类

其它

归档