profiler工具

profiler工具

作者: leenldk

时间: 2021-01-15

vtune

intel profiler

source /home/leenldk/intel/oneapi/vtune/2021.2.0/env/vars.sh  #加载

gprof

gcc 开源 profile 工具

编译时添加 -pg 选项进行插装
运行后生成 gmon.out
通过 gprof 输出 profiling 文件

gcc example.c -o temp -g -pg
./temp
gprof temp > profiling.out

nvprof

update : nvprof 已经不再支持最新 GPU，请使用 nsys 和 ncu

cuda toolkit 中自带工具
使用：

nvprof ./gemm # 输出 prof 结果
# 在使用了 unified memory 时可能需要 添加 --unified-memory-profiling off
nvprof --unified-memory-profiling off ./gemm

-o prof.nvvp : 输出为 nvvp 文件
--metrics [all/gld_throughput] : profile 所有参数/Global Load Throughput (可能需要 sudo)

可视化：使用 x11 forwarding nvvp prof.out
cuda 11 版本可能有 java 问题，此时需要
sudo apt install openjdk-8-jdk
nvvp -vm /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java prof.out

windows :
.\nvvp.exe -vm 'D:\Program Files\Java\jdk1.8.0_311\jre\bin\java.exe'

nsys (nsight system)

粗粒度 timeline profile

ncu (nsight compute)

细粒度单个 kernel 级别 profile
ncu --list-sets 获取支持的 metric section set

--set full
-o file

MPI profile

在单节点 profile 中，nsys 可以在 mpirun 之前：
nsys profile [nsys args] mpirun [mpirun args] ...

在多节点 profile 中，nsys 必须在 mpirun 之后：
mpirun [mpirun args] nsys profile [nsys args] ...

标签: none

vtune

gprof

nvprof

nsys (nsight system)

ncu (nsight compute)

MPI profile

添加新评论

ヒトコト

my friends

最新文章

最近回复

分类

其它

归档