mastodon.world is one of the many independent Mastodon servers you can use to participate in the fediverse.
Generic Mastodon server for anyone to use.

Server stats:

8.8K
active users

#ROCm

2 posts2 participants0 posts today
Replied in thread

@eugenialoli same with photo #raw processing, non of the #foss apps use the color profiles of the cameras, no 16bit raw and many other issues. I gave up retouch due to the still bad gimp ux and the bad implementation of non destructive editing.

The sad reality is the #linuxdesktop is not ready for professional #mediaproduction and this is such a bad thing in times like this.

#davinciresolve also barely runs on everything except nvidia on #linux and still has no #flatpak

Even blender is a pain with amd #rocm

Any #Linux #kernel ,
#graphics or #GPU people out there?

I'm trying to understand the relationship between the #amgdpu driver shipped with the kernel; and the "andgpu-dkms" driver that comes with #ROCm .

Specifically, with a recent enough kernel, do we really need to install the ROCm version of the driver? Does the ROCm version contain stuff the general driver does not? Or is the ROCm stack (esp. libhsa) tightly tied to a very specific version of the driver?

Chat is it OK to install rocm?
I want the GPU accelerated AV1 encoder for OBS. I used to have it, then I installed the AMD Drivers from their website instead of using the ones that come with Ubuntu, after that it disappeared, so I uninstalled those drivers and went back to the kernal ones, but the encoder is still missing.
#ubuntu #apt #amd #rocm

0 upgraded, 199 newly installed, 0 to remove and 1 not upgraded.
Need to get 3,122 MB/3,273 MB of archives.
After this operation, 20.4 GB of additional disk space will be used.
Do you want to continue? [Y/n] Y
sudo apt install rocm
[sudo] password for baa: 
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  amd-smi-lib amdgpu-core comgr composablekernel-dev g++-13-multilib
  g++-multilib gcc-11-base gcc-13-multilib gcc-multilib half hip-dev hip-doc
  hip-runtime-amd hip-samples hipblas hipblas-common-dev hipblas-dev hipblaslt
  hipblaslt-dev hipcc hipcub-dev hipfft hipfft-dev hipfort-dev hipify-clang
  hiprand hiprand-dev hipsolver hipsolver-dev hipsparse hipsparse-dev
  hipsparselt hipsparselt-dev hiptensor hiptensor-dev hsa-amd-aqlprofile
  hsa-rocr hsa-rocr-dev lib32asan8 lib32atomic1 lib32gcc-13-dev lib32gcc-s1
  lib32gomp1 lib32itm1 lib32quadmath0 lib32stdc++-13-dev lib32stdc++6
  lib32ubsan1 libamd-comgr2 libamdhip64-5 libasan6 libavcodec-dev
  libavformat-dev libavutil-dev libc6-dev-i386 libc6-dev-x32 libc6-i386
  libc6-x32 libdc1394-dev libdrm-amdgpu-amdgpu1 libdrm-amdgpu-common
  libdrm-amdgpu-dev libdrm-amdgpu-radeon1 libdrm2-amdgpu libelf-dev
  libevent-pthreads-2.1-7t64 libexif-dev libexif-doc libfabric1
  libfile-copy-recursive-perl libgcc-11-dev libgdcm-dev libgl2ps1.4
  libgphoto2-dev libhsa-runtime64-1 libhsakmt1 libimath-dev libllvm17t64
  libmunge2 libnuma-dev libopencv-calib3d-dev libopencv-contrib-dev
  libopencv-core-dev libopencv-dev libopencv-dnn-dev libopencv-features2d-dev
  libopencv-flann-dev libopencv-highgui-dev libopencv-imgcodecs-dev
  libopencv-imgproc-dev libopencv-java libopencv-ml-dev
  libopencv-objdetect-dev libopencv-photo-dev libopencv-photo406t64
  libopencv-shape-dev libopencv-shape406t64 libopencv-stitching-dev
  libopencv-stitching406t64 libopencv-superres-dev libopencv-superres406t64
  libopencv-video-dev libopencv-videoio-dev libopencv-videoio406t64
  libopencv-videostab-dev libopencv-videostab406t64 libopencv-viz-dev
  libopencv-viz406t64 libopencv406-jni libopenexr-dev libopenmpi3t64
  libpmix2t64 libpsm-infinipath1 libpsm2-2 libraw1394-dev libraw1394-tools
  librdmacm1t64 libstdc++-11-dev libswresample-dev libswscale-dev libtbb-dev
  libtsan0 libucx0 libvtk9.1t64 libx32asan8 libx32atomic1 libx32gcc-13-dev
  libx32gcc-s1 libx32gomp1 libx32itm1 libx32quadmath0 libx32stdc++-13-dev
  libx32stdc++6 libx32ubsan1 mesa-common-dev migraphx migraphx-dev miopen-hip
  miopen-hip-dev mivisionx mivisionx-dev opencv-data openmp-extras-dev
  openmp-extras-runtime python3-argcomplete rccl rccl-dev rocalution
  rocalution-dev rocblas rocblas-dev rocfft rocfft-dev rocm-cmake rocm-core
  rocm-dbgapi rocm-debug-agent rocm-developer-tools rocm-device-libs rocm-gdb
  rocm-hip-libraries rocm-hip-runtime rocm-hip-runtime-dev rocm-hip-sdk
  rocm-language-runtime rocm-llvm rocm-ml-libraries rocm-ml-sdk rocm-opencl
  rocm-opencl-dev rocm-opencl-runtime rocm-opencl-sdk rocm-openmp-sdk
  rocm-smi-lib rocm-utils rocminfo rocprim-dev rocprofiler rocprofiler-compute
  rocprofiler-dev rocprofiler-plugins rocprofiler-register rocprofiler-sdk
  rocprofiler-sdk-roctx rocprofiler-systems rocrand rocrand-dev rocsolver
  rocsolver-dev rocsparse rocsparse-dev rocthrust-dev roctracer roctracer-dev
  rocwmma-dev rpp rpp-dev valgrind
Suggested packages:
  lib32stdc++6-13-dbg libx32stdc++6-13-dbg opencv-doc libraw1394-doc
  libstdc++-11-doc libtbb-doc mpi-default-bin vtk9-doc vtk9-examples
  valgrind-dbg valgrind-mpi kcachegrind alleyoop valkyrie
The following NEW packages will be installed:
  amd-smi-lib amdgpu-core comgr composablekernel-dev g++-13-multilib
  g++-multilib gcc-11-base gcc-13-multilib gcc-multilib half hip-dev hip-doc
  hip-runtime-amd hip-samples hipblas hipblas-common-dev hipblas-dev hipblaslt
  hipblaslt-dev hipcc hipcub-dev hipfft hipfft-dev hipfort-dev hipify-clang
  hiprand hiprand-dev hipsolver hipsolver-dev hipsparse hipsparse-dev
  hipsparselt hipsparselt-dev hiptensor hiptensor-dev hsa-amd-aqlprofile
  hsa-rocr hsa-rocr-dev lib32asan8 lib32atomic1 lib32gcc-13-dev lib32gcc-s1
  lib32gomp1 lib32itm1 lib32quadmath0 lib32stdc++-13-dev lib32stdc++6
  lib32ubsan1 libamd-comgr2 libamdhip64-5 libasan6 libavcodec-dev
  libavformat-dev libavutil-dev libc6-dev-i386 libc6-dev-x32 libc6-i386
  libc6-x32 libdc1394-dev libdrm-amdgpu-amdgpu1 libdrm-amdgpu-common
  libdrm-amdgpu-dev libdrm-amdgpu-radeon1 libdrm2-amdgpu libelf-dev
  libevent-pthreads-2.1-7t64 libexif-dev libexif-doc libfabric1
  libfile-copy-recursive-perl libgcc-11-dev libgdcm-dev libgl2ps1.4
  libgphoto2-dev libhsa-runtime64-1 libhsakmt1 libimath-dev libllvm17t64
  libmunge2 libnuma-dev libopencv-calib3d-dev libopencv-contrib-dev
  libopencv-core-dev libopencv-dev libopencv-dnn-dev libopencv-features2d-dev
  libopencv-flann-dev libopencv-highgui-dev libopencv-imgcodecs-dev
  libopencv-imgproc-dev libopencv-java libopencv-ml-dev
  libopencv-objdetect-dev libopencv-photo-dev libopencv-photo406t64
  libopencv-shape-dev libopencv-shape406t64 libopencv-stitching-dev
  libopencv-stitching406t64 libopencv-superres-dev libopencv-superres406t64
  libopencv-video-dev libopencv-videoio-dev libopencv-videoio406t64
  libopencv-videostab-dev libopencv-videostab406t64 libopencv-viz-dev
  libopencv-viz406t64 libopencv406-jni libopenexr-dev libopenmpi3t64
  libpmix2t64 libpsm-infinipath1 libpsm2-2 libraw1394-dev libraw1394-tools
  librdmacm1t64 libstdc++-11-dev libswresample-dev libswscale-dev libtbb-dev
  libtsan0 libucx0 libvtk9.1t64 libx32asan8 libx32atomic1 libx32gcc-13-dev
  libx32gcc-s1 libx32gomp1 libx32itm1 libx32quadmath0 libx32stdc++-13-dev
  libx32stdc++6 libx32ubsan1 mesa-common-dev migraphx migraphx-dev miopen-hip
  miopen-hip-dev mivisionx mivisionx-dev opencv-data openmp-extras-dev
  openmp-extras-runtime python3-argcomplete rccl rccl-dev rocalution
  rocalution-dev rocblas rocblas-dev rocfft rocfft-dev rocm rocm-cmake
  rocm-core rocm-dbgapi rocm-debug-agent rocm-developer-tools rocm-device-libs
  rocm-gdb rocm-hip-libraries rocm-hip-runtime rocm-hip-runtime-dev
  rocm-hip-sdk rocm-language-runtime rocm-llvm rocm-ml-libraries rocm-ml-sdk
  rocm-opencl rocm-opencl-dev rocm-opencl-runtime rocm-opencl-sdk
  rocm-openmp-sdk rocm-smi-lib rocm-utils rocminfo rocprim-dev rocprofiler
  rocprofiler-compute rocprofiler-dev rocprofiler-plugins rocprofiler-register
  rocprofiler-sdk rocprofiler-sdk-roctx rocprofiler-systems rocrand
  rocrand-dev rocsolver rocsolver-dev rocsparse rocsparse-dev rocthrust-dev
  roctracer roctracer-dev rocwmma-dev rpp rpp-dev valgrind
0 upgraded, 199 newly installed, 0 to remove and 1 not upgraded.
Need to get 3,122 MB/3,273 MB of archives.
After this operation, 20.4 GB of additional disk space will be used.
Do you want to continue? [Y/n] Y

Whoa. First new personal computer hardware in a long while and it shows. #darktable is _very_ fast on the new framework 13. Enabled opencl via rocm and did have some weird behavior last night, but I've yet to reproduce.

I may actually get through my backlog of edits one of these days :)

Continued thread

Good News: Today I had the opportunity to talk with #amd staff: 1. Yes, the transformer support was a higher priority than convolutions. 2. My problem can be tracked down to the backwards pass of Conv3D 3. By using the MIOPEN_FIND_MODE=3 and MIOPEN_FIND_ENFORCE=3 env variables in #rocm 6.4 I got a huge performance boost such that my code now runs faster on the #mi300a than on the #A100 🤩

#AMD splits #ROCm toolkit into two parts – ROCm #AMDGPU drivers get their own branch under Instinct #datacenter #GPU moniker
The new #datacenter Instinct driver is a renamed version of the #Linux AMDGPU driver packages that are already distributed and documented with ROCm. Previously, everything related to ROCm (including the amdgpu driver) existed as part of the ROCm software stack.
tomshardware.com/pc-components

Tom's Hardware · AMD splits ROCm toolkit into two parts – ROCm AMDGPU drivers get their own branch under Instinct datacenter GPU monikerBy Aaron Klotz
Continued thread

Then your Docker compose container should have:

image-name:
build:
context: .
devices:
- /dev/dri
- /dev/kfd
group_add:
- video
shm_size: 4G
environment:
- PYTORCH_HIP_ALLOC_CONF=garbage_collection_threshold:0.9,max_split_size_mb:512
- PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.9,max_split_size_mb:512
- HSA_OVERRIDE_GFX_VERSION=11.0.0

#rocm#gfx1103#780M

So, good news. ROCm 6.3.4 and PyTorch 2.4.0 seems stable enough with gfx1103 if I use HSA override for 11.0.0, using latest firmware blobs and kernel 6.13.10 on Fedora 41.

In your Dockerfile, build your AI app from:
```
FROM rocm/pytorch:rocm6.3.4_ubuntu24.04_py3.12_pytorch_release_2.4.0
```

Been fighting the whole day trying to get ROCm to play nice with 780M and PyTorch. Using latest #rocm and my laptop just freezes with gfx1103 and using HSA override to 11.0.0 and with 10.3.0 :blobcatknife:

#amd really needs to fix this crap for their GPUs. Using Docker and their provided ROCm images. I know, 780M is not supported. But c’mon, ALL Nvidia cards can run #CUDA just fine. #rant

Makes me sad that #AMD #ROCm isn't officially supported on iGPUs. Like, it'll kinda run, but then it's likely to crash my window manager and freeze my machine. Sadly the stuff I want to use doesn't have vulkan support yet.

The B-17 Bomber was amazing and helped win WWII. I flew on one in 2002 as a tourist - I have family members that were ball turret gunners - bad place to be.

This video was shot on Hi-8, and thankfully I digitized it (at 720x480) way back in that day. Now, I've up-scaled it with local AI (1408x954) and the improvement is astounding.

Sadly, this actual B17 crashed in 2019: en.wikipedia.org/wiki/2019_Boe

#localai
#stablediffusion
#rocm
#amd
#b17
#flyingfortress

🌘 AITER:ROCm的人工智慧張量引擎
➤ 使用ROCm的AITER加速AI運算,極大提升效能
rocm.blogs.amd.com/software-to
ROCm的人工智慧張量引擎(AITER)是AMD的高性能AI運算子庫,提供多樣化的功能和強大的性能優化,助力開發者最大化GPU效能。
+ 這篇文章清楚解釋了ROCm的AITER如何提升人工智慧應用的效能,對於想要優化GPU運算的開發者很有參考價值。
+ AI運算的性能優化對於未來各行各業的發展至關重要,ROCm的AITER帶來了新的解決方案,讓人工智慧應用更具競爭力。
#人工智慧 #ROCm #張量引擎

rocm.blogs.amd.comAITER: AI Tensor Engine For ROCm — ROCm BlogsWe introduce AMD's AI Tensor Engine for ROCm (AITER), our centralized high performance AI operators repository, designed to significantly accelerate AI workloads on AMD GPUs