mastodon.world is one of the many independent Mastodon servers you can use to participate in the fediverse.
Generic Mastodon server for anyone to use.

Server stats:

8.1K
active users

#opencl

1 post1 participant0 posts today
HGPU group<p>Performant Unified GPU Kernels for Portable Singular Value Computation Across Hardware and Precision</p><p><a href="https://mast.hpc.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenCL</span></a> <a href="https://mast.hpc.social/tags/SYCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SYCL</span></a> <a href="https://mast.hpc.social/tags/HIP" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HIP</span></a> <a href="https://mast.hpc.social/tags/Kokkos" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Kokkos</span></a> <a href="https://mast.hpc.social/tags/Julia" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Julia</span></a></p><p><a href="https://hgpu.org/?p=30096" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">hgpu.org/?p=30096</span><span class="invisible"></span></a></p>
@reiver ⊼ (Charles) :batman:(programming) Golang & OpenGL
LLMsConTraPh: Contrastive Learning for Parallelization and Performance Optimization With the advancement of HPC platforms, the demand for high-performing applications continues to grow. One effective w...<br><br><a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/Computer" target="_blank">#Computer</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/science" target="_blank">#science</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/OpenCL" target="_blank">#OpenCL</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/paper" target="_blank">#paper</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/Code" target="_blank">#Code</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/generation" target="_blank">#generation</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/Heterogeneous" target="_blank">#Heterogeneous</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/systems" target="_blank">#systems</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/HPC" target="_blank">#HPC</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/LLM" target="_blank">#LLM</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/nVidia" target="_blank">#nVidia</a><br><br><a href="https://hgpu.org/?p=30084" rel="nofollow noopener" target="_blank">Origin</a> | <a href="https://awakari.com/sub-details.html?id=LLMs" rel="nofollow noopener" target="_blank">Interest</a> | <a href="https://awakari.com/pub-msg.html?id=1QXUjmnGWJOcg97Pjiqo6zYBZBY&amp;interestId=LLMs" rel="nofollow noopener" target="_blank">Match</a>
HGPU group<p>ConTraPh: Contrastive Learning for Parallelization and Performance Optimization</p><p><a href="https://mast.hpc.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenCL</span></a> <a href="https://mast.hpc.social/tags/OpenACC" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenACC</span></a> <a href="https://mast.hpc.social/tags/OpenMP" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenMP</span></a> <a href="https://mast.hpc.social/tags/HPC" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HPC</span></a> <a href="https://mast.hpc.social/tags/LLM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLM</span></a> <a href="https://mast.hpc.social/tags/CodeGeneration" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CodeGeneration</span></a></p><p><a href="https://hgpu.org/?p=30084" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">hgpu.org/?p=30084</span><span class="invisible"></span></a></p>
Habr<p>Учимся разрабатывать для GPU на примере операции GEMM</p><p>Привет, Хабр! Сегодня я расскажу про реализацию матричного умножения и особенности разработки для GPU. Познакомлю вас с устройством GPU, объясню, чем отличается программирование от привычного для CPU, какие нюансы нужно учитывать для эффективной реализации операций GEMM. А затем сравним производительность разных подходов к реализации.</p><p><a href="https://habr.com/ru/companies/yadro/articles/934878/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">habr.com/ru/companies/yadro/ar</span><span class="invisible">ticles/934878/</span></a></p><p><a href="https://zhub.link/tags/gpu_%D0%B2%D1%8B%D1%87%D0%B8%D1%81%D0%BB%D0%B5%D0%BD%D0%B8%D1%8F" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>gpu_вычисления</span></a> <a href="https://zhub.link/tags/opencl" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>opencl</span></a> <a href="https://zhub.link/tags/gemm" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>gemm</span></a></p>
Martin Boller :debian: :tux: :freebsd: :windows: :mastodon:<p>Short write-up on running Hashcat 7 (or older) with OpenCL on CPUs and/or using the Nouveau FOSS driver for NVIDIA cards. </p><p><a href="http://www.infosecworrier.dk/blog/2025/08/opencl/" rel="nofollow noopener" target="_blank">www.infosecworrier.dk/blog/2025/08/opencl/</a></p><p>All the good stuff is from <span class="h-card" translate="no"><a href="https://infosec.exchange/@tychotithonus" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>tychotithonus</span></a></span> original post. The rest is just me standing on his shoulders.</p><p><a href="https://infosec.exchange/tags/Hashcat" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Hashcat</span></a> <a href="https://infosec.exchange/tags/OpenCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenCL</span></a> <a href="https://infosec.exchange/tags/Nouveau" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Nouveau</span></a> <a href="https://infosec.exchange/tags/Linux" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Linux</span></a> <a href="https://infosec.exchange/tags/ABC" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ABC</span></a> <a href="https://infosec.exchange/tags/AlwaysBeCracking" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AlwaysBeCracking</span></a> <a href="https://infosec.exchange/tags/NVIDIA" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>NVIDIA</span></a> <a href="https://infosec.exchange/tags/Legacy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Legacy</span></a></p>
Dantali0n :arch: :i3:<p>One of the major problems with <a href="https://fosstodon.org/tags/OpenCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenCL</span></a> is that its kernel placement in global and local space is more a suggestion how to place the execution amongst compute units and threads then a hard requirement.</p><p>But for most vendors you can still expect poor performance if your local range is smaller then the wavefront (warp) size of your architecture.</p><p><a href="https://fosstodon.org/tags/GPU" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GPU</span></a></p>
Dr. Moritz Lehmann<p>Turns out the <a href="https://mast.hpc.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenCL</span></a> __builtin_amdgcn_sdot4 intrinsic for dp4a on AMD <a href="https://mast.hpc.social/tags/GPU" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GPU</span></a>​s is only supported up to RDNA2. RDNA3+ needs another intrinsic, __builtin_amdgcn_sudot4 🖖🤯<br>My OpenCL-Benchmark now supports both: <a href="https://github.com/ProjectPhysX/OpenCL-Benchmark/blob/master/src/kernel.cpp#L6-L20" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/ProjectPhysX/OpenCL</span><span class="invisible">-Benchmark/blob/master/src/kernel.cpp#L6-L20</span></a><br><a href="https://github.com/llvm/llvm-project/blob/c1968fee972859dfd03a7e698422e18a5bc1d478/llvm/include/llvm/IR/IntrinsicsAMDGPU.td#L3213" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/llvm/llvm-project/b</span><span class="invisible">lob/c1968fee972859dfd03a7e698422e18a5bc1d478/llvm/include/llvm/IR/IntrinsicsAMDGPU.td#L3213</span></a></p>
Neustradamus :xmpp: :linux:<p><a href="https://mastodon.social/tags/Mesa" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Mesa</span></a> 25.1.5 has been released (<a href="https://mastodon.social/tags/Mesa3D" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Mesa3D</span></a> / <a href="https://mastodon.social/tags/Mesa3DGraphicsLibrary" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Mesa3DGraphicsLibrary</span></a> / <a href="https://mastodon.social/tags/GraphicsLibrary" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GraphicsLibrary</span></a> / <a href="https://mastodon.social/tags/OpenGL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenGL</span></a> / <a href="https://mastodon.social/tags/EGL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>EGL</span></a> / <a href="https://mastodon.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenCL</span></a> / <a href="https://mastodon.social/tags/Vulkan" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Vulkan</span></a> / <a href="https://mastodon.social/tags/Gallium3D" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Gallium3D</span></a>) <a href="https://mesa3d.org/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">mesa3d.org/</span><span class="invisible"></span></a></p>
Khronos Group<p>OpenCL v3.0.19 maintenance update released with bug fixes &amp; clarifications and adds two new extensions: cl_khr_spirv_queries to simplify querying the SPIR-V capabilities of a device, and cl_khr_external_memory_android_hardware_buffer to more efficiently interoperate with other APIs on Android devices. In addition, the cl_khr_kernel_clock extension to sample a clock within a kernel has been finalized and is no longer an experimental extension. </p><p>Khronos <a href="https://fosstodon.org/tags/OpenCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenCL</span></a> Registry: <a href="https://registry.khronos.org/OpenCL/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">registry.khronos.org/OpenCL/</span><span class="invisible"></span></a></p>
Rainer<p><span class="h-card" translate="no"><a href="https://federation.network/@GuettisKnippse" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>GuettisKnippse</span></a></span> <br>Unter Einstellungen/Bearbeitung/<br><a href="https://social.anoxinon.de/tags/OpenCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenCL</span></a> aktiviert?</p>
Gamey :thisisfine: :antifa:<p>I want to get <a href="https://chaos.social/tags/davinci_resolve" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>davinci_resolve</span></a> working on <a href="https://chaos.social/tags/Fedora" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Fedora</span></a> 42 with my now very old AMD rx480 8GB but it uses <a href="https://chaos.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenCL</span></a>. The obvious choice would be <a href="https://chaos.social/tags/rocm" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>rocm</span></a> but that dropped support for my GPU years ago and from what I found also causes issues with Davinci resolve for even more years. The other obvious choice would be mesas implementation but while <a href="https://chaos.social/tags/Rusticl" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Rusticl</span></a> improved things it's still not a feature complete implementation and rather slow. Is it smart to use the amdgpu-pro ICD with mesa drivers for this?</p>
रञ्जित (Ranjit Mathew)<p>“Blackwell: Nvidia’s Massive GPU”, Chester Lam, Chips And Cheese (<a href="https://chipsandcheese.com/p/blackwell-nvidias-massive-gpu" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">chipsandcheese.com/p/blackwell</span><span class="invisible">-nvidias-massive-gpu</span></a>).</p><p>On HN: <a href="https://news.ycombinator.com/item?id=44409391" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">news.ycombinator.com/item?id=4</span><span class="invisible">4409391</span></a></p><p><a href="https://mastodon.social/tags/Nvidia" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Nvidia</span></a> <a href="https://mastodon.social/tags/GPU" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GPU</span></a> <a href="https://mastodon.social/tags/Blackwell" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Blackwell</span></a> <a href="https://mastodon.social/tags/Hardware" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Hardware</span></a> <a href="https://mastodon.social/tags/HPC" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HPC</span></a> <a href="https://mastodon.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenCL</span></a></p>
Dr. Moritz Lehmann<p>Finally I can "SLI" AMD+Intel+Nvidia <a href="https://mast.hpc.social/tags/GPU" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GPU</span></a>​s at home! I simulated this crow in flight at 680M grid cells in 36GB VRAM, pooled together from<br>- 🟥 <a href="https://mast.hpc.social/tags/AMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AMD</span></a> Radeon RX 7700 XT 12GB (RDNA3)<br>- 🟦 <a href="https://mast.hpc.social/tags/Intel" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Intel</span></a> Arc B580 12GB (Battlemage)<br>- 🟩 <a href="https://mast.hpc.social/tags/Nvidia" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Nvidia</span></a> Titan Xp 12GB (Pascal)<br>My <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FluidX3D</span></a> <a href="https://mast.hpc.social/tags/CFD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CFD</span></a> software can pool the VRAM of any combination of any GPUs together via <a href="https://mast.hpc.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenCL</span></a>.<br><a href="https://mast.hpc.social/tags/Kr%C3%A4henliebe" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Krähenliebe</span></a> <a href="https://mast.hpc.social/tags/birds" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>birds</span></a> <a href="https://mast.hpc.social/tags/crow" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>crow</span></a><br><a href="https://www.youtube.com/watch?v=1z5-ddsmAag" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">youtube.com/watch?v=1z5-ddsmAa</span><span class="invisible">g</span></a></p>
John-Mark Gurney<p>As usual, getting something like GPU compute that's cross platform working is a message because everyone likes to do their own thing and reinvent the wheel.</p><p>I would like something that is [modern] macOS and FreeBSD compatible, but doesn't look like that's possible since Apple deprecated OpenCL.</p><p>(Also, could Apple have picked a less searchable term for their new GPU framework?)</p><p>It's again looking like the best way to be cross platform is to use JS+browser.</p><p>Or am I missing some library?</p><p><a href="https://flyovercountry.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenCL</span></a> <a href="https://flyovercountry.social/tags/GPUCompute" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GPUCompute</span></a> <a href="https://flyovercountry.social/tags/FreeBSD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FreeBSD</span></a></p>
karolherbst 🐧 🦀<p>Who is using CL_sRGBA images with <a href="https://chaos.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenCL</span></a>, specifically to write to it (cl_khr_srgb_image_writes)?</p><p>There is limited hw support for writing to sRGBA images and I'm now curious what even uses that feature.</p><p>It was apparently important enough to require support for it for OpenCL 2.0, but... that's not telling me much.</p>
Dr. Moritz Lehmann<p>Is it possible to run AMD+Intel+Nvidia <a href="https://mast.hpc.social/tags/GPU" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GPU</span></a>​s in the same PC? Yes! 🖖😋<br>Got this RDNA3 chonker for free from 11 bit studios contest! It completes my 36GB VRAM RGB SLI abomination setup: <br>- 🟥 <a href="https://mast.hpc.social/tags/AMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AMD</span></a> Radeon RX 7700 XT 12GB<br>- 🟦 <a href="https://mast.hpc.social/tags/Intel" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Intel</span></a> Arc B580 12GB<br>- 🟩 <a href="https://mast.hpc.social/tags/Nvidia" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Nvidia</span></a> Titan Xp 12GB<br>The drivers all work together in <a href="https://mast.hpc.social/tags/Linux" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Linux</span></a> Ubuntu 24.04.2. Backbone is an ASUS ProArt Z790 with i7-13700K and 64GB, PCIe 4.0 x8/x8 + 3.0 x4 - plenty interconnect bandwidth.<br>Finally I can develop and test <a href="https://mast.hpc.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenCL</span></a> on all major patforms!</p>
txt.fileToday’s hate about computers and software
txt.fileToday’s hate about computers and software
LLMsCASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark We introduce CASS, the first l...<br><br><a href="https://hgpu.org/?p=29913" rel="nofollow noopener" target="_blank">https://hgpu.org/?p=29913</a><br><br><a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/Computer" target="_blank">#Computer</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/science" target="_blank">#science</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/CUDA" target="_blank">#CUDA</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/OpenCL" target="_blank">#OpenCL</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/paper" target="_blank">#paper</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/AI" target="_blank">#AI</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/AMD" target="_blank">#AMD</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/Radeon" target="_blank">#Radeon</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/RX" target="_blank">#RX</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/7900" target="_blank">#7900</a> <a rel="nofollow noopener" class="mention hashtag" href="https://mastodon.social/tags/XT" target="_blank">#XT</a><br><br><a href="https://awakari.com/pub-msg.html?id=BOlq25XcQ0BBmvhFBBHMoaU3P7Y&amp;interestId=LLMs" rel="nofollow noopener" target="_blank">Result Details</a>