Doug Whitfield [Minneapolis]<p>so, gonna write some stuff on <a href="https://mastodon.social/tags/HDFS" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>HDFS</span></a> <a href="https://mastodon.social/tags/MapReduce" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>MapReduce</span></a> <a href="https://mastodon.social/tags/yarn" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>yarn</span></a> and maybe clustering. Also, <a href="https://mastodon.social/tags/machinelearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>machinelearning</span></a> was suggested but I think that may be too broad of a topic for this. I did cover Machine Learning in a blog back in 2023, but this time is for KB, not blog: <a href="https://www.openlogic.com/blog/using-cassandra-kafka-and-spark-ai" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">openlogic.com/blog/using-cassa</span><span class="invisible">ndra-kafka-and-spark-ai</span></a></p><p>Hmm, perhaps some sort of ML performance (as in disk io, etc not accuracy) document would be good but still, where to even start.</p><p>If anyone has beginner resources, I'll likely be pointing folks to some resources</p>