#apache-spark #data-science #technology #data-engineering #programming
Origin | Interest | Match

What makes tools truly useful?
Episode 2 of #TheTestSet features Wes McKinney (Part 1of 2!) sharing his experience building Pandas & Arrow, plus his surprising past in speedrun communities.
Tune in for his story at thetestset.co, on Spotify, or Apple Podcasts
I asked my ETL job how life was going.
#etl #dataengineering #data #sql #meme
3 Ways to use Apache Kafka in a Real-time Data Stack – #dataengineering #streaming #kafka #shorts
Join this channel to get access to perks: – – – Book a ... source
Discover how CocoIndex transforms data orchestration with a pure Data Flow Programming model — ensuring traceable, immutable, and declarative pipelines for know https://hackernoon.com/redefining-data-operations-with-data-flow-programming-in-cocoindex-u486ao8 #dataengineering
@hynek released another great video on uv, where he explained how he uses the just tool to store commands in a cross‑platform, portable way for everyday tasks like installing or refreshing virtual environments, running tests and code checks and even development tasks like sending requests.
Ever wonder about the mind behind Pandas & Apache Arrow? Ep. 2 of #TheTestSet (Part 1!) unpacks Wes McKinney's journey – including his speedrunning past! What makes good tools good?
Listen at https://thetestset.co, on Spotify, or Apple Podcasts
#dataengineering If you needed to use a data lake with Redshift, would you use Iceberg, given some native support, over Delta Lake, which is arguably a better format?
Asking for a friend who is me
Excited about AXLearn for modular ML training, Pinterest's Moka for massive data processing, and PromiseTune for causal configuration tuning! #MachineLearning #DataEngineering
Tin tức công nghệ mới! Apache Parquet đang phát triển một tính năng đột phá cho phép nhúng các chỉ mục (indexes) do người dùng tự định nghĩa trực tiếp vào các file Parquet. Điều này hứa hẹn sẽ tối ưu hóa đáng kể hiệu suất truy vấn dữ liệu, giúp việc xử lý dữ liệu lớn trở nên nhanh chóng và hiệu quả hơn.
#ApacheParquet #DataEngineering #BigData #Indexes #DataFusion #CôngNghệDữLiệu #DữLiệuLớn #TốiƯuHiệuNăng
https://datafusion.apache.org/blog/2025/07/14/user-defined-parquet-indexes/