mastodon.world is one of the many independent Mastodon servers you can use to participate in the fediverse.
Generic Mastodon server for anyone to use.

Server stats:

8.1K
active users

#retrievalaugmentedgeneration

1 post1 participant0 posts today

New #preprint on #DataAnalysis:
intelligent #agents with #RetrievalAugmentedGeneration to automate data analysis, dataset curation and indexing at scale.

Check out the work of Mara Graziani et al. at: arxiv.org/abs/2502.15718

arXiv.orgMaking Sense of Data in the Wild: Data Analysis Automation at ScaleAs the volume of publicly available data continues to grow, researchers face the challenge of limited diversity in benchmarking machine learning tasks. Although thousands of datasets are available in public repositories, the sheer abundance often complicates the search for suitable data, leaving many valuable datasets underexplored. This situation is further amplified by the fact that, despite longstanding advocacy for improving data curation quality, current solutions remain prohibitively time-consuming and resource-intensive. In this paper, we propose a novel approach that combines intelligent agents with retrieval augmented generation to automate data analysis, dataset curation and indexing at scale. Our system leverages multiple agents to analyze raw, unstructured data across public repositories, generating dataset reports and interactive visual indexes that can be easily explored. We demonstrate that our approach results in more detailed dataset descriptions, higher hit rates and greater diversity in dataset retrieval tasks. Additionally, we show that the dataset reports generated by our method can be leveraged by other machine learning models to improve the performance on specific tasks, such as improving the accuracy and realism of synthetic data generation. By streamlining the process of transforming raw data into machine-learning-ready datasets, our approach enables researchers to better utilize existing data resources.

Discover how #Uber has developed Genie - an #AI-powered on-call copilot designed to improve the efficiency of on-call support engineers.

Powered by #RetrievalAugmentedGeneration (RAG), Genie delivers accurate real-time responses and significantly enhances incident response speed and effectiveness.

Since its launch, Genie has:
• Answered 70,000+ questions across 154 Slack channels
• Saved an estimated 13,000 engineering hours
• Achieved a 48.9% helpfulness rate from user feedback

Read more on #InfoQ 👉 bit.ly/3YKbP1N

🎙️ New InfoQ Podcast: A Primer on AI for Architects with Anthony Alford

In this episode of the #InfoQ #podcast with Thomas Betts & Anthony Alford, Senior Director at Genesys and InfoQ Editor, breaks down the essential AI concepts that software architects need to understand in today’s evolving tech landscape. From machine learning models to large language models (LLMs) and retrieval-augmented generation (RAG), Anthony covers everything you need to know.

Key Takeaways:
1️⃣ AI ≠ Magic: Most AI today is machine learning. LLMs like GPT are functions that take inputs and provide outputs, just like any other API call.
2️⃣ LLM Adoption: Define success criteria before integrating LLMs into your app. Start with Retrieval-Augmented Generation (RAG) to improve results.
3️⃣ Vector Databases: They enable nearest-neighbor searches, helping find relevant content to improve LLM responses.

👉 Listen to the full episode: bit.ly/4gqYORN

#AI#LLMs#RAG