mastodon.world is one of the many independent Mastodon servers you can use to participate in the fediverse.
Generic Mastodon server for anyone to use.

Server stats:

8.9K
active users

#CSOs

0 posts0 participants0 posts today
arXiv logo
arXiv.orgUnlocking True Elasticity for the Cloud-Native Era with DandelionElasticity is fundamental to cloud computing, as it enables quickly allocating resources to match the demand of each workload as it arrives, rather than pre-provisioning resources to meet performance objectives. However, even serverless platforms -- which boot sandboxes in 10s to 100s of milliseconds -- are not sufficiently elastic to avoid over-provisioning expensive resources. Today's FaaS platforms rely on pre-provisioning many idle sandboxes in memory to reduce the occurrence of slow, cold starts. A key obstacle for high elasticity is booting a guest OS and configuring features like networking in sandboxes, which are required to expose an isolated POSIX-like interface to user functions. Our key insight is that redesigning the interface for applications in the cloud-native era enables co-designing a much more efficient and elastic execution system. Now is a good time to rethink cloud abstractions as developers are building applications to be cloud-native. Cloud-native applications typically consist of user-provided compute logic interacting with cloud services (for storage, AI inference, query processing, etc) exposed over REST APIs. Hence, we propose Dandelion, an elastic cloud platform with a declarative programming model that expresses applications as DAGs of pure compute functions and higher-level communication functions. Dandelion can securely execute untrusted user compute functions in lightweight sandboxes that cold start in hundreds of microseconds, since pure functions do not rely on extra software environments such as a guest OS. Dandelion makes it practical to boot a sandbox on-demand for each request, decreasing performance variability by two to three orders of magnitude compared to Firecracker and reducing committed memory by 96% on average when running the Azure Functions trace.
arXiv logo
arXiv.orgConcurrency Testing in the Linux Kernel via eBPFConcurrency is vital for our critical software to meet modern performance requirements, yet concurrency bugs are notoriously difficult to detect and reproduce. Controlled Concurrency Testing (CCT) can make bugs easier to expose by enabling control over thread interleavings and systematically exploring the interleaving space through scheduling algorithms. However, existing CCT solutions for kernel code are heavyweight, leading to significant performance, maintainability and extensibility issues. In this work, we introduce LACE, a lightweight CCT framework for kernel code empowered by eBPF. Without hypervisor modification, LACE features a custom scheduler tailored for CCT algorithms to serialize non-determistic thread execution into a controlled ordering. LACE also provides a mechanism to safely inject scheduling points into the kernel for fine-grained control. Furthermore, LACE employs a two-phase mutation strategy to integrate the scheduler with a concurrency fuzzer, allowing for automated exploration of both the input and schedule space. In our evaluation, LACE achieves 38\% more branches, 57\% overhead reduction and 11.4$\times$ speed-up in bug exposure compared to the state-of-the-art kernel concurrency fuzzers. Our qualitative analysis also demonstrates the extensibility and maintainability of LACE. Furthermore, LACE discovers eight previously unknown bugs in the Linux kernel, with six confirmed by developers.
arXiv logo
arXiv.orgTerminal Lucidity: Envisioning the Future of the TerminalThe Unix terminal, or just simply, the terminal, can be found being applied in almost every facet of computing. It is available across all major platforms and often integrated into other applications. Due to its ubiquity, even marginal improvements to the terminal have the potential to make massive improvements to productivity on a global scale. We believe that evolutionary improvements to the terminal, in its current incarnation as windowed terminal emulator, are possible and that developing a thorough understanding of issues that current terminal users face is fundamental to knowing how the terminal should evolve. In order to develop that understanding we have mined Unix and Linux Stack Exchange using a fully-reproducible method which was able to extract and categorize 91.0% of 1,489 terminal-related questions (from the full set of nearly 240,000 questions) without manual intervention. We present an analysis, to our knowledge the first of its kind, of windowed terminal-related questions posted over a 15-year period and viewed, in aggregate, approximately 40 million times. As expected, given its longevity, we find the terminal's many features being applied across a wide variety of use cases. We find evidence that the terminal, as windowed terminal emulator, has neither fully adapted to its now current graphical environment nor completely untangled itself from features more suited to incarnations in previous environments. We also find evidence of areas where we believe the terminal could be extended along with other areas where it could be simplified. Surprisingly, while many current efforts to improve the terminal include improving the terminal's social and collaborative aspects, we find little evidence of this as a prominent pain point.

After 2013, the way #Pakistan views NGOs has changed drastically, with many looking at their role with doubt and suspicion of Western influence.

In this conversation, we have been joined by Ms. Sabiha Shaheen, executive director at BARGAD Youth, Mr. Mohammad Waseem, director Interactive Resource Centre, and Ms. Uzma Yaqoob, director FDI Pakistan, to talk about the different roles of #NGOs, Civil Societies, and #CSOs in Pakistan today.

Watch full episode here :youtu.be/DBsEoYkur94

youtu.be- YouTubeEnjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.
arXiv.orgLarge Language Models as Realistic Microservice Trace GeneratorsComputer system workload traces, which record hardware or software events during application execution, are essential for understanding the behavior of complex systems and managing their processing and memory resources. However, obtaining real-world traces can be challenging due to the significant collection overheads in performance and privacy concerns that arise in proprietary systems. As a result, synthetic trace generation is considered a promising alternative to using traces collected in real-world production deployments. This paper proposes to train a large language model (LLM) to generate synthetic workload traces, specifically microservice call graphs. To capture complex and arbitrary hierarchical structures and implicit constraints in such traces, we fine-tune LLMs to generate each layer recursively, making call graph generation a sequence of easier steps. To further enforce learning constraints in traces and generate uncommon situations, we apply additional instruction tuning steps to align our model with the desired trace features. Our evaluation results show that our model can generate diverse realistic traces under various conditions and outperform existing methods in accuracy and validity. We show that our synthetically generated traces can effectively substitute real-world data in optimizing or tuning systems management tasks. We also show that our model can be adapted to perform key downstream trace-related tasks, specifically, predicting key trace features and infilling missing data given partial traces. Codes are available in https://github.com/ldos-project/TraceLLM.
#csse#csai#csdc
arXiv.orgCache-Craft: Managing Chunk-Caches for Efficient Retrieval-Augmented GenerationRetrieval-Augmented Generation (RAG) is often used with Large Language Models (LLMs) to infuse domain knowledge or user-specific information. In RAG, given a user query, a retriever extracts chunks of relevant text from a knowledge base. These chunks are sent to an LLM as part of the input prompt. Typically, any given chunk is repeatedly retrieved across user questions. However, currently, for every question, attention-layers in LLMs fully compute the key values (KVs) repeatedly for the input chunks, as state-of-the-art methods cannot reuse KV-caches when chunks appear at arbitrary locations with arbitrary contexts. Naive reuse leads to output quality degradation. This leads to potentially redundant computations on expensive GPUs and increases latency. In this work, we propose Cache-Craft, a system for managing and reusing precomputed KVs corresponding to the text chunks (we call chunk-caches) in RAG-based systems. We present how to identify chunk-caches that are reusable, how to efficiently perform a small fraction of recomputation to fix the cache to maintain output quality, and how to efficiently store and evict chunk-caches in the hardware for maximizing reuse while masking any overheads. With real production workloads as well as synthetic datasets, we show that Cache-Craft reduces redundant computation by 51% over SOTA prefix-caching and 75% over full recomputation. Additionally, with continuous batching on a real production workload, we get a 1.6X speed up in throughput and a 2X reduction in end-to-end response latency over prefix-caching while maintaining quality, for both the LLaMA-3-8B and LLaMA-3-70B models.
#csdc#csai#cscl