Session Chairs: Moshe Gabel, University of Toronto, and Joseph Gonzalez, University of California, Berkeley, John Thorpe, Yifan Qiao, Jonathan Eyolfson, and Shen Teng, UCLA; Guanzhou Hu, UCLA and University of Wisconsin, Madison; Zhihao Jia, CMU; Jinliang Wei, Google Brain; Keval Vora, Simon Fraser; Ravi Netravali, Princeton University; Miryung Kim and Guoqing Harry Xu, UCLA. Hence, CLP enables efficient search and analytics on archived logs, something that was impossible without it. USENIX NSDI, 2021 Acceptance Rate: 15.99% Fluid: Resource-Aware Hyperparameter Tuning Engine P. Yu*, J. Liu*, M. Chowdhury (*Equal contribution) MLSys, 2021 Acceptance Rate: 23.53% NetLock: Fast, Centralized Lock Management Using Programmable Switches Z. Yu, Y. Zhang, V. Braverman, M. Chowdhury, X. Jin ACM SIGCOMM, 2020 Acceptance Rate: 21.6% We develop rigorous theoretical foundations to simplify equivalence examination and correction for partially equivalent transformations, and design an efficient search algorithm to quickly discover highly optimized programs by combining fully and partially equivalent optimizations at the tensor, operator, and graph levels. In particular, I'll argue for re-engaging with what computer hardware really is today and give two suggestions (among many) about how the OS research community can usefully do this, and exploit what is actually a tremendous opportunity. We also welcome work that explores the interface to related areas such as computer architecture, networking, programming languages, analytics, and databases. See www.cs.cmu.edu/~mmv/Veloso.html for her scientific publications. HotCRP.com signin Sign in using your HotCRP.com account. To this end, we propose GNNAdvisor, an adaptive and efficient runtime system to accelerate various GNN workloads on GPU platforms. Conference Dates: Apr 12, 2021 - Apr 14, 2021. Submissions violating the detailed formatting and anonymization rules will not be considered for review. The overhead of GPT is 5% for memory-intensive workloads (e.g., Redis) and negligible for CPU-intensive workloads (e.g., RV8 and Coremarks). Foreshadow was chosen as an IEEE Micro Top Pick. To adapt to different workloads, prior works mix or switch between a few known algorithms using manual insights or simple heuristics. First, GNNAdvisor explores and identifies several performance-relevant features from both the GNN model and the input graph, and use them as a new driving force for GNN acceleration. (Jan 2019) Our REPT paper won a best paper at OSDI'18 (Oct 2018) I will serve in the SOSP'19 PC. Responses should be limited to clarifying the submitted work. To resolve the problem, we propose a new LFS-aware ZNS interface, called ZNS+, and its implementation, where the host can offload data copy operations to the SSD to accelerate segment compaction. We introduce a hybrid cryptographic protocol for privacy-adhering transformations of encrypted data. DMons targeted optimizations provide 16.83% speedup on average (up to 53.14%), compared to a baseline that uses the highest level of compiler optimization. After request completion, an I/O device must decide either to minimize latency by immediately firing an interrupt or to optimize for throughput by delaying the interrupt, anticipating that more requests will complete soon and help amortize the interrupt cost. Storm ensures security using a Security Typed ORM that refines the (type) abstractions of each layer of the MVC API with logical assertions that describe the data produced and consumed by the underlying operation and the users allowed access to that data. Cores can safely and concurrently read from their local kernel replica, eliminating remote NUMA accesses. Used Zotero to organize papers about the stress and diffusion between anode and electrolyte and made a summary . In 2023 I started another two-year term on the . Forgot your password? By monitoring the status of each job during training, Pollux models how their goodput (a novel metric we introduce that combines system throughput with statistical efficiency) would change by adding or removing resources. We implement and evaluate a suite of applications, including MICA, Raft and Set Algebra for document retrieval; and we demonstrate that the nanoPU can be used as a high performance, programmable alternative for one-sided RDMA operations. The main contribution of this paper is GoJournal, a verified, concurrent journaling system that provides atomicity for storage applications, together with Perennial 2.0, a framework for formally specifying and verifying concurrent crash-safe systems. Pollux is implemented and publicly available as part of an open-source project at https://github.com/petuum/adaptdl. Compared to a state-of-the-art fuzzer, Fluffy improves the fuzzing throughput by 510 and the code coverage by 2.7 with various optimizations: in-process fuzzing, fuzzing harnesses for Ethereum clients, and semantic-aware mutation that reduces erroneous test cases. (Registered attendees: Sign in to your USENIX account to download these files. 64 papers accepted out of 341 submitted. Such centralized engines are in a perfect position to censor content and violate users privacy, undermining some of the key tenets behind decentralization. Writing a correct operating system kernel is notoriously hard. A PC member is a conflict if any of the following three circumstances applies: Institution: You are currently employed at the same institution, have been previously employed at the same institution within the past two years (not counting concluded internships), or are going to begin employment at the same institution during the review period. . Moreover, as of October 2020, a review of the 50 most cited empirical papers that list personality as a keyword indicates that all 50 papers were authored by people with insti tutional affiliations in the United States, Canada, Germany, the UK, and New Zealand, and only three papers included samples outside of these regions (see Supplementary We demonstrate the above using design, implementation and evaluation of blk-switch, a new Linux kernel storage stack architecture. Hence, kernel developers are constantly refining synchronization within OS kernels to improve scalability at the risk of introducing subtle bugs. A graph neural network (GNN) enables deep learning on structured graph data. Finding the inductive invariant of the distributed protocol is a critical step in verifying the correctness of distributed systems, but takes a long time to do even for simple protocols. Papers accompanied by nondisclosure agreement forms will not be considered. Questions? Existing systems that hide voice call metadata either require trusted intermediaries in the network or scale to only tens of users. To evaluate the security guarantees of Storm, we build a formally verified reference implementation using the Labeled IO (LIO) IFC framework. Existing frameworks optimize tensor programs by applying fully equivalent transformations, which maintain equivalence on every element of output tensors. If you have any questions about conflicts, please contact the program co-chairs. However, memory allocation decisions also impact overall application performance via data placement, offering opportunities to improve fleetwide productivity by completing more units of application work using fewer hardware resources. Based on the observation that invariants are often concise in practice, DistAI starts with small invariant formulas and enumerates all strongest possible invariants that hold for all samples. This formulation of memory management, which we call memory programming, is a generalization of paging that allows MAGE to provide a highly efficient virtual memory abstraction for SC. Paper Submission Information All submissions must be received by 11:59 PM AoE (UTC-12) on the day of the corresponding deadline. Jiachen Wang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Ding Ding, Department of Computer Science, New York University; Huan Wang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Conrad Christensen, Department of Computer Science, New York University; Zhaoguo Wang and Haibo Chen, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Jinyang Li, Department of Computer Science, New York University. Session Chairs: Ryan Huang, Johns Hopkins University, and Manos Kapritsos, University of Michigan, Jianan Yao, Runzhou Tao, Ronghui Gu, Jason Nieh, Suman Jana, and Gabriel Ryan, Columbia University. Ankit Bhardwaj and Chinmay Kulkarni, University of Utah; Reto Achermann, University of British Columbia; Irina Calciu, VMware Research; Sanidhya Kashyap, EPFL; Ryan Stutsman, University of Utah; Amy Tai and Gerd Zellweger, VMware Research. We identify that current systems for learning the embeddings of large-scale graphs are bottlenecked by data movement, which results in poor resource utilization and inefficient training. JEL codes: Q18, Q28, Q57 . This post is for recording some notes from a few OSDI'21 papers that I got fun. We particularly encourage contributions containing highly original ideas, new approaches, and/or groundbreaking results. Memory allocation represents significant compute cost at the warehouse scale and its optimization can yield considerable cost savings. The 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) will take place as a virtual event on July 1416, 2021. Perennial 2.0 makes this possible by introducing several techniques to formalize GoJournals specification and to manage the complexity in the proof of GoJournals implementation. USENIX new Date().getFullYear()>document.write(new Date().getFullYear()); Grants for Black Computer Science Students Application, Propose an interesting, compelling solution, Demonstrate the practicality and benefits of the solution, Clearly describe the paper's contributions, Clearly articulate the advances beyond previous work. We convert five state-of-the-art PM indexes using Nap. We present Nap, a black-box approach that converts concurrent persistent memory (PM) indexes into NUMA-aware counterparts. All deadline times are 23:59 hrs UTC. The 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) will take place as a virtual event on July 14-16, 2021. All papers will be available online to registered attendees before the conference. However, with the increasingly speedy transactions and queries thanks to large memory and fast interconnect, commodity HTAP systems have to make a tradeoff between data freshness and performance degradation. We built a functional NFSv3 server, called GoNFS, to use GoJournal. Password sosp ACM Symposium on Operating Systems Principles. PET then automatically corrects results to restore full equivalence. Attaching supplementary material is optional; if your paper says that you have source code or formal proofs, you need not attach them to convince the PC of their existence. For instance, FAST 21 and NSDI 21 have author-notification dates after the OSDI 21 abstract-registration deadline. We present NrOS, a new OS kernel with a safer approach to synchronization that runs many POSIX programs. Our evaluation shows that, compared to existing participant selection mechanisms, Oort improves time-to-accuracy performance by 1.2X-14.1X and final model accuracy by 1.3%-9.8%, while efficiently enforcing developer-specified model testing criteria at the scale of millions of clients. Professor Veloso is the Past President of AAAI (the Association for the Advancement of Artificial Intelligence), and the co-founder, Trustee, and Past President of RoboCup. Penglai also reduces the latency of secure memory initialization by three orders of magnitude and gains 3.6x speedup for real-world applications (e.g., MapReduce). In addition, increasing CPU core counts further complicate kernel development. Paper abstracts and proceedings front matter are available to everyone now. For general conference information, see https://www . Session Chairs: Deniz Altinbken, Google, and Rashmi Vinayak, Carnegie Mellon University, Tanvir Ahmed Khan and Ian Neal, University of Michigan; Gilles Pokam, Intel Corporation; Barzan Mozafari and Baris Kasikci, University of Michigan. A glance at this year's OSDI program shows that Operating Systems are a small niche topic for this conference, not even meriting their own full session. Authors are required to register abstracts by 3:00 p.m. PST on December 3, 2020, and to submit full papers by 3:00 p.m. PST on December 10, 2020. The key insight in blk-switch is that Linux's multi-queue storage design, along with multi-queue network and storage hardware, makes the storage stack conceptually similar to a network switch. Federated Learning (FL) is an emerging direction in distributed machine learning (ML) that enables in-situ model training and testing on edge data. Metadata from voice calls, such as the knowledge of who is communicating with whom, contains rich information about peoples lives. Many application domains can benefit from hybrid transaction/analytical processing (HTAP) by executing queries on real-time datasets produced by concurrent transactions. OSDI brings together professionals from academic and industrial backgrounds in a premier forum for discussing the design, implementation, and implications of systems software. She is the author of the textbook Interconnections (about network layers 2 and 3) and coauthor of Network Security. will work with the steering committee to ensure that the symposium program will accommodate presentations for all accepted papers. This talk will discuss several examples with very different solutions. Across a wide range of pages, phones, and mobile networks covering web workloads in both developed and emerging regions, Horcrux reduces median browser computation delays by 31-44% and page load times by 18-37%. Grand Rapids, Michigan, United States . The copyback-aware block allocation considers different copy costs at different copy paths within the SSD. Academic and industrial participants present research and experience papers that cover the full range of theory . KEVIN combines a fast, lightweight, and POSIX compliant file system with a key-value storage device that performs in-storage indexing. Title Page, Copyright Page, and List of Organizers | For realistic workloads, KEVIN improves throughput by 68% on average. In this paper, we propose a software-hardware co-design to support dynamic, fine-grained, large-scale secure memory as well as fast-initialization. Starting with small invariant formulas and strongest possible invariants avoids large SMT queries, improving SMT solver performance. The NVMe zoned namespace (ZNS) is emerging as a new storage interface, where the logical address space is divided into fixed-sized zones, and each zone must be written sequentially for flash-memory-friendly access. In particular, responses must not include new experiments or data, describe additional work completed since submission, or promise additional work to follow. A significant obstacle to using SC for practical applications is the memory overhead of the underlying cryptography. At a high level, Addra follows a template in which callers and callees deposit and retrieve messages from private mailboxes hosted at an untrusted server. blk-switch evaluation over a variety of scenarios shows that it consistently achieves s-scale average and tail latency (at both 99th and 99.9th percentiles), while allowing applications to near-perfectly utilize the hardware capacity. Timothy Roscoe is a Full Professor in the Systems Group of the Computer Science Department at ETH Zurich, where he works on operating systems, networks, and distributed systems, and is currently head of department. DeSearch uses trusted hardware to build a network of workers that execute a pipeline of small search engine tasks (crawl, index, aggregate, rank, query). NrOS is primarily constructed as a simple, sequential kernel with no concurrency, making it easier to develop and reason about its correctness. Mothy received a PhD in 1995 from the Computer Laboratory of the University of Cambridge, where he was a principal designer and builder of the Nemesis OS. Important Dates Abstract registrations due: Thursday, December 3, 2020, 3:00 pm PST Complete paper submissions due: Thursday, December 10, 2020, 3:00pm PST Author Response Period This motivates the need for a new approach to data privacy that can provide strong assurance and control to users. Unfortunately, because devices lack the semantic information about which I/O requests are latency-sensitive, these heuristics can sometimes lead to disastrous results. Welcome to the SOSP 2021 Website. Sijie Shen, Rong Chen, Haibo Chen, and Binyu Zang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai Artificial Intelligence Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China. These are hard deadlines, and no extensions will be given. Consensus bugs are extremely rare but can be exploited for network split and theft, which cause reliability and security-critical issues in the Ethereum ecosystem. This approach misses possible optimization opportunities as transformations that only preserve equivalence on subsets of the output tensors are excluded. Sponsored by USENIX in cooperation with ACM SIGOPS. PC members are not required to read supplementary material when reviewing the paper, so each paper should stand alone without it. blk-switch uses this insight to adapt techniques from the computer networking literature (e.g., multiple egress queues, prioritized processing of individual requests, load balancing, and switch scheduling) to the Linux kernel storage stack. Please identify yourself as a presenter and include your mailing address in your email. This paper addresses a key missing piece in the current ecosystem of decentralized services and blockchain apps: the lack of decentralized, verifiable, and private search. OSDI 2021 papers summary. We argue that a key-value interface between a file system and an SSD is superior to the legacy block interface by presenting KEVIN. Qing Wang, Youyou Lu, Junru Li, and Jiwu Shu, Tsinghua University. Welcome to the SOSP 2021 Website. Despite their extensive use for debugging and vulnerability discovery, sanitizer checks often induce a high runtime cost. The NAL eliminates remote PM accesses to hot items without inducing extra local PM accesses. Just using Lambdas on top of CPU servers offers up to 2.75 more performance-per-dollar than training only with CPU servers. If in doubt about whether your submission to OSDI 2021 and your upcoming submission to SOSP are the same paper or not, please contact the PC chairs by email. Marius is open-sourced at www.marius-project.org. OSDI'20: 14th USENIX Conference on Operating Systems Design and ImplementationNovember 4 - 6, 2020 ISBN: 978-1-939133-19-9 Published: 04 November 2020 Sponsors: ORACLE, VMware, Google Inc., Amazon, Microsoft Get Alerts for this Conference Save to Binder Export Citation Bibliometrics Citation count 96 Downloads (6 weeks) 317 Downloads (12 months) Kyuhwa Han, Sungkyunkwan University and Samsung Electronics; Hyunho Gwak and Dongkun Shin, Sungkyunkwan University; Jooyoung Hwang, Samsung Electronics. We describe PrivateKube, an extension to the popular Kubernetes datacenter orchestrator that adds privacy as a new type of resource to be managed alongside other traditional compute resources, such as CPU, GPU, and memory. Papers not meeting these criteria will be rejected without review, and no deadline extensions will be granted for reformatting. 23 artifacts received the Artifacts Functional badge (88%). Mingyu Li, Jinhao Zhu, and Tianxu Zhang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Cheng Tan, Northeastern University; Yubin Xia, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Sebastian Angel, University of Pennsylvania; Haibo Chen, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China. We also propose two file system techniques for ZNS+-aware LFS. Accepted papers will be allowed 14 pages in the proceedings, plus references. While verifying GoJournal, we found one serious concurrency bug, even though GoJournal has many unit tests. The blockchain community considers this hard fork the greatest challenge since the infamous 2016 DAO hack. Web pages today commonly include large amounts of JavaScript code in order to offer users a dynamic experience. Although the number of submissions is lower than the past, it's likely only due to the late announcement; being in my first OSDI PC, I think the quality of the submitted and accepted papers remains as high as ever. To achieve low overhead, selective profiling gathers runtime execution information selectively and incrementally. Youngseok Yang, Seoul National University; Taesoo Kim, Georgia Institute of Technology; Byung-Gon Chun, Seoul National University and FriendliAI. CLP's gains come from using a tuned, domain-specific compression and search algorithm that exploits the significant amount of repetition in text logs. OSDI '21 Technical Sessions All the times listed below are in Pacific Daylight Time (PDT).