Active Learning Materials for Computer Organization and Architecture
Research has shown that active learning can increase student performance and engagement, but access to materials is a notable barrier to using research-based instruction strategies in CS and Engineering. We aim to create, pilot, revise, and disseminate POGIL activities for Computer Organization and Architecture. POGIL is a research-based instruction strategy that comprises self-managed teams, development of process skills, and activities designed based on a theory of instruction called learning cycles. The strategy has been shown to improve student performance and engagement in scientific disciplines (such as Chemistry) and CS.
Fast query processing with parallel languages
Recent efforts have shown that compiling queries to machine code for a single-core can remove iterator and control overhead for significant performance gains. So far, systems that generate distributed programs only compile plans for single processors and stitch them together with messaging. The low-level compiler can only look at these fragments in isolation! We are investigating a different approach with Radish: take advantage of parallel language compilers and runtimes by generating parallel programs from queries.
Verified relational query optimizers
Query optimizers need to apply transformations to plans to find the fastest implementation. How do we know the resulting program is correct and our query will return the intended answer? In correctness-critical applications, testing may not be sufficient to ensure queries return the right answer. Crimp is a query transformer written in Coq that generates verified imperative (C-like) implementations of query specifications (e.g., programs written in SQL).
Productivity and performance for parallel irregular applications
A number of important, data intensive applications are irregular. These applications may involve irregular and unbalanced data structures (such as natural graphs) and unpredictable data accesses. A large effort is usually required to achieve good performance for such programs, the key being supporting enough concurrency to keep the machine (efficiently) busy. The programmer must write non-application logic like batching/sorting communication and managing fine-grained synchronization.
High-performance parallel systems for data-intensive computing.
In Ph.D. thesis, University of Washington 2016.
Apache REEF: Retainable Evaluator Execution Framework.
In TOCS 2017.
The Myria Big Data Management and Analytics System and Cloud Service.
In Conference on Innovative Data Systems Research (CIDR) 2017.
Compiling queries for high-performance computing.
In Technical Report UW-CSE-16-02-02, University of Washington 2016.
Latency-Tolerant Software Distributed Shared Memory.
In USENIX ATC (Best Paper Award) 2015.
REEF: Retainable Evaluator Execution Framework.
In SIGMOD 2015.
Integrating query processing with parallel languages.
In Ph.D. Symposium @ ICDE 2015.
Grappa: A Latency-Tolerant Runtime for Large-Scale Irregular Applications.
In WRSC @ Eurosys 2014.
Flat Combining Synchronized Global Data Structures.
In PGAS 2013.
Pomace: a Grappa for Non-Volatile Memory.
In NVMW 2013.
Compiled Plans for In-Memory Path-Counting Queries.
In IMDM @ VLDB 2013.
Do we need a crystal ball for task migration?.
In USENIX HotPar 2012.
Crunching Large Graphs with Commodity Processors.
In USENIX HotPar (HotPar'11) 2011.
Reef: Retainable evaluator execution framework.
In Demo @ VLDB 2013.