Publications

(2023). Streaming Task Graph Scheduling for Dataflow Architectures. Proceedings of the 32th International Symposium on High-Performance Parallel and Distributed Computing (HPDC'23).

(2021). Productivity, Portability, Performance: Data-Centric Python. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC21).

(2021). StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing Systems. Proceedings of the 19th ACM/IEEE International Symposium on Code Generation and Optimization (CGO'21).

(2020). FBLAS: Streaming Linear Algebra on FPGA. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC20).

(2019). Streaming Message Interface: High-Performance DistributedMemory Programming on Reconfigurable Hardware. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC19).

PDF DOI

(2018). Reducing Message Latency and CPU Utilization in the CAF Actor Framework. Proceedings of the 26th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2018.

(2018). D2K: Scalable Community Detection in Massive Networks via Small-Diameter k-Plexes. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.

PDF DOI

(2017). Evaluating Concurrency Throttling and Thread Packing on SMT Multicores. Proceedings of the 25th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2017.

DOI

(2017). Elastic Scaling for Distributed Latency-sensitive Data Stream Operators. Proceedings of the 25th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2017.

(2016). Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Efficient Elastic Data Stream Processing. Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP).

PDF DOI

(2016). A Divide-and-conquer Parallel Pattern Implementation for Multicores. Proceedings of the 3rd International Workshop on Software Engineering for Parallel Systems.

PDF DOI

(2015). Parallelizing High-Frequency Trading Applications by using C+ + 11 Attributes. Proceedings of the first IEEE International Workshop on Reengineering for Parallelism in Heterogeneous Parallel Platforms.

DOI

(2015). A Multicore Parallelization of Continuous Skyline Queries on Data Streams. Proceedings of the 2015 International Conference on Parallel Processing (Euro-Par).

(2014). Autonomic Parallel Data Stream Processing. High Performance Computing Simulation (HPCS), 2014 International Conference on.

DOI

(2014). Optimizing Message-Passing on Multicore Architectures Using Hardware Multi-threading. Parallel, Distributed and Network-Based Processing (PDP), 2014 22nd Euromicro International Conference on.

DOI

(2014). A Lightweight Run-Time Support for Fast Dense Linear Algebra on Multi-Core. Proceedings of 12th IASTED International Conference on Parallel and Distributed Computing and Networks.

(2014). A High-Throughput and Low-Latency Parallelization of Window-based Stream Joins on Multicores. 12th IEEE International Symposium on Parallel and Distributed Processing with Applications.

PDF DOI

(2013). Evaluation of Architectural Supports for Fine-Grained Synchronization Mechanisms. Proceedings of the 11th IASTED International Conference on Parallel and Distributed Computing and Networks.