Publications

Collaborations always make the whole greater than the sum of its parts! I was lucky to learn from and write papers with the following 84 co-authors (ordered alphabetically by last name):

Mustafa Abbas, Mohamed A. Abd El Ghany, Tanmay Anand, Anupreetham, Pranavi Appana, Aman Arora, Charles Augustine, Valeria Bertacco, Vaughn Betz, Kwadwo Boateng, Aatman Borda, Prerna Budhkar, Yu Cao, Gregory Chen, Paul Chow, Seyed Alireza Damaghani, Reetuparna Das, Aravind Dasu, Mohamed Eldafrawy, Mohamed A. Elgammal, Hongxiang Fan, Evangelos Georganas, Barbara Georgey, Diana Groehringer, Vidushi Goyal, Brett Grady, Sergey Gribok, Karthik Gururaj, Mathew Hall, Alexander Heinecke, Salma Hesham, James C. Hoe, Suyeon Hur, Mohamed Ibrahim, Ravi Iyer, Ali Jafari, Lizy K. John, Sangram Kate, Kenneth B. Kent, Jangwoo Kim, Joonsung Kim, Phil Knag, Ram Krishnamurthy, Raghavan Kumar, Ajay Kuzhively, Dongup Kwon, Martin Langhammer, Wayne Luk, Rui Ma, Fatemehsadat Mahmoudi, Debbie Marr, Karan Mathur, Samidh Mehta, Jiuxi Meng, Amin Mohaghegh, Abinash Mohanty, Vedant Mohanty, Stephen More, Seongmin Na, Mishali Naik, Hiroki Nakahara, Xinyu Niu, Eriko Nurvitadhi, Nicolas Papernot, Bogdan Pasca, Pragnesh Patel, Abirami Prabhakaran, Zhiqiang Que, Aishwarya Rajen, Daniel Rauch, Jens Rettkowski, Jae-sun Seo, David Sheffield, Jaewoong Sim, Srivatsan Srinivasan, Huseyin Sumbul, Phil Tompson, Kuen Hung Tsoi, Xiaowei Wang, Sadegh Yazdanshenas, Jiecao Yu, Chenglong Zeng, Taikun Zhang, Zhipeng Zhao

2024

  1. A Software-Programmable Neural Processing Unit for Graph Neural Network Inference on FPGAs
    Taikun Zhang, Andrew Boutros, Sergey Gribok, Kwadwo Boateng, and Vaughn Betz
    In IEEE International Conference on Field-Programmable Logic and Applications (FPL), 2024
  2. Field-Programmable Gate Array Architecture for Deep Learning: Survey & Future Directions
    Andrew Boutros, Aman Arora, and Vaughn Betz
    In arXiv preprint arXiv:2404.10076, 2024
  3. High Throughput FPGA-Based Object Detection via Algorithm-Hardware Co-Design
    Anupreetham Anupreetham, Mohamed Ibrahim, Mathew Hall, Andrew Boutros, Ajay Kuzhively, Abinash Mohanty, Eriko Nurvitadhi, Vaughn Betz, Yu Cao, and Jae-Sun Seo
    In ACM Transactions on Reconfigurable Technology and Systems (TRETS), 2024

2023

  1. Into the Third Dimension: Architecture Exploration Tools for 3D Reconfigurable Acceleration Devices
    Andrew Boutros, Fatemehsadat Mahmoudi, Amin Mohaghegh, Stephen More, and Vaughn Betz
    In IEEE International Conference on Field Programmable Technology (FPT), 2023
  2. Field-Programmable Gate Array Architecture
    Andrew Boutros, and Vaughn Betz
    In Handbook of Computer Architecture, 2023
  3. A Whole New World: How to Architect Beyond-FPGA Reconfigurable Acceleration Devices?
    Andrew Boutros, Stephen More, and Vaughn Betz
    In IEEE International Conference on Field-Programmable Logic and Applications (FPL), 2023
  4. Placement Optimization for NoC-Enhanced FPGAs
    Srivatsan Srinivasan, Andrew Boutros, Fatemehsadat Mahmoudi, and Vaughn Betz
    In IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2023
  5. Koios 2.0: Open-Source Deep Learning Benchmarks for FPGA Architecture and CAD Research
    Aman Arora, Andrew Boutros, Seyed Alireza Damghani, Karan Mathur, Vedant Mohanty, Tanmay Anand, Mohamed A Elgammal, Kenneth B Kent, Vaughn Betz, and Lizy K John
    In IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2023
  6. A Fast and Flexible FPGA-Based Accelerator for Natural Language Processing Neural Networks
    Suyeon Hur, Seongmin Na, Dongup Kwon, Joonsung Kim, Andrew Boutros, Eriko Nurvitadhi, and Jangwoo Kim
    In ACM Transactions on Architecture and Code Optimization (TACO), 2023

2022

  1. Architecture and Application Co-Design for Beyond-FPGA Reconfigurable Acceleration Devices
    Andrew Boutros, Eriko Nurvitadhi, and Vaughn Betz
    In IEEE Access, 2022
  2. RAD-Sim: Rapid Architecture Exploration for Novel Reconfigurable Acceleration Devices
    Andrew Boutros, Eriko Nurvitadhi, and Vaughn Betz
    In IEEE International Conference on Field-Programmable Logic and Applications (FPL), 2022
  3. FPGA-Based AI Smart NICs for Scalable Distributed AI Training Systems
    Rui Ma, Evangelos Georganas, Alexander Heinecke, Sergey Gribok, Andrew Boutros, and Eriko Nurvitadhi
    In IEEE Computer Architecture Letters (CAL), 2022

2021

  1. Recurrent Neural Networks with Column-Wise Matrix-Vector Multiplication on FPGAs
    Zhiqiang Que, Hiroki Nakahara, Eriko Nurvitadhi, Andrew Boutros, Hongxiang Fan, Chenglong Zeng, Jiuxi Meng, Kuen Hung Tsoi, Xinyu Niu, and Wayne Luk
    In IEEE Transactions on Very Large Scale Integration Systems (TVLSI), 2021
  2. Specializing for Efficiency: Customizing AI Inference Processors on FPGAs
    Andrew Boutros, Eriko Nurvitadhi, and Vaughn Betz
    In IEEE International Conference on Microelectronics (ICM), 2021
  3. Koios: A Deep Learning Benchmark Suite for FPGA Architecture and CAD Research
    Aman Arora, Andrew Boutros, Daniel Rauch, Aishwarya Rajen, Aatman Borda, Seyed Alireza Damghani, Samidh Mehta, Sangram Kate, Pragnesh Patel, Kenneth B Kent, and  others
    In IEEE International Conference on Field-Programmable Logic and Applications (FPL), 2021
  4. End-to-End FPGA-Based Object Detection using Pipelined CNN and Non-Maximum Suppression
    Anupreetham, Mohamed Ibrahim, Mathew Hall, Andrew Boutros, Ajay Kuzhively, Abinash Mohanty, Eriko Nurvitadhi, Vaughn Betz, Yu Cao, and Jae-sun Seo
    In IEEE International Conference on Field-Programmable Logic and Applications (FPL), 2021
  5. FPGA Architecture: Principles and Progression
    Andrew Boutros, and Vaughn Betz
    In IEEE Circuits and Systems Magazine (CAS-M), 2021
  6. Compute-Capable Block RAMs for Efficient Deep Learning Acceleration on FPGAs
    Xiaowei Wang, Vidushi Goyal, Jiecao Yu, Valeria Bertacco, Andrew Boutros, Eriko Nurvitadhi, Charles Augustine, Ravi Iyer, and Reetuparna Das
    In IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2021

2020

  1. Neighbors From Hell: Voltage Attacks Against Deep Learning Accelerators on Multi-Tenant FPGAs
    Andrew Boutros, Mathew Hall, Nicolas Papernot, and Vaughn Betz
    In IEEE International Conference on Field-Programmable Technology (FPT), 2020
  2. Beyond Peak Performance: Comparing the Real Performance of AI-Optimized FPGAs and GPUs
    Andrew Boutros, Eriko Nurvitadhi, Rui Ma, Sergey Gribok, Zhipeng Zhao, James C Hoe, Vaughn Betz, and Martin Langhammer
    In IEEE International Conference on Field-Programmable Technology (FPT), 2020
  3. FPGA Logic Block Architectures for Efficient Deep Learning Inference
    Mohamed Eldafrawy, Andrew Boutros, Sadegh Yazdanshenas, and Vaughn Betz
    In ACM Transactions on Reconfigurable Technology and Systems (TRETS), 2020

2019

  1. Scalable Low-Latency Persistent Neural Machine Translation on CPU Server with Multiple FPGAs
    Eriko Nurvitadhi, Andrew Boutros, Prerna Budhkar, Ali Jafari, Dongup Kwon, David Sheffield, Abirami Prabhakaran, Karthik Gururaj, Pranavi Appana, and Mishali Naik
    In IEEE International Conference on Field-Programmable Technology (FPT), 2019
  2. Why Compete When You Can Work Together: FPGA-ASIC Integration for Persistent RNNs
    Eriko Nurvitadhi, Dongup Kwon, Ali Jafari, Andrew Boutros, Jaewoong Sim, Phillip Tomson, Huseyin Sumbul, Gregory Chen, Phil Knag, Raghavan Kumar, and  others
    In IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2019
  3. Math Doesn’t Have to Be Hard: Logic Block Architectures to Enhance Low-Precision Multiply-Accumulate on FPGAs
    Andrew Boutros, Mohamed Eldafrawy, Sadegh Yazdanshenas, and Vaughn Betz
    In ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), 2019

2018

  1. You Cannot Improve What You Do Not Measure: FPGA vs. ASIC Efficiency Gaps for Convolutional Neural Network Inference
    Andrew Boutros, Sadegh Yazdanshenas, and Vaughn Betz
    In ACM Transactions on Reconfigurable Technology and Systems (TRETS), 2018
  2. Embracing Diversity: Enhanced DSP Blocks for Low-Precision Deep Learning on FPGAs
    Andrew Boutros, Sadegh Yazdanshenas, and Vaughn Betz
    In IEEE International Conference on Field Programmable Logic and Applications (FPL), 2018

2017

  1. Hardware Acceleration of Novel Chaos-Based Image Encryption for IoT Applications
    Andrew Boutros, Salma Hesham, Barbara Georgey, and Mohamed A Abd El Ghany
    In IEEE International Conference on Microelectronics (ICM), 2017
  2. Build Fast, Trade Fast: FPGA-Based High-Frequency Trading Using High-Level Synthesis
    Andrew Boutros, Brett Grady, Mustafa Abbas, and Paul Chow
    In IEEE International Conference on Reconfigurable Computing and FPGAs (ReConFig), 2017
  3. HW/SW Co-Design of The HOG Algorithm on a Xilinx Zynq SoC
    Jens Rettkowski, Andrew Boutros, and Diana Göhringer
    In Journal of Parallel and Distributed Computing (JPDC), 2017

2015

  1. Real-Time Pedestrian Detection on a Xilinx Zynq Using The HOG Algorithm
    Jens Rettkowski, Andrew Boutros, and Diana Göhringer
    In IEEE International Conference on Reconfigurable Computing and FPGAs (ReConFig), 2015