Selected Research Projects

Learn more about things that I am working on (or have worked in the past).

CascadeCNN

Keywords: CNN Quantisation, Embedded FPGAs, Design Space Exploration, Performance Modelling, Input-Dependent Computation, Adaptive Inference 
 
->A novel two-stage FPGA-based accelerator for CNN classifiers, exploiting the fact that not all input require the level same numerical precision in computations to yield a confident prediction, without the need of model re-training.
->Comprises an excessively quantised low-precision unit (LPU) providing rapid classification predictions, followed by a confidence evaluation unit that determines which samples should be re-processed by a high-precision unit to restore accuracy, based on the LPU prediction confidence.
->Analytical performance modelling and Design Space Exploration is employed to parametrise a configurable hardware architecture yielding an tailored instance for the given CNN-FPGA pair, according to user-specified requirements in performance (throughput or latency) and accuracy.

Research Project at: iDSL, Imperial College London
Joint work with: S.Venieris, C.S.Bouganis

Progressive LSTM Inference

Keywords: SVD-based compression, prunning, anytime inference, Embedded FPGAs, Model-Hardware co-design, Self-driving Cars

-> A methodology combining iterative refinement (through low-rank approximation) and structured prunning, to conduct the most information-carrying calculations first in LSTM inference.
-> Provides a rapid approximation of the final output, which is iteratively refined as a function of latency budget, Implemented on autonomous navigation task for self-driving cars, enabling fast reaction.
-> Configurable FPGA-based hardware architecture, tailored to the target LSTM-FPGA pair, co-design with the approximate LSTM model.

Research Project at: iDSL, Imperial College London
Joint work with: S.Venieris, C.S.Bouganis

Autonomous Drone Navigation

Keywords: Self-supervised Learning, Visual Navigation, Obstacle Avoidance, Spatio-temporal representations, Two-stream CNNs

-> Two-stream CNN, simultaneously processing current and previous frame of a UAV camera to extract spatio-temporal represntations, improving efficiency in autonomous navigation.
->Predicting distance-to-collision towards multiple directions through regression, enabling a custom motion planner to make more informed action decisions, avoiding collision in real-world environments.
-> Trained in a self-supervised manner, with a UAV performing safe autonomous navigation through the use of mounted distance sensors, also used as ground-thruth signals for training.

Research Project at: iDSL, Imperial College London
Joint work with: C.S.Bouganis

UAV-based Object Detection

Keywords: Object Detection, Domain-specific models, Data-driven oprimisation, Altitude-aware region proposals, Embedded GPUs

-> Exploiting applictaion-specific information and prior domain knowledge, to optimise region-proposal-based object detectors for efficient altitude aware vehicle detection in UAVs.
-> Considering UAV flight altitude captured by sensors at runtime, to eliminate false positive candiate detections of vehicles on the ground (e.g. based on size/density), effectively reducing inference workload.
-> Exploiting light-weight road segmentation method for further performance optimisation, without sacrifising accuracy.

Research Project at: iDSL, Imperial College London and KIOS
Joint work with: C. Kyrkou and C.S. Bouganis

Visual Semantic Mapping (SLAM)

Keywords: Semantic Segmentation, Impainting, Occlusion Handling, RGB-D perception, Simultaneous Localisation and Mapping (SLAM)

(work in progress) - more details soon...

Research Project at: iDSL and  Dyson Robotics Lab at Imperial College
Joint work with: A.Nicastro, S.Leutenegger, C.S.Bouganis

CNN-to-FPGA Mapping

Keywords: survey, Automated toolflows, DNNs, FPGAs

-> Survey on Automated toolflows for mapping DNNs to FPGAs, along with a set of promising research areas that can bridge the gap between DL and hardware design communities.
-> From a deep learning practitioner's perspective, we extensive study the supported DL models, achived inference speed and applicability to visual scene understanding tasks, from latency-critical mobile systems to high-throughput cloud services.
-> From a harware engineer's perspective, we present a detailed analysis of architectural design choices, design space exploration methods and supported oprimisations on implementation  (such as numerical precision).

Research Project at: iDSL, Imperial College London
Joint work with: S.Venieris and C.S.Bouganis

Human-Robot Collaboration

Keywords: Frequency domain filtering, collision detection, Admittance/Impedance Control, contact distinction

-> A novel contact distinction method is developed, monitoring externally applied forces/torques on a robot manipulator (arm) during physical human-robot interaction, in order todistinguish collisions from intended contacts. 
-> The method is based on frequency component analysis, and adapts with respect to the desired dynamic behavior (inertia, damping and elasticity) of the admittance/impedance controller.
-> Collision trigger appropriate reaction from the robot, improving the safety of the operator during the interaction with the robot.

Research Project at: Robotics Group, University of Patras
Joint work with: F.Dimeas and N.Aspragathos

Line Following Robots

Keywords: FSM-based state-estimation, Adaptive PD Control, Mobile Robots

-> Finite State Machine -based state estimation for line following robots, able to detect line irregurarities or noisy sensor measurments on an array of iR reflectance sensors, enhancing the robot's situational awareness. 
-> Based on the estimated state, the robot switches between a proposed variable-gain PD controller for line following and an open-loop controller handling special cases.

Research Project at: Robotics Group, University of Patras
Joint work with: A.Toumpa, F. Dimeas, N.Aspragathos (et al.)

Acceleration of Tridiagonal System Solvers

Keywords: CUDA, Nvidia GPUs, QR-decomposition, Givens Rotations, gSpike, Parallel System Solvers 

-> g-Spike:  a parallel algorithm for solving general nonsymmetric tridiagonal systems for the GPU. The solver applies Givens rotations and QR factorization without pivoting. It also implements a low-rank modification strategy to compute the Spike DS decomposition even when the partitioning defines singular submatrices along the diagonal.
-> Numerical experiments with problems of high order indicate that g-Spike is competitive in runtime with existing GPU methods, and can provide acceptable results when other methods cannot be applied or fail.

Research Project at: HPC Lab, University of Patras
Joint work with: A.Sobczyk, E.Gallopoulos and A.Sameh