Focus
3D Vision, Robotics, Autonomous Systems, and Edge AI.
Perception Engineer
I design and deploy real-time perception systems for robotics and autonomous platforms, with a focus on multi-sensor fusion, 3D/4D reconstruction, and edge inference.
My work bridges research and production, turning advanced perception models into reliable systems that operate under real-world constraints.
Focus
3D Vision, Robotics, Autonomous Systems, and Edge AI.
Main Stack
C++, Python, ROS2, CUDA, TensorRT, ZED, and multi-sensor systems.
Research Interests
3D reconstruction, high-resolution segmentation, and neural implicit representations.


Designed and deployed real-time perception and 3D reconstruction systems across robotics and medical imaging workflows.
Robotic Perception
Architected a full ROS2 perception stack integrating 6 ZED cameras and ultrasonics with sub-millimeter calibration using Kalibr.
Fine-tuned Mask-RT-DETR on a custom 5x augmented dataset, reaching 0.89 mAP.
Deployed the pipeline in C++ with TensorRT on Jetson Orin Nano, achieving ~15x faster inference.
Built a robust 3D perception pipeline combining stereo depth, classical filtering, and segmentation fusion to compute precise OBBs under harsh outdoor sunlight.
Medical Reconstruction
Led 4D virtual patient reconstruction using Gaussian Splatting and face avatar models, including INSTA and Fate-Avatar.
Developed a pipeline for accurate 3D/4D dental model alignment and tracking inside dynamic facial avatars using 3D Slicer, PnP, and Metrical Tracker.
Built a full-stack autonomous vehicle system and conducted research on high-resolution segmentation for real-world perception.
Autonomous System (Golf Cart)
Architected the full stack for an autonomous golf cart conversion, including perception, communication, and control.
Built a high-performance C++ server integrating 6 cameras via GStreamer/DeepStream for parallel acquisition.
Deployed dynamic-batch YOLOv11 TensorRT inference with custom CUDA pre/post-processing.
Integrated a complete drive-by-wire system using WebRTC (libdatachannel), a Kotlin Android app, and ESP32 for real-time control and streaming.
Research (Segmentation)
Developed a hybrid segmentation architecture combining Mamba and Implicit Neural Representations (INR) to handle scale variation.
Achieved strong boundary precision on DIS5K by modeling long-range dependencies without costly upsampling.
Applied AI engineering focused on LLM, RAG, and VLM-based systems.
LLM / RAG / VLM Systems
Designed and deployed RAG pipelines for knowledge retrieval and structured reasoning.
Worked with LLM and VLM models for multimodal understanding and task automation.
Built production-oriented AI workflows integrating external data sources and model inference.
Improved 3D annotation workflows by automating bounding box fitting and orientation estimation in CVAT.
3D Annotation Workflow
Developed a box fitting algorithm with a lightweight 3D model for yaw angle prediction inside CVAT.
Enabled 2D bird’s-eye-view annotation to automatically generate oriented 3D bounding boxes on point clouds.
Optimized the pipeline for CPU execution to support real-time annotation workflows.
Reduced annotation time by ~40%.
Evaluated and optimized SLAM pipelines and map processing workflows for autonomous vehicle mapping and HD map correction.
SLAM & Mapping
Evaluated multiple SLAM pipelines including Point-LIO, Fast-LIO2, and LIO-SAM using camera, LiDAR, and GPS data.
Generated high-precision environment maps and converted outputs to XODR format for HD map correction.
Integrated SLAM workflows into ROS2 for efficient offline map generation and processing.
Map Refinement
Developed dynamic object filtering pipelines using ERASOR, Removert, and RANSAC on LiDAR point clouds.
Extracted static road structure and removed dynamic obstacles to improve map quality.
Achieved ~15% improvement in map accuracy for autonomous navigation.
Projects representing real system ownership, deployment work, and technical decision-making.
A multi-camera ROS2 perception system built for real-time robotic manipulation, enabling precise 3D localization and grasping of solar panels under harsh outdoor conditions.
Leading perception and system development for a camera-based autonomous golf cart operating in unstructured environments without lane assumptions.
Designed and integrated 3D annotation acceleration features into CVAT, reducing labeling time by up to 4x for autonomous driving datasets.

Evaluated and benchmarked multiple SLAM and HD mapping algorithms on real autonomous vehicle data to identify the most accurate pipeline for offline map generation and correction.
Projects
C++, TensorRT, and CUDA inference pipeline for YOLOv12 focused on fast deployment and efficient preprocessing on edge hardware.
C++ ONNX Runtime implementation for YOLOv12 built to keep the inference path lightweight while preserving real-time performance.
C++ inference project for YOLOv8 and YOLOv11 with oriented bounding box support, tailored for practical deployment workflows.
TensorRT and CUDA inference pipeline for RF-DETR, extending the same deployment-first approach beyond YOLO-based models.
Merged Pull Requests
Geekgineer/YOLOs-CPP
YOLOs-CPP ContributionsMerged a sequence of contributions into YOLOs-CPP covering YOLO pose support, Oriented Bounding Box detection support, YOLOv9 detection and segmentation support, YOLOv11 fixes, YOLOv12 integration, header fixes, and documentation improvements. Taken together, these PRs materially expanded the library’s real-time C++ deployment utility.
Pose
OBB
Segmentation
THU-MIG/yolov10
YOLOv10 Multi-Object Tracking Integration (BoxMOT)Developed and contributed a YOLOv10 + BoxMOT integration supporting multiple tracking backends, later incorporated into the official YOLOv10 repository.
Winter School
KAUST (King Abdullah University of Science and Technology), Saudi Arabia
Highly selective program: selected as one of 300 participants from more than 2,300 applicants globally for an intensive, week-long deep learning and AI program.
World-Class Instruction
Rigorous lectures, hands-on labs, and research roundtables led by researchers and engineers from Google DeepMind, OpenAI, NVIDIA, Apple, and leading academic institutions.
Advanced Topics Explored
Embodied AI, reasoning, large language models, and quantum computing through a systems-oriented machine learning lens.



Master's Degree
Concordia University, Quebec, Canada
Starts Fall 2026
Bachelor's Degree
Alexandria University
Jun 2025
Excellence with Honor
GPA 3.76 / 4.0Bachelor Thesis
Estats: A Non-Parametric Python Statistical Library