Autonomous Systems3D Perception / Multi-Sensor Fusion

Mohamed Samir

Perception Engineer

I design and deploy real-time perception systems for robotics and autonomous platforms, with a focus on multi-sensor fusion, 3D/4D reconstruction, and edge inference.

My work bridges research and production, turning advanced perception models into reliable systems that operate under real-world constraints.

GitHub LinkedIn EmailView CV Download CV

Focus

3D Vision, Robotics, Autonomous Systems, and Edge AI.

Main Stack

C++, Python, ROS2, CUDA, TensorRT, ZED, and multi-sensor systems.

Research Interests

3D reconstruction, high-resolution segmentation, and neural implicit representations.

Experience

3D Computer Vision Engineer

Mar 2025 - Present

Anovate.Ai

Designed and deployed real-time perception and 3D reconstruction systems across robotics and medical imaging workflows.

Robotic Perception

Architected a full ROS2 perception stack integrating 6 ZED cameras and ultrasonics with sub-millimeter calibration using Kalibr.

Fine-tuned Mask-RT-DETR on a custom 5x augmented dataset, reaching 0.89 mAP.

Deployed the pipeline in C++ with TensorRT on Jetson Orin Nano, achieving ~15x faster inference.

Built a robust 3D perception pipeline combining stereo depth, classical filtering, and segmentation fusion to compute precise OBBs under harsh outdoor sunlight.

Medical Reconstruction

Led 4D virtual patient reconstruction using Gaussian Splatting and face avatar models, including INSTA and Fate-Avatar.

Developed a pipeline for accurate 3D/4D dental model alignment and tracking inside dynamic facial avatars using 3D Slicer, PnP, and Metrical Tracker.

Research Assistant

Feb 2025 - Present

Connected Autonomous Vehicles Lab, AUC

Built a full-stack autonomous vehicle system and conducted research on high-resolution segmentation for real-world perception.

Autonomous System (Golf Cart)

Architected the full stack for an autonomous golf cart conversion, including perception, communication, and control.

Built a high-performance C++ server integrating 6 cameras via GStreamer/DeepStream for parallel acquisition.

Deployed dynamic-batch YOLOv11 TensorRT inference with custom CUDA pre/post-processing.

Integrated a complete drive-by-wire system using WebRTC (libdatachannel), a Kotlin Android app, and ESP32 for real-time control and streaming.

Research (Segmentation)

Developed a hybrid segmentation architecture combining Mamba and Implicit Neural Representations (INR) to handle scale variation.

Achieved strong boundary precision on DIS5K by modeling long-range dependencies without costly upsampling.

AI Engineer

Part-time

Nov 2025 - Present

Makkook.ai

Applied AI engineering focused on LLM, RAG, and VLM-based systems.

LLM / RAG / VLM Systems

Designed and deployed RAG pipelines for knowledge retrieval and structured reasoning.

Worked with LLM and VLM models for multimodal understanding and task automation.

Built production-oriented AI workflows integrating external data sources and model inference.

Perception Engineer Intern

Jul 2024 - Sep 2024

Bright-skies

Improved 3D annotation workflows by automating bounding box fitting and orientation estimation in CVAT.

3D Annotation Workflow

Developed a box fitting algorithm with a lightweight 3D model for yaw angle prediction inside CVAT.

Enabled 2D bird’s-eye-view annotation to automatically generate oriented 3D bounding boxes on point clouds.

Optimized the pipeline for CPU execution to support real-time annotation workflows.

Reduced annotation time by ~40%.

Perception Engineer Intern

Sep 2023 - Dec 2023

Bright-skies

Evaluated and optimized SLAM pipelines and map processing workflows for autonomous vehicle mapping and HD map correction.

SLAM & Mapping

Evaluated multiple SLAM pipelines including Point-LIO, Fast-LIO2, and LIO-SAM using camera, LiDAR, and GPS data.

Generated high-precision environment maps and converted outputs to XODR format for HD map correction.

Integrated SLAM workflows into ROS2 for efficient offline map generation and processing.

Map Refinement

Developed dynamic object filtering pipelines using ERASOR, Removert, and RANSAC on LiDAR point clouds.

Extracted static road structure and removed dynamic obstacles to improve map quality.

Achieved ~15% improvement in map accuracy for autonomous navigation.

Projects

Projects representing real system ownership, deployment work, and technical decision-making.

Robotic Perception System for Solar Panel Manipulation

Anovate.AiSwap Robotics2025 - Present

A multi-camera ROS2 perception system built for real-time robotic manipulation, enabling precise 3D localization and grasping of solar panels under harsh outdoor conditions.

ROS2ZED StereoC++CUDATensorRTJetson3D Perception

Deep Dive Into Technical Details

Autonomous Golf Cart

Connected Autonomous Vehicles Lab, AUC2025 - Present

Leading perception and system development for a camera-based autonomous golf cart operating in unstructured environments without lane assumptions.

Drive-by-WireWebRTCESP32KotlinROS2GStreamerTensorRTMulti-Camera Perception

Deep Dive Into Technical Details

3D Annotation Tool Enhancement for Autonomous Driving (CVAT Extension)

BrightDrive / Brightskies2024

Designed and integrated 3D annotation acceleration features into CVAT, reducing labeling time by up to 4x for autonomous driving datasets.

CVATTypeScriptBevPoint Cloud Processing

Deep Dive Into Technical Details

Point-LIO mapping result used for SLAM and HD map evaluation on autonomous vehicle data.

SLAM & HD Mapping Evaluation for Autonomous Vehicle Map Correction

BrightDrive / Brightskies2023 - 2024

Evaluated and benchmarked multiple SLAM and HD mapping algorithms on real autonomous vehicle data to identify the most accurate pipeline for offline map generation and correction.

ROSSLAMHD MappingDynamic Object Removal

Deep Dive Into Technical Details

Open Source Deployment-focused projects and merged contributions

Projects

YOLOv12-TensorRT-CPP

C++, TensorRT, and CUDA inference pipeline for YOLOv12 focused on fast deployment and efficient preprocessing on edge hardware.

Merged into official YOLOv12 repo

YOLOv12-ONNX-CPP

C++ ONNX Runtime implementation for YOLOv12 built to keep the inference path lightweight while preserving real-time performance.

Merged into official YOLOv12 repo

YOLO_OBB_CPP

C++ inference project for YOLOv8 and YOLOv11 with oriented bounding box support, tailored for practical deployment workflows.

RF-DETR-TensorRT-CPP

TensorRT and CUDA inference pipeline for RF-DETR, extending the same deployment-first approach beyond YOLO-based models.

Merged Pull Requests

Geekgineer/YOLOs-CPP

YOLOs-CPP Contributions

Merged a sequence of contributions into YOLOs-CPP covering YOLO pose support, Oriented Bounding Box detection support, YOLOv9 detection and segmentation support, YOLOv11 fixes, YOLOv12 integration, header fixes, and documentation improvements. Taken together, these PRs materially expanded the library’s real-time C++ deployment utility.

Pose

OBB

Segmentation

THU-MIG/yolov10

YOLOv10 Multi-Object Tracking Integration (BoxMOT)

Developed and contributed a YOLOv10 + BoxMOT integration supporting multiple tracking backends, later incorporated into the official YOLOv10 repository.

Selected Academic Programs

Winter School

MENA ML Winter School2026

KAUST (King Abdullah University of Science and Technology), Saudi Arabia

Highly selective program: selected as one of 300 participants from more than 2,300 applicants globally for an intensive, week-long deep learning and AI program.

World-Class Instruction

Rigorous lectures, hands-on labs, and research roundtables led by researchers and engineers from Google DeepMind, OpenAI, NVIDIA, Apple, and leading academic institutions.

Advanced Topics Explored

Embodied AI, reasoning, large language models, and quantum computing through a systems-oriented machine learning lens.

Education

Master's Degree

Master's in Electrical Engineering

Concordia University, Quebec, Canada

Starts Fall 2026

Bachelor's Degree

B.Sc. in Computing and Data Science

Alexandria University

Jun 2025

Excellence with Honor

GPA 3.76 / 4.0

Degree Transcript Thesis

Bachelor Thesis

Estats: A Non-Parametric Python Statistical Library

Mohamed Samir

Experience

3D Computer Vision Engineer

Research Assistant

AI Engineer

Perception Engineer Intern

Perception Engineer Intern

Projects

Robotic Perception System for Solar Panel Manipulation

Autonomous Golf Cart

3D Annotation Tool Enhancement for Autonomous Driving (CVAT Extension)

SLAM & HD Mapping Evaluation for Autonomous Vehicle Map Correction

Open Source Deployment-focused projects and merged contributions

Selected Academic Programs

Education

Master's in Electrical Engineering

B.Sc. in Computing and Data Science

Contact me on LinkedIn or through email