View Document

Title

An Interpretable RGB-D Mixture-of-Experts Framework for Robust Real-Time Perception on Embedded Robotic Systems

Organization

Department: Mechanical Engineering

Document Attachments

Resource	Length	Width	Thickness
Paper	0	0	0

Specimen Elements

City

Pocatello

Coverage

Unknown to Unknown

Creator

Nadim Dib

Publisher

Idaho State University

Type

Thesis

Is Complete

Yes

Date Published

5/15/2026

Archive

digital

City: Pocatello

Degree

Master

Document Attachments

Abstracts

Real-time robotic perception on resource-constrained hardware demands reliable object detection where RGB-only systems routinely fail — specular surfaces, low texture, motion blur, and occlusion. This thesis presents a Mixture-of-Experts (MoE) perception architecture fusing a TensorRT-accelerated YOLOv8 segmentation expert with a GPU-based depth expert, coordinated by a deterministic gating mechanism that routes fusion decisions per-instance at inference time. Unlike existing RGB-D fusion approaches requiring joint training on annotated datasets, the proposed system is training-free at fusion time. A GPU RANSAC support plane fit serves as a live depth reliability signal, enabling per-instance routing — RGB-led, depth-assisted, or depth-proposed — without modifying the underlying RGB detector. The system is evaluated across four areas: detection robustness, contextual reasoning, embedded real-time performance, and interpretability. Under RGB failure conditions including missing backgrounds, darkness, motion blur, and occlusion, RGB-only recall ranges from 2.2% to 79.8%, while the proposed MoE achieves 63.6% to 96.4% recall and recovers 60.0% to 78.9% of objects missed by RGB-only detection. Gate statistics indicate 97.9–100% RGB-led operation under stable conditions, rising to 40.1–87.7% depth-led under failure, consistent with modality-adaptive routing and no learned signal. Performance benchmarks on the Jetson Orin NX and AGX Xavier demonstrate 10.8–36.0 FPS and 8.5–30.2 FPS respectively, meeting soft real-time requirements under worst-case clutter. Observed failure modes are diagnosable from intermediate gate signals, supporting interpretability claims. Keywords: RGB-D perception, mixture of experts, embedded robotics, instance segmentation, sensor fusion, interpretability

An Interpretable RGB-D Mixture-of-Experts Framework for Robust Real-Time Perception on Embedded Robotic Systems

Necessary Documents

View Details

Paper

Document

Information

Paper -Document

ETD

View Document

An Interpretable RGB-D Mixture-of-Experts Framework for Robust Real-Time Perception on Embedded Robotic Systems

Necessary Documents

Document

Other Projects