Mondrian Forest for On-Device Learning on Resource-Constrained IoT Devices
2025 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesis
Abstract [en]
This thesis investigates the practical feasibility of Mondrian Forests as an online learning method for resource-constrained Internet of Things (IoT) devices. Mondrian Forests are wellsuited for on-device learning because they support incremental updates without full retraining. They also offer bounded memory usage and are computationally efficient. However, their deployment on embedded systems with limited memory and processing power remains underexplored. We present a native C implementation of Mondrian Forests tailored for microcontroller-class hardware, ensuring functional parity with the original Python algorithm across multiple benchmark datasets including MNIST, USPS, DNA, and Satimage. To address the strict memory constraints typical of IoT devices, we develop and evaluate a suite of memory optimization techniques, including 8-bit integer precision conversion, PCA-based dimensionality reduction, and class-balanced data sampling, alongside model-level optimizations such as limiting tree growth and dynamic memory management. We evaluated the memory footprint of the model alone, excluding other components such as data storage and inference, across three embedded platforms, limiting forest growth once the model reached each platform’s memory capacity. This constraint on the nRF52 platform (0.25 MB memory budget) led to a 5.5 percentage point drop in accuracy on the Satimage dataset, from 84.5% (no memory constraint) to 79%, for a 6-class classification task. While peak memory usage (including model, dataset, and inference) could not be reduced to fit within the 0.25 MB budget, we came close, reaching 0.4 MB with 70% accuracy on Satimage. For more complex datasets, accuracy approached that of random guessing under tight memory constraints. Our results demonstrate that Mondrian Forests can support on-device learning in moderately constrained embedded systems, especially for simpler classification tasks and datasets. However, their need to retain training samples for online learning introduces a key bottleneck, limiting their practicality on ultra-low-memory devices such as the Zoul. This work provides a foundation for advancing memory-efficient, adaptive online learning methods tailored for embedded intelligence. Future research will focus on hybrid memory designs that leverage flash storage for node data, alongside architectural simplifications aimed at reducing memory demands and enhancing training efficiency on severely constrained devices.
Place, publisher, year, edition, pages
2025. , p. 86
Series
IT ; mDV 25 015
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:uu:diva-562643OAI: oai:DiVA.org:uu-562643DiVA, id: diva2:1979257
External cooperation
Research Institutes of Sweden
Educational program
Master Programme in Computer Science
Presentation
2025-06-10, Uppsala, 15:15 (English)
Supervisors
Examiners
2025-06-302025-06-302025-06-30Bibliographically approved