- Raw Data Is All You Need: Virtual Axle Detector with Enhanced Receptive Field Rising maintenance costs of ageing infrastructure necessitate innovative monitoring techniques. This paper presents a new approach for axle detection, enabling real-time application of Bridge Weigh-In-Motion (BWIM) systems without dedicated axle detectors. The proposed method adapts the Virtual Axle Detector (VAD) model to handle raw acceleration data, which allows the receptive field to be increased. The proposed Virtual Axle Detector with Enhanced Receptive field (VADER) improves the \(F_1\) score by 73\% and spatial accuracy by 39\%, while cutting computational and memory costs by 99\% compared to the state-of-the-art VAD. VADER reaches a \(F_1\) score of 99.4\% and a spatial error of 4.13~cm when using a representative training set and functional sensors. We also introduce a novel receptive field (RF) rule for an object-size driven design of Convolutional Neural Network (CNN) architectures. Based on this rule, our results suggest that models using raw data could achieve better performance than those using spectrograms, offering a compelling reason to consider raw data as input. 3 authors · Sep 4, 2023
- Bidirectional Diffusion Bridge Models Diffusion bridges have shown potential in paired image-to-image (I2I) translation tasks. However, existing methods are limited by their unidirectional nature, requiring separate models for forward and reverse translations. This not only doubles the computational cost but also restricts their practicality. In this work, we introduce the Bidirectional Diffusion Bridge Model (BDBM), a scalable approach that facilitates bidirectional translation between two coupled distributions using a single network. BDBM leverages the Chapman-Kolmogorov Equation for bridges, enabling it to model data distribution shifts across timesteps in both forward and backward directions by exploiting the interchangeability of the initial and target timesteps within this framework. Notably, when the marginal distribution given endpoints is Gaussian, BDBM's transition kernels in both directions possess analytical forms, allowing for efficient learning with a single network. We demonstrate the connection between BDBM and existing bridge methods, such as Doob's h-transform and variational approaches, and highlight its advantages. Extensive experiments on high-resolution I2I translation tasks demonstrate that BDBM not only enables bidirectional translation with minimal additional cost but also outperforms state-of-the-art bridge models. Our source code is available at [https://github.com/kvmduc/BDBM||https://github.com/kvmduc/BDBM]. 5 authors · Feb 11, 2025
- Enhancing Physical Consistency in Lightweight World Models A major challenge in deploying world models is the trade-off between size and performance. Large world models can capture rich physical dynamics but require massive computing resources, making them impractical for edge devices. Small world models are easier to deploy but often struggle to learn accurate physics, leading to poor predictions. We propose the Physics-Informed BEV World Model (PIWM), a compact model designed to efficiently capture physical interactions in bird's-eye-view (BEV) representations. PIWM uses Soft Mask during training to improve dynamic object modeling and future prediction. We also introduce a simple yet effective technique, Warm Start, for inference to enhance prediction quality with a zero-shot model. Experiments show that at the same parameter scale (400M), PIWM surpasses the baseline by 60.6% in weighted overall score. Moreover, even when compared with the largest baseline model (400M), the smallest PIWM (130M Soft Mask) achieves a 7.4% higher weighted overall score with a 28% faster inference speed. 11 authors · Sep 15, 2025