Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Vision without Images: End-to-End Computer Vision from Single Compressive Measurements

Created by
  • Haebom

Author

Fengpu Pan, Heting Gao, Jiangtao Wen, Yuxing Han

Outline

This paper presents a novel Snapshot Compressed Imaging (SCI)-based computer vision framework that utilizes an 8x8 pseudo-random binary mask to overcome the limitations of existing SCI techniques, which exhibit poor performance under low-light and low-SNR conditions. At its core is the Compressive Denoising Autoencoder (CompDAE) based on the STFormer architecture, which is designed to directly perform subsequent tasks such as edge detection and depth estimation without image reconstruction. CompDAE integrates a rate-constrained training strategy inspired by BackSlash to generate compressible models and provides an integrated multi-task platform using a lightweight task-specific decoder and a shared encoder. Experimental results on various datasets demonstrate that CompDAE achieves state-of-the-art performance with significantly reduced complexity, particularly under ultra-low-light conditions where existing CMOS and SCI pipelines fail.

Takeaways, Limitations

Takeaways:
We present a novel SCI-based computer vision framework that demonstrates superior performance under low-light and low-SNR conditions.
Use of small-sized masks that are easy to implement in hardware.
Subsequent operations (edge detection, depth estimation, etc.) can be performed directly without image reconstruction.
Providing an integrated platform for multi-tasking.
Achieving cutting-edge performance with lower complexity compared to existing methods.
Limitations:
Possible resolution loss due to using an 8x8 sized mask.
Only experimental results for specific datasets are presented, so verification of generalization performance is necessary.
Further explanation is needed regarding the specifics and effectiveness of rate-constrained training strategies inspired by BackSlash.
👍