Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

SKGE-SWIN: End-To-End Autonomous Vehicle Waypoint Prediction and Navigation Using Skip Stage Swin Transformer

Created by
  • Haebom

Author

Fachri Najm Noer Kartiman, Rasim, Yaya Wihardi, Nurul Hasanah, Oskar Natan, Bambang Wahono, Taufik Ibnu Salim

Outline

This study focuses on developing an end-to-end autonomous driving model that considers the contextual interplay between pixels, and proposes the SKGE-Swin architecture. SKGE-Swin utilizes the Swin Transformer, which leverages a skip-stage mechanism, to expand feature representations across multiple network levels and globally. Leveraging the Swin Transformer's Shifted Window-based Multi-head Self-Attention (SW-MSA) mechanism, it extracts information from distant pixels and retains important information from the initial to final stages, enhancing the ability to understand complex patterns in the surrounding environment. Using adversarial scenarios on the CARLA platform, we simulated and evaluated real-world environments, achieving driving scores superior to existing methods. Furthermore, we conduct an ablation study to evaluate the contributions of each architectural component, including the impact of skip connections and the use of the Swin Transformer.

Takeaways, Limitations

Takeaways:
We present a novel autonomous driving model architecture that effectively considers the inter-pixel context by combining the Swin Transformer and the skip-stage mechanism.
Demonstrated superior driving performance over conventional methods on the CARLA platform.
Ablation study allows analysis of the contribution of each architectural component.
Limitations:
Evaluation limited to the CARLA simulation environment. Performance verification in actual road conditions is required.
Details on the results of the ablation study are lacking.
Further research is needed on generalization performance across different environments and situations.
👍