This paper deals with an on-demand delivery system using unmanned aerial vehicles (UAVs) with heterogeneous energy storage capacity and unknown delivery time. Unlike previous studies, we propose a distributed deployment strategy to process randomly arriving orders without knowledge of energy consumption models. This strategy combines auction-based task assignment and online learning, where each UAV independently decides whether to bid for an order based on its energy storage capacity, parcel weight, and delivery distance. Simulation results show that, counterintuitively, assigning orders to the least confident bidder reduces delivery time and increases the number of successfully processed orders. In addition, we propose a strategy variant that promises to fulfill orders at a specific time in the future using the learned policy, thereby prioritizing early orders. This study highlights the advantages of distributed energy-aware decision-making and online learning, and provides new insights into the long-term deployment of UAV swarms in real-world dynamic environments.