Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Noise-based reward-modulated learning

Created by
  • Haebom

Author

Yes us Garc ia Fern andez, Nasir Ahmad, Marcel van Gerven

Outline

This paper presents a novel noise-based learning rule that mimics the mechanisms of biological neural systems, which efficiently learn from delayed rewards, and is applicable even in resource-constrained environments or systems containing non-differentiable components. To address the limitations of traditional reward-regulated hebb learning (RMHL), which involves time delays and hierarchical processing, we propose an algorithm that uses the reward prediction error as an optimization objective and incorporates an eligibility trace to enable retrospective credit assignment. This method utilizes only local information and experimentally demonstrates that it outperforms RMHL and achieves performance comparable to backpropagation (BP) in reinforcement learning tasks (both immediate and delayed rewards). Although its convergence speed is slow, it demonstrates applicability to low-power adaptive systems where energy efficiency and biological plausibility are crucial. Furthermore, it provides insight into the mechanisms by which dopamine-like signals and synaptic stochasticity contribute to learning in biological networks.

Takeaways, Limitations

Takeaways:
A noise-based learning rule that is effective even for delayed rewards is presented.
Proof of applicability in resource-constrained environments and non-differentiable systems
Advancing understanding of learning mechanisms in biological neural circuits
Suggests potential applications for low-power adaptive systems, especially those where energy efficiency and biological plausibility are important.
Provides insights into the role of dopamine-like signaling and synaptic stochasticity.
Limitations:
Experiments were conducted only on networks with simple structures.
Slower convergence compared to backpropagation-based learning
Applicability to complex real-world problems requires further study.
👍