This paper addresses the problem of data leakage, which arises from the increasing accessibility of machine learning (ML) and the increasing use of user-friendly interfaces that require no specialized knowledge and rely solely on "push-a-button" approaches. Data leakage occurs when training data contains unintended information that impacts model performance evaluations, potentially leading to incorrect performance estimates. This paper categorizes data leakage in ML and discusses how it propagates through ML workflows under specific conditions. Furthermore, we investigate the association between data leakage and specific tasks, examine its occurrence in transfer learning, and compare standard inductive ML with transferable ML frameworks. Ultimately, we highlight the importance of addressing data leakage for robust and reliable ML applications.