This paper presents DynaMark, a reinforcement learning-based dynamic watermarking technique to address the replay attack vulnerability of networked machine tool controllers (MTCs) in Industry 4.0 environments. Unlike existing dynamic watermarking techniques that assume linear-Gaussian dynamics and constant watermark statistics, DynaMark learns an adaptive policy online that dynamically adjusts the covariance of a Gaussian watermark using system measurements and detector feedback without system knowledge. We maximize a unique reward function that dynamically balances control performance, energy consumption, and detection reliability, and develop a Bayesian belief update mechanism for real-time detection reliability for linear systems. Using a Siemens Sinumerik 828D controller digital twin and a real stepper motor testbed, we demonstrate that DynaMark reduces watermark energy by 70% compared to existing methods while maintaining the nominal trajectory and maintaining an average detection delay of one sampling interval.