In this paper, we propose a modular actor-critic architecture consisting of an LLM actor and a Linear Temporal Logic (LTLCrit)-based LLM critic to overcome the limitations of large-scale language models (LLMs) that suffer from low safety and efficiency due to error accumulation in long-term planning tasks. The LLM actor selects high-level actions through natural language observations, and LTLCrit analyzes the entire path to propose novel LTL constraints that prevent unsafe or inefficient future actions. The architecture supports both fixed manually specified safety constraints and adaptive learning soft constraints that improve long-term efficiency, and is model-independent. We demonstrate the safe and generalizable decision-making capability of the LLM with logic-based mutual supervision by achieving 100% completion rate and improved efficiency compared to the existing LLM planner on the Minecraft diamond mining benchmark.