This paper presents an effective method for improving sample efficiency in off-policy reinforcement learning using function approximation, focusing on the Actor-Critic (AC) framework. We aim to address the key challenges of off-policy AC methods: instability due to the "deadly triad," the problem of evaluating continuously changing policies, and the difficulty of accurately estimating off-policy policy gradients. To achieve this, we introduce a novel concept, functional critic modeling, and propose the first off-policy objective-based AC algorithm that demonstrates convergence in linear function settings. Furthermore, from a practical perspective, we present a carefully designed neural network architecture for functional critic modeling and demonstrate its effectiveness through preliminary experiments on widely used RL tasks from the DeepMind Control Benchmark.