This paper points out the lack of autonomous behavior and independent reasoning ability of existing AI models, and the limitations of data input methods that depend on explicit queries. It raises the problem of AI agents having difficulty integrating knowledge from various fields like humans, and suggests a way to integrate mental imagery, which plays an important role in the human thought process, into the machine thought framework. To this end, we propose a framework centered on cognitive thought units consisting of input data units, desire units, and mental imagery units, and suggest a method to utilize natural language sentences or picture sketches as data to provide information and make decisions. Finally, we present and discuss the verification results of the proposed framework.