This paper focuses on synthetic, personalized image generation, which combines multiple concepts to generate images. Existing research has primarily focused on preserving the appearance of target objects, but has overlooked the fine-grained control of interactions between them. This paper proposes a challenge called "Custom Human-Object Interaction Image Generation" (CHOI), focusing on human-object interaction scenarios. CHOI requires both identity preservation of target humans and objects and control of the interaction semantics between them. The key challenges of CHOI are: (1) simultaneous identity preservation and interaction control require decomposing humans and objects into self-contained identity features and pose-based interaction features. However, existing HOI image datasets do not provide ideal samples for learning this feature decomposition; and (2) inappropriate spatial configurations between humans and objects can result in a lack of desired interaction semantics. To address this, we design a two-stage model, Interact-Custom, by processing a large-scale dataset containing samples of identical human-object pairs with different interaction poses. Interact-Custom first explicitly models the spatial configuration by generating a foreground mask depicting the interaction behavior. It then generates target humans and objects that interact while preserving their identity characteristics, guided by this mask. Interact-Custom also provides an optional feature to specify the union of the background image and the target human-object locations, providing a high level of content control. Extensive experiments on custom metrics for the CHOI task demonstrate the effectiveness of the proposed approach.