This paper focuses on how components used to build Generative Artificial Intelligence (GenAI) applications—including inference servers, object stores, vector and graph databases, and user interfaces—are interconnected via web-based APIs. In particular, it highlights the growing trend of containerized deployment of these components in cloud environments, highlighting the need for related technology development in high-performance computing (HPC) centers. This paper discusses the integration of HPC and cloud computing environments and presents a converged computing architecture that integrates HPC and Kubernetes platforms to run containerized GenAI workloads. A case study of the deployment of the Llama Large Language Model (LLM) demonstrates the deployment of a containerized inference server (vLLM) using multiple container runtimes on Kubernetes and HPC platforms. This paper presents practical considerations and opportunities for the HPC container community and provides guidance for future research and tool development.