This paper presents Photonic Fabric™ and Photonic Fabric Appliance™ (PFA), a photonic-based switch and memory subsystem that provides low latency, high bandwidth, and low energy consumption. It integrates high-bandwidth HBM3E memory, on-module optical switches, and external DDR5 into a 2.5D electro-optical system-in-package to provide up to 32 TB of shared memory and 115 Tbps of full-bandwidth digital switching. Photonic Fabric™ enables more efficient execution of parallel processing strategies in distributed AI training and inference. It addresses the silicon area constraints that limit the fixed memory-to-compute ratio of conventional XPU accelerator designs. It expands memory capacity and bandwidth by replacing the local HBM stack of the XPU with a chiplet that connects to the Photonic Fabric. Using CelestiSim, a lightweight analytical simulator validated on NVIDIA H100 and H200 systems, we evaluate the LLM inference performance and energy savings in the PFA without changing the GPU core design. Simulation results show up to 3.66x throughput and 1.40x latency improvement in 405B-parameter LLM inference, up to 7.04x throughput and 1.41x latency improvement in 1T parameters, and 60-90% reduction in data movement energy consumption in all LLM training scenarios. The results are for NVIDIA GPUs, but are similarly applicable to other AI accelerator designs (XPUs) with the same memory-compute constraints.