RACCooN is a framework that converts video into paragraphs and then regenerates them into videos, allowing users to easily edit individual/raw videos. This framework automatically describes video scenes in natural language, allowing users to perform various editing operations, such as removing, adding, and modifying videos, using text. Its main steps are Video-to-Paragraph (V2P) and Paragraph-to-Video (P2V).