The Open Large-scale Korean Audio-Visual Speech (OLKAVS) dataset boasts the largest publicly available video-audio dataset (1,150 hours of video, 1,107 Korean speakers). It was recorded in a studio environment, covering nine different viewpoints and various noise conditions. It also provides pre-trained baseline models for both video speech recognition and lip reading tasks, and includes experimental results validating the effectiveness of multimodal and multi-view learning. It is expected to overcome the limitations of existing English-centric datasets and facilitate multimodal research in diverse fields, including Korean speech recognition, speaker recognition, pronunciation level classification, and lip movement analysis.