This paper proposes a workflow based on the TVM compiler to efficiently map AI workloads to RISC-V vector units. Instead of relying on existing handcrafted libraries or compiler-specific automatic vectorization, we integrate the RVV extension into TVM's MetaSchedule framework to optimize the performance of diverse AI workloads. Experiments on various RISC-V SoCs on FPGAs and commercial RISC-V SoCs demonstrate an average 46% improvement in execution latency compared to GCC's automatic vectorization feature, a 29% improvement compared to muRISCV-NN, and an average 35% faster mapping than LLVM. Furthermore, the generated binaries exhibit a reduced code memory footprint, making them suitable for embedded devices. The proposed workflow is open-sourced so that it can be applied to other RISC-V extensions.