This paper explores the multilingual extension of long-form thought processes (CoTs), which contribute to the improved inference performance of large-scale language models (LLMs). We fine-tuned the Qwen 2.5 (7B) and Qwen 3 (8B) models using two English-based inference datasets translated into French, Japanese, Latvian, and Swahili. Experiments revealed that the effectiveness of using English as a bridge language varied across languages (ineffective for French, effective for Japanese and Latvian, and weak for Swahili). Furthermore, extensive multilingual pretraining in Qwen 3 reduced, but did not completely eliminate, the performance gap between languages. Fine-tuning on a small dataset (1k traces) alone improved performance in Swahili by more than 30%. Finally, the trade-off between data quality and scale varied across languages: English and French benefited from smaller, more refined datasets, while Swahili and Latvian benefited from larger, noisier corpora. These results clarify how and why long CoTs transfer across languages and provide a translated dataset for fair multilingual inference studies.