In this paper, we propose ACME, an adaptive customization method for large-scale models based on transformers that utilizes distributed systems to solve the data privacy and response delay issues that occur when deploying large-scale language models in cloud environments. ACME performs progressively fine-grained collaborative model customization using a bidirectional single-loop distributed system to avoid the inefficiency of centralized methods. To improve the suitability for user heterogeneity, we identify the Pareto Front under model size constraints to customize the backbone generation, and then use personalized architecture aggregation based on data distribution to improve header generation and model to accommodate data heterogeneity. Evaluation results on various datasets show that ACME achieves a cost-effective model under model size constraints, reduces data transmission by 6% compared to centralized systems, improves the average accuracy by 10% compared to the baseline model, and increases the performance indicator by about 30%.