Sparse autoencoder (SAE) extracts features corresponding to interpretable concepts from activations within the LLM. A key SAE training hyperparameter is L0, which dictates the average number of SAE features to be activated per token. Previous studies compare SAE algorithms using a sparsity-reconstruction tradeoff plot, implying that L0 is a free parameter with no single correct value other than its impact on reconstruction. This study investigates the effect of L0 on SAE and shows that if L0 is not set properly, SAE fails to separate the underlying features of LLM. If L0 is too low, SAE mixes correlated features to improve reconstruction. If L0 is too high, SAE finds a corrupted solution that mixes features. We also present a proxy metric that helps find the correct L0 for SAE given a given training distribution. Our methodology finds the correct L0 on toy models and shows that it matches the best sparse probing performance in LLM SAE. We find that most commonly used SAEs have too low L0. This study shows that L0 must be set accurately to train SAEs with correct features.