This paper focuses on transfer learning to accelerate sequential decision making by leveraging offline data obtained from heterogeneous data sources. Data from heterogeneous data sources that differ in observed features, distributions, or unobserved confounding variables often deidentify causal effects and bias simple estimates. To address this, this paper forms an ambiguity set of structural causal models defined by integral constraints on joint densities. Optimizing any causal effect over such a set typically leads to nonconvex programming, which provides a solution that strictly bounds the range of possible effects under heterogeneity or perturbation. For an efficient solution, this paper develops a hit-and-run sampler that explores the entire ambiguity set, and uses it with a local optimization oracle to generate causal boundary estimates that almost certainly converge to the true bound. To accommodate estimation errors, we relax the ambiguity set and establish exact error propagation guarantees by exploiting the Lipschitz continuity of causal effects. These causal boundaries are incorporated into the bandit algorithm via cancer removal and truncated UCB indices to produce optimal interval-dependent and minimax regret boundaries. To handle estimation errors, we also develop a secure algorithm that incorporates noisy causal boundaries. In the contextual bandit setting with function approximation, our method uses causal boundaries to prune both the set of actions per function class and context, achieving matching upper and lower regret boundaries with only logarithmic dependence on the function class complexity. Our analysis precisely characterizes when and how causal side information accelerates online learning, and experiments on synthetic benchmarks demonstrate significant regret reduction in data-poor or perturbed environments.