Resumen
The integration of Environmental, Social, and Governance (ESG) criteria into algorithmic portfolio optimization presents a fundamental challenge: how to balance financial performance with sustainability goals without sacrificing one for the other. Deep Reinforcement Learning (DRL) has emerged as a powerful approach for portfolio management, yet incorporating ESG constraints into the hyperparameter optimization process remains an open problem. We propose ECCA (ESG-Constrained Cost-Aware Expected Improvement), a novel Bayesian Optimization framework that combines hard and soft ESG constraints through dual Gaussian Process surrogates. The hard constraint ensures probabilistic feasibility by rejecting configurations below a user-defined ESG threshold, while an adaptive soft constraint progressively rewards higher ESG scores among feasible solutions. Through a controlled experimental study on a synthetic benchmark designed to isolate the soft constraint's effect, we demonstrate that ECCA achieves statistically significant improvements in ESG scores compared to hard-constraint-only baselines, as confirmed by both parametric and non-parametric tests at the α = 0.05 significance level, while maintaining equivalent risk-adjusted returns. We further validate ECCA on a real-world portfolio optimization task using 28 Dow Jones Industrial Average constituents with ESG scores derived from MSCI ratings. In a preliminary out-of-sample evaluation (March 2023-March 2024), ECCA achieves a Sharpe ratio of 2.63 compared to 2.32 for the equal-weight (1/N) benchmark of (author?) [1], while simultaneously attaining higher ESG scores (0.67 vs. 0.65), demonstrating that the framework's advantages extend from synthetic benchmarks to practical portfolio management. Our framework provides practitioners with interpretable control over the exploration-exploitation-ESG trade-off, offering a principled methodology for sustainable portfolio optimization.
Soft-Hard ESG Constraints in Portfolio Optimization via Cost-Aware Bayesian Optimization of Deep Reinforcement Learning Agents