Ejemplares similares: Reinforcement Learing and Stochastic Optimization