1. This paper proposes a new class of algorithms called Robust Blackbox Optimization (RBO) for Reinforcement Learning (RL).

2. RBO can recover gradients to high accuracy even when up to 23% of all measurements are corrupted.

3. RBO has been applied to MuJoCo robot control tasks and legged locomotion tasks, including path tracking for quadruped robots.

The article is generally trustworthy and reliable, as it provides evidence for its claims in the form of experiments conducted on MuJoCo robot control tasks and legged locomotion tasks. The authors also provide a detailed description of their proposed algorithm, Robust Blackbox Optimization (RBO), which allows for off-policy updates even when up to 23% of all measurements are corrupted. Furthermore, the authors provide evidence that RBO can effectively train policies in situations where other RL approaches fail due to adversarial noise.

The only potential bias in the article is that it does not explore any counterarguments or alternative solutions to the problem being addressed. However, this is understandable given that this is a research paper proposing a new algorithm rather than an overview of existing solutions.