Commentary
- This is a spicy take. The points are valid
- Averaging is a critical mistake
- They miss checking whether an answer exist or not
- Always answers will guarantee hallucination
- It should asses the question, then generate and not generate and then asses.