БЛОГ

Nov 23, 2023

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Posted by in category: robotics/AI

Q* appears to apply a RL technique that uses AI generated data and teaches LLMs how to solve multi step logic problems Q* techniques can be applied to GPT-5 endowing it with excellent reasoning and retrieval skills This may not be AGI but it is an extremely powerful LLM.


We’re on a journey to advance and democratize artificial intelligence through open source and open science.

1

Comment — comments are now closed.


  1. Lance says:

    What happens when “it” causes you to do something you don’t want to do?