Blog

Nov 23
2023

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Q* appears to apply a RL technique that uses AI generated data and teaches LLMs how to solve multi step logic problems Q* techniques can be applied to GPT-5 endowing it with excellent reasoning and retrieval skills This may not be AGI but it is an extremely powerful LLM.

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

1 Comment so far

Lance says:

November 23, 2023 at 8:42 pm

What happens when “it” causes you to do something you don’t want to do?

Comments are closed.

/* */