考虑到他发布的高质量和/或发人深省的品味,这位海报的关注人数实在是太少了。
mike64_t
mike64_t10月18日 12:33
I think the observation that LLMs are "bad tutors" in that they cannot precisely probe understanding is accurate. The fact that "upweighting the entire rollout" is stupid is also true. However its not obvious to me that the remedy for that is LLM-reflection as to "what went well". I think this runs into very similar issues of collapse-risk or misallocation of supervision. Because while we might be sucking supervision through a straw, the only thing thats even worse is sucking tainted supervision through a straw.
并不是说迈克是个小众发帖者,但我只是在想有多少垃圾账号的关注者比他多2到10倍。
30.79K