2024-09-19
- Toronto AI Safety Meetup: Weak-to-strong generalization
- Paper: Weak-to-Strong Generalization: Elicting Strong Capabilities With Weak Supervision
- Empircal methods
- Weak-to-Strong generalization
- Used various NLP classification tasks
- Training GPT-4 to predict task based on GPT-2 labels
- Fine-tuning on GPT-2 does better than 0-shot prompting
- Models can still be accurate with 99% random labels