[r/ML] [D] unpopular opinion: instruct tuning is going to be a thing of the past.

Summary

This post argues that instruct tuning inherently harms models by trading knowledge for communication ability, proposing a new approach that uses a "snap-on communication head" after the model decides its answer. This method aims to preserve core knowledge, showing 0.0% MMLU change and significantly higher safety refusal (52% vs 8%) compared to official instruct versions. While constraint following still needs improvement, the approach suggests a way to avoid the 'alignment tax'.

Continue Reading

Explore related coverage about community news and adjacent AI developments: [r/ML] [D] MYTHOS-INVERSION STRUCTURAL AUDIT, [r/LocalLLaMA] karpathy / autoresearch, [r/ML] [R] Agentic AI and Occupational Displacement: A Multi-Regional Task Exposure Analysis (236 occupations, 5 US metros), [r/ML] Building behavioural response models of public figures using Brain scan data (Predict their next move using psychological modelling) [P].

[r/ML] [D] unpopular opinion: instruct tuning is going to be a thing of the past.

Summary

Continue Reading

Related Articles

Comments