AI Dose
0
Likes
0
Saves
Back to updates

[Paper] Grounded Token Initialization for New Vocabulary in LMs for Generative Recommendation

Impact: 7/10
Swipe left/right

Summary

This paper analyzes the standard practice of initializing new vocabulary tokens in Language Models (LMs) for domain-specific tasks like generative recommendation. Typically, these tokens are initialized as the mean of existing vocabulary embeddings, followed by supervised fine-tuning. The research systematically demonstrates that this mean initialization strategy causes all new tokens to collapse, hindering their effective representation learning.

Continue Reading

Explore related coverage about research paper and adjacent AI developments: [Paper] Ruka-v2: Tendon Driven Open-Source Dexterous Hand with Wrist and Abduction for Robot Learning, [Paper] MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage, [Paper] In-Place Test-Time Training, [Paper] HaloProbe: Bayesian Detection and Mitigation of Object Hallucinations in Vision-Language Models.

Related Articles

Comments

Sign in to leave a comment.

Loading comments...