✨EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control
📝 Summary:
EntroPIC stabilizes entropy during long-term LLM training by adaptively tuning loss coefficients with Proportional-Integral Control. This novel method ensures efficient exploration and prevents sub-optimal behaviors, leading to stable and optimal reinforcement learning for LLMs.
🔹 Publication Date: Published on Nov 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.15248
• PDF: https://arxiv.org/pdf/2511.15248
• Project Page: https://huggingface.co/spaces/yangkaiSIGS/entropic
• Github: https://github.com/yk7333/EntroPIC
🔹 Models citing this paper:
• https://huggingface.co/hunterbown/shannon-control-unit
✨ Spaces citing this paper:
• https://huggingface.co/spaces/yangkaiSIGS/entropic
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #MachineLearning #ReinforcementLearning #ControlTheory #DeepLearning
📝 Summary:
EntroPIC stabilizes entropy during long-term LLM training by adaptively tuning loss coefficients with Proportional-Integral Control. This novel method ensures efficient exploration and prevents sub-optimal behaviors, leading to stable and optimal reinforcement learning for LLMs.
🔹 Publication Date: Published on Nov 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.15248
• PDF: https://arxiv.org/pdf/2511.15248
• Project Page: https://huggingface.co/spaces/yangkaiSIGS/entropic
• Github: https://github.com/yk7333/EntroPIC
🔹 Models citing this paper:
• https://huggingface.co/hunterbown/shannon-control-unit
✨ Spaces citing this paper:
• https://huggingface.co/spaces/yangkaiSIGS/entropic
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #MachineLearning #ReinforcementLearning #ControlTheory #DeepLearning