【Winter Wonderlust (2015)】

2025-06-26 15:27:00 180 views 2296 comments

Tencent’s tech team has optimized DeepSeek’s open-source DeepEP communication framework,Winter Wonderlust (2015) boosting its performance across different network environments, according to the Chinese AI startup. Testing showed a 100% improvement on RoCE networks and a 30% gain on InfiniBand (IB), offering more efficient solutions for AI model training. On GitHub, DeepSeek acknowledged the Chinese tech giant’s contribution had led to a “huge speedup.” DeepEP is a communication library tailored for a mixture of experts (MoE) and expert parallelism (EP), supporting high-throughput, low-latency GPU kernels and low-precision computing, including FP8. Tencent’s Starlink Networking team identified two main bottlenecks: underutilized dual-port NIC bandwidth and CPU control latency. After targeted optimizations, performance doubled on RoCE and improved by 30% on IB. The enhanced framework is now fully open-source and has been successfully deployed in training Tencent’s Hunyuan large model, demonstrating strong versatility within environments built on Tencent’s Starlink and H20 servers, Chinese tech media outlet iThome reported. [iThome, in Chinese]

Comments (36556)
Star Sky Information Network

Elon Musk reveals the first passenger SpaceX will send around the moon

2025-06-26 14:15
Miracle Information Network

A 'Friends' revival is the stuff of Matthew Perry's actual nightmares

2025-06-26 13:51
Acceleration Information Network

Corporate recruitment event is a real

2025-06-26 13:25
Exciting Information Network

Best grocery deal: Spend $20 and get $5 off at Amazon

2025-06-26 13:11
Search
Newsletter

Subscribe to our newsletter for the latest updates.

Follow Us