SameTime WMT 专题 on GrepCode

SameTime WMT 专题 on GrepCode https://www.grepcode.cn/wmt/index.html Recent content in SameTime WMT 专题 on GrepCode Hugo zh-CN Tue, 05 May 2026 00:00:00 +0000 SameTime WMT 专题：Phase 6 Transformer——从 3 到 11 的跃迁，以及通往学术基线的路 https://www.grepcode.cn/wmt/008-wmt-phase6-transformer.html Tue, 05 May 2026 00:00:00 +0000 https://www.grepcode.cn/wmt/008-wmt-phase6-transformer.html Attention 天花板 3.77，Transformer 词级 11.49，BPE 小模型 10.66。下一步：BPE d512 大模型追学术基线 20-25。 SameTime WMT 专题：Attention——从时序链到全连接邻接图 https://www.grepcode.cn/wmt/007-wmt-phase2-attention.html Mon, 04 May 2026 00:00:00 +0000 https://www.grepcode.cn/wmt/007-wmt-phase2-attention.html RNN 用单条时间链关押所有词的关系——Attention 把每个词放进独立桶，用邻接矩阵保留全连接图。SoftBLEU 的下一个瓶颈：vocab 稀释梯度。 SameTime WMT 专题：实验总览——从 RNN 到 Attention 的技术组合与天花板 https://www.grepcode.cn/wmt/009-wmt-expr-overview.html Mon, 04 May 2026 00:00:00 +0000 https://www.grepcode.cn/wmt/009-wmt-expr-overview.html Phase 1 RNN 和 Phase 2 Attention 的全部实验编号与最佳 BLEU 总结。 SameTime WMT 专题：可微 BLEU 的算法代价——从黑板公式到 GPU 现实 https://www.grepcode.cn/wmt/006-wmt-softbleu-cost.html Sun, 03 May 2026 00:00:00 +0000 https://www.grepcode.cn/wmt/006-wmt-softbleu-cost.html SoftBLEU 理论是对的——但在 GPU 上跑 Python 循环不属于这份正确。 SameTime WMT 专题：梯度函数决定学习规律 https://www.grepcode.cn/wmt/005-wmt-gradient-theory.html Sat, 02 May 2026 00:00:00 +0000 https://www.grepcode.cn/wmt/005-wmt-gradient-theory.html 从 sin(0.76) 到 tanh(3.06) 的距离，不是函数好坏——是梯度方向的自洽性决定了一个模型能走多远。 SameTime WMT 专题：Phase 1 从 RNN 记忆到 LSTM 门控 https://www.grepcode.cn/wmt/004-wmt-phase1.html Mon, 27 Apr 2026 00:00:00 +0000 https://www.grepcode.cn/wmt/004-wmt-phase1.html SameTime WMT Phase 1 学习记录——拆分为 1.0 vanilla RNN（理解记忆原理）和 1.1 LSTM（解决梯度消失），逐级对比。 SameTime WMT 专题：Phase 0 实验底座骨架 https://www.grepcode.cn/wmt/003-wmt-phase0.html Sun, 26 Apr 2026 00:00:00 +0000 https://www.grepcode.cn/wmt/003-wmt-phase0.html 开启 SameTime 的 WMT 学习专题，聚焦 benchmark/wmt/phase0 的骨架实验与翻译管线构建。