Showing 120 of 120on this page. Filters & sort apply to loaded results; URL updates for sharing.120 of 120 on this page
SwiGLU Activation Function
使用 PyTorch 从头开始 构建您自己的 Llama 3 架构 | 爱搜AI工具资源导航站
Add Swiglu activation function · Issue #128712 · pytorch/pytorch · GitHub
Exploring SwiGLU : The Activation Function Powering Modern LLMs | by ...
[Pytorch] Swiglu implementation not aligned with jiterator version in ...
python - How to implement SwiGLU activation? Why does SwiGLU takes in ...
Triton vs PyTorch: SwigLU activation function performance | Suraj kumar ...
为什么大型语言模型都在使用 SwiGLU 作为激活函数? - 知乎
performance of swiglu operator · Issue #734 · facebookresearch/xformers ...
Tutorial on Scaled Dot-Product Attention with PyTorch Implementation ...
[딥러닝] LLaMA 모델 정리(RoPE(Rotary Positional Embedding), GELU, SwiGLU ...
SwiGLU with GELU:重新定义前馈神经网络的激活函数设计艺术_glu、swiglu、gated ffn-CSDN博客
GitHub - ZiyuanMa/nlp_transformer: PyTorch implementation of nlp ...
Beyond ReLU: Discovering the Power of SwiGLU | by heping_LU | Medium
swiglu 激活函数学习笔记_swiglu 代码-CSDN博客
为什么大型语言模型都在使用 SwiGLU 作为激活函数?-阿里云开发者社区
Building Neural Networks in PyTorch | AI Learning Journey
SwiGLU with SiLU:大模型时代的激活函数革命与架构设计精要_silu 论文-CSDN博客
为什么大型语言模型都在使用 SwiGLU 作为激活函数?-腾讯云开发者社区-腾讯云
GitHub - nanowell/Differential-Transformer-PyTorch: PyTorch ...
Implementing PyTorch Flash Attention for Scalable Deep Learning Models ...
Boosting Llama 2 Performance with RMSNorm: PyTorch and TensorFlow ...
PyTorch Activation Functions for Deep Learning • datagy
LLaMA-2 from the Ground Up - by Cameron R. Wolfe, Ph.D.
From GPT-2 to gpt-oss: Analyzing the Architectural Advances
激活函数-SwiGLU_silu激活函数-CSDN博客
SwiGLU魔法揭秘:神经网络效果飙升的神秘力量! - 知乎
SwiGLU是如何改善神经网络模型效果的? - 知乎
详解SwiGLU激活函数 - 知乎
GLU 变种:ReGLU 、 GEGLU 、 SwiGLU-CSDN博客
Swish和SwiGLU激活函数介绍_pytorch swish-CSDN博客
大模型系列:SwiGLU激活函数与GLU门控线性单元原理解析-CSDN博客
Transformer Activation Functions and their Details | JoeLogs
SwiGLU: The Activation Function Powering Modern LLMs | by Saeed Mehrang ...
Transformer Design Guide (Part 2: Modern Architecture) | Rohit Bandaru
The Evolution of Llama: From Llama 1 to Llama 3.1 | Towards Data Science
大模型学习笔记------Llama 3模型架构之RMS Norm与激活函数SwiGLU - 技术栈
大模型基础|激活函数|从ReLU 到SwiGLU - 知乎
All the Activation Functions
Building an Efficient Machine Learning API
SwiGLU论文阅读-CSDN博客
什麼是 PyTorch?為初學者徹底解說其特點、應用實例與安裝方法 - Practical Python Programming
Decoder-Only Transformers: The Workhorse of Generative LLMs
Linear Layers and Activation Functions in Transformer Models ...
SwiGLU论文阅读 | by MLTalks | Medium
【大模型】激活函数之SwiGLU详解-CSDN博客
【NLP高频面题 - LLM架构篇】使用SwiGLU相对于ReLU有什么好处?_动态门控机制-CSDN博客
【笔记】SWiGLU激活函数-大模型常用-CSDN博客
神经网络激活函数:从ReLU到前沿SwiGLU - 技术栈
llama源码学习·model.py[2]SwiGLU激活函数-CSDN博客
Discovering SwiGLU: The Activation Function Powering Modern LLMs
【手撕LLM - Mixtral-8x7B】Pytorch 实现 - 知乎
激活函数的进化之旅:从Sigmoid到SwiGLU,深度学习的神经触发器_ITPUB博客
What is SwiGLU? • Carlos Roldán
【大模型架构笔记】大模型常用激活函数SwiGLU - 知乎
SwiGLU: The FFN Upgrade I Use to Get Free Performance - DEV Community
SwiGLU: The Gated Activation Fueling Modern LLMs
[P] Coding LLaMA 2 from scratch in PyTorch, with step by step ...
DeepSeek中的激活函数SwiGLU_swiglu的bias到底加不加-CSDN博客
Python 通俗易懂系列之-Transformer 各种优化技术,KV Cache,RMSNorm, SwiGLU,GQA,RoPE,旋转 ...
深入解析LLaMA如何改进Transformer的底层结构 - 华为云开发者联盟 - 博客园
SwiGLU: GLU Variants Improve Transformer (2020) – Naoki Shibuya
Coding the Swish Activation Function in PyTorch: Step-by-Step Guide ...
PyTorch源码分析(2)——动态图原理 - 知乎
Squared ReLU and Laplace functions · Issue #1 · lucidrains/Mega-pytorch ...
介绍llama2|带有SwiGlu的FeedForward_swiglu mlp-CSDN博客
SwiGLU激活函数论文:GLU Variants Improve Transformer - 知乎
大模型系列:SwiGLU激活函数与GLU门控线性单元原理解析_mb648c186b9844f的技术博客_51CTO博客
SparseLLM/swiglu-45B · Hugging Face
LLMs之llama3-from-scratch:llama3-from-scratch(从头开始利用pytorch来实现并解读LLaMA-3 ...
Study Notes: Stanford CS336 Language Modeling from Scratch [5] | 🍒 Han ...
SwiGLU在深度学习中到底有什么作用? - 知乎
Mastering LLaMA 3 from Scratch with Python
Deepseek
LLaMA模型结构介绍 - 知乎
SwiGLU是一个更好的选择吗? - 知乎
SwiGLU激活函数简要总结 - 知乎
神经网络的激活函数(五)门控系列GLU、Swish和SwiGLU - 知乎
使用Colossal-AI复现Pathways Language Model_swiglu-CSDN博客
Intel Smooth-SwiGLU:FP8 LLM 训练,34% 加速-AI.x-AIGC专属社区-51CTO.COM
从公式拆解 SwiGLU激活函数 - 知乎