Techniques
小于 1 分钟
目录
专题:LeetGPU - 从零手写 CUDA 算子 - 1D Convolution2D Convolution2D Jacobi Stencil2D Max Pooling2D Subarray Sum3D Convolution3D Subarray SumAll-Pairs Shortest PathsAttention with Linear Biases (ALiBi)Batch NormalizationBatched Matrix MultiplicationBFS Shortest PathCategorical Cross Entropy LossCausal Self-AttentionColor InversionCount 2D Array ElementCount 3D Array ElementCount Array ElementDot ProductFast Fourier Transform (FFT)FP16 Batched Matrix MultiplicationFP16 Dot ProductGaussian BlurGaussian Error Gated Linear UnitGeneral Matrix Multiplication (GEMM)GPT-2 Transformer BlockHistogrammingINT8 Quantized MatMulInterleave ArraysK-Means ClusteringLeaky ReLULinear Self-AttentionLogistic RegressionMatrix AdditionMatrix CopyMatrix MultiplicationMatrix PowerMatrix TransposeMax Subarray SumMean Squared ErrorMerge Sorted ArraysMoE Top-K GatingMonte Carlo IntegrationMulti-Agent Simulation (Boids)Multi-Head AttentionNearest NeighborOrdinary Least SquaresParallel MergePrefix SumRadix SortRainbow TableReductionReLUReverse ArrayRGB to GrayscaleRMS NormalizationRotary Positional Embedding (RoPE)Sigmoid ActivationSigmoid Linear Unit (SiLU)Simple InferenceSliding Window Self-AttentionSoftmaxSoftmax AttentionSortingSparse Matrix-Dense Matrix MultiplicationSparse Matrix-Vector MultiplicationStream CompactionSubarray SumSwish-Gated Linear UnitTop K SelectionTop-p Sampling (Nucleus)Value ClippingWeight Dequantization向量加法
专题:Transformer 深度解析 (Transformer Insider) 巧用 Linux 网络机制,打通网际孤岛 深入理解 RPC 语义:从可能交付到精确一次 深入理解内存一致性:从原子操作到指令集架构 深度解析流量治理——从重试风暴到自适应限流算法 解析循环冗余校验码——能检能纠的强大能力 详解 Git 的三种 Merge Request 形式