Skip to content
View mnicely's full-sized avatar

Highlights

  • Pro

Block or report mnicely

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Planner/Generator/Evaluator orchestration harness for Claude Code (and Codex)

Shell 33 2 Updated May 2, 2026

Ideas → Roadmap → Steps — a planning plugin for Claude Code

20 Updated Apr 14, 2026

FlashInfer: Kernel Library for LLM Serving

Python 5,874 1,094 Updated Jun 30, 2026

NCCL communication API layer, and transport layer created from first principles.

C++ 16 Updated Aug 20, 2025

NCCL Tests

Cuda 1,570 385 Updated Jun 25, 2026

A Quirky Assortment of CuTe Kernels

Python 1,037 139 Updated Jun 29, 2026

Optimized primitives for collective multi-GPU communication

C++ 4,834 1,315 Updated Jun 30, 2026

cuDNN Frontend is NVIDIA's modern, open-source entry point to the cuDNN library and a growing collection of high-performance open-source kernels.

Python 857 193 Updated Jun 29, 2026

A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface

Jupyter Notebook 150 16 Updated Apr 14, 2026

TRaSH-Guides is a comprehensive collection of guides for Radarr, Sonarr, and related media management applications.

Markdown 3,041 309 Updated Jun 28, 2026

A list of awesome compiler projects and papers for tensor computation and deep learning.

2,763 326 Updated Oct 19, 2024

The official PyTorch implementation of the paper "Human Motion Diffusion Model"

Python 4,057 460 Updated Oct 1, 2025

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 9,975 1,928 Updated Jun 26, 2026

RTX compute samples

C++ 71 13 Updated Jun 17, 2023

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

Cuda 1,834 462 Updated Oct 9, 2023