|

info

AI engineer
github repo
LinkedIn
LLM inference & ML systems
contributing to vLLM (& llm-compressor), SGLang, ONNX Runtime — so far on quantization, SD, attention kernels (my PRs)

Pinned Posts

2026-04-21 torch.compile 탐구생활
2025-12-02 Int4 to FP16 dequantization optimization
2024-09-27 Matryoshka Representation Learning Review

all posts

2026-04-21 torch.compile 탐구생활
2026-02-07 FFT on GPU
2025-12-28 Data_format
2025-12-02 Int4 to FP16 dequantization optimization
See all...

categories

공사중...

공사중...

TIP

Working....