분류 전체보기(55)
-
[Arxiv 24] REST: Retrieval-Based Speculative Decoding
* Speculative Decoding* RAG 기여점. Draft 모델 대신, 이전 토큰을 기반으로 데이터 저장소에서 Draft 토큰을 가지고 옴.
2025.02.12 -
pdb
Though you may not be able to reverse the code execution in time, the next best thing pdb has are the stack frame jumps.Use "w" to see where you're in the stack frame (bottom is the newest), and u(p) or d(own) to traverse up the stackframe to access the frame where the function call stepped you into the current frame.
2025.01.07 -
torch hook & torchview
torch hook을 통해 실행될 때의 각layer에 접근할 수 있는 코드def get_all_layers(module): ret = [] children = list(module.children()) if children == []: return module else: for child in children: try: ret.extend(get_all_layers(child)) except TypeError: ret.append(get_all_layers(child)) return ret# for post-forward hookdef hook_fn(module..
2025.01.02 -
open-mpi build
./configure --with-cuda=/usr/local/cuda-11.4 --prefix=/usr/local make -j$(nproc) sudo make install *nproc: the number of CPU cores available on the system.
2024.12.31 -
PPT 꿀팁
외우자 단축키Shift + E = 스포이드Shift + R + R = 개체 맨 앞으로아이콘 사이트https://www.flaticon.com/
2024.11.27 -
cuda stream priority
priority range checkint main(int argc, char** argv) { int a, b; cudaDeviceGetStreamPriorityRange(&a, &b); printf("%d to %d\n", a, b); return -1;} cudaStreamCreateWithPriority ( cudaStream_t* pStream, unsigned int flags, int priority ) cudaStreamCreateWithPriority(&mystream, cudaStreamDefault, -1);
2024.11.19