kubecoder's recent timeline updates

kubecoder

V2EX member #414769, joined on 2019-05-22 21:43:52 +08:00

Today's activity rank 15946

kubecoder 提问技术话题好玩工作信息交易信息城市相关

原来中年男人，真的没人疼了

生活 • kubecoder • Mar 4 • Lastly replied by gurachin

» More topics by kubecoder

kubecoder's recent replies

2 days ago

Replied to a topic by KaiWuBOSS › Local LLM › 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

@shen09darkareas 感谢，大意了随便安装了个最新版本的 openssl 还是不行，又仔细看了下您说的是：Win64OpenSSL-3_6_2.exe ，重新安装 3.6.2 ，现在跑起来了，再次感谢

4 days ago

Replied to a topic by KaiWuBOSS › Local LLM › 我做了个工具让 8GB 显卡跑 30B 模型从 3 tok/s 提到 21 tok/s，记录一下技术发现

PS E:\kaiwu> kaiwu run qwen3-4b --reset

██╗ ██╗ █████╗ ██╗██╗ ██╗██╗ ██╗
██║ ██╔╝██╔══██╗██║██║ ██║██║ ██║
█████╔╝ ███████║██║██║ █╗ ██║██║ ██║
██╔═██╗ ██╔══██║██║██║███╗██║██║ ██║
██║ ██╗██║ ██║██║╚███╔███╔╝╚██████╔╝
╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚══╝╚══╝ ╚═════╝
本地大模型部署器 vv0.1.5 · llama.cpp b8864
by llmbbs.ai · 本地 AI 技术社区

[1/6] Probing hardware...
GPU: NVIDIA GeForce RTX 3060 (SM86, 12288 MB VRAM, 360 GB/s)
RAM: 31 GB DDR4
OS: windows amd64

[2/6] Selecting configuration...
Model: Qwen3-4B (dense, 4B)
Quant: q8-0 (4.4 GB)
Mode: full_gpu
Accel: Flash Attention

[3/6] Checking files...
Using bundled iso3 binary: llama-server-cuda.exe
Binary: llama-server-cuda.exe [cached]
Model: Qwen3-4B-Q8_0.gguf [cached]

[4/6] Preflight check...
llama-server 不支持 iso3 ，回退到 q8_0/q4_0
✓ VRAM sufficient

[5/6] Warmup benchmark...
已清除缓存，重新探测
Probe 1: ctx=32K ... OOM
Probe 2: ctx=16K ... OOM
Probe 3: ctx=8K ... OOM
⚠️ Warmup failed: all ctx probes failed (tried down to 4K)
Using default parameters

[6/6] Starting server...
Waiting for llama-server to be ready (port 11434)...
⚠️ 显存不足，降低上下文至 16K 重试...
Waiting for llama-server to be ready (port 11434)...
⚠️ 显存不足，降低上下文至 8K 重试...
Waiting for llama-server to be ready (port 11434)...
Error: failed to start llama-server: 连续 3 次启动失败，即使最小上下文(4K)也无法运行

NVIDIA GeForce RTX 3060: 12288 MB VRAM
模型 Qwen3-4B: ~4505 MB
KV cache (4K, q4_0): ~144 MB
预估总需: ~5673 MB

建议:
1. 运行 kaiwu run qwen3-4b --reset 重新探测参数
2. 模型较小但仍 OOM ，可能是参数配置问题，请升级到最新版本

Usage:
kaiwu run <model> [flags]

Flags:
--bench Run benchmark after starting
--ctx-size int 手动指定上下文大小（ 0=自动）
--fast Skip warmup, use cached profile
-h, --help help for run
--llama-server string 使用自定义 llama-server 二进制（完整路径）
--reset 清除缓存，重新 warmup 探测最优参数

PS E:\kaiwu>

Feb 28

Replied to a topic by kubecoder › 生活 › 原来中年男人，真的没人疼了

@eventlooped 感谢批评，上有老下有小，中老婆也得哄，每天上班就感觉已经超负荷运转了，周末还要充当带娃主力，夹缝求生中喝口小酒已经是不可多得了，没想到人老西儿不喝酱香的
2026 给自己立个 FLAG ，每周跑 25 公里慢跑吧
酒这种东西还是小饮怡情，毕竟退休金想回本还得活到 73 岁才行（男性平均 67.7 ）
取悦自己---这个，还是得找个大佬给出个攻略，想痛痛快快的把 steam 里的游戏玩一玩：骑砍，大表哥，地平线，魔兽世界，是真的没条件
工作环境也想换一换呢
房子也想换一换呢
家庭排队的电牌今年也该排到了，是先买个 10 来万的车凑合一下呢，还是？
一点一点来吧
假期看了个综艺挺好的，叫喜人，好像好几季了，我只看了最新的这一季也还没看完推荐给大家，里边很喜欢那个诸葛亮，有句台词是：我我一堆事儿啊，给我笑喷了

Feb 28

Replied to a topic by kubecoder › 生活 › 原来中年男人，真的没人疼了

@CloudG 你也没放过我啊兄弟

Dec 9, 2025

Replied to a topic by jchencode › Android › 对豆包手机的思考

@kubecoder 管他呢，干就完了，先爽一把，后面再慢慢收口呗，反正现在私人信息也满天飞

Dec 9, 2025

Replied to a topic by jchencode › Android › 对豆包手机的思考

照你说的，百度地图，各种手机云盘，都别用了得了

Nov 9, 2020

Replied to a topic by ccming › iPhone › 现在每次用 Apple Pay 付款首先都会弹出公交卡，好尴尬

@wasgay 都哪些商户支持刷公交卡哇，除了超市

» More replies by kubecoder