[22:50:44Z] === predator-a3b-ngl-matrix run_id=b37836bf-d1a2-4d6b-a732-aff89da1fa07 === [22:50:44Z] Output: /Users/slobodan/projects/WeeyugaWeb/docs/BENCHMARKS/runs/b37836bf-d1a2-4d6b-a732-aff89da1fa07.jsonl [22:50:44Z] Mirror: predator:D:\WeeyugaBench\runs\b37836bf-d1a2-4d6b-a732-aff89da1fa07\ [22:50:44Z] Plan: NGL=6 12 24 on ctx=32768 [22:50:44Z] Creating run-mirror dir on Predator: D:\WeeyugaBench\runs\b37836bf-d1a2-4d6b-a732-aff89da1fa07 [22:50:47Z] >>> CELL: NGL=6 [22:50:47Z] Stopping llama-server on Predator... [22:50:51Z] Starting llama-server with Qwen3-30B-A3B-UD-IQ2_M.gguf, NGL=6, ctx=32768... [22:50:54Z] Waiting for llama-server (NGL=6, up to 12 min — 32k KV cache + 30B mmap takes a while)... [22:51:28Z] llama-server up after 25s [22:51:29Z] Running llama-bench for ngl6... [22:52:52Z] Running prompts for NGL=6 (cold + 3 warm per prompt)... [22:52:52Z] prompt=hello cold [22:53:16Z] prompt=hello warm 1 [22:53:36Z] prompt=hello warm 2 [22:53:58Z] prompt=hello warm 3 [22:54:31Z] prompt=P-MEDIUM cold [22:56:19Z] prompt=P-MEDIUM warm 1 [22:57:31Z] prompt=P-MEDIUM warm 2 [22:59:03Z] prompt=P-MEDIUM warm 3 [23:00:44Z] prompt=P-HARD cold [23:03:19Z] prompt=P-HARD warm 1 [23:04:52Z] prompt=P-HARD warm 2 [23:07:25Z] prompt=P-HARD warm 3 [23:09:58Z] >>> CELL: NGL=12 [23:09:58Z] Stopping llama-server on Predator... [23:10:03Z] Starting llama-server with Qwen3-30B-A3B-UD-IQ2_M.gguf, NGL=12, ctx=32768... [23:10:06Z] Waiting for llama-server (NGL=12, up to 12 min — 32k KV cache + 30B mmap takes a while)... [23:10:13Z] llama-server up after 6s [23:10:14Z] Running llama-bench for ngl12... [23:11:25Z] Running prompts for NGL=12 (cold + 3 warm per prompt)... [23:11:25Z] prompt=hello cold [23:12:07Z] prompt=hello warm 1 [23:12:18Z] prompt=hello warm 2 [23:12:27Z] prompt=hello warm 3 [23:12:46Z] prompt=P-MEDIUM cold [23:13:51Z] prompt=P-MEDIUM warm 1 [23:15:27Z] prompt=P-MEDIUM warm 2 [23:17:01Z] prompt=P-MEDIUM warm 3 [23:18:11Z] prompt=P-HARD cold [23:20:40Z] prompt=P-HARD warm 1 [23:22:13Z] prompt=P-HARD warm 2 [23:24:02Z] prompt=P-HARD warm 3 [23:26:26Z] >>> CELL: NGL=24 [23:26:26Z] Stopping llama-server on Predator... [23:26:31Z] Starting llama-server with Qwen3-30B-A3B-UD-IQ2_M.gguf, NGL=24, ctx=32768... [23:26:34Z] Waiting for llama-server (NGL=24, up to 12 min — 32k KV cache + 30B mmap takes a while)... [23:26:44Z] llama-server up after 9s [23:26:45Z] Running llama-bench for ngl24... [23:28:17Z] Running prompts for NGL=24 (cold + 3 warm per prompt)... [23:28:17Z] prompt=hello cold [23:28:28Z] prompt=hello warm 1 [23:28:44Z] prompt=hello warm 2 [23:28:54Z] prompt=hello warm 3 [23:29:07Z] prompt=P-MEDIUM cold [23:30:14Z] prompt=P-MEDIUM warm 1 [23:31:14Z] prompt=P-MEDIUM warm 2 [23:32:25Z] prompt=P-MEDIUM warm 3 [23:33:33Z] prompt=P-HARD cold [23:35:32Z] prompt=P-HARD warm 1 [23:36:44Z] prompt=P-HARD warm 2 [23:38:41Z] prompt=P-HARD warm 3 [23:40:49Z] Stopping llama-server on Predator... [23:40:53Z] === bench complete === [23:40:53Z] Mirroring run-dir to Predator... [23:41:01Z] On-device mirror at predator:D:\WeeyugaBench\runs\b37836bf-d1a2-4d6b-a732-aff89da1fa07\