[23:47:58Z] === predator-a3b-ctx-sweep run_id=ad28cb95-5134-40b5-8514-c8a381f83d87 === [23:47:58Z] Output: /Users/slobodan/projects/WeeyugaWeb/docs/BENCHMARKS/runs/ad28cb95-5134-40b5-8514-c8a381f83d87.jsonl [23:47:58Z] Plan: ngl=36/ctx=32k, ngl=6/ctx=64k, ngl=6/ctx=128k [23:47:58Z] Creating run-mirror dir on Predator: D:\WeeyugaBench\runs\ad28cb95-5134-40b5-8514-c8a381f83d87 [23:48:01Z] >>> CELL: ngl36-ctx32k (NGL=36 ctx=32768) [23:48:01Z] Stopping llama-server on Predator... [23:48:05Z] Starting llama-server (key=ngl36-ctx32k, NGL=36, ctx=32768)... [23:48:08Z] Waiting for llama-server (key=ngl36-ctx32k, up to 15 min — large KV cache allocation)... [23:48:18Z] llama-server up after 9s [23:48:21Z] Running llama-bench for ngl36-ctx32k (NGL=36)... [23:52:06Z] Running prompts for ngl36-ctx32k (cold + 3 warm per prompt)... [23:52:06Z] prompt=hello cold [23:52:40Z] prompt=hello warm 1 [23:53:02Z] prompt=hello warm 2 [23:53:30Z] prompt=hello warm 3 [23:53:49Z] prompt=P-MEDIUM cold [23:54:59Z] prompt=P-MEDIUM warm 1 [23:56:14Z] prompt=P-MEDIUM warm 2 [23:57:23Z] prompt=P-MEDIUM warm 3 [23:58:30Z] prompt=P-HARD cold [00:01:18Z] prompt=P-HARD warm 1 [00:04:12Z] prompt=P-HARD warm 2 [00:05:51Z] prompt=P-HARD warm 3 [00:08:36Z] >>> CELL: ngl6-ctx64k (NGL=6 ctx=65536) [00:08:36Z] Stopping llama-server on Predator... [00:08:41Z] Starting llama-server (key=ngl6-ctx64k, NGL=6, ctx=65536)... [00:08:43Z] Waiting for llama-server (key=ngl6-ctx64k, up to 15 min — large KV cache allocation)... [00:09:12Z] llama-server up after 24s [00:09:14Z] Running llama-bench for ngl6-ctx64k (NGL=6)... [00:10:31Z] Running prompts for ngl6-ctx64k (cold + 3 warm per prompt)... [00:10:31Z] prompt=hello cold [00:10:51Z] prompt=hello warm 1 [00:11:20Z] prompt=hello warm 2 [00:11:58Z] prompt=hello warm 3 [00:12:16Z] prompt=P-MEDIUM cold [00:13:52Z] prompt=P-MEDIUM warm 1 [00:15:25Z] prompt=P-MEDIUM warm 2 [00:16:44Z] prompt=P-MEDIUM warm 3 [00:17:37Z] prompt=P-HARD cold [00:19:07Z] prompt=P-HARD warm 1 [00:21:33Z] prompt=P-HARD warm 2 [00:24:01Z] prompt=P-HARD warm 3 [00:26:21Z] >>> CELL: ngl6-ctx128k (NGL=6 ctx=131072) [00:26:21Z] Stopping llama-server on Predator... [00:26:25Z] Starting llama-server (key=ngl6-ctx128k, NGL=6, ctx=131072)... [00:26:28Z] Waiting for llama-server (key=ngl6-ctx128k, up to 15 min — large KV cache allocation)... [00:26:35Z] llama-server up after 7s [00:26:38Z] Running llama-bench for ngl6-ctx128k (NGL=6)... [00:28:04Z] Running prompts for ngl6-ctx128k (cold + 3 warm per prompt)... [00:28:04Z] prompt=hello cold [00:28:27Z] prompt=hello warm 1 [00:29:10Z] prompt=hello warm 2 [00:29:31Z] prompt=hello warm 3 [00:29:45Z] prompt=P-MEDIUM cold [00:31:22Z] prompt=P-MEDIUM warm 1 [00:33:10Z] prompt=P-MEDIUM warm 2 [00:34:46Z] prompt=P-MEDIUM warm 3 [00:36:30Z] prompt=P-HARD cold [00:39:15Z] prompt=P-HARD warm 1 [00:41:56Z] prompt=P-HARD warm 2 [00:43:35Z] prompt=P-HARD warm 3 [00:45:53Z] Stopping llama-server on Predator... [00:45:58Z] === bench complete === [00:45:58Z] Mirroring run-dir to Predator... [00:46:03Z] On-device mirror: predator:D:\WeeyugaBench\runs\ad28cb95-5134-40b5-8514-c8a381f83d87\