Sloba 2026-05-06T17:30Z chat: "i want to copy this public repo to my
github and make it a clone there as a public github repo with
sponsorship enabled".
GitHub mirror created at https://github.com/slobodanmargetic988/weeyuga-benchmarks-public
(public, with description + homepage pointing at
https://benchmarks.weeyuga.com); both `main` and the
`feature/runner-and-agent-instructions` branch pushed.
This commit lights up the "Sponsor" button on the GitHub repo's
landing page via .github/FUNDING.yml. The button currently routes to
github.com/sponsors/slobodanmargetic988 — Sloba's GitHub Sponsors
profile needs separate one-time enrollment at that URL before the
button does anything useful (see GitHub's "Set up GitHub Sponsors"
flow). Once enrolled, every public repo of his with this FUNDING.yml
gets a Sponsor button automatically.
The file also leaves placeholders for `custom`, `patreon`, `ko_fi`,
`tidelift`, `liberapay`, `polar` etc., commented out — uncomment +
edit when corresponding accounts exist.
Lives in `.github/FUNDING.yml` (the canonical GitHub-side path);
Gitea ignores the file harmlessly since it has no equivalent
sponsorship integration today, so the file is identical on both
mirrors with no Gitea-side noise.
Sloba 2026-05-06T16:55Z chat: "lets flip the switch i want the repo public".
Pre-launch security audit signed off:
- G2-P0-1 (/Users/slobodan/ paths) cleared end-to-end
- G2-P0-2 (10.8.0.x WG mesh IPs) cleared end-to-end
- G2-P0-3 (Slobodans-MacBook-Air mDNS) cleared end-to-end
- G2-P0-4 (MyBoard / TruthGraph names) cleared end-to-end
All four verified clean on the live https://benchmarks.weeyuga.com/
deploy at SHA 0ba4451 by Miljan's pen-test re-grep on
/index.html, /catalog.html, /benchmarks/09d8fbde.html (0 hits per
pattern per page). Full live security probe earlier the same day
also cleared: path traversal blocked, methods 405-locked, autoindex
OFF, hidden files (.git/.DS_Store/.env/.htaccess) all 404,
_template.html 404 via vhost SPA-fallback fix, all 6 security
headers + HSTS holding.
Updated:
- README banner: "PRIVATE STAGING — pre-launch audit pending" →
"PUBLIC — anonymous-readable since 2026-05-06"
- Status table: "Pre-launch security audit | scheduled" →
"cleared 2026-05-06 (G2 P0-1/2/3/4 all verified clean on live
benchmarks.weeyuga.com at SHA 0ba4451)"
- Status table: "Visibility flipped to public | pending audit
sign-off" → "2026-05-06 ✓"
This commit lands the README copy update inside the repo. The
Gitea-side visibility flip (Settings → Visibility → Make Public)
is a UI click Sloba does himself; no Gitea API token available
locally to drive it from this session.
crowdsourced runner
Sloba's chat directive 2026-05-06: "this project is preparation for
going public ... ship the harness along so others can join in."
The repo's original purpose (Ben's catalogue + 21 reference run
ledgers, shipped 2026-05-05) stays intact. This commit ADDS a second
purpose: a portable harness + agent runbook so a friend's coding agent
can clone, read CLAUDE.md, run the same suite on the friend's hardware,
and submit results back as a PR.
What landed:
CLAUDE.md + AGENTS.md (byte-identical, ~520 lines)
Full agent runbook: hardware probe, runtime + model selection,
canonical knob reference (Sloba's Pavilion methodology values),
hardware-adaptation decision rules, run-instructions, output-schema
templates for hardware.json + metadata.json + run.md, PR submission
flow (fork → branch → push → PR; nothing auto-merges), privacy
guardrails, methodology lineage. Per Sloba's Q3 directive: the
runbook explicitly tells the friend's agent to ADAPT to hardware
reality and document deviations rather than blindly run defaults.
CONTRIBUTING.md (~110 lines)
Human-readable companion for the friend (not the agent). What you
need, how it works, what we ask, what maintainers commit to,
license, code-of-conduct short version.
harness/
├── README.md Technical readme for the harness folder
├── run_benchmark.py ~520 LOC runner. Stdlib-only. Adapted from
│ WeeyugaWeb/scripts/benchmarks/run_pavilion_weeyuga.py
│ v3 with the cluster-internal IP defaults
│ (10.8.0.x) replaced by 127.0.0.1:11434, the
│ cluster /v1/cluster/* endpoints removed, the
│ canonical-suite paths under ~/Documents/MyServers
│ replaced by harness/suites/ paths, the git-sha
│ enforcement on WeeyugaWeb dropped, and the
│ output written under submissions/<handle>/<tag>/
│ instead of docs/BENCHMARKS/runs/. Supports all
│ six suite phases via --phases, plus 'all'.
├── prompts.py Verbatim copy of the canonical 3 frozen prompts
│ (P-EASY/P-MEDIUM/P-HARD) from
│ WeeyugaWeb/scripts/benchmarks/prompts.py.
├── requirements.txt Empty by intent (stdlib-only); placeholder for
│ pip-tools / agent auto-install patterns.
├── .gitignore __pycache__/ etc.
└── suites/ Six bundled JSON suites copied verbatim from
Sloba's MyServers/instances/vps-81-17-99-14/telemetry/:
small_model_eval_questions.json, python_task_suite_questions.json,
parallel_qwen_same_model_20q_suite.json,
parallel_qwen_mixed_model_20q_suite.json,
python_context_edge_append_questions.json,
python_context_edge_suite_only.json.
submissions/
README.md Folder convention + naming + reviewability rules
EXAMPLE/mac-m1-8gb/run-00000000-...-000000000000/
Synthetic-but-shape-complete contribution template:
manifest.json, hardware.json, run.jsonl (5 example lines),
metadata.json, run.md (with privacy attestation, methodology
deviations, reproducibility command). Marked as synthetic at
the top so future analysis doesn't accidentally cite it.
LICENSE-MIT
MIT for harness/*.py and future helper code. Existing LICENSE
(CC-BY-4.0) covers data files.
README.md (modified)
Updated to reflect dual purpose. Layout diagram updated.
Maintainer credits: Ben for catalogue/methodology + Bane for harness.
Contributor quick-start added. Status table extended.
Privacy posture:
- All 6 suite JSON files privacy-scanned for cluster IPs / hostnames /
paths / tokens. Two prompts contain project names ("MyBoard" auth
debugging in 20Q-Q14, generic SSH troubleshooting in 5Q-Q03);
flagged in chat for Sloba's review. Otherwise clean.
- run_benchmark.py default target_url is 127.0.0.1:11434 (no internal
IPs leaked).
- manifest.json captures host_hostname_short via socket.gethostname()
.split('.')[0] — agent should review before PR if hostname is
sensitive.
- CLAUDE.md §8 spells out the privacy-grep before push.
Verification:
- py_compile run_benchmark.py: OK
- --help renders cleanly
- All 6 suite JSON files: valid
- All 4 example JSON files: valid
- Example run.jsonl (5 lines): valid
This commit lands on branch feature/runner-and-agent-instructions.
NOT pushed to main; staying on the feature branch until Sloba reviews
on Gitea and merges. Bus dispatch to Ben + Sam announcing the
architectural pivot lives in the WeeyugaWeb coordination repo.