# Contest Runners

This directory contains two self-contained contest entrypoints:

- `tools/tn_contest_runner.py`: general tensor-network path search and contraction.
- `tools/mps_contest_runner.py`: Vidal/MPS multi-node expectation runner.

Both scripts keep circuit and observable definitions inside the script so a
contest case can be edited in one place.

## Environment

Run commands from the repository root:

```bash
cd /home/yx/qibotn
```

For Intel MPI on two nodes, use the known working style:

```bash
mpirun -np 4 -hostfile /home/yx/qibotn/hostfile -perhost 2 ...
```

Set `TCM_ENABLE=1` for CPU runs:

```bash
export TCM_ENABLE=1
```

## TN Workflow

List built-in TN contest cases:

```bash
python -u tools/tn_contest_runner.py list
```

TN path search uses dask by default. Without `--dask-address`, the script starts
a local dask cluster. For multiple servers, start one scheduler and workers
with the helper script, then pass the scheduler address to the search command.

Start the default two-node dask cluster:

```bash
cd /home/yx/qibotn
tools/manage_tn_dask_cluster.sh start
```

Check status:

```bash
cd /home/yx/qibotn
tools/manage_tn_dask_cluster.sh status
```

Stop the cluster:

```bash
cd /home/yx/qibotn
tools/manage_tn_dask_cluster.sh stop
```

The helper defaults are:

```bash
SCHEDULER_HOST=10.20.1.103
WORKER_HOSTS="10.20.1.103 10.20.1.102"
NWORKERS=48
NTHREADS=1
ROOT_DIR=/home/yx/qibotn
PYTHON_BIN=.venv/bin/python
DASK_WORKER_TTL="24 hours"
DASK_TICK_LIMIT="30 minutes"
DASK_LOST_WORKER_TIMEOUT="30 minutes"
```

Override them inline if needed:

```bash
WORKER_HOSTS="10.20.1.103 10.20.1.102" NWORKERS=48 \
  tools/manage_tn_dask_cluster.sh restart
```

Check that both nodes are connected by adding `--tn-debug-trials` to a small
search. The output should include `qibotn_dask_workers` with both hosts.

`tools/tn_contest_runner.py search` stops the external dask cluster after the
search phase by default. Pass `--keep-dask` if you want to reuse the same dask
cluster for several searches.

Use enough trials to fill the cluster. With the default two-node setup there are
96 worker slots, so `--tn-search-repeats` should be at least 96. The contest
runner default is 2048.

Cotengra trials are CPU-bound and can hold the Python GIL long enough for dask
to report `Event loop was unresponsive`. Dask defaults are much more aggressive:
`scheduler.worker-ttl=5 minutes`, `admin.tick.limit=3s`, and
`deploy.lost-worker-timeout=15s`. The helper script raises these limits so
workers are not killed by dask during search. The intended timeout is
`--tn-search-time`; after that, the runner stops the external dask cluster.

Small correctness check against statevector:

```bash
python -u tools/tn_contest_runner.py validate \
  --case main1 \
  --nqubits 8 \
  --nlayers 2 \
  --torch-threads 4 \
  --tn-search-repeats 8 \
  --tn-search-time 5
```

Search and save contraction trees:

```bash
TCM_ENABLE=1 python -u tools/tn_contest_runner.py search \
  --case main1 \
  --torch-threads 48 \
  --dtype complex64 \
  --dask-address tcp://10.20.1.103:8786 \
  --tn-search-repeats 2048 \
  --tn-search-time 300
```

Contract using the saved tree on one node:

```bash
TCM_ENABLE=1 mpirun -np 2 python -u tools/tn_contest_runner.py contract \
  --mpi \
  --case main1 \
  --torch-threads 48 \
  --dtype complex64
```

Contract using the saved tree on two nodes:

```bash
TCM_ENABLE=1 mpirun -np 4 -hostfile /home/yx/qibotn/hostfile -perhost 2 \
  python -u tools/tn_contest_runner.py contract \
  --mpi \
  --case main1 \
  --torch-threads 48 \
  --dtype complex64
```

Run search and contract in one command:

```bash
TCM_ENABLE=1 python -u tools/tn_contest_runner.py all \
  --case main1 \
  --torch-threads 48 \
  --dtype complex64 \
  --dask-address tcp://10.20.1.103:8786 \
  --tn-search-repeats 2048 \
  --tn-search-time 300
```

Run only selected observables:

```bash
python -u tools/tn_contest_runner.py search \
  --case main2 \
  --observables open_zz
```

Tree files are written to `trees/contest_tn/` by default. The tree filename
contains case, observable, qubit count, layer count, and target slice count.
If any of these change, search again.

Edit TN contest cases in `tools/tn_contest_runner.py`:

- `CASES`: case name, circuit kind, observable list, default scale.
- `build_circuit`: circuit definitions.
- `pauli_sum_observable`: observable definitions.

## MPS Workflow

List built-in Vidal/MPS contest cases:

```bash
python -u tools/mps_contest_runner.py list
```

Small correctness check against statevector:

```bash
mpirun -np 2 python -u tools/mps_contest_runner.py validate \
  --case main1 \
  --nqubits 8 \
  --nlayers 2 \
  --bond 64 \
  --torch-threads 4
```

Run one MPS case on one node:

```bash
TCM_ENABLE=1 mpirun -np 2 python -u tools/mps_contest_runner.py run \
  --case main1 \
  --torch-threads 48
```

Run one MPS case on two nodes:

```bash
TCM_ENABLE=1 mpirun -np 4 -hostfile /home/yx/qibotn/hostfile -perhost 2 \
  python -u tools/mps_contest_runner.py run \
  --case main1 \
  --torch-threads 48
```

Run only one observable:

```bash
TCM_ENABLE=1 mpirun -np 4 -hostfile /home/yx/qibotn/hostfile -perhost 2 \
  python -u tools/mps_contest_runner.py run \
  --case main1 \
  --observables ring_xz \
  --torch-threads 48
```

Override scale:

```bash
TCM_ENABLE=1 mpirun -np 4 -hostfile /home/yx/qibotn/hostfile -perhost 2 \
  python -u tools/mps_contest_runner.py run \
  --case main1 \
  --nqubits 128 \
  --nlayers 24 \
  --bond 1024 \
  --torch-threads 48
```

Edit MPS contest cases in `tools/mps_contest_runner.py`:

- `CASES`: case name, circuit kind, observable list, default scale and bond.
- `build_circuit`: circuit definitions.
- `observable`: observable definitions, including dense local terms.

## Notes

- TN uses path search plus contraction. Reuse tree files only for the exact same
  circuit, observable, qubit count, layer count, seed, and slicing setup.
- TN path search defaults to dask. Use `--tn-search-backend processpool` only
  for fallback/debugging.
- Prefer the default `--tn-target-size 4294967296` memory target. Do not force
  `--tn-target-slices` unless you have already verified that cotengra can find
  valid trees for that exact setting.
- MPS/Vidal does not use contraction-tree search. It runs the circuit directly
  and reports `trunc_sum` and `trunc_max`.
- Default TN contraction is the stable torch/quimb path. Do not pass
  `--tn-contract-implementation cpp` for contest runs.