jaunatisblue/qibotn

Fork 0

Files

jaunatisblue 915c24dc7b

Build wheels / build (ubuntu-latest, 3.11) (push) Has been cancelled

Details

Build wheels / build (ubuntu-latest, 3.12) (push) Has been cancelled

Details

Build wheels / build (ubuntu-latest, 3.13) (push) Has been cancelled

Details

Tests / check (push) Has been cancelled

Details

Tests / build (ubuntu-latest, 3.11) (push) Has been cancelled

Details

Tests / build (ubuntu-latest, 3.12) (push) Has been cancelled

Details

Tests / build (ubuntu-latest, 3.13) (push) Has been cancelled

Details

赛前稳定版

2026-05-15 09:32:26 +08:00

6.3 KiB

Raw Blame History

Contest Runners

This directory contains two self-contained contest entrypoints:

tools/tn_contest_runner.py: general tensor-network path search and contraction.
tools/mps_contest_runner.py: Vidal/MPS multi-node expectation runner.

Both scripts keep circuit and observable definitions inside the script so a contest case can be edited in one place.

Environment

Run commands from the repository root:

cd /home/yx/qibotn

For Intel MPI on two nodes, use the known working style:

mpirun -np 4 -hostfile /home/yx/qibotn/hostfile -perhost 2 ...

Set TCM_ENABLE=1 for CPU runs:

export TCM_ENABLE=1

TN Workflow

List built-in TN contest cases:

python -u tools/tn_contest_runner.py list

TN path search uses dask by default. Without --dask-address, the script starts a local dask cluster. For multiple servers, start one scheduler and workers with the helper script, then pass the scheduler address to the search command.

Start the default two-node dask cluster:

cd /home/yx/qibotn
tools/manage_tn_dask_cluster.sh start

Check status:

cd /home/yx/qibotn
tools/manage_tn_dask_cluster.sh status

Stop the cluster:

cd /home/yx/qibotn
tools/manage_tn_dask_cluster.sh stop

The helper defaults are:

SCHEDULER_HOST=10.20.1.103
WORKER_HOSTS="10.20.1.103 10.20.1.102"
NWORKERS=48
NTHREADS=1
ROOT_DIR=/home/yx/qibotn
PYTHON_BIN=.venv/bin/python
DASK_WORKER_TTL="24 hours"
DASK_TICK_LIMIT="30 minutes"
DASK_LOST_WORKER_TIMEOUT="30 minutes"

Override them inline if needed:

WORKER_HOSTS="10.20.1.103 10.20.1.102" NWORKERS=48 \
  tools/manage_tn_dask_cluster.sh restart

Check that both nodes are connected by adding --tn-debug-trials to a small search. The output should include qibotn_dask_workers with both hosts.

tools/tn_contest_runner.py search stops the external dask cluster after the search phase by default. Pass --keep-dask if you want to reuse the same dask cluster for several searches.

Use enough trials to fill the cluster. With the default two-node setup there are 96 worker slots, so --tn-search-repeats should be at least 96. The contest runner default is 2048.

Cotengra trials are CPU-bound and can hold the Python GIL long enough for dask to report Event loop was unresponsive. Dask defaults are much more aggressive: scheduler.worker-ttl=5 minutes, admin.tick.limit=3s, and deploy.lost-worker-timeout=15s. The helper script raises these limits so workers are not killed by dask during search. The intended timeout is --tn-search-time; after that, the runner stops the external dask cluster.

Small correctness check against statevector:

python -u tools/tn_contest_runner.py validate \
  --case main1 \
  --nqubits 8 \
  --nlayers 2 \
  --torch-threads 4 \
  --tn-search-repeats 8 \
  --tn-search-time 5

Search and save contraction trees:

TCM_ENABLE=1 python -u tools/tn_contest_runner.py search \
  --case main1 \
  --torch-threads 48 \
  --dtype complex64 \
  --dask-address tcp://10.20.1.103:8786 \
  --tn-search-repeats 2048 \
  --tn-search-time 300

Contract using the saved tree on one node:

TCM_ENABLE=1 mpirun -np 2 python -u tools/tn_contest_runner.py contract \
  --mpi \
  --case main1 \
  --torch-threads 48 \
  --dtype complex64

Contract using the saved tree on two nodes:

TCM_ENABLE=1 mpirun -np 4 -hostfile /home/yx/qibotn/hostfile -perhost 2 \
  python -u tools/tn_contest_runner.py contract \
  --mpi \
  --case main1 \
  --torch-threads 48 \
  --dtype complex64

Run search and contract in one command:

TCM_ENABLE=1 python -u tools/tn_contest_runner.py all \
  --case main1 \
  --torch-threads 48 \
  --dtype complex64 \
  --dask-address tcp://10.20.1.103:8786 \
  --tn-search-repeats 2048 \
  --tn-search-time 300

Run only selected observables:

python -u tools/tn_contest_runner.py search \
  --case main2 \
  --observables open_zz

Tree files are written to trees/contest_tn/ by default. The tree filename contains case, observable, qubit count, layer count, and target slice count. If any of these change, search again.

Edit TN contest cases in tools/tn_contest_runner.py:

CASES: case name, circuit kind, observable list, default scale.
build_circuit: circuit definitions.
pauli_sum_observable: observable definitions.

MPS Workflow

List built-in Vidal/MPS contest cases:

python -u tools/mps_contest_runner.py list

Small correctness check against statevector:

mpirun -np 2 python -u tools/mps_contest_runner.py validate \
  --case main1 \
  --nqubits 8 \
  --nlayers 2 \
  --bond 64 \
  --torch-threads 4

Run one MPS case on one node:

TCM_ENABLE=1 mpirun -np 2 python -u tools/mps_contest_runner.py run \
  --case main1 \
  --torch-threads 48

Run one MPS case on two nodes:

TCM_ENABLE=1 mpirun -np 4 -hostfile /home/yx/qibotn/hostfile -perhost 2 \
  python -u tools/mps_contest_runner.py run \
  --case main1 \
  --torch-threads 48

Run only one observable:

TCM_ENABLE=1 mpirun -np 4 -hostfile /home/yx/qibotn/hostfile -perhost 2 \
  python -u tools/mps_contest_runner.py run \
  --case main1 \
  --observables ring_xz \
  --torch-threads 48

Override scale:

TCM_ENABLE=1 mpirun -np 4 -hostfile /home/yx/qibotn/hostfile -perhost 2 \
  python -u tools/mps_contest_runner.py run \
  --case main1 \
  --nqubits 128 \
  --nlayers 24 \
  --bond 1024 \
  --torch-threads 48

Edit MPS contest cases in tools/mps_contest_runner.py:

CASES: case name, circuit kind, observable list, default scale and bond.
build_circuit: circuit definitions.
observable: observable definitions, including dense local terms.

Notes

TN uses path search plus contraction. Reuse tree files only for the exact same circuit, observable, qubit count, layer count, seed, and slicing setup.
TN path search defaults to dask. Use --tn-search-backend processpool only for fallback/debugging.
Prefer the default --tn-target-size 4294967296 memory target. Do not force --tn-target-slices unless you have already verified that cotengra can find valid trees for that exact setting.
MPS/Vidal does not use contraction-tree search. It runs the circuit directly and reports trunc_sum and trunc_max.
Default TN contraction is the stable torch/quimb path. Do not pass --tn-contract-implementation cpp for contest runs.

6.3 KiB Raw Blame History

Contest Runners

Environment

TN Workflow

MPS Workflow

Notes

6.3 KiB

Raw Blame History