7  Preferential Bayesian Optimization for Outpatient Appointment Scheduling

7.1 Objective

This experiment aims to apply Preferential Bayesian Optimization (PBO) to find optimal or near-optimal solutions for the outpatient appointment scheduling problem as defined by Kaandorp and Koole (2007). Specifically, the objective is to minimize a weighted sum of Expected Waiting Time (EWT) and Expected Staff Penalty (ESP) by efficiently searching the space of schedule perturbations. The experiment leverages dictionary-based embeddings (HED) as proposed by Deshwal et al. (2023) to handle the high-dimensional combinatorial space of perturbation selection vectors within the PBO framework, drawing on the original preferential-BO formulation of Gonzalez et al. (2017).

7.2 Background

We consider an outpatient appointment scheduling problem as described by Kaandorp and Koole (2007) where the schedule is represented by a vector \(\mathbf{x} = (x_0, x_1, \ldots, x_{T-1})^T\). This vector comprises \(T\) components, where \(x_j\) denotes the non-negative allocation (number of patients) to time slot \(j\), for \(j = 0, \ldots, T-1\). A fundamental constraint is that the total allocation across all time slots must equal a fixed constant \(N\):

\[\sum_{j=0}^{T-1} x_j = N\] where \(N\) is the total number of patients to be scheduled. This constraint ensures that the schedule is feasible and respects the total patient load.

We require \(x_j \ge 0\) for all \(j = 0, \ldots, T-1\). Consequently, a valid schedule \(\mathbf{x}\) belongs to the feasible set \(\mathcal{F} = { \mathbf{z} \in \mathbb{D}^{T} \mid \sum_{j=0}^{T-1} z_j = N, z_j \ge 0 \text{ for all } j}\), where \(\mathbb{D}\) is the set of non-negative integers (\(\mathbb{Z}\_{\ge 0}\))

Kaandorp and Koole (2007) define a neighborhood structure for local search based on perturbation vectors derived from a set of \(T\) basis change vectors, \(v_i \in \mathbb{D}^{T}\), for \(i = 0, \ldots, T-1\). These basis vectors represent elementary shifts of allocation between time slots:

  • \(v_0 = (-1, 0, \ldots, 0, 1)\) (Shift unit from slot 0 to slot \(T-1\))
  • \(v_1 = (1, -1, 0, \ldots, 0)\) (Shift unit from slot 1 to slot 0)
  • \(v_i = (0, \ldots, 0, \underbrace{1}*{\text{pos } i-1}, \underbrace{-1}*{\text{pos } i}, 0, \ldots, 0)\) for \(i = 2, \ldots, T-1\) (Shift unit from slot \(i\) to slot \(i-1\))

A key property of these basis vectors is that the sum of components for each vector is zero: \(\sum_{j=0}^{T-1} v_{ij} = 0\) for all \(i=0, \ldots, T-1\).

Perturbations are constructed using a binary selection vector \(\mathbf{U} = (u_0, u_1, \ldots, u_{T-1})\), where \(u_i \in {0, 1}\). Each \(u_i\) indicates whether the basis change \(v_i\) is included in the perturbation. The resulting perturbation vector \(\mathbf{r}(\mathbf{U}) \in \mathbb{D}^{T}\) is the linear combination:

\[ \mathbf{r}(\mathbf{U}) := \sum_{i=0}^{T-1} u_i v_i \]

Since each \(v\_i\) sums to zero, any perturbation \(\mathbf{r}(\mathbf{U})\) also sums to zero: \(\sum\_{j=0}^{T-1} r\_j(\mathbf{U}) = 0\). This ensures that applying such a perturbation to a valid schedule \(\mathbf{x}\) preserves the total allocation \(N\).

The neighborhood of a schedule \(\mathbf{x} \in \mathcal{F}\), denoted by \(\mathcal{N}(\mathbf{x})\), comprises all distinct, feasible schedules \(\mathbf{x}'\) reachable by applying a non-zero perturbation \(\mathbf{r}(\mathbf{U})\) (Kaandorp and Koole (2007), use a slightly different but related neighborhood definition based on combinations of these basis vectors [cite: 89, 93, 1645]).

The objective function to be minimized is a weighted sum of Expected Waiting Time (EWT) and Expected Staff Penalty (ESP), as defined by Kaandorp and Koole (2007):

\[ C(\mathbf{x}) = w \cdot EWT(\mathbf{x}) + (1-w) \cdot ESP(\mathbf{x}) \]

Kaandorp and Koole (2007) prove that this objective function is multimodular, which guarantees that a local search algorithm using their defined neighborhood converges to the global optimum.

However, evaluating this function can be computationally expensive for large \(N\) and \(T\), and the search space of binary vectors \(\mathbf{U}\) is high-dimensional (\(2^T - 2\) possibilities).

Dictionary‐Based Hamming Embeddings. Deshwal et al. (2023) identify that directly modeling high-dimensional binary vectors with a Gaussian process is both statistically and computationally challenging, since the search space grows as \(2^d\). They propose Hamming Embedding via Dictionaries (HED): choose a small set of \(m\) “dictionary” vectors \(\{a_i\}\subset\{0,1\}^d\) and embed any candidate \(z\) by its Hamming distances \(\phi_i(z)=\mathrm{Hamming}(a_i,z)\). By using carefully constructed binary-wavelet (Hadamard) dictionaries, they both dramatically reduce input dimensionality \((m\ll d)\) and obtain provable regret bounds \(\widetilde O(\sqrt{Tm})\). Empirically on combinatorial tasks (e.g. MAX-SAT, feature selection, compiler flags), BO with HED (“BODi”) converges faster and to better optima than state-of-the-art discrete methods.

Preferential Bayesian Optimization.
González et al. introduced the first PBO framework for optimizing a latent black-box function using only pairwise “duels.” They place a Gaussian-process prior over a latent utility function \(f\), then squash it through a probit (or logistic) likelihood to model the probability \(\pi_f([x, x'])\) that \(x\) is preferred to \(x'\). From this they define acquisition functions—such as a Copeland-expected-improvement extension and a “dueling” Thompson–sampling rule—that balance exploration and exploitation by selecting the next pair \([x_t, x'_t]\) to query Gonzalez et al. (2017).

Because the preferential likelihood is non-conjugate, each iteration requires expensive approximate inference (refitting the GP classification model) and an inner optimization over pairs to maximize the acquisition. Although sample-efficient (drastically reducing the number of duels needed to find the Condorcet winner), this per-step computational overhead motivates integrating faster surrogate updates or approximate acquisition schemes—exactly the gap our HED-based embedding approach helps address by reducing problem dimensionality before applying PBO.

7.3 Hypothesis

By applying PBO with HED embeddings and Thompson sampling, we expect to efficiently identify perturbation vectors \(\mathbf{U}\) that yield significantly improved schedules (lower cost) compared to random or simple local search heuristics.

7.4 Methodology

We are using Botorch for the PBO implementation (Balandat et al. 2020).

import torch
import numpy as np
import gpytorch
from botorch.models.pairwise_gp import PairwiseGP, PairwiseLaplaceMarginalLogLikelihood
from botorch.fit import fit_gpytorch_mll
from botorch.models.transforms.input import Normalize
from scipy.linalg import hadamard
from scipy.optimize import minimize
import random
from typing import List, Tuple, Dict, Optional, Union
import plotly.graph_objects as go

from functions import (
    bailey_welch_schedule,
    get_v_star,
    compute_convolutions,
    calculate_objective_serv_time_lookup,
)

7.5 Helper Functions

# --- Helper: generate_weighted_list ---
def generate_weighted_list(max_s: int, l: float, i: int) -> Optional[np.ndarray]:
    if not isinstance(max_s, int) or max_s <= 0: return None
    if not isinstance(l, (int, float)) or not (1 <= l <= max_s): return None
    if not isinstance(i, int) or not (0 <= i < max_s): return None
    def objective_fn(x: np.ndarray) -> float:
        return (np.dot(np.arange(1, max_s + 1), x) - l) ** 2
    constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1.0})
    bounds = [(0, 1)] * max_s
    initial_guess = np.random.dirichlet(np.ones(max_s))
    try:
        result = minimize(objective_fn, initial_guess, method='SLSQP', bounds=bounds, constraints=constraints, options={'maxiter': 300, 'ftol': 1e-9})
        if not result.success: return None
        optimized_probs = result.x
        optimized_probs[optimized_probs < 0] = 0
        current_sum = np.sum(optimized_probs)
        if not np.isclose(current_sum, 1.0):
            if current_sum > 1e-8: optimized_probs /= current_sum
            else: return None
    except Exception: return None
    first_part_probs = optimized_probs[:i] if i > 0 else np.array([])
    second_part_probs = optimized_probs[i:]
    values = np.zeros(max_s + 1)
    if i > 0: values[1 : i + 1] = np.sort(first_part_probs)
    values[i + 1 : max_s + 1] = np.sort(second_part_probs)[::-1]
    final_sum = np.sum(values[1:])
    if not np.isclose(final_sum, 1.0):
        if final_sum > 1e-8: values[1:] /= final_sum
        else: return None
    return values

7.6 Problem Setup

N_patients = 50
T_param = 48
d_interval_len = 10
max_s_time = 30
q_no_show = 0.20
w_weight = 0.1
l_target_avg_service_time = 14.0
i_sorting_split = 10

v_star_matrix = get_v_star(T_param)
s_dist = generate_weighted_list(max_s_time, l_target_avg_service_time, i_sorting_split)
if s_dist is None: raise ValueError("Failed to generate service time distribution.")
print(f"Service time distribution (s): {s_dist.tolist()}")
# Assuming s_dist is already defined
fig = go.Figure(data=go.Bar(
    x=np.arange(1, len(s_dist) + 1),
    y=s_dist,
    marker_line_width=1
))
fig.update_layout(
    title="Service Time Distribution",
    xaxis_title="Service Time (units)",
    yaxis_title="Probability",
    template="plotly_white"
)
fig.show()
print(f"Average generated service time: {np.dot(np.arange(len(s_dist)), s_dist):.4f}")
convolutions_dict = compute_convolutions(s_dist.tolist(), N_patients, q_no_show)
X_initial_schedule = np.array(bailey_welch_schedule(T_param, d_interval_len, N_patients, s_dist))
print(f"Initial base schedule (X_vec): {X_initial_schedule.tolist()}")
print(f"Sum of patients in X_vec: {np.sum(X_initial_schedule)}")
LARGE_PENALTY_VAL = 1e10
Service time distribution (s): [0.0, 0.0010182870243508066, 0.004383818596928343, 0.0072059574089277465, 0.019019241762008206, 0.02349126759082732, 0.02411042572052284, 0.036652289273485704, 0.05318511479116829, 0.06150071962665944, 0.08187670519847493, 0.1788545013306628, 0.07393315723224196, 0.0715779633838313, 0.0671838052994548, 0.05037559566484032, 0.04515039510431882, 0.03549795298413222, 0.03025627131841115, 0.02540926306634486, 0.02140321880236048, 0.02126470732882247, 0.019372014336892757, 0.011485516916314666, 0.010503815298936134, 0.010175822380668713, 0.007889358087376343, 0.002183484247262923, 0.0020919317092953096, 0.0015849218428687784, 0.0013624766861605455]
Average generated service time: 12.7394
Initial base schedule (X_vec): [2, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 12]
Sum of patients in X_vec: 50

7.7 Objective Function

def evaluate_objective(U_np: Union[np.ndarray, List[int]], X_vec: np.ndarray, v_star: np.ndarray,
                       conv_dict: Dict[int, np.ndarray], d_len: int, w_val: float) -> float:
    if not isinstance(U_np, np.ndarray): U_np = np.array(U_np, dtype=int)
    if U_np.ndim != 1: raise ValueError("Input U must be 1-dimensional")
    if U_np.shape[0] != v_star.shape[0]: raise ValueError(f"U length {U_np.shape[0]} != V* rows {v_star.shape[0]}.")
    if X_vec.shape[0] != v_star.shape[1]: raise ValueError(f"X length {X_vec.shape[0]} != V* columns {v_star.shape[1]}.")
    if not np.all((U_np == 0) | (U_np == 1)): raise ValueError("Input U must be binary.")
    V_sum = np.sum(v_star[U_np == 1, :], axis=0)
    Y_schedule = X_vec + V_sum
    if np.all(Y_schedule >= 0) and np.sum(Y_schedule) == np.sum(X_vec):
        ewt, esp = calculate_objective_serv_time_lookup(Y_schedule.tolist(), d_len, conv_dict)
        return w_val * ewt + (1 - w_val) * esp
    return LARGE_PENALTY_VAL

U_zeros = np.zeros(T_param, dtype=int)
initial_obj_val = evaluate_objective(U_zeros, X_initial_schedule, v_star_matrix, convolutions_dict, d_interval_len, w_weight)
print(f"Initial objective value (U=zeros): {initial_obj_val:.4f}")
Initial objective value (U=zeros): 205.2664

7.8 PBO Setup

# --- HED: Binary Wavelet Dictionary Construction ---
def get_binary_wavelet_dictionary(T_dim_U: int, m_dict_size: int) -> np.ndarray:
    d_hadamard = 1
    while d_hadamard < T_dim_U: d_hadamard *= 2
    H = hadamard(d_hadamard)
    if m_dict_size > d_hadamard: m_dict_size = d_hadamard
    dictionary = np.zeros((m_dict_size, T_dim_U), dtype=int)
    for i in range(m_dict_size):
        dictionary[i, :] = (H[i, :T_dim_U] + 1) // 2
    return dictionary

def embed_HED(U_np: np.ndarray, dictionary: np.ndarray) -> np.ndarray:
    U_np_reshaped = U_np.reshape(1, -1) if U_np.ndim == 1 else U_np
    hamming_distances = np.sum(dictionary != U_np_reshaped[:, np.newaxis, :], axis=2).T
    return hamming_distances.flatten().astype(float) if U_np.ndim == 1 else hamming_distances.astype(float)

m_dictionary_size = min(T_param * 2, 32 if T_param <=20 else 64)
hed_dictionary = get_binary_wavelet_dictionary(T_param, m_dictionary_size)
print(f"HED dictionary shape: {hed_dictionary.shape}")

num_total_initial_pairs = 10
num_initial_U_zeros_comparisons = 3 # Number of times to compare U_zeros with random U
num_purely_random_initial_pairs = num_total_initial_pairs - num_initial_U_zeros_comparisons

num_pbo_iterations = 50
n_candidates_for_thompson = 200
n_thompson_posterior_samples = 20

all_U_vectors_list = []
all_U_embeddings_tensor = None
comparison_pairs_indices = []

def add_U_to_master_list(U_np: np.ndarray, embedding: np.ndarray) -> int:
    global all_U_vectors_list, all_U_embeddings_tensor
    U_np = U_np.astype(int) # Ensure consistent type
    for i, existing_U_np_item in enumerate(all_U_vectors_list):
        if np.array_equal(existing_U_np_item, U_np): return i
    new_index = len(all_U_vectors_list)
    all_U_vectors_list.append(U_np.copy())
    if embedding.ndim > 1: embedding = embedding.flatten()
    embedding_tensor = torch.from_numpy(embedding).double().unsqueeze(0)
    if all_U_embeddings_tensor is None: all_U_embeddings_tensor = embedding_tensor
    else: all_U_embeddings_tensor = torch.cat([all_U_embeddings_tensor, embedding_tensor], dim=0)
    return new_index

def generate_random_U_vector(dim: int) -> np.ndarray:
    return np.random.randint(0, 2, size=dim, dtype=int)

def get_hamming_neighbors(U_vector: np.ndarray, num_neighbors: int, max_flips: int = 1) -> List[np.ndarray]:
    neighbors = []
    dim = len(U_vector)
    if dim == 0: return []
    for _ in range(num_neighbors):
        neighbor = U_vector.copy()
        actual_flips = np.random.randint(1, max_flips + 1)
        flip_indices = np.random.choice(dim, size=min(actual_flips, dim), replace=False)
        for idx in flip_indices: neighbor[idx] = 1 - neighbor[idx]
        neighbors.append(neighbor)
    return neighbors
HED dictionary shape: (64, 48)

7.9 PBO Initialization

print("Starting PBO Initialization...")
best_U_overall = U_zeros.copy() # Start with U_zeros as initial best
best_obj_overall = initial_obj_val
best_iter_found = 0 # 0 for initial U_zeros evaluation

# Add U_zeros to the master list first
idx_U_zeros = add_U_to_master_list(U_zeros, embed_HED(U_zeros, hed_dictionary))
obj_U_zeros = initial_obj_val # Already calculated

# --- Initial comparisons involving U_zeros ---
print(f"Generating {num_initial_U_zeros_comparisons} initial comparisons involving U_zeros...")
for i in range(num_initial_U_zeros_comparisons):
    U_competitor = generate_random_U_vector(T_param)
    attempts = 0
    while np.array_equal(U_competitor, U_zeros) and attempts < 100: # Avoid comparing U_zeros to itself
        U_competitor = generate_random_U_vector(T_param)
        attempts += 1
    if np.array_equal(U_competitor, U_zeros): continue

    obj_competitor = evaluate_objective(U_competitor, X_initial_schedule, v_star_matrix, convolutions_dict, d_interval_len, w_weight)
    print(f"  Init U_zeros vs U_rand_{i}: Obj_U_zeros={obj_U_zeros:.4f}, Obj_U_rand={obj_competitor:.4f}")

    if obj_competitor < best_obj_overall:
        best_obj_overall = obj_competitor
        best_U_overall = U_competitor.copy()
        best_iter_found = 0 # Still initialization phase

    idx_competitor = add_U_to_master_list(U_competitor, embed_HED(U_competitor, hed_dictionary))
    
    if not np.isclose(obj_U_zeros, obj_competitor):
        pref_pair = None
        if obj_U_zeros < obj_competitor: pref_pair = [idx_U_zeros, idx_competitor]
        elif obj_competitor < obj_U_zeros: pref_pair = [idx_competitor, idx_U_zeros]
        
        if pref_pair and pref_pair not in comparison_pairs_indices:
            comparison_pairs_indices.append(pref_pair)

# --- Purely random initial comparisons (excluding U_zeros unless randomly picked again) ---
print(f"Generating {num_purely_random_initial_pairs} purely random initial comparisons...")
for i in range(num_purely_random_initial_pairs):
    U_A = generate_random_U_vector(T_param)
    U_B = generate_random_U_vector(T_param)
    attempts = 0
    while np.array_equal(U_A, U_B) and attempts < 100:
        U_B = generate_random_U_vector(T_param)
        attempts += 1
    if np.array_equal(U_A, U_B): continue

    obj_A = evaluate_objective(U_A, X_initial_schedule, v_star_matrix, convolutions_dict, d_interval_len, w_weight)
    obj_B = evaluate_objective(U_B, X_initial_schedule, v_star_matrix, convolutions_dict, d_interval_len, w_weight)
    print(f"  Init U_randA_{i} vs U_randB_{i}: Obj_A={obj_A:.4f}, Obj_B={obj_B:.4f}")


    if obj_A < best_obj_overall: best_obj_overall = obj_A; best_U_overall = U_A.copy(); best_iter_found = 0
    if obj_B < best_obj_overall: best_obj_overall = obj_B; best_U_overall = U_B.copy(); best_iter_found = 0

    idx_A = add_U_to_master_list(U_A, embed_HED(U_A, hed_dictionary))
    idx_B = add_U_to_master_list(U_B, embed_HED(U_B, hed_dictionary))
    
    if not np.isclose(obj_A, obj_B):
        pref_pair = None
        if obj_A < obj_B: pref_pair = [idx_A, idx_B]
        elif obj_B < obj_A: pref_pair = [idx_B, idx_A]

        if pref_pair and pref_pair not in comparison_pairs_indices:
            comparison_pairs_indices.append(pref_pair)

print(f"Initialization complete. Observed {len(all_U_vectors_list)} unique schedules.")
print(f"Number of preference pairs: {len(comparison_pairs_indices)}")
if best_U_overall is not None:
    print(f"Best U after initialization: {best_U_overall.tolist()}, Obj: {best_obj_overall:.4f}")
Starting PBO Initialization...
Generating 3 initial comparisons involving U_zeros...
  Init U_zeros vs U_rand_0: Obj_U_zeros=205.2664, Obj_U_rand=10000000000.0000
  Init U_zeros vs U_rand_1: Obj_U_zeros=205.2664, Obj_U_rand=10000000000.0000
  Init U_zeros vs U_rand_2: Obj_U_zeros=205.2664, Obj_U_rand=10000000000.0000
Generating 7 purely random initial comparisons...
  Init U_randA_0 vs U_randB_0: Obj_A=10000000000.0000, Obj_B=10000000000.0000
  Init U_randA_1 vs U_randB_1: Obj_A=10000000000.0000, Obj_B=207.1167
  Init U_randA_2 vs U_randB_2: Obj_A=10000000000.0000, Obj_B=225.0753
  Init U_randA_3 vs U_randB_3: Obj_A=10000000000.0000, Obj_B=10000000000.0000
  Init U_randA_4 vs U_randB_4: Obj_A=10000000000.0000, Obj_B=10000000000.0000
  Init U_randA_5 vs U_randB_5: Obj_A=10000000000.0000, Obj_B=10000000000.0000
  Init U_randA_6 vs U_randB_6: Obj_A=215.4007, Obj_B=10000000000.0000
Initialization complete. Observed 18 unique schedules.
Number of preference pairs: 6
Best U after initialization: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 205.2664

7.10 PBO Loop

import time

# Start timer for PBO loop
start_time = time.time()

pairwise_model = None

for pbo_iter in range(num_pbo_iterations):
    print(f"\n--- PBO Iteration {pbo_iter + 1}/{num_pbo_iterations} ---")
    if len(comparison_pairs_indices) < 1 or all_U_embeddings_tensor is None or all_U_embeddings_tensor.shape[0] < 2:
        print("Not enough data for GP. Generating a random pair to add to comparisons.")
        # (Simplified fallback: add one random comparison)
        U_A_fallback = generate_random_U_vector(T_param)
        U_B_fallback = generate_random_U_vector(T_param)
        attempts = 0
        while np.array_equal(U_A_fallback, U_B_fallback) and attempts < 100:
            U_B_fallback = generate_random_U_vector(T_param)
            attempts += 1
        if np.array_equal(U_A_fallback, U_B_fallback):
            continue

        obj_A_fb = evaluate_objective(U_A_fallback, X_initial_schedule, v_star_matrix, convolutions_dict, d_interval_len, w_weight)
        obj_B_fb = evaluate_objective(U_B_fallback, X_initial_schedule, v_star_matrix, convolutions_dict, d_interval_len, w_weight)
        if obj_A_fb < best_obj_overall:
            best_obj_overall = obj_A_fb
            best_U_overall = U_A_fallback.copy()
            best_iter_found = pbo_iter + 1
        if obj_B_fb < best_obj_overall:
            best_obj_overall = obj_B_fb
            best_U_overall = U_B_fallback.copy()
            best_iter_found = pbo_iter + 1

        idx_A_fb = add_U_to_master_list(U_A_fallback, embed_HED(U_A_fallback, hed_dictionary))
        idx_B_fb = add_U_to_master_list(U_B_fallback, embed_HED(U_B_fallback, hed_dictionary))
        if not np.isclose(obj_A_fb, obj_B_fb):
            pref_pair_fb = [idx_A_fb, idx_B_fb] if obj_A_fb < obj_B_fb else [idx_B_fb, idx_A_fb]
            if pref_pair_fb not in comparison_pairs_indices:
                comparison_pairs_indices.append(pref_pair_fb)
        if len(comparison_pairs_indices) < 1 or all_U_embeddings_tensor.shape[0] < 2:
            continue

    train_X_gp = all_U_embeddings_tensor.double()
    train_Y_gp = torch.tensor(comparison_pairs_indices, dtype=torch.long)
    min_bounds = torch.zeros(train_X_gp.shape[-1], dtype=torch.double)
    max_bounds = torch.full((train_X_gp.shape[-1],), float(T_param), dtype=torch.double)
    input_transform = Normalize(d=train_X_gp.shape[-1], bounds=torch.stack([min_bounds, max_bounds]))

    pairwise_model = PairwiseGP(
        train_X_gp,
        train_Y_gp,
        input_transform=input_transform,
        covar_module=gpytorch.kernels.ScaleKernel(
            gpytorch.kernels.MaternKernel(nu=2.5, ard_num_dims=train_X_gp.shape[-1])
        )
    )
    mll = PairwiseLaplaceMarginalLogLikelihood(pairwise_model.likelihood, pairwise_model)

    U_query1, U_query2 = None, None
    try:
        fit_gpytorch_mll(mll)
        U_cand_pool_list = []
        for _ in range(n_candidates_for_thompson // 2):
            U_cand_pool_list.append(generate_random_U_vector(T_param))
        if best_U_overall is not None:
            U_cand_pool_list.extend(
                get_hamming_neighbors(best_U_overall, n_candidates_for_thompson // 2, max_flips=2)
            )
        else:
            for _ in range(n_candidates_for_thompson // 2):
                U_cand_pool_list.append(generate_random_U_vector(T_param))

        unique_U_cand_tuples = {tuple(u.tolist()) for u in U_cand_pool_list}
        U_cand_pool_np = np.array([list(t) for t in unique_U_cand_tuples], dtype=int)
        if len(U_cand_pool_np) == 0:
            raise ValueError("Candidate pool is empty.")

        embedded_candidates_np = np.array([embed_HED(u, hed_dictionary) for u in U_cand_pool_np])
        embedded_candidates_torch = torch.from_numpy(embedded_candidates_np).double()

        pairwise_model.eval()
        with torch.no_grad(), gpytorch.settings.fast_pred_var():
            posterior_f = pairwise_model.posterior(input_transform(embedded_candidates_torch))
            f_samples = posterior_f.sample(torch.Size([n_thompson_posterior_samples]))

        best_indices_per_sample = torch.argmax(f_samples, dim=1)
        query_idx1_in_cand_pool = best_indices_per_sample[0].item()
        U_query1 = U_cand_pool_np[query_idx1_in_cand_pool]
        query_idx2_in_cand_pool = -1

        if len(U_cand_pool_np) > 1:
            if len(best_indices_per_sample) > 1:
                for i in range(1, len(best_indices_per_sample)):
                    if best_indices_per_sample[i].item() != query_idx1_in_cand_pool:
                        query_idx2_in_cand_pool = best_indices_per_sample[i].item()
                        break
            if (
                query_idx2_in_cand_pool == -1
                or query_idx1_in_cand_pool == query_idx2_in_cand_pool
            ):
                rand_indices = np.arange(len(U_cand_pool_np))
                np.random.shuffle(rand_indices)
                for rand_idx in rand_indices:
                    if rand_idx != query_idx1_in_cand_pool:
                        query_idx2_in_cand_pool = rand_idx
                        break
                U_query2 = (
                    U_cand_pool_np[query_idx2_in_cand_pool]
                    if query_idx2_in_cand_pool != -1
                    else generate_random_U_vector(T_param)
                )
            else:
                U_query2 = U_cand_pool_np[query_idx2_in_cand_pool]
        else:
            U_query2 = generate_random_U_vector(T_param)
    except Exception as e:
        print(f"Error in GP/TS: {e}. Falling back to random.")
        U_query1 = generate_random_U_vector(T_param)
        U_query2 = generate_random_U_vector(T_param)

    attempts = 0
    while np.array_equal(U_query1, U_query2) and attempts < 100:
        U_query2 = generate_random_U_vector(T_param)
        attempts += 1
    if np.array_equal(U_query1, U_query2):
        continue

    obj_q1 = evaluate_objective(
        U_query1,
        X_initial_schedule,
        v_star_matrix,
        convolutions_dict,
        d_interval_len,
        w_weight,
    )
    obj_q2 = evaluate_objective(
        U_query2,
        X_initial_schedule,
        v_star_matrix,
        convolutions_dict,
        d_interval_len,
        w_weight,
    )
    print(f"  Query 1 (U): {U_query1.tolist()}, Obj: {obj_q1:.4f}")
    print(f"  Query 2 (U): {U_query2.tolist()}, Obj: {obj_q2:.4f}")

    if obj_q1 < best_obj_overall:
        best_obj_overall = obj_q1
        best_U_overall = U_query1.copy()
        best_iter_found = pbo_iter + 1
        print(
            f"  New best U from Q1: {best_U_overall.tolist()}, Obj: {best_obj_overall:.4f} (Iter: {best_iter_found})"
        )
    if obj_q2 < best_obj_overall:
        best_obj_overall = obj_q2
        best_U_overall = U_query2.copy()
        best_iter_found = pbo_iter + 1
        print(
            f"  New best U from Q2: {best_U_overall.tolist()}, Obj: {best_obj_overall:.4f} (Iter: {best_iter_found})"
        )

    idx_q1 = add_U_to_master_list(
        U_query1, embed_HED(U_query1, hed_dictionary)
    )
    idx_q2 = add_U_to_master_list(
        U_query2, embed_HED(U_query2, hed_dictionary)
    )

    if not np.isclose(obj_q1, obj_q2):
        pref_pair_iter = [idx_q1, idx_q2] if obj_q1 < obj_q2 else [idx_q2, idx_q1]
        print(f"  Preference: {'Q1 > Q2' if obj_q1 < obj_q2 else 'Q2 > Q1'}")
        if pref_pair_iter not in comparison_pairs_indices:
            comparison_pairs_indices.append(pref_pair_iter)
        else:
            print("  (Preference already recorded)")
    else:
        print("  Preference: Tie")
    print(f"  Total unique Us: {len(all_U_vectors_list)}, Total preference pairs: {len(comparison_pairs_indices)}")

print("\n--- PBO Experiment Finished ---")
if best_U_overall is not None:
    print(f"Best U vector found (by direct evaluation): {best_U_overall.tolist()}")
    final_Y_schedule = X_initial_schedule + np.sum(v_star_matrix[best_U_overall == 1, :], axis=0)
    print(f"Corresponding Y schedule: {final_Y_schedule.tolist()}")
    print(f"Objective value: {best_obj_overall:.4f}")
    iter_str = "during initialization (before PBO iterations)" if best_iter_found == 0 else f"at PBO iteration {best_iter_found}"
    print(f"Found {iter_str}")
    print(f"Patient count in final Y schedule: {np.sum(final_Y_schedule)}")
else: print("No valid U vector found.")

if pairwise_model is not None and all_U_embeddings_tensor is not None and len(all_U_embeddings_tensor) > 0:
    pairwise_model.eval()
    transformed_all_U_embeddings = input_transform(all_U_embeddings_tensor)
    with torch.no_grad(), gpytorch.settings.fast_pred_var():
        posterior_f_all = pairwise_model.posterior(transformed_all_U_embeddings)
        mean_f_all = posterior_f_all.mean
    best_idx_model = torch.argmax(mean_f_all).item()
    U_reco_model = all_U_vectors_list[best_idx_model]
    obj_reco_model = evaluate_objective(U_reco_model, X_initial_schedule, v_star_matrix, convolutions_dict, d_interval_len, w_weight)
    print(f"\nRecommendation based on GP model (highest posterior mean utility over all {len(all_U_vectors_list)} evaluated Us):")
    print(f"  U vector: {U_reco_model.tolist()}")
    Y_reco = X_initial_schedule + np.sum(v_star_matrix[U_reco_model == 1, :], axis=0)
    print(f"  Corresponding Y schedule: {Y_reco.tolist()}")
    print(f"  Objective value of this GP recommended U: {obj_reco_model:.4f}")
    print(f"  GP's posterior mean utility for this U: {mean_f_all[best_idx_model].item():.4f}")
else: print("\nGP model not available for recommendation.")

# End of PBO loop
end_time = time.time()
elapsed = end_time - start_time
print(f"\n--- PBO Experiment Finished in {elapsed:.2f} seconds ---")

--- PBO Iteration 1/50 ---
  Query 1 (U): [0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1], Obj: 10000000000.0000
  Query 2 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 205.9627
  Preference: Q2 > Q1
  Total unique Us: 20, Total preference pairs: 7

--- PBO Iteration 2/50 ---
  Query 1 (U): [1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1], Obj: 10000000000.0000
  Query 2 (U): [1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 22, Total preference pairs: 7

--- PBO Iteration 3/50 ---
  Query 1 (U): [1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0], Obj: 10000000000.0000
  Query 2 (U): [1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 24, Total preference pairs: 7

--- PBO Iteration 4/50 ---
  Query 1 (U): [1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0], Obj: 10000000000.0000
  Query 2 (U): [1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 26, Total preference pairs: 7

--- PBO Iteration 5/50 ---
  Query 1 (U): [1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1], Obj: 10000000000.0000
  Query 2 (U): [0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 28, Total preference pairs: 7

--- PBO Iteration 6/50 ---
  Query 1 (U): [1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1], Obj: 10000000000.0000
  Query 2 (U): [0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 30, Total preference pairs: 7

--- PBO Iteration 7/50 ---
  Query 1 (U): [1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0], Obj: 10000000000.0000
  Query 2 (U): [0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 32, Total preference pairs: 7

--- PBO Iteration 8/50 ---
  Query 1 (U): [1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1], Obj: 10000000000.0000
  Query 2 (U): [1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 34, Total preference pairs: 7

--- PBO Iteration 9/50 ---
  Query 1 (U): [1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0], Obj: 10000000000.0000
  Query 2 (U): [0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 36, Total preference pairs: 7

--- PBO Iteration 10/50 ---
  Query 1 (U): [0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0], Obj: 10000000000.0000
  Query 2 (U): [1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 38, Total preference pairs: 7

--- PBO Iteration 11/50 ---
  Query 1 (U): [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 206.6964
  Query 2 (U): [1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1], Obj: 10000000000.0000
  Preference: Q1 > Q2
  Total unique Us: 40, Total preference pairs: 8

--- PBO Iteration 12/50 ---
  Query 1 (U): [0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 207.0453
  Query 2 (U): [0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 10000000000.0000
  Preference: Q1 > Q2
  Total unique Us: 42, Total preference pairs: 9

--- PBO Iteration 13/50 ---
  Query 1 (U): [0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0], Obj: 10000000000.0000
  Query 2 (U): [0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 206.8452
  Preference: Q2 > Q1
  Total unique Us: 44, Total preference pairs: 10

--- PBO Iteration 14/50 ---
  Query 1 (U): [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 206.1122
  Query 2 (U): [1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1], Obj: 10000000000.0000
  Preference: Q1 > Q2
  Total unique Us: 46, Total preference pairs: 11

--- PBO Iteration 15/50 ---
  Query 1 (U): [1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1], Obj: 10000000000.0000
  Query 2 (U): [0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 48, Total preference pairs: 11

--- PBO Iteration 16/50 ---
  Query 1 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 206.7525
  Query 2 (U): [0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0], Obj: 10000000000.0000
  Preference: Q1 > Q2
  Total unique Us: 50, Total preference pairs: 12

--- PBO Iteration 17/50 ---
  Query 1 (U): [0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1], Obj: 207.0562
  Query 2 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0], Obj: 10000000000.0000
  Preference: Q1 > Q2
  Total unique Us: 52, Total preference pairs: 13

--- PBO Iteration 18/50 ---
  Query 1 (U): [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0], Obj: 206.9445
  Query 2 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 205.5335
  Preference: Q2 > Q1
  Total unique Us: 54, Total preference pairs: 14

--- PBO Iteration 19/50 ---
  Query 1 (U): [0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1], Obj: 10000000000.0000
  Query 2 (U): [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 56, Total preference pairs: 14

--- PBO Iteration 20/50 ---
  Query 1 (U): [0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1], Obj: 10000000000.0000
  Query 2 (U): [1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 58, Total preference pairs: 14

--- PBO Iteration 21/50 ---
  Query 1 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 206.3939
  Query 2 (U): [1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1], Obj: 10000000000.0000
  Preference: Q1 > Q2
  Total unique Us: 60, Total preference pairs: 15

--- PBO Iteration 22/50 ---
  Query 1 (U): [0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0], Obj: 10000000000.0000
  Query 2 (U): [1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 62, Total preference pairs: 15

--- PBO Iteration 23/50 ---
  Query 1 (U): [1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0], Obj: 10000000000.0000
  Query 2 (U): [0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 64, Total preference pairs: 15

--- PBO Iteration 24/50 ---
  Query 1 (U): [1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1], Obj: 10000000000.0000
  Query 2 (U): [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 206.0704
  Preference: Q2 > Q1
  Total unique Us: 66, Total preference pairs: 16

--- PBO Iteration 25/50 ---
  Query 1 (U): [1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 221.4890
  Query 2 (U): [0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 10000000000.0000
  Preference: Q1 > Q2
  Total unique Us: 68, Total preference pairs: 17

--- PBO Iteration 26/50 ---
  Query 1 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 206.1706
  Query 2 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 206.1301
  Preference: Q2 > Q1
  Total unique Us: 70, Total preference pairs: 18

--- PBO Iteration 27/50 ---
  Query 1 (U): [0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0], Obj: 10000000000.0000
  Query 2 (U): [1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 72, Total preference pairs: 18

--- PBO Iteration 28/50 ---
  Query 1 (U): [0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 206.9641
  Query 2 (U): [1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0], Obj: 10000000000.0000
  Preference: Q1 > Q2
  Total unique Us: 74, Total preference pairs: 19

--- PBO Iteration 29/50 ---
  Query 1 (U): [0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 10000000000.0000
  Query 2 (U): [0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 76, Total preference pairs: 19

--- PBO Iteration 30/50 ---
  Query 1 (U): [0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0], Obj: 10000000000.0000
  Query 2 (U): [1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 78, Total preference pairs: 19

--- PBO Iteration 31/50 ---
  Query 1 (U): [0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1], Obj: 10000000000.0000
  Query 2 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 206.8882
  Preference: Q2 > Q1
  Total unique Us: 80, Total preference pairs: 20

--- PBO Iteration 32/50 ---
  Query 1 (U): [0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 207.0182
  Query 2 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0], Obj: 10000000000.0000
  Preference: Q1 > Q2
  Total unique Us: 82, Total preference pairs: 21

--- PBO Iteration 33/50 ---
  Query 1 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 10000000000.0000
  Query 2 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 84, Total preference pairs: 21

--- PBO Iteration 34/50 ---
  Query 1 (U): [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0], Obj: 206.7306
  Query 2 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 10000000000.0000
  Preference: Q1 > Q2
  Total unique Us: 86, Total preference pairs: 22

--- PBO Iteration 35/50 ---
  Query 1 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0], Obj: 205.4661
  Query 2 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0], Obj: 10000000000.0000
  Preference: Q1 > Q2
  Total unique Us: 88, Total preference pairs: 23

--- PBO Iteration 36/50 ---
  Query 1 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0], Obj: 10000000000.0000
  Query 2 (U): [1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 90, Total preference pairs: 23

--- PBO Iteration 37/50 ---
  Query 1 (U): [0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1], Obj: 10000000000.0000
  Query 2 (U): [1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 92, Total preference pairs: 23

--- PBO Iteration 38/50 ---
  Query 1 (U): [1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1], Obj: 10000000000.0000
  Query 2 (U): [1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 94, Total preference pairs: 23

--- PBO Iteration 39/50 ---
  Query 1 (U): [1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1], Obj: 10000000000.0000
  Query 2 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 205.3123
  Preference: Q2 > Q1
  Total unique Us: 96, Total preference pairs: 24

--- PBO Iteration 40/50 ---
  Query 1 (U): [1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0], Obj: 10000000000.0000
  Query 2 (U): [0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 98, Total preference pairs: 24

--- PBO Iteration 41/50 ---
  Query 1 (U): [0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1], Obj: 10000000000.0000
  Query 2 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 206.7417
  Preference: Q2 > Q1
  Total unique Us: 100, Total preference pairs: 25

--- PBO Iteration 42/50 ---
  Query 1 (U): [0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1], Obj: 10000000000.0000
  Query 2 (U): [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 206.8088
  Preference: Q2 > Q1
  Total unique Us: 102, Total preference pairs: 26

--- PBO Iteration 43/50 ---
  Query 1 (U): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 205.2584
  Query 2 (U): [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 207.0135
  New best U from Q1: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Obj: 205.2584 (Iter: 43)
  Preference: Q1 > Q2
  Total unique Us: 104, Total preference pairs: 27

--- PBO Iteration 44/50 ---
  Query 1 (U): [0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1], Obj: 10000000000.0000
  Query 2 (U): [0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 106, Total preference pairs: 27

--- PBO Iteration 45/50 ---
  Query 1 (U): [1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1], Obj: 10000000000.0000
  Query 2 (U): [0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 108, Total preference pairs: 27

--- PBO Iteration 46/50 ---
  Query 1 (U): [1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1], Obj: 10000000000.0000
  Query 2 (U): [1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 110, Total preference pairs: 27

--- PBO Iteration 47/50 ---
  Query 1 (U): [1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1], Obj: 10000000000.0000
  Query 2 (U): [0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 112, Total preference pairs: 27

--- PBO Iteration 48/50 ---
  Query 1 (U): [0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1], Obj: 10000000000.0000
  Query 2 (U): [1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 114, Total preference pairs: 27

--- PBO Iteration 49/50 ---
  Query 1 (U): [1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1], Obj: 10000000000.0000
  Query 2 (U): [1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 116, Total preference pairs: 27

--- PBO Iteration 50/50 ---
  Query 1 (U): [0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0], Obj: 10000000000.0000
  Query 2 (U): [0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1], Obj: 10000000000.0000
  Preference: Tie
  Total unique Us: 118, Total preference pairs: 27

--- PBO Experiment Finished ---
Best U vector found (by direct evaluation): [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Corresponding Y schedule: [2, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 12]
Objective value: 205.2584
Found at PBO iteration 43
Patient count in final Y schedule: 50

Recommendation based on GP model (highest posterior mean utility over all 118 evaluated Us):
  U vector: [1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0]
  Corresponding Y schedule: [2, 0, 2, 0, 0, 2, 1, 0, 2, 0, 1, 0, 2, -1, 2, 0, 1, 2, 0, 1, 0, 1, 2, 0, 1, 1, 1, 0, 0, 1, 2, 1, 0, 0, 2, 0, 2, -1, 2, 0, 2, 0, 1, 1, 1, 1, -1, 13]
  Objective value of this GP recommended U: 10000000000.0000
  GP's posterior mean utility for this U: -0.1030

--- PBO Experiment Finished in 55.63 seconds ---

7.11 Discussion

In this study, we demonstrated how Preferential Bayesian Optimization (PBO) combined with Hamming Embeddings via Dictionaries (HED) can efficiently navigate the high-dimensional binary space of schedule perturbations and deliver improved outpatient appointment schedules with far fewer objective evaluations than naïve random or local‐search heuristics.

7.11.1 Future Directions

Automated Acquisition–Function Discovery with FunBO.
Aglietti et al. (2024) introduce FunBO, a technique that leverages large language models to programmatically generate, refine, and evaluate novel Bayesian‐optimization acquisition functions. By iterating between an LLM prompt (to propose new code variants) and empirical benchmarking, FunBO has discovered bespoke strategies that outperform classical rules like Expected Improvement or Upper Confidence Bound on standard continuous benchmarks.

Why this matters for HED+PBO: Our current pipeline uses off‐the‐shelf PBO acquisitions (e.g. Thompson sampling on Hamming‐dictionary inputs). FunBO could be tasked to propose duel‐specific acquisition rules—perhaps embedding‐aware diversity measures or adaptive exploration–exploitation schedules—that are tailor‐made for the combinatorial neighborhood of outpatient schedules. Systematically comparing these LLM‐generated acquisitions could lead to even faster convergence or better final schedules, especially in large or highly constrained clinics.

Inner–Loop Amortization via Learned Proposal Policies.
Swersky et al. (2020) train a lightweight neural policy via reinforcement learning to amortize the inner maximization of discrete‐space acquisition functions. Instead of exhaustively scanning candidates at each BO step, the network instantly proposes high‐quality points, yielding dramatic per‐iteration speed‐ups on tasks such as protein‐sequence design.

Why this matters for HED+PBO: In our implementation, we still sample and score hundreds of Hamming‐neighbors each iteration—incurring nontrivial runtime as \(T\) and the dictionary size grow. By training a “deep evolutionary” proposal network on synthetic perturbation‐scheduling tasks (varying \(N,T\), service distributions, no‐show rates), we could replace that random/Hamming sampling step with a single neural forward pass, enabling near‐real‐time PBO for large‐scale or interactive applications.

Fully Amortized Preferential BO with Meta‐Learning.
Zhang et al. (2025) develop PABBO, a transformer‐based meta‐learner that jointly models both the surrogate posterior and the duel acquisition policy. Once pre‐trained on a diverse corpus of BO problems, it requires no GP refitting or inner‐loop optimization at test time—simply mapping past duels to new pair scores in one forward pass, and achieving orders‐of‐magnitude speed‐ups while matching or exceeding GP dynamics :contentReferenceoaicite:2.

Why this matters for HED+PBO: Although our HED+PairwiseGP already reduces input dimensionality, each iteration still fits a GP and samples dozens of candidates. A PABBO‐inspired extension could meta‐train on HED‐embedded scheduling duels, so that in a new clinic setting, the model instantly proposes top perturbations—unlocking truly interactive preferential optimization for clinician‐in‐the‐loop scheduling or adaptive appointment systems.

By exploring these directions—LLM‐driven acquisition design, learned inner‐loop proposals, and fully amortized PBO—we can further improve the scalability, runtime efficiency, and automation of HED‐based preferential optimization for outpatient scheduling.

7.12 Timeline

  • Experiment Design: 19-05-2025
  • Implementation: 19-05-2025
  • Execution: (to be filled after run)
  • Analysis: (to be filled after run)

7.13 References