2024-06-06

Author

Witek ten Hove

OBP:

Afspraken:

We gaan verder kijken naar Simulation Optimization methodes
Wellicht icm Gradient Boosting, mogelijk ML-toepassingen
Onderzoeken wat de stand van zaken is mbt SO en Appointment Scheduling
Start met artikel van Homem-de-Mello, Kong, and Godoy-Barba (2022)
Waarom zou het probleem dat besproken wordt in Homem-de-Mello, Kong, and Godoy-Barba (2022) non-convex zijn?
Aanmaken van Overleaf document voor samenwerking.
Literatuurnotities maken
Literatuuroverzicht maken
Problem description uitwerken in Overleaf.

import pandas as pd
import json
from itables import init_notebook_mode, show
init_notebook_mode(all_interactive=False)
import math
import numpy as np
import plotly.graph_objects as go

This is the init_notebook_mode cell from ITables v2.1.0
(you should not see this message - is your notebook trusted?)

# Read the JSON data from the file
with open('methods.json', 'r') as file:
    data = json.load(file)

# Sort the data by the "Year" field
sorted_data = sorted(data, key=lambda x: x["Year"])

# Extracting the data for the table
headers = ["BibTex ID", "Time Slots", "Schedule Format", "Overbooking Allowed", "Service Time", "No-Shows Assumed", "Emergency Patients Assumed", "Punctuality Assumed", "Waiting Time", "Idle Time", "Overtime", "Solution Method"]
rows = []

for article in sorted_data:
    row = [
        article["BibTex ID"],
        article["Modeling Approach"]["time slot length"],
        article["Modeling Approach"]["schedule format"],
        article["Modeling Approach"]["overbooking allowed"],
        article["Modeling Approach"]["service time"],
        article["Modeling Approach"]["no-shows assumed"],
        article["Modeling Approach"]["emergency patients assumed"],
        article["Modeling Approach"]["punctuality assumed"],
        article["Modeling Approach"]["objective function elements"]["waiting time"],
        article["Modeling Approach"]["objective function elements"]["idle time"],
        article["Modeling Approach"]["objective function elements"]["overtime"],
        article["Solution method summary"]
    ]
    rows.append(row)

# Create a DataFrame
df = pd.DataFrame(rows, columns=headers)

# Use Styler to format the DataFrame
styled_table = df.style.set_properties(**{'text-align': 'left', 'white-space': 'pre-wrap'}) \
                 .set_table_styles([dict(selector='th', props=[('text-align', 'left'), ('vertical-align', 'top')])])  
show(styled_table,
     paging=False,
     classes='display compact cell-border',
     buttons=['copyHtml5', 'csvHtml5', 'excelHtml5'],
     layout={'top1': 'searchPanes'},
     searchPanes={"cascadePanes": True, 'initCollapsed': True},)

BibTex ID	Time Slots	Schedule Format	Overbooking Allowed	Service Time	No-Shows Assumed	Emergency Patients Assumed	Punctuality Assumed	Waiting Time	Idle Time	Overtime	Solution Method
kaandorp_optimal_2007	fixed	number of patients per time slot	yes	exponential distribution	yes	no	yes	yes	yes	yes	A local search procedure ensures globally optimal schedules by evaluating and improving performance iteratively.
klassen_improving_2009	variable, results from schedule	appointment time, discrete	yes	lognormal distribution	yes	no	yes	yes	yes	yes	A simulation optimization approach integrates metaheuristics to iteratively search for optimal scheduling solutions.
koeleman_optimal_2012	fixed	number of patients per time slot	yes	general distribution	yes	yes	yes	yes	yes	yes	A generalized local search method evaluates and iteratively improves schedules to find the optimal solution.
cayirli_universal_2012	variable	['number of patients per time slot', 'appointment time, integer']	yes	lognormal distribution	yes	yes	yes	yes	yes	yes	The 'Dome' rule is parameterized using simulation and nonlinear regression to accommodate no-shows and walk-ins.
qingxia_kong_scheduling_2013	variable, results from schedule	time interval, continuous	yes	general distribution	no	no	yes	yes	no	yes	A convex conic programming approach uses a two-stage stochastic optimization problem to minimize maximum expected cost.
klassen_strategies_2014	variable, results from schedule	appointment time, integer	no	lognormal distribution	yes	no	no	yes	yes	yes	A simulation optimization framework explores various scheduling rules to design effective schedules under patient unpunctuality.
kuiper_computational_2015	variable, results from schedule	appointment time, continuous	no	general distribution	no	no	yes	yes	yes	yes	A computational approach uses phase-type distributions to approximate service times for sequential and simultaneous optimization.
jiang_integer_2017	variable, results from schedule	time interval, continuous	yes	general distribution	yes	no	yes	yes	yes	yes	Distributionally robust optimization models use integer programming to manage random no-shows and service durations.
deceuninck_outpatient_2018	fixed	appointment time, integer	yes	lognormal distribution	yes	no	no	yes	yes	yes	A numerical methodology using a modified Lindley recursion evaluates schedules accounting for no-shows and unpunctuality.
wang_managing_2020	fixed	number of patients per time slot	yes	general distribution	yes	yes	yes	yes	yes	yes	A two-stage stochastic linear programming model incorporates walk-in processes and heterogeneous no-show behaviors.
zacharias_multimodularity_2020	fixed	number of patients per time slot	yes	general distribution	yes	yes	no	yes	yes	yes	A novel method uses nonlinear integer programming and multimodularity to ensure globally optimal schedules.
homem-de-mello_simulation_2022	variable, results from schedule	time interval, continuous	no	general distribution	yes	no	yes	yes	yes	yes	A simulation optimization approach uses Monte Carlo simulations and a projected gradient path algorithm.
kuiper_flexible_2023	variable	time interval, continuous	no	phase-type distribution	yes	yes	no	yes	yes	yes	A method using phase-type distribution approximations minimizes idle time and waiting time with penalties for overtime.

Solution landscape

The scheduling problem can be illustrated as a landscape that contains all possible solutions (points) and the task is to find a path that leads to the solution that has the lowest value. Each solution \(x\) can be defined as an array of \(T\) coordinates with values between \(0\) and \(N\), such that the sum of all coordinates equals \(N\):

\[ x = (x_1, x_2, \ldots, x_T) \quad \text{where} \quad x_t \in \{0, 1, \ldots, N\} \quad \text{and} \quad \sum_{t=1}^{T} x_t = N \]

def binomial_coefficient(N, T):
    return math.comb(N + T - 1, N)

def create_square_matrix(i):
    if i < 1:
        raise ValueError("The input must be an integer greater than 1.")
    
    # Find n and m such that n * m = i and |n - m| is minimized
    n = int(np.sqrt(i))
    while i % n != 0:
        n -= 1
    m = i // n
    
    # Generate the n x m matrix with random values
    matrix = np.random.rand(n, m)
    
    return matrix

def find_positions(matrix, values):
    positions = []
    for value in values:
        result = np.where(matrix == value)
        positions.append((result[0][0], result[1][0]))
    return positions

def plot_matrix_with_arrows(N, T, l):
    i = binomial_coefficient(N, T)
    matrix = create_square_matrix(i)
    n, m = matrix.shape
    x, y, values = [], [], []

    for i in range(n):
        for j in range(m):
            x.append(j)
            y.append(n - i - 1)  # Invert y-axis to have the origin at the top-left corner
            values.append(matrix[i, j])
    
    # Find the four lowest values and their positions
    sorted_values = np.sort(matrix, axis=None)
    lowest_values = sorted_values[:l]
    
    positions = find_positions(matrix, lowest_values)
    
    fig = go.Figure(data=go.Scatter(
        x=x,
        y=y,
        mode='markers',
        marker=dict(
            size=20,
            color=values,
            colorscale='deep',
        ),
        hovertemplate = 'Value: %{marker.color:.5f}<extra></extra>'
    ))
    
    fig.update_layout(
        title=f'Landscape with all {n * m} possible solutions for N = {N} and T = {T}',
        xaxis=dict(
            tickmode='linear',
            dtick=1,
            range=[-0.5, m - 0.5],
            visible=False  # numbers below
        ),
        yaxis=dict(
            tickmode='linear',
            dtick=1,
            range=[-0.5, n - 0.5],
            autorange='reversed',
            visible=False  # numbers below
        ),
        # xaxis_title='Column',
        # yaxis_title='Row',
        width=800,
        height=800,
        showlegend = False
    )

    # Add arrows from the fourth lowest to the lowest value
    for i in range(l-1):
        (start_i, start_j) = positions[i + 1]
        (end_i, end_j) = positions[i]
        fig.add_annotation(
            x=end_j,
            y=n - end_i - 1,
            ax=start_j,
            ay=n - start_i - 1,
            xref="x",
            yref="y",
            axref="x",
            ayref="y",
            showarrow=True,
            arrowhead=2,
            arrowsize=3,
            arrowwidth=2,
            arrowcolor="red"
        )
    
    fig.show()
    fig.write_image("images/landscape-with-arrows.png", scale=4)

# Example usage
N = 7
T = 5

# Plot the matrix with arrows
plot_matrix_with_arrows(N, T, 6)

What we know is:

A scheduling problem with fixed and equal time slots defined as:

\[ \min \left\{ \alpha_W W(x) + \alpha_I I(x) + \alpha_L L(x) \ \middle| \ \begin{array}{c} \sum_{t}^{T} x_t = N \\ x_t \in \mathbb{N}_0 \end{array} \right\} \]
has a multimodular objective (Kaandorp and Koole 2007). This means that any local optimum is also the global optimum.
In the case where appointment times are continuous and variable the optimal schedule is dome-shaped, meaning the time allotted for each appointment initially increases sharply, then climbs slowly to a peak, and decreases slowly after the peak. This allows for some buffer time to absorb increasing service time variability as the day progresses due to accumulated delays, while keeping slots tight at the start and end (Klassen and Yoogalingam 2009).
Would that mean that schedules with overbooking at the start and towards the finish and more zeroes in the middle show optimal performance?

References

Homem-de-Mello, Tito, Qingxia Kong, and Rodrigo Godoy-Barba. 2022. “A Simulation Optimization Approach for the Appointment Scheduling Problem with Decision-Dependent Uncertainties.” INFORMS Journal on Computing 34 (5): 2845–65.

Kaandorp, Guido C, and Ger Koole. 2007. “Optimal Outpatient Appointment Scheduling.” Health Care Management Science 10: 217–29.

Klassen, Kenneth J., and Reena Yoogalingam. 2009. “Improving Performance in Outpatient Appointment Services with a Simulation Optimization Approach.” Production and Operations Management 18 (4): 447–58. https://doi.org/10.1111/j.1937-5956.2009.01021.x.