Publications• Sorted by Date • Classified by Publication Type • Classified by Research Category • A timely match for ride-hailing and ride-pooling services using a deep reinforcement learning approachYiman Bao, Jie Gao, Jinke He, Frans A. Oliehoek, and Oded Cats. A timely match for ride-hailing and ride-pooling services using a deep reinforcement learning approach. Transportation Research Part C: Emerging Technologies, 187:105644, 2026. DownloadAbstractEfficient matching in ride-hailing and ride-pooling services depends not only on how matches are constructed, but also on when the platform triggers a matching operation. Many systems use batched matching with a fixed time interval to accumulate requests before matching, which increases the candidate set but cannot adapt to real time supply-demand fluctuations and may induce unnecessary waiting. This paper proposes a reinforcement learning approach that learns when to trigger matching based on current system conditions. We formulate the timing problem as a finite-horizon Markov decision process and train the policy using the Proximal Policy Optimization algorithm. To address sparse and delayed feedback, we introduce a finite-horizon, potential-based reward shaping scheme that preserves the optimal policy while densifying the learning signal; the same framework applies to both ride-hailing and ride-pooling, where detour delay is incorporated into the reward for pooling. Using a data-driven simulator calibrated on NYC trip records, the learned policy adapts matching timing decisions to the current state of waiting requests and available drivers and outperforms fixed-interval, rule-based dynamic, and first-dispatch baselines. It reduces total waiting time by 3.1% in ride-hailing and 20.1% in ride-pooling, and detour delay by 36.1% in pooling, while maintaining short matching times. BibTeX Entry@article{Bao26TRC,
author = {Yiman Bao and Jie Gao and Jinke He and Frans A. Oliehoek and Oded Cats},
title = {A timely match for ride-hailing and ride-pooling services using a deep reinforcement learning approach},
journal = {Transportation Research Part C: Emerging Technologies},
volume = {187},
pages = {105644},
year = {2026},
issn = {0968-090X},
doi = {https://doi.org/10.1016/j.trc.2026.105644},
OPTurl = {https://www.sciencedirect.com/science/article/pii/S0968090X26001324},
keywords = {refereed},
abstract = {
Efficient matching in ride-hailing and ride-pooling services depends
not only on how matches are constructed, but also on when the
platform triggers a matching operation. Many systems use batched
matching with a fixed time interval to accumulate requests before
matching, which increases the candidate set but cannot adapt to
real time supply-demand fluctuations and may induce unnecessary
waiting. This paper proposes a reinforcement learning approach that
learns when to trigger matching based on current system conditions.
We formulate the timing problem as a finite-horizon Markov decision
process and train the policy using the Proximal Policy Optimization
algorithm. To address sparse and delayed feedback, we introduce a
finite-horizon, potential-based reward shaping scheme that
preserves the optimal policy while densifying the learning signal;
the same framework applies to both ride-hailing and ride-pooling, where
detour delay is incorporated into the reward for pooling. Using a
data-driven simulator calibrated on NYC trip records, the learned
policy adapts matching timing decisions to the current state of
waiting requests and available drivers and outperforms
fixed-interval, rule-based dynamic, and first-dispatch baselines.
It reduces total waiting time by 3.1% in ride-hailing and 20.1% in
ride-pooling, and detour delay by 36.1% in pooling, while
maintaining short matching times.
}
}
Generated by
bib2html.pl
(written by Patrick Riley) on
Thu Apr 02, 2026 14:56:28 UTC |