Introduction

The vast scale of NYC can magnify even a slight improvement in the efficiency of the transportation solutions translating it into significant cumulative economic, environmental and societal impacts. The rapidly growing for-hire vehicles (FHV) service is one area which can realize such optimization of drastically improving the efficiency of car and taxi transportation, as intended to cut traffic, congestion and energy consumption (Santi et al. 2014). Companies like Uber, Lyft, Via and many others provide their services in most of the U.S. cities as well as around the world—with the customer being able to book an FHV or shared FHV (ride-sharing with other customer) with their mobile applications. The surge in ride-sharing trips in recent years have demonstrated that the FHV service is playing an increasingly important role in the city’s overall transportation (over 2.5 times from mid-2017 till the end of 2018 with over 25 million miles traveled monthly by the end of 2018 on the shared rides according to New York City Taxi & Limousine Commission (NYC TLC) open data) (Atkinson-Palombo et al. 2019). Such potential has been further unleashed with an in-depth understanding of the basic urban quantities/parameters (such as city size and driving speed) that affect the fraction of individual trips that can be shared (Tachet et al. 2017). Unfortunately, modal shifts resulting from increased affordability of the FHV service can easily offset those positive impacts, contributing to a substantial proportion of the overwhelmed road traffic and energy emission. Meanwhile, an evaluation of the impacts against different transport alternatives and for different population groups with distinct demographics is essential (Kodransky and Lewenstein 2014). Another issue resulting from the growing number of vehicles is the increased traffic congestion in the city. Both citywide bus speeds and the average travel speed within Manhattan’s central business district (the area south of 60th Street) are the lowest they have been in decades. Buses average 7.58 miles per hour—it was 8 miles per hour in 1990—while the travel speed in Manhattan is now just over 7 miles per hour, down from 9 miles per hour in 1990 (NYC 2019). Meanwhile, close to 45% of New Yorkers get delivery at home once per week, which not only affects the number of trucks are on city streets, but how vehicles can get around. The city is putting congestion pricing as one of the measures into place that may combat these problems. Urban stakeholders and municipal managers need to make informed decisions while considering policies and adopting solutions based on the travel behavior simulations driven by such knowledge, ideally, in a social petri dish minimizing the impact of irrelevant external factors.

The behavioral framework for the set of complete and inter-related choices undertaken by travelers and potential travelers in the travel market is required. Both aggregate and disaggregate approaches have been developed to estimate travel demand and modal split (Koppleman and Bhat 2006). Those popular and widely-used include the gravitational models (Anas 1983), the Probit models (Alemi et al. 2019), the Logit models (Wen and Koppelman 2001) and many others. The explanatory variables included in the models often involve demographic, socioeconomic character, trip characters and mode attributes (Wen and Koppelman 2001; Scheiner and Holz-Rau 2007). In addition, traveler and trip-related data including the actual mode choice of the traveler are often required for the estimation and evaluation of a practical mode choice model, which should be obtained by surveying a sample of travelers from the population of interest. For decades, transportation researchers have largely used survey data from active solicitation (Chen et al. 2016), which are detailed but limited by relatively small sample size (small data). The rapid rise and prevalence of mobile technologies have enabled the collection of a massive amount of passive data (big data) very different from data of active solicitation (small data) that are familiar to most transportation researchers and require different methods and techniques for processing and modeling (González et al. 2008; Liu et al. 2015; Yue et al. 2014; Hasan and Ukkusuri 2014). In recent years, data on human mobility and interactions in the city space saw an increasing number of applications. Data sources being leveraged as proxies for human mobility include anonymized cell phone connections (Girardin et al. 2008; Gonzalez et al. 2008; Amini et al. 2014; Kung et al. 2014; Grauwin et al. 2017), credit card transactions (Sobolevsky et al. 2014, 2016), GPS readings (Santi et al. 2014; Nyhan et al. 2016; Qian et al. 2019), geo-tagged social media (Hawelka et al. 2014; Paldino et al. 2015; Belyi et al. 2017) as well as various sensor data (Kontokosta and Johnson 2016).

A critical drawback lies in having the available data either not including any user demographic information for individual trips, or providing travel statistics with demographic information at the aggregate level only, as a response to alleviate privacy and surveillance concerns (Douriez et al. 2016). A synergy of disclosed (small and big) travel data from different data providers and departments is often required (Huang et al. 2018; Li et al. 2019; Beiró et al. 2016): to represent the resultant complete travel information (such as the number of trips, travel time, and monetary cost) at a certain aggregate level, and it, subsequently, is not as accurate and detailed as the incomplete data. Such compromise imposes uncertainties onto both the data reliability and the modeling process (Manzo et al. 2015; Trajcevski 2011; Rasouli and Timmermans 2012), suggesting that the point estimates of modeled modal choices only represent one of the possible outputs generated by the models and, instead, anticipated modal choices are better expressed as a central estimate and an overall range of uncertainty margins articulated in terms of output values and the likelihood of occurrence (Boyce 1999).

The key focus of our work lies in developing a data-driven approach applicable to scenarios where ground-truth mobility information is insufficient and fragmentary. The lack of individual point level data creates roadblocks in modeling city-wide assessment of any transportation-related policy interventions. Furthermore, evaluating intervention-related changes become even more complex for small urban geographies such as zip-codes and census tracts, which can be crucial for making localized decisions by policymakers in a big city. Thus, one of the primary focus of our work is demonstrating how the transportation impact assessment could be performed in the somewhat typical situation of having incomplete data on urban mobility. We put forward a probabilistic framework to explore the modal choice behaviors in NYC using a data-driven method based on partial ground-truth data and with consideration of both data and model uncertainties. The proposed individual choice-based simulation model utilizes the synthesized data (from NYU C2SMART center) along with NYC’s TLC ridership, allowing to simulate the mode-choices probabilities across all the transportation modes in question. We demonstrate that the model can learn from multiple data sources which could have different scales and different information on transport modes. The applicability of the model can thus be extended to any urban area in question where mobility information is incomplete or fragmented. This is typical for many cities where mobility across all transport modes is difficult to measure and could be collected by independent agencies.

By evaluating the synthesized transportation choices under scoping scenarios as well as the actual up-to-date taxi and FHV ridership, we train the mode-choice simulation model capable of simulating further mode-shift on the individual level under intervention scenarios of interest—the introduction of ridesharing FHV in NYC and the Manhattan Congestion surcharge. Once quantified, the mode-shift impacts can be translated into the economic, environmental, societal impacts of the considered scenarios, aiming to quantitatively inform stakeholders and policymakers of the implications of shared mobility and congestion pricing on the entire city as well as specific populations and neighborhoods.

Data Overview

The Origin–Destination flows are retrieved from two sources: C2SMART simulation test bed and NYC TLC open data. The C2SMART test bed represents synthesized travel flows across multiple transport modes, of which flows for Taxi, Transit, Walking, and Driving are aggregated at Taxi Zone levels for our work. The NYC TLC data gives ground truth data on flows for Taxis, FHVs, and shared FHVs., which are originally aggregated on the Taxi Zone level. The trip distribution and spatial coverage across NYC for the four travel modes from C2SMART simulation test bed is shown in Fig. 1. We supplement these data with travel cost, travel time (retrieved from API services) for each O–D pair in question. Furthermore, the income wage brackets for commuters are accessed from the American Community Survey Data (ACS) and Longitudinal Employer-Household Dynamics (LEHD) (both U.S. Census Bureau programs). The LEHD also provides population breakdown across the income brackets for each Taxi Zone.

Fig. 1
figure 1

Mode-wise distribution of C2SMART Simulation Test Bed data and spatial distribution at taxi zone level in NYC

The detailed discussion on data and comparison metrics is present in Appendix A: “The Data.”

Methods

The objective of this study is to prototype a simulation modeling framework suitable for understanding the mode-choice behavior and assessment of city-scale impacts of transportation innovations and policies on urban transportation systems along with the associated environmental, economic and social implications. The assessment will be evaluated on two pilot use cases of introducing ride-sharing in New York City (offered through UberPOOL, Lyft Shared and other FHV companies) and Manhattan Congestion surcharge. The impacts in question include travel time and cost for passengers, traffic and congestion, gas consumption/vehicular emissions. Particular focus will be made on the equitable impacts across populations, comparing how overall changes in travel time and mileage translate among different income groups.

Traditional counterfactual impact assessment is challenged by (1) the fact that spatial counterfactual does not seem feasible (interventions are implemented city-wide and there is no comparable territory without deployment to be considered as a control area), while (2) utility of the temporal counterfactual (comparing the same urban system before and after the deployment) is limited by multiple major trends and transformations happening within a complex urban system simultaneously with the deployment in question, (3) many target quantities of interest, such as overall urban traffic, gas consumption, emissions are hardly measurable with the available data and are again affected by multiple urban transformations happening simultaneously. It is also important to mention that a body of studies applied the Geographically Weighted Regression model (GWR) for transport mode choice analysis, but they generally compared the GWR model with the ordinary least squares (OLS) model to highlight the spatial variations in the relationship between transport accessibility, land uses, etc (Andersson 2017; Paez and Currie 2010; Torun et al. 2020; Chow et al. 2006; Chiou et al. 2015). The factors being considered in those related work are based on each single spatial point, whereas in our research context we consider the utility of transport between a pair of origin–destination (O–D) points. While data from C2SMART test bed and TLC present an opportunity to model travel flows with respect to variables in interest, it is noted that the scale of these two data is quite different, with TLC representing much larger of the trips (Appendix A: “Data Metrics”). So we do not quite have a ground truth of the mobility from all the travel modes we consider. Hence a GWR model would not be appropriate in this regard.

As an alternative, the present paper proposes a methodology based on a data-driven integrated transportation simulation modeling framework, assessing the mode choice between the six major transportation modes in question: walking, private and public transportation, taxi and for-hire-vehicles, including ride-share modes. For an estimated transportation demand, an agent-based choice model will be simulated, estimating unknown parameters of the individual utility of considered transportation modes as well as the agent characteristics (distribution parameters for individual preferences) through a multi-step Bayesian inference framework sequentially gaining information from the available partial observations of actual mobility choices. The Bayesian inference framework for the mode-choice model inference was earlier applied in our work on assessing the impact of bike-sharing (Sobolevsky et al. 2018), although the model used there was more traditional multinomial logit discussed below.

While the simulated individual choices within the model enable a direct assessment of the mode-shift consistent with individual preferences, which can be further translated into the impact of interest. Uncertainty about the data and parameter estimates will be incorporated into the simulations and resulting impact assessment.

Multinomial Logit

We first consider the broadly used Multinomial Logit Model as the baseline approach for estimating the mode-choice for the regular commute. The model as well as its nested version (which we can use in case of related modes such as taxi and FHV) offers an advantage of estimating the mode-choice probabilities using closed-form formulas representing the aggregate-level choices of a simulation model. However, the parameters of the nested model lack a direct connection with the underlying simulation parameters (which are based on individual choices of commuters with respect to travel time/cost) and this way limits the utility of the model for individual-level mode-shift assessment. Nevertheless, it can still serve as a baseline to assess the efficiency of the proposed simulation model, so we include it in that capacity.

A Multinomial Logit (MNL) discrete choice model (Fig. 2) and its nested version with a nest for taxi+FHV and sub-nest for shared and non-shared FHV (discussion on different nesting structures is in “Appendix B”) is trained based on the two available datasets: (1) Number of trips between each O–D pair by wage group and 4 transport modes (Taxi, Transit, Walk, Driving) from C2SMART; and (2) Number of trips between each O–D pair by 3 transport modes (Taxi, FHV, shared FHV) from TLC. The models depend on a set of parameters—\(\lambda\), which controls the impact of the mode utility differences on the mode choice probability, \(\beta\) adjusting the objective value of time (time multiplied by individual wage rate) to anticipated monetary cost incorporating possible irrationality of individual decisions while combining it with the direct monetary cost to assess the overall utility. The nested model would further include \(\tau _{\textrm{taxi}+\textrm{FHV}}\), \(\tau _{\textrm{FHV}}\) controlling the choices between nests and within each nest (Koppleman and Bhat 2006).

Fig. 2
figure 2

Multinomial Logit model framework

Mathematically, the utility score \(U_j\) for alternative j depends on the time taken \(T_j\) between the O–D pair in consideration, the monetary cost \(P_j\) for choosing the alternative, the hourly income W of the commuter, and a random component of error \(\epsilon _j\), yielding a base utility function

$$\begin{aligned} U_j = (\beta WT_j + P_j) \end{aligned}$$
(1)

and the individual utility of \(U_j+\epsilon _j\), where \(\epsilon _j\) follows a Gumbel distribution. The probabilities for each of the four major transportation modes to be chosen as having the highest utility is defined as

$$\begin{aligned} P_{\textrm{mode}} = \frac{e^{\lambda U_{\textrm{mode}}}}{e^{\lambda U_{\textrm{taxi}}} + e^{\lambda U_{\textrm{transit}}} + e^{\lambda U_{\textrm{walk}}} + e^{\lambda U_{\textrm{drive}}}}. \end{aligned}$$
(2)

We further consider another version of the MNL with log-utilities (logMNL), corresponding to having a multiplicative random factor applied to original utilities. Specifically, adjust (1) as

$$\begin{aligned} U_j = (\ln (\beta WT_j + P_j)) \end{aligned}$$

considering log-utilities and assuming individual log-utility to be \(U_j+\epsilon _j\) with a random term again following Gumbel distribution. This will correspond to choosing a mode with a minimal inverse negative utility \(e^{-U_j}=(\beta WT_j + P_j)e^{-\epsilon _j}\) rather than a minimal negative utility \(-U_j=\beta WT_j + P_j-\epsilon _j\) in the classical setup, i.e. having a multiplicative exp-Gumbel individual random factor instead of an additive Gumbel random term.

When considering FHV and shared FHV modes one needs to acknowledge the relation with the taxi mode and corresponding correlations between individual preferences. This can be accounted for by introducing a nest of taxi and these modes along with a subnest of FHV and shared FHV modes to the model. For the nested model, the marginal probability of the outcome j is calculated based on the deterministic part \(V_j\) of the utility (i.e., \(V_j = -\lambda (\beta WT_j + P_j)\)), and the inclusive value IVk which signifies how inclusive each nest is based on its dissimilarity parameters (i.e., IVk \(=\) ln \(\sum _{l \in N_k}e^{\frac{1}{\tau _k}V_l}\) ), yielding a chosen mode

$$\begin{aligned} P_r(y=j) = \frac{e^{\frac{1}{\tau _k}V_j}}{e^{IV_k}}\cdot \frac{e^{\tau _kV_j}}{\sum _m \tau _m IV_k}. \end{aligned}$$
(3)

The parameter \(\tau _k\) cancels itself out for the nests containing a single transport mode. Eventually, the dissimilarity parameters \(\tau _1, \tau _2\) for the taxi, non-shared FHV, shared FHV nests/sub-nests together with the utility parameters \(\lambda , \beta\) determine the shift between each alternative, while \(\tau _1, \tau _2\) largely control the balance within the taxi + FHV nest and FHV sub-nest. The baseline model parameters were estimated through estimating \(\lambda , \beta\) of the utility function based on C2SMART data by minimizing the Weighted Root Mean Squared Error (WRMSE) between the number of trips from model prediction and real data for Taxi, Public Transit, Walk and Driving. The tested models measure the goodness of fit between model prediction and C2SMART simulation test bed data based on several metrics and search a wide range of parameters for the optimal fit in a reasonable time. The final nested model splits people’s regular mobility between origins and destinations across the city (from C2SMART or LEHD data) and predicts aggregated transportation mode choices. The model also provides wage distribution for each transport mode to be used while assessing the preferred transport mode choice for the commuters from the given wage group. In the next evolution of the model, it will enable further direct simulation of their future choices under changing conditions according to the scenarios of interest.

Individual Choice-Based Simulation Model

This approach is based on agent-based simulations of individual choices. In fact, so does the multinomial logit model representing one particular scenario when an additive random term following Gumbel distribution represents individual preferences. This enables a closed-form representation of the resulting probabilities, however not relying on that allows further flexibility in choosing the modeling framework. Besides direct control of the original simulation parameters will enable direct individual-level assessment of the mode-shift consistent with individual preferences.

To avoid the reliance on close form representation of the mode probabilities, we simulate mode choices for each individual origin–destination pair and the specific passenger of the given income category and use a Neural Network architecture for parameter estimation. This allows further flexibility in choosing the modeling framework, without having to use explicit analytic formulas.

Model parameters

We define the utility for each given pair of O–D, passenger wage w and transportation mode based on travel time and cost estimates as well as a random factor, representing individual preferences towards each mode. The utility can be interpreted as a perceived "cost of travel" to an individual weighting travel time and monetary cost by suitable parameters which can be learned using the model. The utility in this model setting is thus defined as \(U = \beta *t*w + c\), where \(\beta\) is the rationality adjustment for the cost of time estimate as before, t is the travel time estimate, c is the travel fare/cost estimate and w is the wage of the commuter. We also introduce a random multiplicative factor \(\epsilon \sim N(0,\sigma ^2)\). representing individual preference to the given mode. Thus the log-utility is defined as \(\ln U = \ln (\beta *t*w + c) + \epsilon\).

We assume \(\epsilon\) terms generally independent across transport modes except of taxi, FHV and shared FHV, which are of course related—if one has an increased preference towards taxi, its likely that FHV will be also preferred and even more so between FHV and shared FHV which one can see as even more closely related, as while offering a slightly different type of service they are facilitated by the same provider/app. Thus two new parameters are introduced: corTFS—correlation coefficient between random factors \(\epsilon\) of taxi and FHV or shared FHV (SFHV) modes, and corFS: correlation coefficient between random factors of FHV and SFHV. (Detailed discussion on different model nesting variations and performance comparison is given in “Appendix B”). This way, the model parameters to be estimated are \(\beta\), \(\sigma\), corTFS, and corFS. The Neural Network (NN) model thus outputs mode-choice probabilities corresponding to each O–D pair, which is then used for likelihood estimation (Fig. 3).

Fig. 3
figure 3

Individual choice model pipeline

Model Training and Likelihood Estimation

The NN model is used in a two-phase Bayesian inference framework (Fig. 4) based on the data of individual simulated trips generated by C2SMART simulation test bed as well as taxi+FHV data available from TLC. The parameters \(\beta , \sigma\) are estimated in the first step with C2SMART test bed, while correlation parameters corTFS, and corFS are estimated in the next step with training on TLC data, with keeping \(\beta , \sigma\) fixed. It fits the mode-choice probabilities \(P_m\) between the six transportation modes m as the function of their log-utilities and the model parameters. Notice that \(\sigma\) can be treated as the scaling factor for the log-utilities to simplify the model. To fit the model we simulate \(P_m\) for various values of \(U_m/\sigma\) sampled from the random (normal) distribution and corTFS, corFS (provided corFS > corTFS) sampled uniformly (50,000 random samplings) and use it to learn the neural network. The model architecture consists of three hidden layers with 8,12,8 neurons respectively, with a rectified linear unit (“relu”) activation for hidden and sigmoid for the output layer trained on ’binary cross-entropy’ objective function.

Fig. 4
figure 4

Individual choice-based simulation model framework - training model parameters

For mode choice probabilities \(P_m(o,d,w,\sigma ,\beta )\) for each set of origin(o), destination(d) and wages(w), the log-likelihood for four modes given the observed C2SMART ridership \(R_m(o,d,w)\) is calculated as

$$\begin{aligned} L(\sigma , \beta ) = \sum _{o,d,w}\sum _m R_m(o,d,w)\ln P_m(o,d,w,\sigma ,\beta ) \end{aligned}$$
(4)

For each \(\sigma , \beta\) the observed TLC ridership \(R_m(o,d,w)\) for \(m \in {\textrm{taxi},\textrm{FHV},\textrm{SFHV}}\) estimated using \(P_m(o,d,w,\sigma ,\beta ,\textrm{corTFS},\textrm{corFS})\) calculate the log-likelihood of the data given in the model as

$$\begin{aligned}{} & {} L_{\textrm{FHV}}(\textrm{corTFS},\textrm{corFS}) \nonumber \\{} & {} \quad \quad = \sum _{o,d,w}\sum _m R_m(o,d,w)\ln \frac{P_m(o,d,w,\sigma ,\beta )}{P_{\textrm{TFHV}}(o,d,w,\sigma ,\beta )} \end{aligned}$$
(5)

where \(P_{\textrm{TFHV}} = \sum _{m \in {\textrm{taxi},\textrm{FHV},\textrm{SFHV}}}P_m\).

Based on the above framework, we obtained the best parameter sets of \(\beta = 0.71, \sigma = 0.38\) and corTFS \(=\) 0.31, corFS \(=\) 0.58 based on likelihood values. The \(\beta , \sigma\) parameter values are sampled from log-normal prior distributions with \(\ln \beta \sim N(\ln \mu _{\textrm{beta}}, \sigma _{\textrm{beta}}^2)\), \(\ln \sigma \sim N(\ln \mu _{\textrm{sigma}}, \sigma _{\textrm{sigma}}^2)\). The prior assumes having majority of the time underestimated up to 3 times with \(P(0.33<\beta <1)=68\%\) confidence, i.e. \(P(-\ln 3<\ln \beta <0)=68\%\) which can be achieved when \(\ln \mu _{\textrm{beta}}=-(\ln 3)/2, \sigma _{\textrm{beta}}=(\ln 3)/2\). Similarly, for \(\sigma\), the prior distribution assumes having the individual correction factor \(\epsilon\) within [1/2, 2] (correction up to twice) with \(68\%\) confidence. This can be achieved if we take \(\mu _{\textrm{sigma}}=\ln (\ln 2)\) and \(\sigma _{\textrm{sigma}} = \left| \ln (\ln 2)\right|\); if one simulates multiple \(\ln \sigma \sim N(\ln (\ln 2), (\ln (\ln 2))^2)\) then for the resulting \(\epsilon\) the probability of \(P(0.5<\epsilon <2)\) is again going to be 68%. The correlation parameters corTFS and corFS are sampled from uniform distribution [0,1] provided that corFS > corTFS. Then the sampling simply takes the evenly distributed percentiles of each distribution with equal weights.

Once the parameters are sampled and the model fit likelihoods are assessed, it allows simulating of the mode-choices for a variety of sampled parameters with the results weighted by the joint likelihood \(e^{L(\sigma ,\beta )+L_{\textrm{FHV}}(\textrm{corTFS},\textrm{corFS})}\) (as the prior sampling ensures even probability intervals). For express-assessment, one can simulate the results just for the max-likelihood parameters, however comprehensive parameter sampling provides assessment with respect to the model uncertainty.

Based on the estimated parameter likelihoods, we simulate the final mode choices between origins and destinations for each individual commuter or group of commuters of a given wage group under two different scenarios of interest: (A) intervention scenario (having shared FHV unavailable or after imposing Manhattan Congestion surcharge) and (B) the baseline scenario with all the transportation modes available with their original utilities. Individual correction factors \(\epsilon\) are maintained the same between scenarios (A) and (B). For each individual simulation and the set of model parameters, the mode-shift can be directly assessed and aggregated into percentage mode-shift over the entire city or origin, destination and/or wage group of interest. Being assessed for multiple sampled parameters, it also provides probability distributions with respect to parameter likelihood weighting. The percentage mode-shift can be further translated into the impacts of interest with respect to the differences in travel time, cost and mileage driven between the transport modes.

Model Comparisons

We first evaluate the above simulation model against the classic MNL and logMNL (a version with multiplicative random factors for further consistency with the simulation framework above) according to their capability of fitting the reported choices of four major modes (walking, driving, public transit and taxi) during the pre-FHV era.

All of the discussed approaches estimate mode-choice probabilities \(P_t\) for each origin–destination-wage pair based on the defined utility involving the income of a commuter, travel time and costs. The MNL framework gives probabilities based on Eq. (2). Whereas for the individual choice simulation model, we developed an approach to estimate choice probabilities and resulting likelihoods for each parameter sets through a NN model. The simulations corresponding to each parameter set are weighted by the likelihoods having their logarithms estimated by Eqs. (4) and (5). Table 1 reports the likelihood-weighted averages for the mode-choices provided by each model as well as the R-squared values based on the net 4-mode prediction values for the models discussed.

We observe that both multiplicative model specifications provide estimates much closer overall to the C2SMART test bed according to the R2 score compared to the additive MNL specification, while individual choice simulation model performs slightly better compared to logMNL. But it also apparently gets a much closer prediction on the taxi ridership, which is particularly important for our use cases. Specifically, for the taxi ridership estimates (which is the most important for the considered use cases concerning taxi and FHV trips primarily), MNL underestimates the ground truth by over 1.5 times, while \(\log\)MNL overestimates by approximately 1.6 times. While the individual choice simulation model shows just a 9% deviation. It also gives much closer estimates for walking and driving, while underperforming on public transit. Furthermore, the choice simulation model provides a more adequate estimate for the travel time rationality parameter (the max-likelihood parameter of \(\beta =0.71\) corresponds to a quite realistic 29% undervaluing time, while optimal \(\beta\) for MNL and \(\log\)MNL is above 1 corresponding to time overestimation, which contradicts common intuition of people generally valuing direct money benefits more than indirect benefits of the same estimated value. This further asserts that the main advantage of the simulation model lies not in being vastly better than MNL evaluated on the whole data, but being more interpretable and providing a better understanding of the underlying parameters and apparently a better fit as far as the taxi ridership is concerned.

Finally, as discussed the simulation model framework provides a better intuition and flexibility when simulating individual trips and evaluating alternative choices for the mode-shift part of the analysis. Based on this initial evaluation we are going to stick to the simulation model going forward.

Table 1 Comparison of aggregated number of modal trips of MNL versus Choice simulation model for 4 modes

Uncertainty Analysis

Accounting for uncertainties is critically important for the impact assessment to assess statistical significance of the reported city-wide quantities as well as their difference per wage group or areas across the city. We address uncertainties from two sources: uncertainty in the data and uncertainty in the model.

Uncertainty in the data is accounted for by incorporating the travel time and fares random distributions into the model and running the simulations multiple times. The variation in the trips from the data-based uncertainty simulations was observed to be pretty low to have any significant impact on the mode shift and resulting impacts of interest (“Appendix B: Uncertainty Analysis”).

Model-based uncertainty is analyzed using the approach described above weighting results from different model simulations by the model fit likelihood. This way uncertainty in the mode-choice assessment turns out to be much more significant (Appendix B: “Uncertainty Analysis”), hence going forward we’ll primarily focus on this type of uncertainty in the mode-shift and related impact assessment.

Impact Assessment

To evaluate the applicability of the proposed framework to assess the impacts of transportation interventions and policies, the paper considers two use cases - introducing shared FHV after 2014 in NYC and imposing Manhattan Congestion surcharge in early 2019.

The data for the six major transportation modes in question (transit, walking, driving as well as taxi, FHV, shared FHV) is leveraged from three major sources: (1) C2SMART simulation test bed, (2) NYC Taxi and Limousine Commission (TLC) and (3) web-scraped data from public API interfaces (Appendix A: “The Data”). The C2SMART simulation testbed (He et al. 2020) includes approximately 27.3 million trips for travel modes—taxi, transit, walking and driving and across 16 income groups, following the travel agendas from the historic Regional Household Travel Survey with a synthetic population. The data provides a representative estimation of the city-wide travel choices during the pre-FHV era across people from different income groups across NYC. Whereas for the estimation of taxi and For-hire vehicle choices, we use the up-to-date open data from TLC. It is further used for estimating time and costs estimates for taxis and driving. The travel costs and times for other travel modes are retrieved from the publicly available API services (Google Maps and HereMaps). Accounting for uncertainty is one of the key goals of our analysis. To account for this, we retrieved this information multiple times for each origin–destination pair to capture the variations in the costs and times. More details regarding each data set are given in Appendix A: “The Data”.

Impact of Ridesharing in NYC

As shared FHV became an integral part of NYC transportation, understanding their actual impact is challenged by the lack of an appropriate control area where shared FHV were not available. Historic pre-2014 mobility cannot serve as an adequate baseline as a rapidly evolving transportation system likely got affected by multiple trends, not only the spread of shared FHV. E.g. increased adoption of an FHV service as such (not necessarily shared) could have had a larger impact.

However, the proposed mode-choice model allows simulating a hypothetical scenario with the same transportation demand if shared FHV were not available. As described before we first train the model on the historic mobility represented by C2SMART simulation testbed and then further estimate FHV-related parameters based on the actual taxi, FHV and shared FHV ridership reported by TLC. Important to mention that the model is used to simulate the relative distribution of the ridership per mode for each origin–destination and passenger wage group, while to estimate the actual scale of the impact we are going to rely on the actual amount of shared FHV reported by TLC (as those are the trips that would not have happened without ridesharing, while the alternative modes that would have been used are to be determined for those). This way dependence of the model on historic simulation testbed data is limited to estimating the likelihood of the parameters.

We analyzed the mode-shift (if shared FHV trips were to be facilitated by the second-choice mode in each scenario) simulated by the model with different parameters weighted by the model fit likelihood to determine the anticipated effect of shared FHV on the NYC transportation system. The mode-shift (i.e. percentage of the observed shared FHV trips that would have been facilitated by public transportation, walking, taxi, FHV, and private vehicles) is reported in Fig. 5. The model-based uncertainties seem relatively small, highlighting the robustness of the pattern.

Fig. 5
figure 5

Percent of shared FHV trips accommodated by each alternative mode if shared FHV were not available

As one would expect a majority of the shared FHV trips would have been facilitated by FHV and taxi as the closest alternative. Together with driving, this adds up to nearly 70%. However, around 30% of the trips have actually replaced transit and walking. So while the majority of the shared FHV rides potentially (in case ridesharing actually occurred) cut the traffic by combining the trips that would otherwise involve individual driving, around 30% of those trips replace non-driving mobility, this way increasing the traffic.

On the aggregate citywide scale, we observe a net travel time decrease of 1.77% (95% confidence interval—1.71%–1.83%) and the net mileage increase of 1.14% (95% confidence interval—1.06%–1.22%) even if we assume that each shared FHV trip have actually combined two trips (unfortunately we do not have ground-truth data on that, so this likely represents an optimistic scenario in terms of the traffic impact as some shared FHV might still serve individual passengers while sharing more than two trips at once seems to be a rather rare case). Assessing the scale of yearly citywide ridership from 2019, this corresponds to more than 495,000 h saved for the NYC commuters at the price of 940,000 extra miles driven citywide over the year. So on average, every hour saved comes at a price of a traffic increase by 1.9 miles. The net decrease in travel times comes mainly from the reduction of about 14 M transit trips. The extra miles driven translate to close to 47,000 extra gallons of fuel emitting around 420 tons of carbon-dioxide emissions, assuming  9 kg CO2 emitted per gallon of gas (data from U.S. Environmental Protection Agency (United 2022)). In terms of economic impacts, the mode shift accounts for the citywide time–cost reduction of $4.72 M.

While shared FHV cause an overall travel time decrease and traffic increase across the city, those impacts are greatly uneven across the city. On the level of individual taxi zones, the largest travel time decrease of up to 8% occurred in inner areas of Brooklyn, Queens and Staten Island which seem to benefit the most (Fig. 6) as the new relatively affordable commute option has likely bridged the local gaps in transportation accessibility. While some areas such as the airports saw an opposite effect of up to 8% increase in travel time, which can be related to using the shared FHV as a replacement for more expensive taxi and FHV service heavily used in such locations (having generally lengthy and expensive commutes for which people may compromise travel time for significant cost savings).

Fig. 6
figure 6

Percent change in travel times and mileage across taxi zones after the introduction of shared FHV

Providing individual simulations with respect to commuter wealth, the model allows to analyze the equitability of the impacts across urban populations. We observe the most significant changes for the low-income groups in % difference in mileage (Fig. 7), while the highest changes in travel times are observed for higher-income groups (>$100k annual income). For the high-income groups, the majority of shared FHV trips come from transit and driving. So there is an increase in mileage from transit to shared FHV trips and at the same time decrease from the switch from driving to shared FHV. In the case of low-income groups (<$60k annual income), the mileage increase comes from shared FHV trips are being accommodated from walking and transit modes. In short, it looks like the shared FHV service is the most efficient for the wealthier in terms of the trade-off between improved travel time and the traffic footprint, while when used by low-income passengers it causes a much heavier traffic footprint with smaller travel time improvement. Additionally, we observe that the mode-shift differences across income groups are significant with respect to the model-based uncertainties.

Fig. 7
figure 7

Percent changes in travel times and mileage across income groups after the introduction of shared FHV

Manhattan Congestion Pricing Impact

Another use case is the impact of a new pricing policy—Manhattan Congestion surcharge, adding a fixed cost to taxis ($2.50), FHV ($2.75) and shared FHV ($0.75/passenger) for all trips originating in Manhattan. For shared FHV, we took an average of 2 passengers per ride at a time, so the total cost added was $1.50.

According to the model simulation, on a city-wide scale, we observe an increase of 1.09% in travel times and a 0.87% decrease in mileage, which can be attributed to lower usage of taxis and FHV and the mode-shift to alternative non-driving modes.

On seeing the number of reduced trips across modes, we observe an almost equal drop across taxis and FHVs, although the highest reduction is seen for shared FHV. Almost 60% of the reduced trips are accommodated by transit mode, which translates into $16 M projected increase in revenue for the MTA. Driving and walking accommodate 28% and 12% of the reduced trips respectively (Fig. 8). Assessing on a scale of total 2019 taxi+FHV ridership, the decrease in the number of trips for taxis and FHVs account for around 681,000 fewer miles driven which comes at a net increased travel time of 329,000 h. The decrease in driving mileage causes revenue loss of $19 M for the taxi and $11 M for FHV (shared+non-shared) which comes due to the drop in taxi+FHV trip numbers for trips originating in Manhattan, although the net revenue increase for taxi \(+\) FHV services is $119 M from the increased prices per trip. This further translates into the citywide economic impact of $2.7 M time–cost value increase after the Manhattan congestion charge is added.

Fig. 8
figure 8

Change in number of trips across transport mode after Manhattan congestion surcharge (on C2SMART test bed scale)

Seeing from an equitability perspective, we observe the most dramatic changes for the high-income groups in percent difference in travel times and mileage (Fig. 9), meaning that commute choices of the richest are affected the most. Compared to the low-income population, we see an increase of about 1 percentage point in travel times for high-income groups. The same is observed for total mileage driven, where the decrease is about 0.8% lower for low-income populations than high-income groups. This makes sense as taxi and FHV ridership are seen across the high-income population. The highest mileage cut comes from the top mode switch from FHVs and taxi to transit. This change is seen the most for the $100k–$150k income group whereas for >$150k income groups, the mileage cut decreases as top mode choice is private car instead of transit after the congestion surcharge. The mileage cut is significantly less for lower-income groups as the number of trips of taxis and FHVs are low, to begin with. With congestion surcharge, the top mode choice becomes transit/walking but the net number of trip changes are low compared to the higher-income groups. The highest change among low-income groups is observed for the $60k–$100k group where top mode choice switches to transit from FHV/shared FHV after the congestion charge is introduced. The same trend is seen for travel times where the rich observe the highest time increase owing to their switch from taxis/FHVs to transit mode. In terms of spatial impact, the biggest impact is seen in the high-income neighborhoods of Manhattan, specifically Lower East side, Upper East side and Upper West side parts of the borough. As compared to upper Manhattan neighborhoods like East Harlem and Washington Heights, the impacts in both travel times and mileage are relatively higher in Midtown and Lower Manhattan areas.

Fig. 9
figure 9

Percent changes in travel times and mileage across income groups after Manhattan congestion surcharge

The relatively low changes in total travel times and mileage for the whole city can be explained by the low total proportion of taxi trips present in the data. Together, taxis and FHVs make up around  7% of the total mobility in the C2SMART simulations. Thus any monetary changes in fares in taxi+FHVs for one borough (Manhattan) translate into a low change in net times and mileages.

So in general the policy seems to be efficient in causing a statistically significant decrease in the overall traffic, while the vulnerable populations seem to be the least affected overall.

Consensus Across C2SMART Test Bed and LEHD Mobility

With the intervention scenario of the introduction of shared FHV, we can expect an increase in citywide net travel mileage and a decrease in travel times. But as the C2SMART simulation testbed data might not be a perfect representation of true mobility within the city, it is important to test our model across different mobility data sets to see how much the results might differ and if the model gives a reasonable estimate across different representations of mobility. Thus we decided to also test it on one other data for NYC—the LEHD mobility. The LEHD data has mobility information from across 47,000 O–D pairs compared to  21,000 pairs from C2SMART. On running the simulation model with the best likelihood parameters on the LEHD pairs, we observed a net citywide travel time decrease of 1.91% and a net mileage increase of 1.29% upon introducing the scenario where shared FHV was available as a transport mode.

So the resulting impacts from the simulation model mildly depend on the data source of mobility demand (and compared to other quantifiable sources of the assessment uncertainty, the data source remains the most significant one), but generally remain consistent and close to the range of percent changes originally obtained from the C2SMART simulation testbed data.

Conclusions

This research work constructed the simulation modeling and probabilistic inference framework suitable for the assessment of city-scale impacts of transportation innovations and policies on the transportation system along with the associated environmental and economic implications with respect to the uncertainty of such impacts. The key aspect of this work lies in the framework’s ability to learn from diverse and possibly inconsistent datasets (such as historic transportation surveys and actual taxi and FHV ridership) providing partial information on urban mobility, stepwise gaining information from either source. This provides an important way to model and measure impacts in the events of incomplete mobility data, which is the case in many urban locations. The framework’s applicability is illustrated in two use cases: the introduction of shared FHV in NYC and Manhattan Congestion surcharge.

Broadly, our results indicate that shared mobility helped to decrease travel times between 1 and 2\(\%\) for all categories of passengers. However, it does so by increasing the traffic up to 0.5–1.5\(\%\)—decreases from trip sharing seem to be offset by a growing number of riders due to increased affordability of the service. It works more efficiently for high-income categories of passengers providing higher travel time decrease with lower mileage increase. On the other hand, the Manhattan congestion surcharge noticeably decreases the FHV traffic of up to 1\(\%\), however, it does so at the price of increased travel time and in particular for high-income travelers, who are perhaps the most frequent users of taxis and FHVs, to which the surcharge is targeted. The uncertainty analysis confirms the statistical significance of the impacts as well as their heterogeneity across populations. The impacts above are further translated into the total traffic, gas consumption, emissions, monetary savings and public transit earnings implications.

While we hope that this study can be a proof of concept for other cities considering shared mobility, congestion pricing, or other similar interventions, it should be noted that New York City’s transportation system is unique in many ways, and makes a switch to public transportation more practical than in many other cities. In addition, while the impact assessments in the paper provide proof-of-concept use cases for the proposed framework, further work may be needed to develop a comprehensive and accurate picture of the mode choices and mode shift. The outdated survey-based ground truth refined by C2SMART simulation testbed by itself might not be fully representative of actual urban mobility. The current landscape of urban mobility might differ significantly from the RHTS and similar available transportation surveys conducted in the pre-FHV era. And although the historic data is only used as part of the parameter estimation for the model, while the scale of the impacts is based on the up-to-date TLC data, the mode-choice proportions and reliability of the impact assessment might still get affected.

Another limitation of the study is the simplicity of the utility function as presently considered. Accounting only for travel time and cost it may not reflect all the critical factors of how a person makes a transportation choice. The present utility function would work very well in an ideal world where everyone worked out the economics of their commute daily, but, the reality is that transportation choices are influenced by habits, comfort preferences, and other human factors as well as environmental conditions. We focused our study on commuters because it allowed us to infer demographic and transportation demand information, but the morning commute takes up a small part of New York City’s complex transportation system. The collection of more comprehensive ground truth data and accounting for more aspects of individual choices could further improve the reliability of the impact assessment. Finally, the validity of the mode-choice and impact assessments is conditional on the validity of the model, although an uncertainty assessment related to inaccuracies in the data as well as the model fit allows us to assess the degree of confidence in such an assessment, the specific model behind mode choices need to be assumed.

With that, the main contribution of the paper is a proof-of-concept demonstration that a robust data-driven probabilistic modeling framework incorporating incomplete and inconsistent available mobility data, is capable of assessing the holistic picture of the urban commute and impact of transportation interventions with a reasonable degree of certainty.