Past Projects

Traffic flow implications of autonomous and partially autonomous vehicles

While it may be a long time before all the vehicles on our roadways are completely automated (if ever), it is likely that in the near future there will be an increasing number of autonomous (or partially autonomous) vehicles. These vehicles will likely drive somewhat differently than human drivers, and will thus influence the traffic dynamics. Similar to the shift from Eularian to Lagrangian traffic state estimation that occured when GPS-enabled smartphones entered the mainstream, a similar shift from Eularian control (control at fixed locations in the infrastructure, e.g., ramp metering) to Lagrangian control in the traffic flow may be possible, even with just a small number of autonomous vehicles in the traffic that can be controlled to control the overall traffic flow.

The goal of this project is to identify the extent to which a small number of autonomous vehicles in the traffic flow (e.g., 5% of vehicles) are able to alter the traffic dynamics and mitigate adverse emergent phenomena such as traffic oscillations or phantom traffic jams. The work includes both theoretical contributions on traffic stability as well as extensive experimental work demonstrating the ability of a single autonomous vehicle in a flow of 20 human-piloted vehicles to completely eliminate traffic instabilities reducing overall fuel consumption by nearly 40%.

Funding:  National Science Foundation

Students: Raphael Stern

Publications and Products:

  1. R. Stern, S. Cui, M. L. Delle Monache, R. Bhadani, M. Bunting, M. Churchill, N. Hamilton, R. Haulcy, H. Pohlmann, F. Wu, B. Piccoli, B. Seibold, J. Sprinkle, D. Work. "Dissipation of stop-and-go waves via control of autonomous vehicles: Field experiments." Transportation Research Part C: Emerging Technologies, 2018. Download: manuscript.
  2. S. Cui, B. Seibold, R. Stern, and D. Work. "Stabilizing Traffic Flow via a Single Autonomous Vehicle: Possibilities and Limitations." in Proceedings of the IEEE Intelligent Vehicle Symposium, Redondo Beach, CA, June 2017. Download: manuscript.
  3. F. Wu, R. Stern, M. Churchill, M. L. Delle Monache, Han, Piccoli, and D. Work. "Measuring fuel consumption in oscillatory traffic: experimental results" in Proceedings of the Transportation Research Board Annual Meeting, Washington, DC, January 2017. Download: manuscript.

Detecting extreme traffic events via a context augmented graph autoencoder

Accurate and timely detection of large events on urban transportation networks enables informed mobility management. This work tackles the problem of extreme event detection on large scale transportation networks using origin-destination mobility data, which is now widely available. Such data is highly structured in time and space, but high dimensional and sparse. Current multivariate time series anomaly detection methods cannot fully address these challenges. To exploit the structure of mobility data, we formulate the event detection problem in a novel way, as detecting anomalies in a set of time dependent directed weighted graphs. We further propose a Context augmented Graph Autoencoder (Con-GAE) model to solve the problem, which leverages graph embedding and context embedding techniques to capture the spatial and temporal patterns. Con-GAE adopts an autoencoder framework and detects anomalies via semi-supervised learning.

The performance of the method is assessed on several city-scale travel time datasets from Uber Movement, New York taxi and Chicago taxi, and compared to state of the art approaches. The proposed Con-GAE can achieve a 0.1-0.4 improvement in the area under the curve (AUC) score compared to the baselines. We also discuss real-world traffic anomalies detected by Con-GAE.

Funding:  National Science Foundation

Students: Yue Hu

Publications and Products:

  1. Y. Hu, A. Qu, D. Work. "Detecting extreme traffic events via a context augmented graph autoencoder" in submission. Download: Manuscript. Code: github.

Quantifying Social Distancing Compliance Using Computer Vision

Social distancing has become a pressing and challenging issue during the Covid-19 pandemic. In a smart cities context, it is possible to measure inter-personal distance using networked cameras and computer vision analysis. We deploy a computer vision pipeline based on Retinanet that identifies pedestrians in continuously streaming video frames, then converts their positions to GPS coordinates for distance calculation and further analysis. This processing is applied to nine camera streams at three locations across Vanderbilt University's campus. We collected 70 hours of baseline distancing data over the course of two weeks, after which time we deployed small behavioral interventions at the three locations aimed at increasing distancing compliance. Another 70 hours of data with the interventions in place will be analyzed against the baseline data to determine if we can increase social distancing compliance.

Students: Nicole Gloudemans, Derek Gloudemans, Will Barbour

Large-scale traffic estimation with performance guarantees

Despite important advances in computation and sensing, real-time traffic estimation problems are still open to a number of critical issues including: (i) the entire state of the transportation network is too large for the estimators to scale in real time; (ii) few results are available which provide a theoretical analysis of the performance of traffic estimation algorithm, and (iii) the non-observability of the traffic model is inevitable due to the existence of shocks and the sparsity of sensor measurements.

Our work aims at designing scalable distributed traffic estimation algorithms to address issues (i) and (ii) with specific care of issue (iii). Large-scale networks can be partitioned into overlapping local sections, with the traffic density on each local section described by a traffic model on the local state, and estimated by a cheap commodity computer (e.g., an agent).  The main algorithmic challenge is to obtain theoretical bounds on the performance of the estimator, even as the system state switches between observable and unobservable modes. We are also interested in the development of estimation algorithms when with performance guarantees when not all sensor data can be transmitted due to energy and communication costs.

Funding:  National Science Foundation

Students: Ye Sun

Publications and Products:

  1. Y. Sun and D. Work. "Online estimation with synthetic measurements under an event-triggered sensor scheduler." submitted to the IEEE Transactions on Control Systems Technology, 2016. Downloadpreprint.
  2. Y. Sun and D. Work. "Kalman filtering with synthetic measurements under an event-triggered sensor scheduler." to appear in the European Controls Conference, June 2016.
  3. Y. Sun and D. Work. "Scaling the Kalman filter for large-scale traffic estimation." submitted to the IEEE Transactions on Control of Network Systems, February 2015. Download:   extended versionsource code.
  4. Y. Sun and D. Work. "A distributed local Kalman consensus filter for traffic estimation." in Proceedings of the IEEE Conference on Decision and Control, pp. 6484–6491, 2014.

Quantifying traffic due to potential COVID-19 commute mode shifts

This work is done in cooperation with teams from Cornell and UT Austin, to quantify the sensitivity of commute travel times in about a hundred US metro areas, due to potential changes in commute patterns after COVID-19 from transit and carpooling to single occupancy vehicles (SOV). Bayesian regression model is applied on US census data to relate commute travel time to the number of passenger vehicles. Findings are covered in more than 30 medias, including Reuters, Bloomberg CityLab, CBS news (also here), New York Post and so on.

The findings of The Rebound study are coalesced into the Rebound Calculator. The tool estimates one-way commute travel times using models built around recent commuting data for most U.S. metro areas. As the number of vehicles on the roads increases, so too does the travel time, according to a traffic fundamental called the BPR model. Travel times, therefore, increase if existing commuters switch from transit or carpool to single-occupancy vehicles. Travel times will decrease if fewer vehicles are on the road due to unemployment or remote work. How mode shift could affect travel times is particularly important in the era or Covid-19, as it could impart high cost on commuters due to increased time spent in traffic.

Students: Yue Hu, Will Barbour

Publications and Products:

  1. Yue Hu, Will Barbour, Kun Qian, Christian Claudel, Samitha Samaranayake, and Dan Work, "Quantifying traffic due to potential Covid-19 commute mode shifts" in submission, 2020. Blog: The rebound. Download: preprint.

Automatic Data Cleaning for Urban Sensor Networks

Low cost urban sensing networks enhance our understanding of cites and urban life. The impacts of mitigation strategies in communities can be measured at a fine-grained scale, informing context-aware policies and infrastructure design. However, fine-grained city-scale data analysis is complicated by common, tedious data cleaning tasks such as removing outliers and imputing missing data. To address the challenge of data cleaning, this project applies robust low-rank tensor factorization method to automatically correct anomalies and impute missing entries for high-dimensional urban environmental datasets.

The method is applied to the Array of Things (AoT) city-scale sensor network. Located in the City of Chicago, IL, AoT collects real time data on the city's environment and activity with more than 90 nodes. Further analysis of AoT data and its broader usages are also under way.

Students: Yue Hu, Yanbing Wang

Publications and Products:

  1. Y. Hu, Y. Wang, C. Jiao, R. Sankaran, C. E. Catlett, D. Work. "Automatic data cleaning via tensor factorization for large urban environmental sensor networks." Tackling Climate Change with Machine Learning Workshop at at NeurIPS, 2019. Code: github.

Improving intelligent transportation systems (ITS) in safety-critical environments

The enormous and rapidly increasing amount of traffic data opens tremendous opportunities for the development of intelligent transportation systems. However, real time traffic estimation and optimal traffic management in safety critical environments are still open problems. One example occurs in rural work zones, where existing sensor coverage is sparse.

Our work aims at improving ITS in safety-critical situations via: i) the development of a low-cost and low-power wireless sensor network for traffic monitoring in work zones; ii) the development of mathematical traffic models on networks for amenable for optimal traffic management. In this work, a low-cost and low-power wireless traffic sensor platform is built using the passive infrared camera and machine learning algorithms to detect vehicles and estimate the traffic speed. Concurrently, we are developing a theoretical framework traffic estimation and optimal traffic management for networks of conservation laws. The developed sensor network, a variety of traffic estimation algorithms (including ensemble Kalman filter and Particle filter), and the optimal traffic management (using convex programs) are simulated and validated in micro-simulation environment.

Funding:  Illinois Department of Transportation

Students: Yanning Li, Juan Carlos Martinez Mori, Chris Chen, Fangyu Wu

Publications and Products:

  1. Y. Li, J. C. Martinez Mori, and D. Work. "Estimating traffic conditions from smart work zone systems." submitted to the Journal of Intelligent Transportation Systems, 2016. Download: preprint.
  2. Y. Li, C. Claudel, B. Piccoli, and D. Work. "A convex formulation of traffic dynamics on transportation networks." submitted to SIAM Journal on Applied Mathematics, 2016. Download: preprint.
  3. Y. Li, J. C. Martinez Mori, and D. Work. "Improving the effectiveness of smart work zone technologies, Illinois Department of Transportation Technical Report," submitted for review July 2016. 
  4. B. Donovan Y. Li,  R. Stern J. Jiang C. Claudel, and D. Work. "Vehicle detection and speed estimation with PIR sensors." International Conference on Information Processing in Sensor Networks, peer reviewed poster session, April 2015. Download: preprint

Management of shared e-scooter parking using historical data

Proliferation of shared urban mobility devices (SUMDs), particularly dockless e-scooters, has created opportunities for users desiring efficient, short trips. Simultaneously, these devices have raised management challenges for cities and regulators in terms of safety, infrastructure, and parking. There is a need in some high-demand areas for dedicated parking locations for dockless e-scooters and other devices.

We use the data generated by SUMD trips for establishing locations of parking facilities and assessing their required capacity and anticipated utilization. The problem objective is: find locations for a given number of parking facilities that maximize the number of trips that could reasonably be ended and parked at these facilities. Posed another way, what is the minimum number and best locations of parking facilities needed to cover a desired portion of trips at these facilities?

We find areas of high-density trip destination points using unsupervised machine learning algorithms to serve as parking locations. The dwell time of each device is used to estimate the number of devices parked in a location over time and the necessary capacity of the parking facility. We test these methods on scooter data totalling approximately 100,000 trips at Vanderbilt University. DBSCAN is the most effective algorithm tested for determining high-performing parking locations. A selection of 19 parking locations, is enough to capture roughly 25% of all trips in the dataset. The vast majority of parking facilities found require a mean capacity of 6 scooters when sized for the 98th percentile observed demand.

Students: Will Barbour, Ricardo Sandoval, Caleb Van Geffen

Publications and Products:

  1. W. Barbour, M. Wilbur, R. Sandoval, C. Van Geffen, B. Hall, A. Dubey, D. Work. "Data driven methods for effective micromobility parking." In Proceedings of the Transportation Research Board Annual Meeting, 2020 (submitted). Download: preprint.

Extreme Event Detection from Massive Transportation Data

This research project focuses on extreme events in urban transportation systems. Motivated by fast urbanization and increasing frequency of extreme weather events, the need for methods to quantify infrastructure performance and resilience at city scales has become a priority. The research on extreme events can be greatly aided by high volume of empirical data collected recent years, such as the large taxi dataset published by New York city, or Waze app dataset collected in Nashville. Data may of course be sparse and is in some cases masked. However, the sheer volumn of data from various sources provides an underexploited starting point to understand how transportation systems respond to distuptions.

This project is aimed at finding a method that can overcome the constraints in high volume traffic data, and identify ''extreme'' behaviors from ''regular'' behaviors. Exploiting the regular patterns can be of help, which means we can rearrange the traffic data into higher dimensions. The porpose of the project is to develop a reliable algorithm to analysis the massive city traffic data in tensor format, and give a better insight on traffic pattern and extreme event behavior.

Funding:  National Science Foundation

Students: Yue Hu

Publications and Products:

  1. Y. Hu and D. Work. "Robust Tensor Recovery with Fiber Outliers for Traffic Events." submitted to ACM Transactions on Knowledge Discovery from Data (TKDD), 2019(under review). Download: preprint.Code: github.

Electric motor failure classification

Predictive maintenance of machinery has recently become a higher-performance methodology compared to traditional condition-based maintenance. This is due to improved sensing capabilities through the internet of things (IoT) and renewed interest in machine learning methods that perform diagnosis well. Fault states as well as machine health values can be predicted in some cases.

We are interested in classifying electric motor failure conditions using accelerometer data, with specific application for traction motors on railroad locomotives. Machine learning classifiers were first trained and tested on motor data from a test bench to make the binary prediction of faulty or healthy condition. Ensemble models were then developed to make the multi-label classification of which fault, or which combination of fault, are present in a motor.

Funding:  CRRC Corporation Limited

Students: Will Barbour, Derek Gloudemans


TrafficTurk is a mobile application that allows easy collection of traffic data.  The app functions as a turning movement counter - the user stands at an intersection and swipes the screen when a vehicle passes.  The user swipes the trajectory of the car on an intersection image - different types of swipes correspond to various types  of turns.  All of these turns are sent wirelessly to a central server in realtime, so they can be analyzed and stored in a database. The purpose of TrafficTurk is to collect data on a large scale during unusual events. Ideally, many users will simultaneously collect data at nearby intersections in order to get a complete picture of the traffic patterns.  The server sends a map to the app, allowing users to scroll around and see various intersections.  They choose the one that they intend to collect data at, and begin counting. TrafficTurk is designed for the worst-case scenario.  During some types of unusual events, mobile networks become crowded or even fail entirely.  For this reason, the app must be able to function with intermittent or absent internet connections.  To support this, the app can easily pre-cache all of the intersections in an area.  In this way, it is possible to count at these intersections even if there are no mobile networks available near them.  Similarly, the app maintains a cache of collected data which only clears once the server has received it.  In this way, the app has a 100% data collection guarantee as long as the phone eventually connects to a wireless network.

Funding:  Civil and Environmental Engineering, UIUC; College of Engineering, UIUC; Illinois Department of Transportation; National Science Foundation

Students: Brian Donovan, Sudeep Gowrishankar, Mostafa Reisi Gahrooei, Jon Que, Bhinav Sura, Meng Han

Publications and Products:

  1. M. Reisi Gahrooei and D. Work. "Inferring traffic signal phases from turning movement counters using hidden Markov models." IEEE Transactions on Intelligent Transportation Systems,PP(99), pp. 1-11, June 2014. Download: preprint, source code
  2. S. Gowrishankar and D. Work. "Estimating traffic control strategies with inverse optimal control." in Proceedings of the IEEE Conference on Intelligent Transportation Systems, April 2013.  Download:preprint, source code
  3. M. Reisi Gahrooei and D. Work. "Estimating traffic signal phases from turning movement counters." in Proceedings of the IEEE Conference on Intelligent Transportation Systems, April 2013, pp. 1113 – 1118Download: preprint, source code
  4. AwesomeStitch for OpenStreetMap:

Estimating arrival times on freight railroads using machine learning

We have written a mixed integer linear optimization model that dispatches trains according to signaling constraints on single track railway lines with passing sidings and does so optimally according to the minimization of a weighted delay measure.

This concept is similar to that used in commercial computer-aided dispatching systems, but those systems have the notable shortcoming of being overridden often in areas with complex dispatching situations. That is, they do not match the behavior of human dispatchers well. I propose to remedy this problem by performing inverse optimization according to known historical data for single track rail lines.

Specifically, the forward optimization problem (dispatching trains) can be tuned to match historical dispatching behavior as closely as possible. The resulting dispatching model is a useful simulation tool that could be used for prediction of train arrivals, assessment of dispatching performance, investigation of track infrastructure layout, and schedule optimization.

Students: Will Barbour

Traffic State Estimation and Incident Detection

Joint traffic state estimation and incident detection is a critical problem for both traffic monitoring and highway safety. In this project, multiple model filtering techniques (e.g., particle filter and Kalman filter) and macroscopic traffic flow models are proposed to jointly estimate the traffic state and detect traffic incidents in real-time using sensor data. The algorithms are tested through incident data generated by a microscopic traffic simulation software (CORSIM) and field data collected from the mobile century experiment by Berkeley, 2008. The left figure shows the traffic density evolution from a CORSIM simulation, where the red area in the plot indicates the traffic congestion caused by an incident.

Funding:  Federal Highway Administration

Students: Ren Wang

Publications and Products:

  1. R. Wang, S. Fan, and D. Work. "Efficient multiple model particle filtering for joint traffic state estimation and incident detection." Transportation Research Part C: Emerging Technologies (in press), 2016. Download: preprint.
  2. R. Wang, D. Work, and R. Sowers. "Multiple Model Particle Filter for Traffic Estimation and Incident Detection." IEEE Transactions on Intelligent Transportation Systems (in press), 2016. Download: preprintsource code.
  3. R. Wang and D. Work. "Joint parameter and state estimation algorithms for real-time traffic monitoring." NEXTRANS, technical report. No. DTRT12-G-UTC05, December 2013.
  4. R. Wang and D. Work. "Interactive multiple model ensemble Kalman filter for traffic estimation and incident detection." in Proceedings of the IEEE Conference on Intelligent Transportation Systems, pp 804–809, October 2014. Download: preprintsource code

Decision Making with Data Uncertainty in Life Cycle Assessment

In life cycle assessment, when product systems are optimized to minimize environmental impacts, uncertainty in the process data may impact optimal decisions. In this project, a robust optimization approach is proposed for decision making under uncertainty at the life cycle inventory stage. The level of protection against data uncertainty can be controlled to reflect varying degrees of conservatism. The left figure shows the comparison between the proposed robust optimization approach (red) and deterministic optimization approach (blue) in terms of carbon oxygen output for an electricity generation problem with varying degrees of protection against data uncertainty.

Students: Ren Wang

Publications and Products:

  1. "Application of robust optimization in matrix-based LCI for decision making under uncertainty." R. Wang and D. Work, International Journal of Life Cycle Assessment, 19(5) ,pp. 1110-1118, 2014. Download: preprint, source codemanuscript.

Estimating arrival times on freight railroads using machine learning

Numerous methods exist by which to estimate the arrival times (ETAs) of freight trains. Many have been applied to passenger rail, particularly in Europe. But few have investigated the predictability of freight railroads in the United States with large amounts of historical data. Using a dataset from CSX Transportation, we applied machine learning regression techniques on a rich feature set. The features mined from historical railroad operational data included train characteristics, crew information, and network state information.

The dynamics of train operations across route segments is very complex and varied due to numerous factors including topography, locations of passing sidings, and locations of intermediate yards. For this reason, independent regression models were built for a series of discrete points across each route segment. The intuition behind this modeling decision is that the factors contributing to ETA prediction change as the train completes its route. This hypothesis is supported if significant changes are observed in regression feature weights between independent models.

The methodology was tested on sections of the railroad with notoriously varied and unpredictable behavior. ETA predictive performance was compared to that of simple statisitcal methods, which are used in some prediction systems on the railroad. Results showed a consistent 20-30% improvement in prediction accuracy (mean average error) at the beginning of route segments.

Feature weights did indeed change significantly between various regression models on the same route segment. In fact, difficult topographical areas were even identifiable by increased feature weight observed for train tonnage and length.

Funding:  Roadway Safety Institute, the University Transportation Center for US- DOT Region 5

Students: Will Barbour

Publications and Products:

  1. W. Barbour, J. C. Martinez Mori, S. Kuppa, D. Work. "Estimating Arrival Times for US Freight Rail Traffic." Transportation Research Part C: Emerging Technologies, 2018. Download: preprint.
  2. W. Barbour, J. C. Martinez Mori, S. Kuppa, D. Work. "Predicting Delay Occurrence at Freight Rail Sidings." Proceedings of Transportation Research Board Annual Meeting, 2018 (to appear).
  3. W. Barbour, S. Kuppa, and D. Work. "Supporting Automated Operations with Improved Arrival Time Predictions on US Freight Railroads." in Proceedings of the the ITRL Conference on Integrated Transport, Stockholm, Sweden, 2016. Download: preprint.

Quantifying resilience with coarse GPS data

What effects do natural disasters have on our transportation infrastructure?  How resilient are our cities to extreme events, and how quickly do they recover?  Our work aims to answer these questions in a way that is low-cost and practical, by analyzing millions of GPS records from taxis equipped with GPS.  In this way, it is possible to determine the pace of traffic (i.e., travel time per mile) between various regions of the city, and detect atypical congestion. Our method is tested using a dataset of nearly 700 million taxi trips from New York City.  We have made the dataset publicly available here.

The basic idea is that city traffic conditions follow a fairly periodic pattern.  The figure (left) shows the average pace (travel time per mile) between various regions of the city over the course of three weeks (East of the Hudson, Upper Manhattan, Midtown, and Lower Manhattan).  High travel times are obvious during rush hour each day, especially in the Midtown-to-Midtown trips (M-M).

On a particular day, the traffic conditions may deviate significantly from this expected behavior.  This is usually due to an extreme event such as a natural disaster or bad weather conditions.  Based on this deviation, it is possible to measure how "usual" or "unusual" the traffic conditions are.  For more information about how this probabilistic computation is performed, see our paper:

Brian Donovan and Daniel B. Work. Using coarse GPS data to quantify city-scale transportation system resilience to extreme events.  presented at the Transportation Research Board 94th Annual Meeting, January 2015.  preprintsource code.

You can also download the data here:

Brian Donovan and Daniel B. Work  “New York City Taxi Trip Data (2010-2013)”. 1.0. University of Illinois at Urbana-Champaign. Dataset., 2014.

This video shows all of the taxi trips during the weeks before, during and after Hurricane Sandy, color coded by their pace.  On the right is a normal week for comparison.  The graph at the top indicates how typical or atypical the traffic conditions are, and indicates that it took over five days for the city traffic network to recover from the storm.

One interesting finding is that travel times are actually lower than expected in many parts of the city during the recovery process.  This indicates a change in the usage of the transportation infrastructure – it is likely that the hurricane caused a decrease in usage.  This is especially true in areas like Lower Manhattan, which experienced severe flooding and power outages.  Another surprising result is observable throughout the city on Wednesday the 31st, when the network is completely congested by people try to return to normal operations. This result  indicates that there is room for further research in the post-disaster process of repopulating the city after an evacuation.

Funding:  National Science Foundation

Students: Brian Donovan

Publications and Products:

  1. Source Code:
  2. 2010-2013 NYC Taxi Dataset
  3. 2010-2013 NYC Traffic Estimates
  4. B. Donovan, J. Lee, and D. Work. "A high resolution method for quantifying resilience of urban road networks." submitted to IEEE Transactions on Intelligent Transportation Systems, 2016. Download:  preprint.
  5. X. Guan, C. Chen, and D. Work. "From Warnings to Awareness and Actions: Insights from Hurricane Sandy." submitted to the 2017 Transportation Research Board Annual Meeting, 2016.
  6. B. Donovan and D. Work. "Empirically quantifying city-scale transportation resilience to extreme events." submitted to Transportation Research Part C: Emerging Technologies, 2016. Download: preprint.
  7.  X. Guan, C. Chen, and D. Work. "Tracking the Evolution of Infrastructure Systems and Mass Responses Using Publically Available Data." submitted to PLOS ONE, 2016.
  8. B. Donovan and D. Work. "Using coarse GPS data to quantify city-scale transportation system resilience to extreme events." presented at the Transportation Research Board 94th Annual Meeting, August 2014. Download: preprintsource code data.