Comparing Vehicle Tracking Data from CIS, Vision and NextBus/TransSee

This page is a work-in-progress and is “off the grid” on this site in the sense that it is not part of the regular blog, but is here for interested parties.

For many years, I have tracked the behaviour of various TTC routes using data from the Communications and Information System (CIS) which has been around for over two decades. It had its problems, especially in early years before the TTC adopted GPS tracking, but the data are generally reliable.

Over the past few years, the TTC has rollred out a new system called VISION which includes tracking information. However, the granularity of data which I have received from VISION is nowhere near the level of CIS because VISION’s reporting is based on stop-level monitoring, not a continuous data stream. This is not an inherent problem with VISION itself because the more detailed info goes to the NextBus feed (see below), but with the granularity of data which is made available for analysis. (This could also reflect the granularity at which data are archived.)

NextBus receives a continuous data feed from the TTC and uses this to predict vehicle arrivals at stops. This information is available through a public interface and is used by various apps to produce a tailored version of the NextBus info. One of these, TransSee, is the work of Darwin O’Connor, and he archives much of the NextBus data. Indirectly, this provides access to finer-grained data than is available from VISION.

This article compares data from these three sources and demonstrates the capabilities and limitations inherent in each source.

Conversion of Data to a Common Format

The CIS data stream is the most detailed, and over many years I have developed a suite of programs to digest and present these data. The process is described in detail in Methodology for Analysis of TTC’s Vehicle Tracking Data. That article is more or less up-to-date with the process as it now exists.

In order to present information from other sources in a comparable manner, the data streams are converted to the same format as the CIS data.

CIS Data:

  • Contains one record for each vehicle (usually) every 20 seconds (the CIS polling cycle) including the GPS co-ordinates.
  • Stationary vehicles transmit the same information on each cycle.

VISION Data:

  • Contains one record for each stop a vehicle passes giving the vehicle’s GPS co-ordinates plus the arrival and departure time at the stop. The GPS co-ordinates are not the same as the stop co-ordinates and can differ from them substantially. This implies that there is a boundary around the stop within which a vehicle is deemed to be “at” the stop. At some locations, this is rather generous to accommodate variations in stopping location as well as GPS errors caused by building reflections of signals.
  • Under certain circumstances, typically at terminals where operators often “sign off” of VISION during a layover, there is no record of arrival or departure at one or more near-terminal stops.

NextBus/TransSee Data

  • Contains one record for each cycle that TransSee polls NextBus and gets updated information. This is typically about 30 seconds. As with CIS data, the records include the GPS co-ordinates.
  • In an attempt to conserve archive space, TransSee does not retain records for vehicles that are stationary or nearly so.

In all cases, times are rounded down to 20-second intervals to match the CIS format. This is related to the fact that “time” within my programs, is measured in 20 second “ticks” for the purpose of interpolating vehicle locations between points where there are missing CIS records. This interpolation is particularly important when dealing with VISION data which only reports vehicle times and locations at stops, but not in between.

Records from all sources are checked for reasonableness of the GPS information because there are problems with signal echo in “canyons” causing mis-reporting of locations. In the case of CIS, this usually corrects itself on the next polling cycle, and so the “missing” data covers only a short time interval. The echo problem shows up in various ways ranging from a vehicle claiming to be a  block away from its actual location to being in the wilds of Caledon or sailing on Lake Ontario. There were also some vehicles with notoriously inaccurate GPS units, although this appears to be less of a problem now than in past years.

With the data in a common format, vehicle locations are mapped into a linear space that “flattens” the route into a line where 1 unit is approximately equal to 10m. This permits display of vehicles and route behaviour in the format of a graphic timetable. (For a more detailed discussion of this process, see the article linked above.)

The Raw Data

The difference in the level of detail available from each feed is shown in three sets of charts below which show one day’s data for the Bay bus.

CIS data are the most detailed because of the 20-second polling cycle and the fact that records are not dropped even if an operator “signs off” for a layover. This sample dates from Monday, November 6, 2017 before the 6 Bay bus was running on VISION.

The first chart shows the data from the north end of the route at Dupont south to about College Street. The dispersion of the dots from Bloor Street south is caused by GPS reflections and positioning errors in the Bay Street “canyon”. The effect gets worse further south. Where this is not an issue, as on Davenport, it is possible to see the eastbound and westbound points as separate tracks.

The loop via Davenport and Yonge 6B short turn, dropped from the schedules on June 24, 2019, is clearly visible here. This shows up in the 2019 data from VISION and NextBus in an unexpected manner later.

Also visible here is some off-route travel. One bus made a trip south on Bedford to St. George Station, and another went down Avenue Road/University to Wellesley.

The second chart runs from College south to Queens Quay. The jog at Queen is about half way down. Through the financial district, GPS resolution wanders a lot.

On both charts, the boxes are the bounding areas for data that I include or exclude as well as part of the conversion scheme to “flatten” the route’s geography into a single dimension.

From the VISION system, the plot is much different. Here are the data for Monday, June 3, 2019. The red circles indicate the GPS position of the stops according to the VISION data feed. It is clear that vehicles considered to be “at” these stops can be at some distance, and this is likely a compromise made so that VISION can report activity at a stop level while allowing for variation in GPS locations. In particular, note the extended series of data points in the Dockside Drive loop at the southeast end of the route, all of which are considered by VISION to be “at” the stop even though they are actually between two official stop locations.

The fundamental point here is that there is no information about the behaviour of vehicles between stops. Specifically, if there is a congested point between stops, its location is invisible, and the problem shows up only as an extended travel time between stops where the vehicle location is reported. Similarly, if operators take layovers near terminals, but before VISION determines that they have “arrived” there, this layover would count as part of the travel time in any analysis. (The situation at terminals is further complicated by VISION simply not reporting vehicles arriving or leaving at terminal stops, as we will see later.)

Note that there are almost no data points on the 6B Davenport Loop. It turns out that on June 3, almost all buses operated north to Dupont even though the 6B short turn was still in peak period schedules.

Finally, here are the NextBus/TransSee data also for Monday, June 3, 2019. As with the CIS data, there is continuous tracking along the route, but what is quite striking is how closely the data points align with the underlying streets. There is none of the scatter visible in the CIS or VISION data implying that some sort of cleanup occurs along the way with these data from NextBus.

As with the VISION data, there are few data points for the 6B Davenport Loop. This produces an interesting effect in the NextBus feed as we will see later.

Conversion to Graphic Timetables

In the following sections, data for various routes are shown. In each case, the graphic timetable format of the data is shown for the PM peak, but the full day’s data plot is available in a PDF as well.

6 Bay

CIS-based versions of these charts will be familiar to readers of this site. Here is a sample of the Bay bus data from November 6, 2017 rendered into this format. The lines are a bit ragged, but that is the inevitable effect of variation in traffic speed which affects the slope of the line. Where a line is horizontal, the bus is not moving.

Full day chart PDF

The chart from VISION data looks quite a bit different.

  • At both terminals, but particularly at the south end of the route, many vehicles are not reported at the “end of the line” but appear to turn around before reaching the terminal because the data points only cover stops near, but not at the termini.
  • The pink line southbound from Davenport at about 16:45 illustrates a problem with reporting only at the stop level. Normally, my routines would detect a “jump” where a bus travels some distance without any intervening data points on the route (this can occur due to a diversion, among other reasons). The data for this bus reports its location at Davenport at about 16:47 and then at King Street at 17:11. What it actually did in the intervening time is unknown. I have left this “jump” in the chart as a sample of the problems of stop-level location tracking.
  • Despite the fact that the 6B Davenport short turn is still part of the schedule, almost all runs are shown as operating north to Dupont Street. Construction on the loop affected service on this day.

Full day chart PDF

Finally, the same day with data from NextBus/TransSee. This took me a bit by surprise until I figured out what was happening because it shows the 6B Davenport short turns by contrast to the VISION data above.

This is an example of a shortcoming in NextBus in that it tracks what it expects to see from the schedule, not what is actually happening. If a vehicle does something unexpected, it simply disappears from the feed. In this case, NextBus expects service to turn back at Davenport, and does not track the affected runs while they go to Dupont. They re-appear on their southbound trip.

However, the NextBus feed does a somewhat better job of tracking vehicles to termini (at least those that are scheduled to go there).

(This is a very serious problem during diversions such as TIFF when the TTC does not publish an alternate route structure for NextBus and predictions for vehicle arrivals are almost useless, just when they are most needed. Similarly, any extras that are assigned to a route, including shuttle buses for diversions, do not show up on NextBus because their run number are not part of the official schedule.)

The “jumping bus” from the VISION data is tracked by NextBus and shows a continuous data stream southbound. Somehow, VISION lost track of this vehicle, but NextBus didn’t. Note that NextBus shows the vehicle southbound from Bloor about 10 minutes later than the time implied by VISION, hence the “better behaved” slope of the line in the chart below.

Full day chart PDF

Both the VISION and NextBus/TransSee data show the problems inherent in using a filtered data feed rather than the raw data from the tracking system. Each of these feeds has “digested” the data to fit its reporting model, but in the process has removed or distorted information about actual vehicle behaviour.

A problem with the VISION stop-based data is that it can be difficult to distinguish between a gap between stops and a diversion off route if breaks of several minutes between reported vehicle locations are considered “normal”. This shows up particularly on express bus routes where the combination of widely-spaced stops and traffic congestion can cause what appear to be “gaps” in location reporting. For the purposes of charts in this article, I adjusted the charting logic so that only gaps of 30 minutes or more would produce a break in lines on the charts.

929 Dufferin Express VISION Data

The 929 Dufferin Express shows some of the same problems as the 6 Bay route with lost tracking at terminals, particularly at Wilson Station. In the VISION data feed, many buses do not report going north of Wilson Avenue. Similar problems exist with data for 29 Dufferin.

Something truly odd happens with the bus arriving at Wilson Station at about 17:45. It has an extremely ragged location track northbound from Lawrence to Yorkdale which is not appropriate for an “express” bus. Looking at the raw data, there are:

  • 3 records supposedly at Lawrence West, North Side stop
  • 23 records supposedly at Orfus Road
  • 7 at Jane Osler Boulevard (Yorkdale)
  • 4 at Wilson Avenue

All of these have different GPS vehicle locations despite being, in theory, records for only four stops. VISION is clearly quite confused about what this bus is doing.

Full day chart PDF

505 Dundas Bus VISION and CIS Data

For quite some time, 505 Dundas has operated as a bus route because of the streetcar shortage and because of water main construction working its way east along Dundas to Bay Street. Most of the tracking data for Dundas buses comes from VISION, but until the third week of June 2019 a batch of old buses (79xx series Orions dating from 2006) still using CIS weas mixed in on the 505. This allows a direct comparison of data for the same day collected via the two systems.

Comparing the two charts for the PM Peak:

  • The CIS chart does a better job of resolving behaviour at the terminals than the chart with VISION data.
    • At Dundas West, there is no inbound stop near the station as there is at Broadview (Erindale at Broadview, and Broadview at Danforth), and so the first stop that reports an inbound trip is at Dundas and Roncesvalles. Buses spend a fair amount of time at the station, and then enroute to that first inbound stop, but the split between layover and travel time cannot be determined from the data.
    • Similarly at Broadview, the last outbound stop before the station is northbound on Broadview at Wolfrey (unlike the situation at Dundas West where there is a northbound stop at Bloor). When this is combined with severe queuing problems for vehicles attempting to enter Broadview Station, the actual vehicle location as it approaches the station is unknown. Buses leave Wolfrey and eventually arrive at the station, but the time spent at various locations enroute is unknown.
  • In many cases, the VISION data give the impression that vehicles do not spend long at the terminals, but this is not consistent with actual observed behaviour. At Broadview Station it is common to see so many 505 buses accumulate that there is no room on the platform for them. This is a direct result of very generous running times padded for construction delays.

Full day charts: VISION CIS

7 Bathurst Bus VISION Data

As on other routes, the VISION data for 7 Bathurst misrepresents vehicle behaviour at terminals, notably at Bathurst Station. Buses appear to take a long time either southbound from the stop north of the station (Barton) to the loop, or a long time for the same northbound trip. However, what is actually happening is that buses get a long layover at the station. Some vehicles are never shown as reaching the station at all (horizontal lines just above the x-axis).

Full day charts PDF

41 Keele and 941 Keele Express VISION Data

The Keele bus shows similar problems with terminal layover times as seen on other routes.

Full day charts PDF

Oddly enough, this problem does not affect the 941 Keele Express bus to the same degree, and terminal layovers are generally shown correctly with separate arrival and departure times.

Full day charts PDF

The Potential Effect on Scheduled Running Times

Many bus route schedules are being modified for “reliability” with increases in both driving and recovery times. According to TTC Planning, the new times are based on the 85th percentile of actual travel times.

However, it is clear from the VISION data on many routes that terminal-to-terminal times are possibly inflated due to the inclusion of what is really layover time as part of the journey near terminals. This begs the question of whether new schedules are based on erroneous travel time data.

The new schedules achieve their ends in two ways:

  • Stretching headways with no additional vehicles so that “N” vehicles get a longer trip and/or recovery time.
  • Adding vehicles to a route so that there is more running time on a similar headway to the old schedule.

The first of these penalizes riders by making wait times longer, while the second adds to the number of vehicles required without providing additional service.

I will review this issue in a future article.