The quality of service provided on Toronto’s streets and in the subway has been a major, long-running topic on this site. As reported last week, the TTC has just issued its third quarterly report on surface route reliability relative to a target of scheduled headway ±3 minutes. They acknowledge that the methodology behind these numbers is flawed, and seek a better way to track reliability from the riders’ point of view.
To that end, the TTC is looking at the “journey time metric” used in London, UK, which tracks an entire trip’s experience including access, waiting and transfer times. Leaving aside the need to define multiple trips both in location (downtown, suburban, in between) and in time (peak commutes, midday, evening, weekends), I believe that multiple metrics are required to flag problems at a level that is both meaningful and revealing of problem specifics.
What follows is a slightly reworked version of a proposal I made to the TTC recently on this subject.
Whatever metrics are used, they should be calculated on a disaggregated basis such as time of day, day of week (weekends), portion of route. The outliers should be reported.
In other words, if there are have 1,000 combinations of route, time period, location, and there are 500 poor performers, then we have a large and pervasive problem. If there are only 100 really poor performers, then things are may not too bad. However if these are concentrated by period or location, this shows an area needing attention.
If we find that only 20% of “Saturday” meets the standard, this is a major issue given the level of weekend demand, but the problem could be submerged in the much more plentiful weekday statistics by quarterly averaging. Route analyses published on this site show that evening and weekend services in many cases appear to operate with little or no supervision and completely unpredictable service quality.
Major routes should be subdivided for analysis so that good service on the central part of 504 King (say) does not mask problems on outer sections (e.g. west of Dufferin, east of Parliament).
Granular reporting allows the selection of an appropriate metric for the route, time and location.
2. Headway Adherence and On Time Performance
These stats remain valuable, but only with granular analysis and reporting described above.
Where headways are greater than some policy threshold (say 15 minutes), on time performance (OTP) is more meaningful that headway adherence. Buses may be “properly” spaced, but they may not arrive when riders expect to see them. This can foul up both wait times and transfer connections.
This is particularly important for night services, but it can also affect major routes with scheduled short-turns or branches that leave wider headways on the outer sections. The metric appropriate for a line at Yonge Street in the peak period may not be appropriate for a branch well away from the core especially at evenings and weekends.
Very short headways (bunching) do not necessarily represent acceptable service even if they fall within the 3 minute rule. Riders will tend to cram on the first bus or streetcar that appears. If that vehicle is short turned, at least a through vehicle may be close behind. A related problem is that all vehicles in a bunch may not be equally attractive to riders depending on their destination, even assuming that they all actually stop (buses in packs love to leap frog).
3. Proportion of Service Operated
How many trips actually operate at a specific location and time period versus what is on the schedule? If all trips are present, but either the headway or OTP metric is low, then we know that it’s line conditions (traffic, operation, management) that are the likely culprits. If, however, trips are missing, this will cascade into wider headways and trips that are not just off schedule, but completely absent.
On frequent services, a missing vehicle (including short turns) may not trigger an exception to the headway metric. Half of the peak scheduled subway service could be missing and the route would still score 100% on the 3 minute rule. (The scheduled headway is 2’20”, and half the service would be 4’40”. The service provided would fall within the span of 0’00” to 5’50” allowed by the metric.)
4. Consolidated Metrics
By now, readers familiar with such management tools as “scorecards” will know that the amount of detail this process describes will be substantial, and the typical manager’s eyes will glaze over even at the thought of a chart or spreadsheet with myriad details. Managers love to reduce their worlds to simple, one-dimensional numbers, or even something as simple as a traffic signal. This sort of attitude drives those of us who know that the system fails or succeeds at the detail level absolutely bonkers.
Having calculated all of the details, the goal should be to meet a compound standard but still at a granular level by location. For example, did service operate to reasonable headways and at the scheduled trip count?
Exception reporting is essential here. Assuming that service is as good as TTC often claims it to be, then the number of instances of routes failing to meet criteria should be small. These are the ones to flag for attention, especially if they show up again and again, or if they score poorly on a consistent basis over a long period. Conversely, if only a handful of routes manages to stay off of the “bad behaviour list” and every edition of this report lands with a thud (real or virtual), then there is something very wrong.
If the manager wants a simple metric, it should be to get that report down to one page, and not just by printing in teensy-weensy type on the largest available sheet of paper.
5. Ridership and Crowding
If a service is overcrowded, this is extremely unpleasant for riders and deters transit use. Riders may face full vehicles they cannot board, and even when they do squeeze on, their trip is in less than ideal circumstances. TTC Service Standards were relaxed as a budgetary measure, albeit a one-time fix that cannot be repeated, and what small amount of surplus capacity that had been designed into the bus network was removed. This leaves no room for growth, but keeps the budget hawks at Council happy with the supposed “efficiency” of the transit system. How this is supposed to attract motorists to transit is a mystery.
The TTC takes riding counts regularly, and is supposed to be equipping vehicles with automatic passenger counters to aid in the frequency and granularity of counts. This information should be reported, and it should not be consolidated into average loads per peak hour. If vehicle loads are uneven (as they often are with poorly spaced service), then an average load will mask what riders actually experience. Most riders are not on the half-empty vehicles. Would-be riders who give up and walk (or take a taxi) are not counted at all.
The fix may be better line management, traffic priority, more service or a combination of all three. With the political focus on big-ticket expansion programs, attention must be drawn to the shortcomings of the service provided today.
6. Journey Time
It will be impossible to construct every possible variation for trips in the network, but these will certainly give a high level view. These journeys must reflect not just peak period core oriented trips for which the system is optimized. Suburb-to-suburb trips, midday trips, evening and weekend trips all need to be considered.
There is a “Catch 22” here. Suppose that a journey involves a walk to a transit route, a wait for a bus, a ride to a subway terminal, a walk through that terminal to the train platform, a wait for a train, a ride when the train arrives, and finally a walk from the final station to one’s destination. Transit planners know that these components are perceived very differently by riders.
For example, wait time, especially for an unpredictable service, is poisonous to the view of a convenient service. There is anxiety associated with uncertainty about a vehicle’s arrival and the rider’s ability to board. The time is not spent doing anything productive (e.g. travelling), and the environment may be less than ideal even in a subway station. Wait times will be compounded, of course, if the service is overcrowded because a rider may not be able to board the first train. This sort of problem needs to be included in the construction of the metric.
Conversely, some parts of the trip (walking, riding) may have fairly consistent values, and these could swamp large swings in the more annoying components of waits and transfers.
While journey times might provide another way of looking at service quality, they are not a substitute for detailed tracking of service behaviour at the detailed level.
Some of the service quality metrics will interact and fixing one problem may bring improvement in multiple values. However, if we don’t know the details in time and location, and only vaguely sense that “the Finch West bus is a mess”, this feeds the TTC culture where “traffic congestion” and a “not our fault” attitude prevail.
Some issues — budget allocations for service and fleet levels, transit signal priority, parking bylaws and towing policies to keep streets clear — do require external assistance and difficult policy decisions by City Council. Average reports over three months’ operation do not, however, pinpoint the problems or show the degree to which they are shared across the city.
Any metrics describing TTC service must be at a level riders can understand. To much consolidation, whether it is in time, location, or trip type, will hide valuable information that riders know from first-hand experience.
A goal that service will meet a standard 70% of the time is laughable. Such a goal will guarantee than, on average, a typical 10 trip a week commuter will have 3 trips that don’t meet standard. This could be a small variation, or a major disruption. Averaged over three months, that 3-in-10 will be much worse on bad weather days and on days when unusual events disrupt the system. An entire week without a problem will be rare.
That 3-in-10 goal has a long history at the TTC where the earlier metric — schedule adherence within three minutes — was the target. The service could only manage a 70% rating a good deal of the time, and the “target” was set to match actual behaviour. That’s no way to improve, only a way to say “we didn’t get any worse this week”.
Management has hidden behind the averages for too long. Granular reporting and multiple metrics are needed to ferret out the problems. Don’t report every item, but without the details (visible in public reports), the value of any measurement exercise is dubious.
Much of this information can be calculated retroactively from vehicle tracking data and this would allow transition to new metrics by seeing the effect of a new scheme on periods for which old-style reports are already available. A trial on a few major routes could be used to evaluate various options and to understand how the new metrics behave.
The TTC keeps talking about its customers and how much more responsive they want to be to riders’ concerns. This means more than cleaning up subway stations. The organization needs to focus on the product they are actually selling — service.