Saturday, July 23, 2011

How Sightsplanner searches

In this article we will examine more inner workings of Sightsplanner - it's time to see how to construct the recommendation that consists of multiple items, each individually rated according to how well they match with the user's profile. There is a lot of ground to cover, so we will be glossing over some detail.

We will begin by explaining how the recommendation is modeled as an optimization problem and look at the method of solving it - iterative search. Starting with a simple greedy search we will add features one by one, arriving at a more evolved meta-heuristic called GRASP. Also, an example of the search algorithm at work will be given.

The optimization problem

Each recommended item (a tourist attraction) has a predicted value, or utility, associated with them. Items are combined into candidate solutions, which in turn each have an aggregate score determined. We discussed approaches of doing that in earlier articles "Rating the tours" and "The balancing act". The aggregate score is what we are ultimately seeking to maximize. Each item also has a cost associated, including the static cost of visit time, but also the dynamic cost of traveling to the location and possibly waiting until the attraction opens. Finally, there are constraints - the main one being the total time available for the trip, but items can also have opening and closing times which form time windows.

Most general description that applies in our case would be that we are attempting to solve a constraint optimization problem. We look at combinations that satisfy given constraints, while attempting to minimize or maximize some cost function - in our case, aggregate score of the solution. However, this is probably too general to be useful - instead, we should look at more specialized classes of problems which also have more specialized, and therefore better suited algorithms available.

Traveling Salesman Problem (TSP) is one that is well-known, although in it's classical form of shortest path optimization it does not quite apply to tourist recommender systems - rather, we have something more related to TSP with profits, also called the orienteering problem - we're looking for the shortest path (so we can best use our time) with the highest profit. Adding the constraint of time windows, we get even more specific class of problems, namely orienteering problems with time windows (OPTW).

Due to the presence of balance requirements that conflict with maximizing the individual item score (again, see here), the most naturally fitting model is bi-objective orienteering problem (metric 1 is item score, metric 2 is category balance) - however Sightsplanner currently avoids the approach of multi-criteria optimization in order to present a single solution to the user.

The problem classification may appear as a strictly academic exercise, but it is quite relevant because each class of problems also has a range of solution methods being developed and refined - we can pick one of the established methods as a starting point, instead of re-inventing the wheel. However, in this article, we will do just that - re-invent the search algorithm in order to illustrate the logic behind it's design.

Simplistic approaches

It should be common knowledge, but nevertheless worth mentioning, that the most fundamental approach of solving optimization problems is trying all the possible combinations and comparing the results to see which one is best. Another bit of common knowledge is that this exhaustive search approach is usually too time-consuming to be practical. All practical approaches stem from examining only a part of this solution space - we try some combinations and ignore the (vast majority) of others, the tricky part being how to decide which combinations are worth looking at.

A simple search method, which is very fast, and may give reasonable results goes as follows. Start by examining all the items we can use in constructing our recommendation with and pick the one that has the highest rating (and fits into the solution at the current step). Add this item to the solution, then examine the remaining items, seeking the one that has the next highest rating. Keep repeating this procedure until our solution is filled - this is called pure greedy search, the greedy part being that our heuristic for guiding the search is to always pick the item with the highest individual ranking, hoping that this will result in a good overall score. This search only requires one iteration, since repeated applications will produce identical results - hence the speed.

The greedy search on steroids

With some modifications, the humble greedy search can be turned into a fairly sophisticated, yet simple to understand and easily tunable algorithm. First let's turn our attention to the most glaring weakness - only being able to examine one solution candidate. Each recommended item incurs a direct cost (in case of trip planning, the time spent visiting the attraction), which itself is easy enough to handle even by the pure greedy search - just modify the item ranking by cost, which gives an expected cost-benefit ratio. However, each item also incurs an indirect cost - by visiting one attraction towards south, we may cause the cost of items towards the north to increase because of the increased travel time. So it's certainly possible that instead of going towards the south to visit the lone highest-ranking item there, you might want to head north and visit multiple items there that also have a high rank, just not the highest. There is really no way around this problem other than iterative search, meaning that we need to examine many alternative solutions.

The first modification we add is randomness. Instead of picking the highest-ranking item at each step, we pick a random item, but preferably near the top of the ranking list. Now we can repeat our cycle multiple times and get a different sequence of items each time. Note that we are not losing anything compared to the original algorithm - we may run our first iteration so that we're picking the top-ranking item on each step and introduce randomization at the subsequent iterations, so we're guaranteed to get the same solution as the pure greedy search - plus a number of alternative solutions.

If we are willing to spend more time at each iteration, we can refine our search further. Until now it has been implied that we append items to the solution step-by-step, meaning that we start from location s, then add item A, then B and so on. This is computationally fastest, but not necessarily the best - as long as the transposition s, A, B → s, B, A is possible, it might be worth checking if it reduces the cost, as we may expect the value to be the same*. So, at each step of the iteration, we instead examine every possible slot in the solution we can insert the item into, and go with the one that incurs the lowest overall cost. By doing so we have introduced local search in our algorithm. As this includes the solutions that would have been produced by adding items sequentially, we are again not missing anything, compared to the previous version of the algorithm.

Before introducing the final modification, let's return briefly to the discussion in "The Balancing Act". In short, there are multiple categories of items and for the sake of diversity, we would like many categories to be represented in the final solution. At the same time, categories are not equal and some may consistently have higher-ranking items than others. To address that, we described a formula to calculate a score for the recommendation - the formula rewards including items from all the categories present in user profile, regardless of their individual ranking. Until this point, our search algorithm has been ignorant of this policy.

Observing that the category-based scoring model makes the utility of a candidate item dependent on the items already present in the solution, we are now ready to correct the search algorithm accordingly. Originally we ordered items by individual ranking, picking the top item. Then we modified this to pick the top item and random items near the top, but the ordering was still done by individual ranking. However, if at each step we calculate the utility for each item that is dependent on the solution under construction, we may use this value to order the candidates instead**. Here's the pseudo-code that includes all the improvements discussed so far:
BestSolution:=empty
START ITERATION i
initialize the Solution
DO
compute utility value and
best place in Solution for each item
IF i=1 THEN
NextItem:=item with highest utility value
ELSE
NextItem:=random item with a high utility value
END IF
insert NextItem in Solution at the best place
WHILE Solution does not exceed total time available
IF Solution > BestSolution THEN
BestSolution:=Solution
END IF
END ITERATION

* - glossing over the details here - in some cases the value won't be the same, for example pool, restaurant is preferable to restaurant, pool.
** - this is not the only motivation for re-ranking the items between search steps, but it's the most pertinent to Sightsplanner.

GRASP

The algorithm we have constructed is an instance of Greedy Randomized Adaptive Search Procedure (Full article from Metaheuristics Handbook available here). The other parts of the name are probably self-explanatory at this point - the word adaptive refers to re-calculating the utility value (referred to as heuristic value in GRASP terminology) at each step of the search.

Classical GRASP is somewhat different in that the local search is performed once at each iteration - in Sightsplanner it's integrated into heuristic value calculation. This approach was pioneered by another recommender (journal article) and seems to work well in practice.

So far we didn't really explain how the items are picked randomly, yet favoring the high-ranking ones. If we take the queue of items sorted by their utility value and draw a line at either 30% of the top items, or items that have at least 70% of the utility value of the top item in the queue, we are saying that α=0.3. The next item is chosen randomly, based on uniform distribution, from the items above that line. As noted in the Handbook of Metaheuristics, if α=0, then we have a pure greedy search, while α=1 corresponds to random construction. This is a powerful feature of GRASP that allows us to control the greediness/randomness, even automatically. The list of items that the random selection is made from is called restricted candidate list (RCL).

Sightsplanner's search at work

The following picture shows the first step of an full iteration in Sightsplanner's GRASP search. For illustrative purposes, we've taken up the task of recommending colored balls - the colors may be thought of as metaphor for different categories of items. We begin by computing the heuristic values; these are then used to order the items to produce the restricted candidate list. Not all items necessarily make it to the candidate list at each step - a real-world example would be an attraction that is too far from the starting location to make visiting it viable initially, but if our trip takes us closer to the attraction it becomes viable. Here the blue ball rated at 0.8 is unusable initially.



When the second step begins, the heuristic values of red balls are shifted (while their individual item ratings remain intact) - this is because we'd like our recommended set to be as diverse as possible. The heuristic value calculation is capable of predicting that including too many red balls would be counter-productive.



The third step is similar, except that green balls are the only ones that are not represented in the solution yet, so everything else gets penalized. The size of RCL has shrunk - as in the actual Sightsplanner program, the size is limited by value and only two balls are above the alpha threshold. Here we also see that the insertion point of a new item may be between items already present in the solution.



If our initial goal was to find three balls, then our iteration is complete. We may compare this solution to earlier ones and re-initialize the state for the next iteration. The current solution can either be cleared completely, or some items may be left in place to seed the next solution - this makes the iterations faster, but more prone to getting stuck in local maxima. For instance, if the red ball rated 0.9 is left in the solution, a promising alternative where the red ball (1.0) is present, won't likely be found during the next iteration at least.

Summary

We've looked at a meta-heuristic search method called GRASP. The algorithm can be derived by taking a primitive greedy search and adding features, until we have enough domain-specific knowledge included to produce recommendations under the constraints of time windows, total time and the scoring model that emphasizes category balance. The algorithm is not yet widely used in trip planning recommender systems - we are only aware of Citytrip Planner where the approach was tried originally.