Friday, July 29, 2022

Algorithmic problem solving: The importance of boundary values, edge case, and a consistent model

 I made the title sound almost like that of a computer science paper, although confessions here are of a more mundane nature. Using a small set of problems solving which I sucked at, I want to capture gaps in my own problem solving approach, and figure out patterns.

The right boundary conditions: Thief on a circular road

This is one of those famous problems. There is a circular road or whatever lined with houses. Each house has some cash. A thief is out thieving, but he cannot break into two adjacent houses on the same night as then some automated system notifies the police or dials 111 or some such (why does it need two adjacent houses to be broken into to raise an alarm is the question that perturbed me at a far deeper level all through the course of trying to solve the problem, but ...). The actual problem isn't all that hard to solve. But we have some interesting boundary conditions. If there are no houses, the maximum loot is 0. When there is exactly one house, the maximum loot is the money in that house. When there are two, the maximum loot is the higher of the sums in the two houses. The general algorithm starts being applicable starting with three or more houses. Stated this way it sounds simple, but writing it cleanly is important. It is also worth spending a tiny bit of time trying to keep these special cases as general as possible, but more often that not it doesn't work.

Defining structure: Finding zero-sum triplets in an array of integers

Given an array of integers, find unique triplets that sum to zero. Obviously this won't happen in an array that does have a mix of integers of opposite signs (or one having all zeros). The harder part is to get unique triplets out without any sort of post-processing.

First shot

Without the unique constraint, here was my flawed first shot. We sort the array, and loop through it one element at a time. For each element, we look for two other elements that sum to its negative - and we shall have our triplet. Obviously, we have reduced the problem of finding a triplet to a slightly easier problem of finding a pair of elements matching a constraint. We start by summing two elements one from each end. If their sum is less than the target, we pick the next higher element from the left. If the sum exceeds the target, we pick the next lower element from the right. Else we have our triplet. This is good but inadequate - it does not eliminate duplicates.

Dedup, initial thoughts

One possible way we would be to start with a sorted array of unique elements. But this means we modify the input or make a copy. However, even that alone won't be enough. A little thinking leads us to the observation that if I found a triplet from elements at positions i, j, and k in the sorted array, then I would hit this triplet thrice - first time when I picked a[i] and looked for a pair summing to -a[i], second time when I picked a[j], etc. and third time when I picked a[k]. So if we pick a[0], and look for a pair with a matching sum of opposite sign, then we should exclude a[0] from further searches because we would have found any pairs with a matching sum of opposite sign already. This means, starting with a[i], we should always search for j and k such that j > i and of course k < size.

Dedup, the comprehensive way

The above would suffice in a sorted array of unique elements. But if we want to avoid that chore and keep the input read-only or avoid copying, we can still do it. Simply put, if an element is repeated then we should skip all copies of it after the first. If a[1] equal a[0], we should just skip the logic for a[1]. If a[j+1] equals a[j], we again skip the logic for a[j+1] and so on.

None of these ideas are difficult, but they need a bit of thinking about the structure of the array - for most people perhaps.

Establishing the above structures is important - like shrinking the search space in each successive iteration, and skipping over duplicates.


Read more!