Friday, December 18, 2020

Recent pitfalls at work

Spent unproductive hours debugging issues that should have taken less time. Why?
  1. I implemented something without experimenting enough, based on dodgy documentation (in fact no good documentation is available).
  2. An oversight in my analysis / design. I got away with a small fix but sometimes this can be costly.
  3. Didn't check if my code ran on all environments it was targeted at. For example, I introduced a dependency on OpenSSL without verifying if it can be fulfilled everywhere.

 #1 is easy to address. Run more small experiments as a habit while building software. I do blame the dodgy doc too. I eventually ran the experiments and it took some effort to be honest.

#3 should be done better as a habit. Listing assumptions and environment requirements explicitly while making them should be a good start.

#2 was disappointing because I pride myself on design thinking. It was a small detail that I missed through oversight. The way to catch this early would be to deliberately list all preconditions.

Read more!

Monday, September 14, 2020

Enable autocomplete for C++ on vim

 Assuming you have a recent version of vim (8.1 should do) this is a lot easier now.

  1. If you don't already have Vundle:
    git clone https://github.com/VundleVim/Vundle.vim.git ~/.vim/bundle/Vundle.vim
  2. Add the following section to ~/.vimrc. The vundle section may already be there in which case you just add the Plugin line for Valloric/YouCompleteMe.
    set rtp+=~/.vim/bundle/Vundle.vim
    call vundle#begin()
    Plugin 'VundleVim/Vundle.vim'
    Plugin 'Valloric/YouCompleteMe'
    call vundle#end()
  3. Clone YouCompleteMe.
    git clone https://github.com/Valloric/YouCompleteMe.git ~/.vim/bundle/YouCompleteMe
  4. Update YouCompleteMe.
    cd ~/.vim/bundle/YouCompleteMe && git submodule update --init --recursive
  5. Install the clang completer. This works but a better way may be to configure clangd.
    ./install.py --clang-completer
  6. Add a file .ycm_extra_conf.py to some ancestor directory of your C++ projects.
    def Settings( **kwargs ):
      return {
        'flags': [ '-x', 'c++', '-Wall', '-Wextra', '-Werror' ],
      }

    There are other ways of doing the above that can give you more flexibility in your own sandbox. You can check them here: https://github.com/ycm-core/YouCompleteMe#general-semantic-completion


Read more!

Sunday, August 23, 2020

Algorithms: Two interesting problem solving techniques

I recently learned two interesting techniques for solving problems using algorithms that could be called non-obvious and ingenious. But I think there is a pattern to it that is worth recognizing, and hence this post. These problems are courtesy one or the other of the several good online platforms for practicing algorithmic problems. The code is mine.

Using BFS to solve a dynamic programming problem

(This section is identical to my quora answer: https://www.quora.com/Is-the-breadth-first-search-an-example-of-dynamic-programming)

Let's look at the first problem: You have N oranges. On any given day, you can decide to eat a certain number. This can always be 1. But if you have an even number of oranges, you can eat N/2. If the number of oranges you have is divisible by 3, you can instead eat 2N/3 if you want. What is the minimum number of days in which all the oranges can be eaten?

This problem is definitive of typical dynamic programming problems and there is a fairly routine dynamic programming solution to this problem. Here goes the dynamic programming solution first.

  1. int minDays(int n) { 
  2. std::vector<int> minDays(n+1); 
  3. // minDays[n] == min days for n oranges 
  4.  
  5. minDays[0] = 0; 
  6.  
  7. for (int i = 1; i <= n; ++i) { 
  8. int minDay = 1 + minDays[i-1]; 
  9. if (i % 3 == 0) { 
  10. minDay = std::min(minDay, 1+minDays[i/3]); 
  11. } 
  12. if (i % 2 == 0) { 
  13. minDay = std::min(minDay, 1+minDays[i/2]); 
  14. } 
  15. minDays[i] = minDay; 
  16. } 
  17. return minDays[n]; 
  18. } 

This is an O(n) algorithm. But turns out that we can do much better. Consider N=12. On the first day, you could eat 8 oranges, or 6 oranges, or just 1. So there are three different paths you code take. For each of those there would be one, two, or three choices to make on the second day. This is also the classic dynamic programming structure but, it is also a graph. By traversing through this graph breadth-first, it is possible to evaluate these paths and figure out the first path to reach zero - the fastest to zero oranges. The following is a BFS solution.

  1. int minDays(int n) { 
  2. if (n <= 0) { 
  3. return 0; 
  4. } 
  5.  
  6. std::queue<int> bfsQueue; 
  7. std::set<int> seen; 
  8. bfsQueue.push(n); 
  9. int days = 0; 
  10.  
  11. while (!bfsQueue.empty()) { 
  12. int size = bfsQueue.size(); 
  13.  
  14. for (int i = 0; i < size; ++i) { 
  15. int entry = bfsQueue.front(); 
  16. if (entry == 0) {  
  17. return days; 
  18. }  
  19.  
  20. bfsQueue.pop(); 
  21.  
  22. auto it = seen.insert(entry); 
  23. if (!it.second) { 
  24. continue;  
  25. }  
  26.  
  27. // push the child entries 
  28. if (entry % 3 == 0) { 
  29. bfsQueue.push(entry/3); 
  30. }  
  31. if (entry % 2 == 0) { 
  32. bfsQueue.push(entry/2); 
  33. }  
  34. bfsQueue.push(entry-1); 
  35. }  
  36. days += 1; 
  37. }  
  38. return days;  
  39. }

I don’t have a complexity number for this. Had the child nodes been all distinct, the complexity would possibly have been O((3k)^d) for some constant k less than 1, where d is the minimum number of days required. This itself grows much faster than O(n). But the fact that many of the child nodes are actually the same - the overlapping sub-problems property of dynamic programming - possibly makes this O(d*log(n)) or something like that. And I could be way off the mark here - I am just speaking from observation and some very rudimentary reasoning.

Using binary search in an optimization problem

Here goes the second problem: You are given a list of positions on a straight-line where you can place magnets. The attraction between the magnets is inversely proportional to the distance between them. You are given m magnets and want to minimize the maximum possible attraction between any two magnets when you arrange them. That is same as saying that you want to maximize the minimum distance between two successive magnets.

The given positions are a constraint. Discounting that for a minute, if you had four magnets that you could place anywhere between positions 1 and 10, how would you do it?

 First magnet at 1, last magnet at 10 - that's a given. The second magnet could be at 4, the third at 7. That would make the minimum distance between any two magnets to be 3. You cannot have any arrangement of four magnets at positions 1-10, in which the smallest distance between two magnets (obviously successive) is 4. Now how on earth does one solve this. I thought of dynamic programming initially but couldn't frame it as one - maybe it is possible to solve it that way. But if the range of positions is 1 through 10, and there are 4 elements, the elements an equitable distribution of 4 elements would require a distance of maximum (10-1)/(4-1) = 3 between two successive elements. Therefore the minimum distance between two elements can never be more than 3. And of course, the least distance is when they are next to each other - i.e. 1. This means, what we are trying to find out is really whether it is possible to place the elements with a minimum distance of x between them where 1 <= x <= 3. Of course, instead of 1 <= x <= 3, the range could be arbitrarily large, say 1 <= x <= 1000000. And that's where you're gonna have to search that state space using something better than a linear algorithm starting from 1 through 1000000 (or the other way). Now if it be possible to place elements with a minimum distance of x, then it goes without saying that it is possible to do so for all [1, x]. So then we are interested to find if it is also possible to do so for some x' in (x, 1000000]. So by making minor modifications to the binary search process we can find the largest x satisfying our constraint. Here goes the code.

 

    int maxDistance(vector<int>& position, int m) {
        if (position.empty() || m <= 1 || position.size() < m) {
            return 0;
        }
        std::sort(position.begin(), position.end());
        int first = position.front(), last = position.back();
        if (m == 2) {
            return last - first;
        }

        int max_gap = (last - first)/(m-1);
        int min_gap = 1;
        int max_min_gap = -1;
       
        while (min_gap <= max_gap) {
            auto cur_gap = min_gap + (max_gap - min_gap)/2;
            if (!canFitWithMinGap(position, m, cur_gap)) {
                max_gap = cur_gap - 1;
            } else {
                max_min_gap = cur_gap;
                min_gap = cur_gap + 1;
            }
        }
        return max_min_gap;
    }
   
    bool canFitWithMinGap(const vector<int>& position, int num_elems, int gap) {       
        auto begin = position.begin();
        auto start = *begin;
        int last = position.back();
        for (int i = 0; i < num_elems - 2; ++i) {
            begin = std::lower_bound(begin, position.end(), start + gap);
            if (begin == position.end() || (last - *begin) < gap) {
                return false;
            }
            start = *begin;
        }
        return true;
    }

The idea is simple but it takes a bit of thinking to see this as a viable approach.


Read more!