Monday, September 14, 2020

Enable autocomplete for C++ on vim

 Assuming you have a recent version of vim (8.1 should do) this is a lot easier now.

  1. If you don't already have Vundle:
    git clone https://github.com/VundleVim/Vundle.vim.git ~/.vim/bundle/Vundle.vim
  2. Add the following section to ~/.vimrc. The vundle section may already be there in which case you just add the Plugin line for Valloric/YouCompleteMe.
    set rtp+=~/.vim/bundle/Vundle.vim
    call vundle#begin()
    Plugin 'VundleVim/Vundle.vim'
    Plugin 'Valloric/YouCompleteMe'
    call vundle#end()
  3. Clone YouCompleteMe.
    git clone https://github.com/Valloric/YouCompleteMe.git ~/.vim/bundle/YouCompleteMe
  4. Update YouCompleteMe.
    cd ~/.vim/bundle/YouCompleteMe && git submodule update --init --recursive
  5. Install the clang completer. This works but a better way may be to configure clangd.
    ./install.py --clang-completer
  6. Add a file .ycm_extra_conf.py to some ancestor directory of your C++ projects.
    def Settings( **kwargs ):
      return {
        'flags': [ '-x', 'c++', '-Wall', '-Wextra', '-Werror' ],
      }

    There are other ways of doing the above that can give you more flexibility in your own sandbox. You can check them here: https://github.com/ycm-core/YouCompleteMe#general-semantic-completion


Read more!

Sunday, August 23, 2020

Algorithms: Two interesting problem solving techniques

I recently learned two interesting techniques for solving problems using algorithms that could be called non-obvious and ingenious. But I think there is a pattern to it that is worth recognizing, and hence this post. These problems are courtesy one or the other of the several good online platforms for practicing algorithmic problems. The code is mine.

Using BFS to solve a dynamic programming problem

(This section is identical to my quora answer: https://www.quora.com/Is-the-breadth-first-search-an-example-of-dynamic-programming)

Let's look at the first problem: You have N oranges. On any given day, you can decide to eat a certain number. This can always be 1. But if you have an even number of oranges, you can eat N/2. If the number of oranges you have is divisible by 3, you can instead eat 2N/3 if you want. What is the minimum number of days in which all the oranges can be eaten?

This problem is definitive of typical dynamic programming problems and there is a fairly routine dynamic programming solution to this problem. Here goes the dynamic programming solution first.

  1. int minDays(int n) { 
  2. std::vector<int> minDays(n+1); 
  3. // minDays[n] == min days for n oranges 
  4.  
  5. minDays[0] = 0; 
  6.  
  7. for (int i = 1; i <= n; ++i) { 
  8. int minDay = 1 + minDays[i-1]; 
  9. if (i % 3 == 0) { 
  10. minDay = std::min(minDay, 1+minDays[i/3]); 
  11. } 
  12. if (i % 2 == 0) { 
  13. minDay = std::min(minDay, 1+minDays[i/2]); 
  14. } 
  15. minDays[i] = minDay; 
  16. } 
  17. return minDays[n]; 
  18. } 

This is an O(n) algorithm. But turns out that we can do much better. Consider N=12. On the first day, you could eat 8 oranges, or 6 oranges, or just 1. So there are three different paths you code take. For each of those there would be one, two, or three choices to make on the second day. This is also the classic dynamic programming structure but, it is also a graph. By traversing through this graph breadth-first, it is possible to evaluate these paths and figure out the first path to reach zero - the fastest to zero oranges. The following is a BFS solution.

  1. int minDays(int n) { 
  2. if (n <= 0) { 
  3. return 0; 
  4. } 
  5.  
  6. std::queue<int> bfsQueue; 
  7. std::set<int> seen; 
  8. bfsQueue.push(n); 
  9. int days = 0; 
  10.  
  11. while (!bfsQueue.empty()) { 
  12. int size = bfsQueue.size(); 
  13.  
  14. for (int i = 0; i < size; ++i) { 
  15. int entry = bfsQueue.front(); 
  16. if (entry == 0) {  
  17. return days; 
  18. }  
  19.  
  20. bfsQueue.pop(); 
  21.  
  22. auto it = seen.insert(entry); 
  23. if (!it.second) { 
  24. continue;  
  25. }  
  26.  
  27. // push the child entries 
  28. if (entry % 3 == 0) { 
  29. bfsQueue.push(entry/3); 
  30. }  
  31. if (entry % 2 == 0) { 
  32. bfsQueue.push(entry/2); 
  33. }  
  34. bfsQueue.push(entry-1); 
  35. }  
  36. days += 1; 
  37. }  
  38. return days;  
  39. }

I don’t have a complexity number for this. Had the child nodes been all distinct, the complexity would possibly have been O((3k)^d) for some constant k less than 1, where d is the minimum number of days required. This itself grows much faster than O(n). But the fact that many of the child nodes are actually the same - the overlapping sub-problems property of dynamic programming - possibly makes this O(d*log(n)) or something like that. And I could be way off the mark here - I am just speaking from observation and some very rudimentary reasoning.

Using binary search in an optimization problem

Here goes the second problem: You are given a list of positions on a straight-line where you can place magnets. The attraction between the magnets is inversely proportional to the distance between them. You are given m magnets and want to minimize the maximum possible attraction between any two magnets when you arrange them. That is same as saying that you want to maximize the minimum distance between two successive magnets.

The given positions are a constraint. Discounting that for a minute, if you had four magnets that you could place anywhere between positions 1 and 10, how would you do it?

 First magnet at 1, last magnet at 10 - that's a given. The second magnet could be at 4, the third at 7. That would make the minimum distance between any two magnets to be 3. You cannot have any arrangement of four magnets at positions 1-10, in which the smallest distance between two magnets (obviously successive) is 4. Now how on earth does one solve this. I thought of dynamic programming initially but couldn't frame it as one - maybe it is possible to solve it that way. But if the range of positions is 1 through 10, and there are 4 elements, the elements an equitable distribution of 4 elements would require a distance of maximum (10-1)/(4-1) = 3 between two successive elements. Therefore the minimum distance between two elements can never be more than 3. And of course, the least distance is when they are next to each other - i.e. 1. This means, what we are trying to find out is really whether it is possible to place the elements with a minimum distance of x between them where 1 <= x <= 3. Of course, instead of 1 <= x <= 3, the range could be arbitrarily large, say 1 <= x <= 1000000. And that's where you're gonna have to search that state space using something better than a linear algorithm starting from 1 through 1000000 (or the other way). Now if it be possible to place elements with a minimum distance of x, then it goes without saying that it is possible to do so for all [1, x]. So then we are interested to find if it is also possible to do so for some x' in (x, 1000000]. So by making minor modifications to the binary search process we can find the largest x satisfying our constraint. Here goes the code.

 

    int maxDistance(vector<int>& position, int m) {
        if (position.empty() || m <= 1 || position.size() < m) {
            return 0;
        }
        std::sort(position.begin(), position.end());
        int first = position.front(), last = position.back();
        if (m == 2) {
            return last - first;
        }

        int max_gap = (last - first)/(m-1);
        int min_gap = 1;
        int max_min_gap = -1;
       
        while (min_gap <= max_gap) {
            auto cur_gap = min_gap + (max_gap - min_gap)/2;
            if (!canFitWithMinGap(position, m, cur_gap)) {
                max_gap = cur_gap - 1;
            } else {
                max_min_gap = cur_gap;
                min_gap = cur_gap + 1;
            }
        }
        return max_min_gap;
    }
   
    bool canFitWithMinGap(const vector<int>& position, int num_elems, int gap) {       
        auto begin = position.begin();
        auto start = *begin;
        int last = position.back();
        for (int i = 0; i < num_elems - 2; ++i) {
            begin = std::lower_bound(begin, position.end(), start + gap);
            if (begin == position.end() || (last - *begin) < gap) {
                return false;
            }
            start = *begin;
        }
        return true;
    }

The idea is simple but it takes a bit of thinking to see this as a viable approach.


Read more!

Wednesday, April 03, 2019

An Inexact Introduction to what Envoy is / does

Backstory

Of late I have spent a bit of time on Envoy. They say it's the next big thing for cloud services. It changes how microservices are written and deployed. With that kind of interest and developer traction, you'd imagine that they'd have a fantastic set of tutorials in their docs to get any interested engineer started. Well, they do. But like everything else in this day and age, to read those fantastic docs you already have to know a fair bit about proxies and API gateways and stuff that I feel not every Envoy newbie need know. What is a dummy to do? For one, persevere, and for two, make it demystify (aka help cut the crap). So this one is about essentially what I managed to learn about Envoy so far. Precious little, but I'll try summarizing nonetheless.

When you build a software system these days, especially a client-server kind of app, you split it into a few (or many) small components that interact with each other to perform cohesive functions - microservices. If this system has to serve requests as most systems do, then you may need to be able to operate without loss of functionality and responsiveness as the number of requests grows - a quality that's known as scalability. Building cloud services in terms of microservices is the norm today. Microservices confer a great deal of flexibility in how we address availability, responsiveness, and scalability of our cloud services. But leveraging all of it isn't always easy if you're a microservice author. Simply put, there is a lot of cross-cutting concern that every microservice author has to think about - right from transport layer security and authentication, to discovering peer services, to load balancing, rate limiting - concerns that are not central to the business logic of the microservice. Addressing them is hard enough. Once you consider that different microservices in the same cloud app could be written in different languages or frameworks, the problem becomes harder still.

First look at Envoy

In a nutshell, Envoy, developed at Lyft and written in modern C++, allows you to build microservices without bothering too much about how to route requests to other microservices, how to handle SSL connections, authenticate users, do load balancing, rate limiting, circuit breaking, and lots more (patience, I'll explain all the terms). In other words, it helps address all of the cross-cutting concerns mentioned earlier in a polyglot microservices environment. And it does so in an extensible way allowing anything from hard-coded static configuration to completely dynamic configuration for everything from the endpoints used to serve requests, the clusters serving them, to the load-balancing policies.

Now true to the promise above, I owe a short explanation of some terms I casually threw at you.

Load balancing: You have lots of requests coming in and you want to serve them all responsively and reliably. What you do is create multiple replicas of your service and then route requests to them spreading the load across the replicas. The exact strategy can vary, and usually depends on whether your services are stateful or stateless.

Circuit Breaking: Service A talks to service B in order to serve requests. If B be heavily loaded and unresponsive, or simply unavailable, A owes it to the user to degrade gracefully instead of being hung. A also owes it to a perhaps already-loaded B to not bombard it with even more requests in such a situation. Detecting such a situation, and preventing requests from A to B for a short time, before once again resuming them is what circuit breaking is about.

Rate limiting: You don't want overeager clients to swamp your service with more requests than you can reliably handle. Rate limiting does this using various strategies and algorithms. You reject requests beyond a threshold number per second, or throttle requests by introducing small random delays while routing them. You put such checks at the client, and at the server.

Envoy in wee bit more detail

Ok, back to Envoy. So what does Envoy deal in? As in, what are the abstractions or domain objects in terms of which Envoy operates? I ask this, because without being able to satisfactorily answer this question about a given software system, I have noticed I never make a good job of trying to make sense of the system itself. So here are the key abstractions.

1. Some concept of an endpoint - IP address + port + protocol - that downstream clients connect to and send requests on. Envoy calls them listeners.
2. Some concept of an upstream cluster of hosts running specific services. Envoy calls these clusters.
3. Each cluster has one or more hosts, and these hosts could be discovered via DNS lookups, or through API endpoints.
4. A concept of routing requests from endpoints to clusters. Envoy calls these routes and route_configs.
5. Some concept of a request URL that a client hits, consisting of a virtual host or domain name, an API path prefix, etc. This is used to determine which routing rules are invoked.
6. Some concept of pluggable middleware for intercepting requests and processing them. Envoy calls them filters and filter_chains. You can do all sorts of things in these filters, such as authentication, rate limiting, etc.
7. Policies around load balancing, rate limiting, circuit breaking, etc.

Envoy deployment

So how does Envoy run alongside your own services? Several ways are possible. Most commonly it is deployed to run in both of the following roles:

1. As an API gateway that handles and routes all incoming requests to different microservices. This is called the edge proxy because it sits on the edge of your app boundary. It's a gateway into your app, so to speak.
2. As a peer process of each service, intercepting, qualifying, checking, routing all its incoming and outgoing data. This is called a service proxy. Imagine that you the microservice writer don't need to bother about TLS-secured connections, authentication, discovering service endpoints to talk to, etc. You just identify which other services to talk to and send requests and response to your peer Envoy on some port on the local machine (technically, in the same network namespace). It takes care of routing those requests.

Now in the overwhelming majority of cases, Envoy would run as a Docker container. As a service proxy, it would likely run as a sidecar container alongside the service container. But it can also run as standalone binaries.

Summary

Thus, Envoy serves as an edge and service proxy. It handles routing of incoming requests and service-to-service requests, and takes care of lots of common concerns. It allows you to write really simple microservices which practically need to do nothing more then getting its own business logic right. Now the above is a deliberately dumbed-down version of the truth, because Envoy does a lot more. It can work at both TCP/UDP + SSL level (L3/L4 proxy), as well as at HTTP level (L7). It can handle GRPC, and HTTP/2. And there is much more to it. But at its core, it is a proxy for a microservice-based apps that makes routing between services declarative and easy, and adds a whole host of useful services.


Read more!

Saturday, May 19, 2018

Setting up a private insecure Docker registry

Setting up a private insecure Docker registry for you Kubernetes sandbox

Well there's nothing specific to Kubernetes about this article. It just shows you how to quickly setup an insecure docker registry locally on one of your VMs. But if you do have a local Kubernetes setup on a set of VMs as described in my last article, then setting up a local docker registry, and pushing to it all the images you intend to deploy to your k8s cluster, would save you precious bandwidth (and time).

Running the docker registry

Pick a node to run your docker registry. I usually pick the master node of my kubernetes cluster arbitrarily, but it can really be any accessible node. The only thing that you need to make sure is that it has a fully-qualified domain name or a stable IP (statically assigned or a DHCP IP that's configured in your router to be sticky based on MAC address).
The actual docker registry is best run as a container, using the registry:2 image. Bind mount a host directory to /var/lib/registry to store the images durably.
$ mkdir -p /var/lib/local-registry/registry
$ docker run -d -p5000:5000 --restart=always --name local-registry -v /var/lib/local-registry/registry:/var/lib/registry registry:2

Accessing the registry from your k8s nodes

The private registry you just configured needs to be accessible from you k8s nodes, so that they could all access images from this registry. Because the registry you just configured is an insecure registry and does not use TLS, you must tell the docker daemon of each individual node that needs access to this registry that it should access these registries via http and via https.
Assuming that the host on which you're running the registry has a hostname of reghost (you can also IP address), the way to do that would be the following:
$ [ -w /etc/docker/daemon.json ] && \
  jq '."insecure-registries" = [."insecure-registries"[], "reghost:5000"]' /etc/docker/daemon.json >/tmp/daemon.json && \
  mv /tmp/daemon.json /etc/docker/daemon.json
The above assumes that you have the jq utility, which is a totally cool json utility that you should master. This would not have succeeded if the file didn't exist already. In that case:
$ [ -w /etc/docker -a -r /etc/docker -a ! -f /etc/docker/daemon.json ] && cat > /etc/docker/daemon.json < EOF
{
  "insecure-registries": ["reghost:5000"]
}
EOF
On older versions of Docker, the following seems to work on RedHat / Fedora / CentOS based systems:
$ [ -w /etc/sysconfig/docker ] && cat >> /etc/sysconfig/docker << EOF
INSECURE_REGISTRY='--insecure-registry reghost:5000 <<and-more>>'
EOF
Note that if you want to access your registry without the port number (because docker uses 5000 as the default port), you would need to list it separately in the insecure-registries key, in addition to the one qualified with the port number. Following this, on every such node where you added or edited /etc/docker/daemon.json, you need to restart the docker daemon:
$ systemctl daemon-reload
$ systemctl restart docker

Pushing images

You should now be ready to push images to this registry. Assuming you created a local image call mywebserver, you could try this:
$ docker tag mywebserver reghost:5000/myuser/mywebserver:latest
$ docker push reghost:5000/myuser/mywebserver:latest
The above will push all the layers of your image to your registry. You should be able to verify the addition of new content on your registry under /var/lib/local-registry/registry (or whichever path you bind mounted in your registry container).
Et, voila!

Read more!

Saturday, February 17, 2018

Setting up vim autocomplete for golang

The key is to know which keystrokes to use, and you should use Ctrl-X followed by Ctrl-O in insert mode to bring up completion suggestions in a pop up. If you want automatic pop ups as you type, YouCompleteMe is supposed to work. I couldn't get it to work, and it wouldn't work on some of my setups because it requires a more recent version of vim (7.4.15xx) than I have.

So what was needed? Assuming your GOROOT and GOPATH are correctly set up (if not, see below), the following is all that you need to do:
mkdir -p ~/.vim/autoload
curl -LSso ~/.vim/autoload/pathogen.vim https://tpo.pe/pathogen.vim
echo "call pathogen#infect()" | cat - /etc/vimrc > ~/.vimrc.new
mv ~/.vimrc ~/.vimrc.old && mv ~/.vimrc.new ~/.vimrc
go get github.com/nsf/gocode
mkdir -p ~/.vim/bundle
git clone https://github.com/fatih/vim-go.git ~/.vim/bundle/vim-go
cd $GOPATH/src/github.com/nsf/gocode/vim
./update.sh
Then, open a vim session and type the following vim command:
:GoInstallBinaries
The above will install additional go command-line tools and you should be all set. Just one more thing. Open you /etc/vim/vimrc file (if you have root privileges) or your ~/.vimrc file (which you could copy from /etc/vim/vimrc) and add or uncomment the following lines:
filetype plugin on
if has("autocmd")
  filetype plugin indent on
endif

You can see what it should look like when you edit go code and press Ctrl-X followed by Ctrl-O in insert mode.


The screenshot gif was created on Ubuntu 16.04 using peek, along with gifski.

Setting up your Golang development environment

Installing the latest golang development environment (currently go 1.9) and setting it up involves the following steps:
curl -L0 https://storage.googleapis.com/golang/go1.9.linux-amd64.tar.gz > go1.9.linux-amd64.tar.gz
tar xfz go1.9.linux-amd64.tar.gz -C /usr/local

GOROOT=/usr/local/go
GOPATH=~/devel/apporbit/go
PATH="$GOROOT/bin:$PATH:$GOPATH/bin"
export PATH GOROOT GOPATH
The key is to have GOPATH set correctly. Also set up glide for better vendoring / package dependencies.
go get github.com/Masterminds/glide
You create / checkout all your projects under $GOPATH/src. If you clone a github project, it should be under $GOPATH/src/github.com/<user>/<repo>. Using go get to get go packages would also clone the repos under these paths.

Read more!

Wednesday, November 29, 2017

Java alphabet soup

Java alphabet soup

Trying a Java refresher, more specifically a Spring refresher, has so far been a source of mixed emotions. Having written a lot of Core Java in the past and used a smattering of Spring, the utility of either wasn't in question in my head. But then having developed tons of Ruby on Rails apps, and seen both its magic and seamy sides, my perspective perhaps has more dimensions to it today.

I found the Java / Spring way of developing web applications a little too archaic by today's standards. All the annotations have still not exorcised much of the esoteric XML that you still cannot avoid. But Eclipse provides significant relief to the extent that you can get away without writing perhaps a single line of XML, using its Maven and Spring (Spring Tools Suite / Web Tools Platform) plugins instead to add most of the XML content.

You still need to manually configure Tomcat, manually configure data sources and JNDI for accessing your data sources in server specific ways. You still have to manually configure your web.xml. Those are hard, and you have to know at least what to search for in google, and then what to search for in the documentation that google lists. It is frustrating if you don't have a very good, precise tutorial or howto. It all shows the struggles of an old and evolving framework in trying to remain usable even as it remains solidly relevant. I suspect Spring Boot, something I am yet to explore, would bear signs of some true-north evolution. In the meantime, one hopes Java comes up with more modern web frameworks. Spring is mature, stable, mildly usable and very useful. But it feels a decade behind. Spring addressed usability problems of Java EE. But today, something else needs to do that for Spring. Spring Boot may be an evolutionary step in that direction, but more needs to be done.

Read more!

Saturday, November 11, 2017

Setting up local repos for Ubuntu packages





Setting up repos of .deb packages for Ubuntu may not be something people need to do often on their own laptops. But I was recently doing something where it made sense. I was setting up a kubernetes cluster using VirtualBox on my Linux laptop and wanted to automate the whole process using vagrant and ansible. This meant that each time a VM would be spun up, I would add apt repositories to it and then install docker.io, kubeadm, kubectl, kubelet, and kubernetes-cni packages. All of these VMs were to be on my laptop, and each time they'd reach out to google's or docker's repos to pull these packages in. A sum total of around 70 MB isn't big but I could be spinning up tens of VMs over the course of my experiments and a fresh download off the web every time is a terribly inefficient use of bandwidth (certainly here in India). So I wanted to setup an apt repository locally on the host laptop which runs Ubuntu 16.04 (xenial).

The plan

On the host: What we need to do is to run a web server (Apache2 works fine) that serves a directory with the packages that I want to host. To make this secure, we need to generate a key pair using GnuPG, and sign the packages I want to expose. I should also expose the public key. The packages are .deb files (the analogues of .rpm on RedHat / Centos, etc.) and in case you've already installed them on your host laptop, they might be available under the directory /var/cache/apt/archives. If not you can download the packages without installing using a command-line switch for apt install.

 
On the guests: On each guest, we'll need to edit the /etc/apt/sources.list file to include the repository from the host laptop. We'll also need to accept the public key from this repository.

The action

We shall follow the plan outlined above.

Actions on the host

Setting up the webserver and packages

  1. The following will install the Apache 2 webserver and other prerequisites.

    $ sudo apt install apache2 dpkg-dev dpkg-sig
     
  2. The root virtual directory for Apache is by default /var/www/html. You should create the following directory tree under it: /var/www/html/pkgs/dists/$(lsb_release -s -c)/main/binary-amd64. To do that, run the command:

    $ mkdir -p /var/www/html/pkgs/dists/$(lsb_release -s -c)/main/binary-amd64
     
  3. Download the required packages in the directory you just created:
     
    $ cd /var/www/html/pkgs/dists/$(lsb_release -s -c)/main/binary-amd64
    $ apt download 
    

Setting up your key pair and signing packages

  1. If you haven't already, you should generate a key pair using GnuPG.
     
    $ gpg --gen-key
    

    In the prompts that follow, select the key type to be RSA (sign only), key size to be 4096 bits, key does not expire, and specify a unique name. Specify a reasonable password. Wait for the keys to be generated.

  2. Once the key is created, run:
     
    $ gpg --list-keys
    

    Note the line starting with pub like this one:
     
    pub   4096R/B1B197AF 2017-11-11
    

    Note the number appearing after the size specifier (4096R/). Here it is B1B197AF. This would be your key identifier and would be used in the next step.

  3. Generate your public key file so that it is accessible through your web server:
    $ cd /var/www/html/pkgs
    $ sudo gpg --output keyFile --armor --export B1B197AF
     
  4. Finally, sign each of your packages thus:
     
    $ cd /var/www/html/pkgs/dists/$(lsb_release -s -c)/main/binary-amd64
    $ sudo dpkg-sig --sign builder 
    

    Also, generate a Packages.gz and a Release + InRelease files.
     
    $ cd /var/www/html/pkgs/dists/$(lsb_release -s -c)/main/binary-amd64
    $ dpkg-scanpackages . | gzip -9c > Packages.gz
    
    $ apt-ftparchive release . > Release
    $ gpg --clearsign -o InRelease Release
    $ gpg -abs -o Release.gpg Release
    

Actions on the guest

  1. Run the following command to add the local repository:
     
    $ sudo su
    $ cat <> /etc/apt/sources.list
    > 
    > deb [ arch=amd64 ] http://laptop-host-ip/pkgs xenial main
    > EOF
    

    One point to understand here is that you would want your repository to be chosen preferentially over others if the packages are present in your repository. So unless the packages of interest not be present on other listed repositories, you should put this entry above any other repositories. The above command appends your repository at the end of the fail. To move it up in the file, you may need to edit it manually.

  2. Add the laptop host repository's key:
     
    $ wget -O http://laptop-host-ip/pkgs/keyFile | sudo apt-key add -
     
  3. Now run the following:
     
    $ sudo apt update
    

    Once this step passes, you can run:
     
    $ sudo apt install 
    

Read more!