Sergey Mikhanov  

Couple of good videos I watched lately — 2 (May 26, 2015)

Big Data is (at least) Three Different Problems. Michael Stonebraker, the most recent recipient of ACM Turing Award, in his usual charismatic manner addresses the most common real (as opposed to imaginary) problems arising when dealing with large amounts of data. Tons of valuable insights about modern databases.

Facebook’s iOS Architecture. Facebook people talk about implementing Facebook app on iOS. You can clearly tell that classical MVC approach is terribly outdated when using it to build apps of the complexity you see in 21st century. It’s not that mobile development is inherently difficult, it’s just gets entangled and messy very quickly as the apps grow.

The world’s most complicated software (May 19, 2015)

A typical software developer in a company possessing some level of technical sofistication routinely switches between abstraction levels during a working day. He or she may go from reasoning about product structure on a web page level to the intricacies of file allocation in their database system. Those dealing with some sort of message processing can switch from a byte-level layout of the protocol messages to a more general view of interconnected queues within the system — you get the drift. From the very early days of being in the profession, programmers are told that abstraction is the key to fighting complexity natural to all software. It’s only after spending few years in the profession, some may discover that few domains are surprisingly resistant to abstraction alone. Without the only tool to fight complexity, developers are left to accept the difficulty of the field as a given. I’ll give an example of a truly difficult problem.

Meet the calendar, the world’s most complicated software.

I’m talking about a product like Microsoft Outlook (actually, its calendaring part and the server). On a first sight, there’s nothing special about it, but if you try to think about it, complexity starts manifesting itself from a very basic level. The interaction protocol between participants trying to agree on a meeting is surprisigly hard to get right. For example:

  • When someone receives a meeting invitation, should it be shown to him if a meeting has already passed? How do we detect this (note that we should take time zones in consideration, including the cases when participant is not in his default time zone)?
  • When someone proposes a new time for a meeting, but it can’t be sent because the participant was offline should we try re-sending it when participant goes online? What about if the meeting time has already passed? What if not, but his proposed time did?
  • What about the time zone changes? If the participant changed a zone, should we reschedule his events? What if all participants changed the time zone? What would be the time zone of the newly proposed events? Of the changes into existing events?
  • Should we, once the participant goes online, notify him about the changes to the events that were not accepted by the participant because he was offline? What about the already passed ones?

And so on with an added inherent complexity of dealing with time zones. Calendar is a classical distributed system with participants being people within the same organizations using it simultaneously. Participants can be offline for extended periods of time; they must find consensus on timing of the group events using some reasonably robust protocol; they move around. You probably noted already that part of the problem is the difficulty in specifying it correctly — you’ll have lots of fuzzy and vague sentences with “except” in your specification, rendering almost all your abstraction skills useless. The difficulty of developing a calendar suite differs from the difficulty of your typical job ad’s “hard and interesting problem” in the same way as your morning 2 km jog differs from doing an Iron Man.

If you made a note to yourself never to work on calendar suites, here’s the second most complicated software in the world: library dependency manager. It does not have to deal with people as participants, but is just as full of fuzzy specs: how to handle conflicting (or broken) transitive dependencies, non-mandatory ones, source vs. binary, etc.

Would you want to work on a calendar suite or a dependency manager?

Venice–San Francisco (May 11, 2015)

Everyone who has ever visited Venice as a tourist probably went there with some limited baggage of knowledge about the town. A former center of very powerful Venetian Republic is, as the modern story goes, now full of tourists, almost abandoned by locals, sinking. Reality is, of course, much less grim. When I stayed there in March — it’s the low season there — Venice presented itself as a very lively small town. You stroll around and see kids return from school across the bridge of Calle Bandi in Cannaregio and groups of students from the nearby Academy of Fine Arts playing guitar in Dorsoduro.

Venice Dorsoduro

One comparison held very strongly on my head the entire time I spent in Venice. What I saw around was, figuratively speaking, what San Francisco will look and feel like in three or four centuries. Bounded by water on almost every side, it’s now just as restrained from further growth and almost as wealthy compared to its modern peer cities now as Venice was at its heyday. In the coming centuries the technology — that San Francisco will ultimately represent — will only grow in importance in everyone’s life and will help it prosper further. The best artists will lend their skills to making the future San Francisco the most refined city for those living there. The unstoppable gentrification will continue to purify the city fabric, eventually turning it into something as beautiful and uniform as Venice’s Centro storico.

Venice roofs

It will all be fine for San Franciscans until the technology will stop to matter (because everything eventually does). Maybe the new human psycho-powers will be discovered in, say, Cologne — it for sure will be Germans with their sense of irrational who will do it, — rendering the entire technology industry obsolete. After the period of San Francisco decline we’ll visit it, marvel at the hills and the architecture, and then will all be looking for “non-touristy typical San Franciscan restaurant”.

Apple Watch (May 4, 2015)

Can’t understand what’s the fuss with Apple Watch overwhelming people with constant distracting notifications. Notifications can easily be disabled. This is what my (and any reasonable person’s) notifications settings screen looks like.

Notifications

Calls, texts, things to do, navigation, updates. That’s all; Twitter mentions can wait, Instagram likes can wait, even email can wait. Below the fold there are something like hundred and fifty apps, all of which were willing to get their share of my attention and which I decided to check on my own schedule. Because it’s the phone that belongs to me, not the other way around.

Deployment-driven development (November 10, 2014)

Die-hard fans of test-driven development advocate writing tests first, when no code of the actual program has been written. Their purpose at this stage is to serve as an (incomplete) specification of the program that also does not capture or express its actual semantics. To me, the described extreme case of TDD is meaningless, but recently I’ve discovered a real situation where one might want to write “code before actual code”. Let’s call it “deployment-driven development”.

The basic idea of DDD is that you always keep your program in a state where it can be deployed anywhere. This means you always start writing deployment automation code before anything. Want to work on a Python webapp? Instead of installing nginx and uWSGI locally, write a Puppet manifest or a Chef recipe first that will apply it on a virtual machine or a Docker container. It may even take less time to do this than to perform an actual installation. Need to add a database or configure log rotation? Update the manifest and redeploy. As a result, you’re always dealing with a software that is ready to be deployed anywhere and is not confined within your development machine.

There are a number of advantages to starting even the small projects with server automation first. When the product will be ready to go live, there will be many less checkboxes to go through in pre-launch checklist, and many less last-minute surprises. Exposure to deployment makes you think about the system architecture of your product at a very early stage and grow it as you code. You are always spawning an isolated machine to run your code, so library dependencies from different projects won’t interfere with each other.

What’s not to like about it?