Sunday, October 16, 2016

When closed source should become open

I've been thinking for some time about when closed source should become open, particularly in the context of when your core business is about producing software. If your core business is to provide a service such as movies, as in the case of Netflix, then the dynamics are different. Because the core business is to produce movies then simply go OSS and reap the benefits from having done so (as Netflix indeed have).

Before I start I should state that my views here don't describe the only reason why to go with open software; there can be other reasons of course. Indeed there are many valid reasons to start with open as well. This post just investigates the closed to open transition, and when to make it.

When your business is about producing software, you're producing software assets that contain costly intellectual property. I'm a massive fan of open software and I've made many contributions in that space. However a software business also needs to make money of course.

I assert that there is a very limited window of opportunity for a software business to retain a software asset as closed; and that window is governed by the open competition that it faces. The job of the software business then, is to stay ahead of the open curve, yet yield to open software when it starts to become a threat. This happened with Java when it was threatened by Apache Harmony. I believe that Harmony subsequently died precisely because of Sun OSS-ing Java.

I should state right now that my thoughts have been influenced by Danese Cooper who gave a great talk on this very subject during Scala Days 2015. Denese discussed why open languages win, and I think her talk has a wider application.

When discussing the subject of open vs closed software with colleagues at Lightbend over the past year or two, I've described closed software as resting on a tectonic plates. As these plates move around then the closed software at the edge falls off into the abyss of open software! I think that the analogy is mostly useful though in order to illustrate that the world changes. Because of this if you must regularly re-evaluate the competition that is open. If you have closed software solving a particularly useful/important problem then you can be fairly certain that open software will rise around it (again thinking of what Denese said here).

Open your commercial software and neutralise its open competition, also reaping the benefits of having gone open. Focus on adding higher level value building out from your core. Stay ahead of the game.

You certainly can't sit still.

Thursday, July 14, 2016

Microservices: from development to production

Let’s face it, microservices sound great, but they’re sure hard to set up and get going. There are service gateways to consider, setting up service discovery, consolidated logging, rolling updates, resiliency concerns… the list is almost endless. Distributed systems benefit the business, not so much the developer.

Until now.

Whatever you think of sbt, the primary build tool of Lagom, it is a powerful beast. As such we’ve made it do the heavy lifting of packaging, loading and running your entire Lagom system, including Cassandra, with just one simple command:

sbt> install

This “install” command will introspect your project and its sub-projects, generate configuration, package everything up, load it into a local ConductR cluster and then run it all. Just. One. Command. Try doing that with your >insert favourite build tool here<!

Lower level commands also remain available so that you can package, load and run individual services on a local ConductR cluster in support of getting everything right before pushing to production.

Lagom is aimed at making the developer productive when developing microservices. The ConductR integration now carries that same goal through to production.

Please watch the 8 minute video for a comprehensive demonstration, and be sure to visit the “Lagom for production” documentation in order to keep up to date with your production options. While we aim for Lagom to run with your favourite orchestration tool, we think you’ll find the build integration for ConductR hard to beat. Finally, you can focus on your business problem, and not the infrastructure to support it in production.


Tuesday, July 5, 2016

Developers need to care about resiliency

Disclaimer: I'm the technical lead for Lightbend ConductR - a tool that focuses on managing distributed systems with a key goal of resiliency.

I've been doing a reasonable amount of travelling over the past few years. Overall I enjoy it; I don't think that I'm away that much that it has become painful - may be 3-6 international flights per year.

One of the things you hear about when travelling is missing an international connection. I've been fortunate in that this has happened just once; a couple of weeks ago in fact.

The airline was British Airways (BA), and they did a really good job of trying to make up time given that flights out of Heathrow were causing delays across Europe. Thus my flight from Berlin TXL to London LHR was about two hours late. I missed my Sydney SYD flight from LHR. The BA staff did a great job of putting me up in a hotel overnight and getting me on to the next available flight. Honestly, from a staff perspective, BA were fantastic in fact.

What was frustrating though was that it took about two hours to arrange the accomodation and flight booking. This was also considering that I didn't have to queue for long and that a staff member attended to me in a reasonable time frame. The problem was the BA computer system.

Apparently BA have some new system. There were IT staff walking around helping the front-of-house staff get into the system and deal with its incapacity to handle any load. The IT staff were frustrated. The front-of-house staff were frustrated. I was frustrated (although not as frustrated as a Business Class passenger in front of me who felt that his ticket meant that BA should treat him like royalty!).

BA's computer system had an amazing effect on all concerned, except most likely the people that wrote it. In my opinion the original developers should be there supporting the front-of-house staff. They should feel the pain that they have inflicted.

I'm sure that you have similar stories to share, where computer systems have failed you miserably. Computer systems will of course fail, that's natural, but it is the fact that their degree of failure is considered acceptable that is the problem. Computer systems should not fail to the extent that they do. Your airline reservation system, online banking site, or whatever it is, it should be more reliable that it probably has been.

The problem is that developers do not understand that building-in resilience to their software is more important than most other things. As my colleague, Jonas Bonér has stated many times, "without resiliency nothing else matters". He's so right. Why is it then that developers just don't get this?

My answer to that is that many developers just see what they do as a job, and they don't really care about what they do. Putting that aside though, creating and then indeed managing distributed systems, a key requirement for resiliency, is harder than not; not hard, but harder and developers are lazy (btw: in case you don't realise this, I'm a developer!).

We need systems that are resilient. We therefore need developers to care about resiliency. The more that developers care about resiliency, the more tools and technologies we'll see appearing in support of it. I strongly feel that it all starts with the developer though.

I imagine a world where, given the inevitability of missing flight connections, I can wait in a queue for no longer than 10 minutes, be handled within another 10 minutes and then sleep off the tiredness and inconvenience of waiting another 24 hours for my next flight. The developer just needs to start caring in order for this to happen. Make he or she responsible for managing their system in production and they'll start caring, I guarantee it.

Here's a language/tool agnostic starting point for you if you are a developer that cares enough to have read this far: The Reactive Manifesto.


Sunday, May 22, 2016

Why we created an orchestration tool for you

One question I have had to answer a few times as the tech lead of ConductR, and I think it is a healthy question, is on why did Lightbend create ConductR? This post is my personal attempt to describe the rationale for it two years ago, and why I think it is more relevant than ever.

Back then we wanted a tool that was focused on making the deployment and management of applications and services built on Java, Scala, Akka and Play as easy as it could be. We wanted ConductR to be to operations what Play is to web application developers; a "batteries included" approach to deploying and managing reactive applications and services.

Two years ago, there really wasn't anything else out there that we felt offered such a packaged approach to solving these new use-cases for operations people. The sentiment was that we had done a reasonable job with the Reactive Manifesto at that point, and that we'd definitely engaged developers, but we were quickly going to arrive at a situation where operational people were going to find it a challenge to manage these new distributed applications and services. We also wanted something that had the reactive DNA.

That's how it all started. So, what's changed, and why is ConductR relevant now?

There are a number of players emerging in the orchestration space presently. This certainly validates our being a player in this space from a needs perspective. If you're happy to roll your own orchestration (which actually remains what we're up against in terms of competition, and this hasn't changed much in two years), then be prepared to have two people spend at least year tackling a problem that is harder than you think, and then realise that you have an operational cost in maintaining it. Atop of this, there's the risk to your company regarding those individuals leaving... is it sufficiently documented in terms of others taking over? Nobody has won in the orchestration space, but there's enough to choose from that will trump the business risk to your company of rolling your own. My advice here having been involved in designing and writing an orchestration tool (twice) is to not roll your own and focus on your core-business.

While I personally think that the operational productivity culture that permeates through our design is still the single most important reason to consider ConductR, here are some other reasons:

  • a means to manage configuration distinctly from your packaged artifact;
  • consolidated logging across many nodes;
  • a supervisory system whereby if your service(s) terminate unexpectedly then they are automatically restarted;
  • the ability to scale up and down with ease and with speed;
  • handling of network failures, in particular those that can lead to a split brain scenario;
  • automated seed node discovery when requiring more than one instance of your service so that they may share a cluster;
  • the ability to perform rolling updates of your services;
  • support for your services being monitored across a cluster; and
  • the ability to test your services locally prior to them being deployed.

Furthermore ConductR is the complete manifestation of the entire stack of technologies that we at Lightbend both contribute to and support. It is a great example of an Akka based distributed application that uses in particular, akka-cluster, akka-distributed-data and akka-streams/http. It is also tightly integrated with our Akka monitoring based instrumentation, and the monitoring story around events, tracing and metrics is going to get stronger. If you like our stack, you should feel good about the way ConductR has been put together.

We have programmed ConductR in the spirit of the Reactive Manifesto, with resiliency and elasticity being a particular focus. There is no single point of failure and our ability to scale out is holding up.

One last point: we use ConductR for our own production environment at Lightbend hosting our websites and sales/marketing services. With any product out there, you should always look for this trait. If a supplier is not dependent on their own technologies in terms of running their core business then beware; they can lose enthusiasm very quickly.

ConductR is as relevant as it ever was, and with its batteries-included approach for operations, I'm sure it'll become even more relevant as the industry moves toward deploying and managing microservices.

One last tidbit: ConductR is becoming a framework for Mesos/DCOS. Exciting times!

Thanks for reading this far!

Monday, March 7, 2016

What the name "lightbend" means to me

I thought that it'd be useful to share my personal perspective on the meaning of our company name change. Here are the contents of an email that I sent out to everyone within Lightbend, and which was warmly received.


Hi fellow lightbenders,

I’m very excited about the Lightbend name, and want to provide my view on what it means to me.

About two years ago, I presented at YOW. YOW is a great conference with the characteristic that speakers get to talk to a cross section of our industry on three occasions: Melbourne, Brisbane and Sydney. One is therefore not preaching to the converted, but rather talking to what can be quite a hostile crowd!

My first talk was to a few hundred people in Melbourne - apparently the most hostile of the three cities. About ten minutes into the talk I had that sinking feeling that I’d lost everyone. My talk was about Akka streams and the importance of back pressure. Lots of blank looks all around. An interesting aspect of YOW is that you are scored by the audience. You guessed it, my scores were low.

Travelling up to Brisbane I felt that it was important to bring the talk back a bit. Instead of delving right into Akka streams, I felt that I should at least have a preamble around reactive streams and why we did that. The Brisbane talk went much better.

However given the nature of the questions asked after my talk I felt that I could do even better. So, for Sydney, my preamble included a discussion on “why reactive”. This set the scene for the remainder of the talk and my Sydney scores reflected that.

Coming away from YOW I realised how fringe Typesafe were - again this is two years ago. I certainly appreciated that we were not anything near mainstream, but really, we were on another planet compared to where the IT industry was at.

Roll forward to today and you can see that we’ve come a long way. We have done so without deviating on our mission from an technical perspective. I would tell people that if you want to understand anything about our technical direction then simply read the Reactive Manifesto. You'll then see our DNA blueprint; the very fabric of what we are. Taking that further and quoting Jonas Boner, “without resiliency, nothing else matters”. We have upheld the manifesto and, in particular resiliency, like nothing else matters.

And now we are seeing the industry finally come our way. To highlight a few points, the industry received our new name well, it is excited about Lagom as a microservices framework for Java and the enterprise leading Spring framework is effectively adopting the reactive manifesto.

This is where the lightbend name kicks in for me.

I see lightbend as the gravitational force that is bending the light beam representing the direction of the industry at large. Gravity bends light.

To use my earlier analogy of Typesafe being on another planet, two years ago, we were light years away from where the industry was thinking. We are no longer. We have pulled the industry to the way we think software systems should be put together and managed.

We are now at an interesting juncture. As the company expands as it needs to, it would be easy to compromise our technical mission in order to gain further traction. However it is now more important than ever to stay on mission.

We need to be brave and continue to be bold. The industry doesn’t need more of the same; it needs more companies like us.

Thanks for reading!

Kind regards,