DevOps for dummies

In May of last year, I published the blog "Docker for dummies" (https://bankloch.blogspot.com/2020/02/docker-for-dummies.html), explaining some basic concepts for non-technical people about Docker containers.
Another concept which is closely associated with Docker containers and also impacts strongly the financial services industry, is DevOps. Also here, the non-technical impacts (on organization, security, processes…) of this concept are sometimes hard to grasp for managers and businesspeople, while it is essential that everyone is aware and involved in this transformation.

DevOps (a contraction of "development" and "operations") is a practice within software engineering, which aims to bring the (before strongly separated) worlds of software development (developing the software) and software operations (responsible for keeping systems stable and running) together. The term was first introduced in Belgium on a conference in 2009, but really took off a bit more than 5 years ago, accelerated by other evolutions like Agile and Cloud computing.

DevOps aims to deliver software changes (value) to the end users in a faster, smarter, cheaper and better (repeatable, predictable, efficient, and fully audited) way, which ultimately benefits both the development and operations teams, but of course also the competitiveness of the entire organization.

This goal is achieved in multiple ways:

Automation (continuous organizational automation): automation of all steps in the software delivery life cycle allows to gain time (as no manual intervention required anymore), but also improves quality (no room for human errors) and security (no room for malicious intent). All this leads to increased speed of delivery, shortened bug fix lead times and reduced downtimes (e.g. by releasing without service interruption and faster mean time to recovery in case of failure).
Better collaboration (you build it, you run it) between teams, by breaking the silos between development teams, operation teams and other groups such as the architecture and testing teams, i.e. all stakeholders work collaboratively towards a shared goal and have end-to-end accountability and ownership.
Shorter feedback cycles: by speeding up the release cycles and better (proactive) monitoring the usage of the system, it becomes possible to have a shorter time to market (faster time to value), reduce the failure rate of releases (as each release is much smaller) and get much faster feedback from users to change your organization’s course of action.

Obtaining this goal and implementing these methods, is of course a long transformation journey, which is not completed overnight. Instead it is a never-ending process of continuous improvement, where an organization can be at different maturity levels for these different methods.

More in detail it consists of implementing a number of tools and processes, but also a drastic change in the organizational structure and people’s way of thinking and working (requiring extensive change management). More specifically following processes need to be adapted or implemented:

Source code management: all source code should be checked-in centrally (in a code repository like Git), allowing everyone to have the latest version of the source code and a full history of all changes to the source code.
Code analysis: when code is checked in, it should be automatically analyzed for potential issues like bugs (cfr. FindBugs), bad practices (cfr. PMD), convention breaches (cfr. Checkstyle), copy-paste errors (cfr. CPD), code coverage (cfr. Cobertura)…
Continuous Integration (CI): code should be automatically built and assembled after each code commit. This is respectively done via a build (automation) tool (like Ant, Maven, Gradle…) and a continuous integration tool (like Jenkins).
This forces developers to make sure that the full code base still builds and executes correctly after each check-in. It enforces the end-to-end accountability, meaning everyone is responsible to have a successful code integration.
Continuous Testing (CT): after each code commit, there should be also an automatic execution of functional and non-functional tests (using tools like JMeter, LoadRunner, Selenium, Cucumber…) and this on different abstraction levels and with different focus areas, i.e. automated unit testing, assembly testing, end-to-end testing, stress testing, performance testing, security testing… Real-time dashboarding and alerting should notify developers immediately that a recent code commit has led to an (regression) issue.
Automated infrastructure: in order to automate all above processes, it should be possible to provision infrastructure in a very short time, i.e. allowing to spin-up a test environment fully automatically. This requires infrastructure to be treated as code, i.e. all configuration instructions to setup infrastructure should be in a code repository and versioned as well. Via Infrastructure as Code tools (such as Chef, Puppet, Terraform, Ansible…) and containerization/container orchestration (i.e. Docker & Kubernetes), the infrastructure level can be fully automated and abstracted away.
Continuous Deployment (CD): each commit should be automatically released to a test environment and on a frequent basis (e.g. weekly) released to production. Important here is that the same automated pipelines and deployment steps used for the test environments should also be used for the production environment.
Continuous Monitoring (CM): a continuous monitoring of different metrics (infrastructure, application & business monitoring) via automated monitoring tools (like Kibana, NewRelic, Datadog, Prometheus…) gives the possibility to proactively monitor and intervene on a system and rapidly correct in an automated way (e.g. automatic roll-back of last deployment)
Continuous Insights/Improvement: based on the metrics collected via continuous monitoring from the usage of the end-users, the product roadmap should be driven (i.e. data driven product management). This means that small changes are deployed to a part of the user base (via canary or A/B testing) and checked on their impact (value). If the impact is positive, the change can be deployed to a larger user-base, otherwise a roll-back is done and a new iteration (Experiment & Fail fast) is launched.
Change Management: introduce a culture of collaboration and continuous improvement in the organisation via:
- Adapting the existing organizational structure, i.e. remove the silos (bridge the gaps) of the development and operations teams and instead work with integrated, cross-functional teams with end-to-end responsibility, supported by transversal teams which setup and improve the DevOps tooling and common infrastructure components
- Empower teams and make them as autonomous as possible, i.e. reward initiative and force teams to take decisions themselves, rather than promoting a top-down hierarchical decision process. This means also that a culture of rewarding success and avoiding blame (a fault is an opportunity to learn and improve) should be established.
- Train people to become T-shaped all-round profiles (cfr. full stack engineers)

This transformation journey goes hand-in-hand with other evolutions, which mutually accelerate each other.

Agile: the Agile methodology practice also aims to deliver value to the business as quickly as possible (i.e. in every sprint), meaning also that every sprint a deployment to at least a test environment (but preferably a production environment) should be foreseen. Furthermore the Agile methodology foresees to work with cross-functional teams, which are empowered to be maximum autonomous and to continually adjust the product backlog based on new insights (gained from measuring).
Migrate to (Public/Private) Cloud: cloud providers don’t only make the automated infrastructure provisioning much simpler, but typically provide also a lot of out-of-the-box DevOps tooling helping to automate the different steps in the software delivery life cycle.
Microservices: in a microservices based architecture (cfr. my blog "https://bankloch.blogspot.com/2020/02/microservices-yet-another-buzzword-or.html) the overall application landscape is split in autonomous, isolated components (called microservices). Such an architecture simplifies the application of DevOps principles, as microservices are usually built and maintained by fully autonomous teams which are end-to-end responsible for a microservice. Furthermore due to the strong encapsulation of each microservice, continuous deployment becomes easier, as impacts of code changes can be much easier isolated.
Open source: thanks to open source (cfr. my blog "https://bankloch.blogspot.com/2020/02/banks-are-finally-embracing-open-source.html) hundreds of tools, libraries and best practices are available for free to be used to help automate every step of the software delivery lifecycle (at a fraction of the cost). Smaller organizations can profit here from all the DevOps experience of large, very mature organizations like Facebook, Google, Uber, Netflix or LinkedIn. These components allow to leapfrog the DevOps maturity level of your organization.
Containerization: the rise of (Docker) containers and container orchestration (Kubernetes) helps also to apply DevOps principles, as the released package (i.e. the container) becomes more high-level and more easily portable. Furthermore via the standard orchestration framework of Kubernetes, all kind of best practices with regards to deployment (without downtime and gradual roll-out strategies) and monitoring, come practically out of the box.

Despite all the obvious benefits and the fact that banks and insurers have been working on introducing DevOps practices and all other accelerators (like Agile, open source…) for a while now, the maturity level (and thus the achieved benefits) of incumbent financial service firms is still relatively low. Obviously the complexity and historical legacy of a bank or insurer, makes it much more difficult to introduce DevOps than at a Fintech start-up, i.e.

A complex layering of processes: today in a bank or insurance company most activities are still subject to formal requests (strict reporting lines), rules, approvals, signoffs and strict schedules (often imposed by security, audit, risk and compliance departments). These restrictions are the result of the risk averse nature of the industry and the fact that the financial industry is highly regulated. The result is that it is typically very hard to get access to production environments, to install something (and experiment) on environments (even development environments), to rapidly release something to production or to introduce new technologies or new processes.
Most banks and insurance companies are still very hierarchically structured, with very strict defined IT departments. Such an organizational structure feeds the silo thinking (as each department has its own objectives) and complexity of the decision process.
Outsourcing developments and operations: as many banks have outsourced (a part of) their developments and operations, silos are automatically created, as the goals of the financial service company and the outsourcing vendor are obviously not always aligned.
Shared service teams: in an attempt to find economies of scale and thus improve efficiency, many financial service companies have setup so-called shared service teams for developing/configuring and maintaining services, which are used in almost every business application (such as database administration, managing middleware such as message queues or application servers like Websphere, JBoss or Tomcat). As these teams initially were flooded by demands, they have all introduced a very rigid system of ticketing, which enforces considerably the silo-thinking and results in a lack of ownership and end-to-end accountability (as the shared service engineer has little feeling with the bigger picture).
Technical legacy and technology sprawl: the complex and outdated application architecture of most incumbents in the financial services industry, makes it very difficult to roll-out modern DevOps practices. E.g. DevOps tooling is very difficult (if not impossible) to apply on mainframe systems.
Furthermore the ongoing digital transformation programs and regulatory projects, typically consume already so much energy within these organizations, that very little focus remains to implement DevOps practices.

Despite these challenges, banks and insurance companies need to take the step to roll-out DevOps practices within their organization. In this transition, it’s everyone’s responsibility to make sure that user-centric and robust systems are delivered, not just the software developer. A bank or insurer is an IT firm, so the IT systems are key assets. Every (line) manager should therefore have a notion of those assets and how DevOps practices can help in improving those precious IT assets.

Comments

Add comment

Bankloch

Search This Blog

DevOps for dummies

Labels

Comments

Post a Comment

Popular posts from this blog

Transforming the insurance sector to an Open API Ecosystem

RPA - The miracle solution for incumbent banks to bridge the automation gap with neo-banks?

IoT - Revolution or Evolution in the Financial Services Industry

AI in Financial Services - A buzzword that is here to stay!

A bank account - A concept of the past

An overview of 1-year blogging

Peer-to-peer payments - A crucial component towards a cashless society

The UPI Phenomenon: From Zero to 10 Billion

From app to super-app to personal assistant

Marketplaces in the financial industry - Here to stay?