Skip to main content

Testing: overkill or an absolute necessity


The tech-companies of Silicon Valley (with Facebook one of its most notable proponents) embrace the attitude of "Move Fast and Break Things", "Sell/Ship First, Fix Later" and "Growth at any cost". This philosophy promotes a fast time-to-market (and being first on the market) over quality. The idea is that if you aren’t breaking things you’re delivering value too slowly. In a world with continuous evolution (especially in the tech sector), this seemed the right way forward.
This is in strong contrast with the large established financial service companies, which have often only a few releases a year, with multiple test-phases and test-cycles, which often take months to complete.
As always in these types of evolutions, the 2 extremes (productivity/time-to-market versus predictability/quality) tend to converge to a middle ground. On the one hand the incumbent banks are adopting Agile and DevOps methodologies and practices, allowing for faster and more gradual (and with roll-back) delivery. On the other hand, the major tech-companies with millions of users have also understood that even a small bug can have enormous impacts given the size of their user base. Not surprisingly Facebook changed its motto "Move Fast and Break Things" in 2014 to "Move Fast with Stable Infra".
In the end there is no golden rule for how software should be delivered. The degree of testing required, and the associated speed of delivery depends on a number of factors:
  • What is the impact of a bug (i.e. cost of failure), determined by the criticality of the application, the number of impacted users, the impact of a bug (e.g. impact on human health/human lives, financial impacts, legal/security implications…​)…​
  • Speed at which feedback can be obtained from users. Typically, a front-end with good feedback mechanisms through monitoring allows faster feedback, than a complex back-end process.
  • Ease to roll-back the change, i.e. how easy can a change be rolled-back. E.g. an installation where a database structure is modified is much more complex to roll-back than a new version of a front-end screen.
  • Possibility to roll-back the resulting impact of the software, e.g. for an online view-only screen or report there is no roll-back required of the results of the erroneous software, while software which sends out mails or SMS’s or software which makes calculations and bulk updates in a database can be very complex and difficult to roll-back the results (even if the software can be easily rolled-back).
  • Time to debug a bug in the code, i.e. how well is the code structured and documented, how much knowledge is there in the development team of the code base, how easily can a specific situation be reproduced in a test environment…​
  • What is the cost of doing a test, i.e. how easily can tests be automated, which skills and coordination is needed, how easy can test data be setup…​
Companies should therefore not have 1 unique policy and 1 release calendar applicable to all changes, but instead allow a categorization in the changes, depending on the above criteria. Depending on the categorization a faster/slower release schedule with less/more testing is needed.
Typically, you could work with 3 categories, i.e.
  • Low impact changes, for which software and the resulting impact can be easily rolled-back could be deployed to production immediately without any manual verification (i.e. only some automated tests). This type of changes could also be deployed directly to the full user base.
  • Medium impact changes, for which the software and resulting impact can be easily rolled-back, can also be deployed with little to no manual testing, but require a gradual deployment/roll-out. As it is difficult to have a gradual roll-out for every small change, it is best to work with a lower release frequency, like e.g. bi-weekly.
  • High impact changes (i.e. mission-critical) or changes for which software or the resulting impact cannot be easily rolled-back should get extensive testing cycles and a gradual roll-out (move slowly and steadily). This means that releasing on a bi-monthly basis is probably the maximum obtainable frequency.
Of course, with these different release frequencies, it is difficult to execute the testing for a change in the last category, as the surrounding software can evolve during the test-cycles. It is therefore critical to isolate software components as much as possible through encapsulation and to automate as much as possible any testing, so that (regression) testing cycles can be repeated very quickly.
In big financial service companies, it is difficult to introduce the first 2 categories, as it is in contradiction to the DNA of a typical bank or insurance company, which aims to manage (avoid or mitigate) risks in the best possible way (i.e. the industry is very sensitive and critical and the reputation of a financial service company as a safe-haven is essential).
However if banks and insurance companies want to become again innovative companies, they need to try out new things, measure the impact and learn from it (probe, sense and respond). It means that releasing should be a continuous effort (at least to the test environment and for the first and second category of changes also to the production environment), which can be obtained via:
  • Advanced DevOps practices, like CI/CD pipelines, continuous monitoring, Infrastructure as Code, automated testing, fast and automated roll-backs…​
  • Deployment strategies, like feature flags, canary testing, A/B testing…​
  • Well trained and motivated software engineers, which allow to deliver high-quality code, have a good understanding of the business impacts and are able to analyse and fix potential bugs very quickly.
  • IT development teams having (almost) direct contact with end-users, so that feedback cycles are instantaneous and IT departments have a good connection with the business world (ownership and commitment). This in contrast to the current way of working, where IT developers are often 4-5 layers away from the end-user (i.e. often separated by key users, business/product owners, business analysts, functional analysts, technical analysts…​).
  • Good application architecture, where a maximum encapsulation between modules is realized (e.g. via a microservices based architecture) and were fault isolation is built in the system (cfr. my blog on Building resilient systems - https://bankloch.blogspot.com/2020/02/building-resilient-systems-in-financial.html)
  • Good logging and monitoring, allowing automatic identification of issues and easy debugging via drill-down capabilities from business metrics to application metrics, all the way to log entries (cfr. my blog on monitoring - https://bankloch.blogspot.com/2020/03/microservices-monitoring-disaggregate.html)
  • An agile governance structure, i.e. no complex quality governance with reviews, sign-offs points, quality gates…​, but instead a governance where end-to-end ownership is promoted.
The best way to realize this change is by realizing that even the best-tested software still has bugs (i.e. it is impossible to create 100% defect free software and impossible to test everything).
As a result of this realization, banks and insurance companies should focus less on trying to eliminate all bugs before going to production, but instead identify bugs in production as quickly as possible (getting fast feedback of end-users and ensuring that this feedback reaches the IT development team as quickly as possible) and react as fast as possible to them (quick bug fixing delivery cycles). Often a bug can even be a positive experience for an end-user, when the bug can be fixed quickly and there is good communication around it. At that moment the end-user reporting the bug will feel appreciated and a more in-depth relationship can be obtained. Today most banks and insurers leave however production bugs sometimes months unsolved (due to other priorities and the above-mentioned release cycles), which leads to enormous frustration of end-users.
Furthermore, one should realize that the long release cycles come at a major cost. Due to the enormous cost of late delivery (i.e. missing the gate for a release and impacting all dependent projects), IT teams start to optimize their work for deadlines, which reduces agility and cooperation, ultimately lowering overall productivity.
But even if the above transformation is realized, the extensive testing phases and cycles are still required for the last category of changes. With testing consuming typically around 30% of the project budget, one should definitely look at testing as a possible candidate for cost reduction.
Testing is important and requires a specific skill set, but my personal experience is that testing teams often tend to lose themselves in methodology which doesn’t always deliver so much result. Below are some reflections on possible cost reductions with regards to testing:
  • Unit testing: in the Waterfall model these types of tests were critical, as time between code writing and the functional testing (i.e. feedback on quality) was very long, meaning that the developer no longer had the code in mind, which resulted in high costs to fix a bug (exponentially increased in every testing cycle).
    Today with the shorter test cycles and the automated tests executed in CI/CD pipelines, certain functional tests might be executed only minutes after the code commit. One should therefore wonder if it is not better to skip unit tests and replace them by extensive more elaborate functional end-to-end testing. Such end-to-end tests are more complete, easier to understand, less prone to maintenance and easier to manage. Having seen too many projects, where the time to update the unit tests following a small change took 2-3 times more time than the change itself, this is definitely a candidate for cost reduction.
    Often developers argument that writing unit tests also helps to better understand and design the code and makes it easier to maintain the code, as the unit tests can act as a structured way of documenting the code. While I fully agree with these arguments (don’t think anybody is claiming that unit tests are a bad thing), they don’t make the business case (i.e. is the time spent on creating and maintaining unit tests ultimately gained later in time).
  • Dedicated testing teams: while testing definitely requires a specific skill set, which is different from design or development, it is (in my belief) a bad idea to work with separate testing teams (competence centres), which execute tests from different projects, as they often have insufficient knowledge of the business, the project and the applications. This lack of in-depth knowledge usually leads to a too strong focus on UX testing, resulting in dozens of small, cosmetic defects, while critical business bugs are overlooked. It is therefore better to work with a dedicated tester within the scrum team, who assists also in the design process, but has as primary focus the quality assurance within the team.
  • Test preparations: a good test preparation (and coordination between all involved parties) is essential to deliver good software. Nonetheless I don’t think writing detailed test cases and test scripts is the right way to achieve this. These deliverables are typically very time consuming to create and maintain and include a lot of repetitive information, which a tester who is well acquainted with the business and the application doesn’t require to execute his tests. Instead it is much more efficient to create a test matrix, which contains the different combinations of flows/decision points. Such a matrix takes considerably less time to create and maintain and shows immediately the different areas to focus on during testing.
  • Test automation: test automation is not cheap (to create and to maintain) and can result in quite some complexity (which is also prone to errors) to setup. As a general guideline, you could say that setting up an automated test takes 5 to 10 times more effort than manually executing the test. Automation is therefore only relevant, when you know that the test will be executed at least 10 times. However, with CI/CD pipelines executing continuous testing, this number is often reached in a few days. Test automation is therefore intrinsically linked to Agile software delivery.
    The 2 most common areas for test automation are the automated calling of APIs and robots simulating user interactions on screens. If resources capable of test automation are scarce one should focus first on automating API calls, as these are much easier to automate and less prone to change (i.e. tests are more regression proof and APIs evolve less quickly than the front-end screens).
  • User acceptance testing: try to involve as soon as possible end-users in the testing (and not only in the final acceptance testing phase) as these users can give very valuable feedback in a very short time. Of course, the time of these resources should be used as efficient as possible and expectations should be managed carefully, when these users are involved early in the testing process.
Independent of the chosen approach and methodology, the human aspect should never be underestimated. Software engineers that have a feeling of ownership, commitment and sense of urgency, will deliver better software quality at a faster pace, which ultimately leads to more motivation and further enforces the feeling of vibrancy around the company. This (r)evolution will however only happen when companies allow themselves to build imperfect systems (embrace and reward initiative and the failure that comes with it).
Check out all my blogs on https://bankloch.blogspot.com/

Comments

Popular posts from this blog

Transforming the insurance sector to an Open API Ecosystem

1. Introduction "Open" has recently become a new buzzword in the financial services industry, i.e.   open data, open APIs, Open Banking, Open Insurance …​, but what does this new buzzword really mean? "Open" refers to the capability of companies to expose their services to the outside world, so that   external partners or even competitors   can use these services to bring added value to their customers. This trend is made possible by the technological evolution of   open APIs (Application Programming Interfaces), which are the   digital ports making this communication possible. Together companies, interconnected through open APIs, form a true   API ecosystem , offering best-of-breed customer experience, by combining the digital services offered by multiple companies. In the   technology sector   this evolution has been ongoing for multiple years (think about the travelling sector, allowing you to book any hotel online). An excelle...

RPA - The miracle solution for incumbent banks to bridge the automation gap with neo-banks?

Hypes and marketing buzz words are strongly present in the IT landscape. Often these are existing concepts, which have evolved technologically and are then renamed to a new term, as if it were a brand new technology or concept. If you want to understand and assess these new trends, it is important to   reduce the concepts to their essence and compare them with existing technologies , e.g. Integration (middleware) software   ensures that 2 separate applications or components can be integrated in an easy way. Of course, there is a huge evolution in the protocols, volumes of exchanged data, scalability, performance…​, but in essence the problem remains the same. Nonetheless, there have been multiple terms for integration software such as ETL, ESB, EAI, SOA, Service Mesh…​ Data storage software   ensures that data is stored in such a way that data is not lost and that there is some kind guaranteed consistency, maximum availability and scalability, easy retrieval...

IoT - Revolution or Evolution in the Financial Services Industry

1. The IoT hype We have all heard about the   "Internet of Things" (IoT)   as this revolutionary new technology, which will radically change our lives. But is it really such a revolution and will it really have an impact on the Financial Services Industry? To refresh our memory, the Internet of Things (IoT) refers to any   object , which is able to   collect data and communicate and share this information (like condition, geolocation…​)   over the internet . This communication will often occur between 2 objects (i.e. not involving any human), which is often referred to as Machine-to-Machine (M2M) communication. Well known examples are home thermostats, home security systems, fitness and health monitors, wearables…​ This all seems futuristic, but   smartphones, tablets and smartwatches   can also be considered as IoT devices. More importantly, beside these futuristic visions of IoT, the smartphone will most likely continue to be the cent...

Are product silos in a bank inevitable?

Silo thinking   is often frowned upon in the industry. It is often a synonym for bureaucratic processes and politics and in almost every article describing the threats of new innovative Fintech players on the banking industry, the strong bank product silos are put forward as one of the main blockages why incumbent banks are not able to (quickly) react to the changing customer expectations. Customers want solutions to their problems   and do not want to be bothered about the internal organisation of their bank. Most banks are however organized by product domain (daily banking, investments and lending) and by customer segmentation (retail banking, private banking, SMEs and corporates). This division is reflected both at business and IT side and almost automatically leads to the creation of silos. It is however difficult to reorganize a bank without creating new silos or introducing other types of issues and inefficiencies. An organization is never ideal and needs to take a numbe...

PSD3: The Next Phase in Europe’s Payment Services Regulation

With the successful rollout of PSD2, the European Union (EU) continues to advance innovation in the payments domain through the anticipated introduction of the   Payment Services Directive 3 (PSD3) . On June 28, 2023, the European Commission published a draft proposal for PSD3 and the   Payment Services Regulation (PSR) . The finalized versions of this directive and associated regulation are expected to be available by late 2024, although some predictions suggest a more likely timeline of Q2 or Q3 2025. Given that member states are typically granted an 18-month transition period, PSD3 is expected to come into effect sometime in 2026. Notably, the Commission has introduced a regulation (PSR) alongside the PSD3 directive, ensuring more harmonization across member states as regulations are immediately effective and do not require national implementation, unlike directives. PSD3 shares the same objectives as PSD2, i.e.   increasing competition in the payments landscape and en...

Trade-offs Are Inevitable in Software Delivery - Remember the CAP Theorem

In the world of financial services, the integrity of data systems is fundamentally reliant on   non-functional requirements (NFRs)   such as reliability and security. Despite their importance, NFRs often receive secondary consideration during project scoping, typically being reduced to a generic checklist aimed more at compliance than at genuine functionality. Regrettably, these initial NFRs are seldom met after delivery, which does not usually prevent deployment to production due to the vague and unrealistic nature of the original specifications. This common scenario results in significant end-user frustration as the system does not perform as expected, often being less stable or slower than anticipated. This situation underscores the need for   better education on how to articulate and define NFRs , i.e. demanding only what is truly necessary and feasible within the given budget. Early and transparent discussions can lead to system architecture being tailored more close...

Low- and No-code platforms - Will IT developers soon be out of a job?

“ The future of coding is no coding at all ” - Chris Wanstrath (CEO at GitHub). Mid May I posted a blog on RPA (Robotic Process Automation -   https://bankloch.blogspot.com/2020/05/rpa-miracle-solution-for-incumbent.html ) on how this technology, promises the world to companies. A very similar story is found with low- and no-code platforms, which also promise that business people, with limited to no knowledge of IT, can create complex business applications. These   platforms originate , just as RPA tools,   from the growing demand for IT developments , while IT cannot keep up with the available capacity. As a result, an enormous gap between IT teams and business demands is created, which is often filled by shadow-IT departments, which extend the IT workforce and create business tools in Excel, Access, WordPress…​ Unfortunately these tools built in shadow-IT departments arrive very soon at their limits, as they don’t support the required non-functional requirements (like h...

An overview of 1-year blogging

Last week I published my   60th post   on my blog called   Bankloch   (a reference to "Banking" and my family name). The past year, I have published a blog on a weekly basis, providing my humble personal vision on the topics of Fintech, IT software delivery and mobility. This blogging has mainly been a   personal enrichment , as it forced me to dive deep into a number of different topics, not only in researching for content, but also in trying to identify trends, innovations and patterns into these topics. Furthermore it allowed me to have several very interesting conversations and discussions with passionate colleagues in the financial industry and to get more insights into the wonderful world of blogging and more general of digital marketing, exploring subjects and tools like: Search Engine Optimization (SEO) LinkedIn post optimization Google Search Console Google AdWorks Google Blogger Thinker360 Finextra …​ Clearly it is   not easy to get the necessary ...

The UPI Phenomenon: From Zero to 10 Billion

If there is one Indian innovation that has grabbed   global headlines , it is undoubtedly the instant payment system   UPI (Unified Payments Interface) . In August 2023, monthly UPI transactions exceeded an astounding 10 billion, marking a remarkable milestone for India’s payments ecosystem. No wonder that UPI has not only revolutionized transactions in India but has also gained international recognition for its remarkable growth. Launched in 2016 by the   National Payments Corporation of India (NPCI)   in collaboration with 21 member banks, UPI quickly became popular among consumers and businesses. In just a few years, it achieved   remarkable milestones : By August 2023, UPI recorded an unprecedented   10.58 billion transactions , with an impressive 50% year-on-year growth. This volume represented approximately   190 billion euros . In July 2023, the UPI network connected   473 different banks . UPI is projected to achieve a staggering   1 ...

AI in Financial Services - A buzzword that is here to stay!

In a few of my most recent blogs I tried to   demystify some of the buzzwords   (like blockchain, Low- and No-Code platforms, RPA…​), which are commonly used in the financial services industry. These buzzwords often entail interesting innovations, but contrary to their promise, they are not silver bullets solving any problem. Another such buzzword is   AI   (or also referred to as Machine Learning, Deep Learning, Enforced Learning…​ - the difference between those terms put aside). Again this term is also seriously hyped, creating unrealistic expectations, but contrary to many other buzzwords, this is something I truly believe will have a much larger impact on the financial services industry than many other buzzwords. This opinion is backed by a study of McKinsey and PWC indicating that 72% of company leaders consider that AI will be the most competitive advantage of the future and that this technology will be the most disruptive force in the decades to come. Deep Lea...