Technical Debt: A Cautionary Tale!

Most people understand what debt is but when I ask about attendees’ understanding of technical debt during awareness sessions I often get blank faces. Technical debt is simply not something many in the business people understand. Part of the reason for writing this article is that a failure to comprehend and manage the technical debt held over after IT projects are closed is often a major root cause of information security breaches. The other reason I am writing this is that I believe it’s a major reason why IT projects fail spectacularly in general. In this article, we’ll look at what technical debt is by going through a case study, why technical debt can be both good and bad, and what can be done to prevent catastrophic failures down the line when the project team have long disappeared…

Are you sitting comfortably, then let us begin…

Technical debt is essentially the difference between what you originally need in an IT system and what gets delivered at the end of the project. It’s the cost of compromise, the cost of short cuts, the cost of not putting enough money in the upfront. Another way to describe it would be putting some of the IT systems functionality using a low-interest long term loan and putting some of the functionality on a high-interest credit card. Not making sense yet? Let’s work through an example.

Imagine an organisation is looking to replace their old in-house MySQL Customer Relationship Management (CRM) system with a modern CRM solution. The CIO has a general idea what the business want (but doesn’t write that down anywhere). The CIO may do some research (but it’s limited to a cursory Google search). They may even ask some peers what they are using (but not how they use their system or how much they paid). The CIO may even rely on a Gartner Magic Quadrant (but doesn’t read the detail). Eventually, after a couple of hours, this CIO settles on a cloud-based CRM solution offered by a vendor who gave them some tickets for that sold-out rugby final last month. After all, if they are that good at building client relationships, their product must be equally awesome! In fact, the vendor is so good at customer service they offer to project manage the whole implementation for the CIO too, incorporating those expensive setup costs into a monthly fee over 3 years. As the CIO’s organisation only uses an arbitrary monetary trigger for governance, these amortised costs fall below the minimum threshold to report the project to any of the governance committees. The CIO is ‘legitimately’ able to proceed without triggering any governance gateways…including information security and legal…which has massively sped up the CIOs delivery time! The CIO signs a contract and project delivery commences!

Whilst the vendors are implementing the product, they explain to the CIO that they need Port 22 on the cloud-based CRM servers open to the Internet as a lot of their engineers work remotely so in order to support the CRM, the engineers need to be able to connect from anywhere. The vendor also needs to connect to the current CRM via a secure API to migrate the data. The CIO knows if they connect to the current DB via an API he will have to inform information security and that will open the floodgates for due-diligence work. As the CIO has already signed off the contract and the business is expecting the CRM to be delivered on time he unilaterally agrees to the SSH connection request but instructs one of his DBAs to send a CSV file containing all the customer data to avoid getting a change request endorsed by InfoSec to open the corporate firewall to the vendor’s API. The DBA complies as the CIO is his boss and it’s coming up to bonus season. InfoSec is not aware of this environment and the DBA used her legitimate access to export the data.

The main CRM databases are hosted on AWS. The vendor has deployed this solution ‘a thousand times before’ so there is little chance of anything going wrong. The vendor even has a cloud formation template to make sure each deployment is consistent. This time however things didn’t go to plan. Because of Brexit concerns, the CIO did have the foresight to ask that the CRM is deployed in the London availability zone. Because the London zone is relatively new, it doesn’t have the same features as the Ohio availability zone contained in the standard template. Without telling anyone else, the vendor engineer deletes some of the cloud formation template to get it to work. The engineer also creates some new AWS roles, security groups and an S3 bucket for storing the customer data CSV file ready for upload into the CRM. These manual changes were not recorded, nor was the cookie-cutter vendor documentation amended to reflect the new configuration. The CSV file was also used to successfully populate the Dev, UAT environments in addition to the Production environment. This decision was taken by the CIO to make testing as ‘real’ as possible to ‘avoid problems down the line’. Once the data was successfully uploaded into the CRM, the CSV was not deleted from the publicly accessible S3 bucket. When mapping the old data into the new CRM, the engineer took a unilateral decision to use the customers’ email addresses as a primary key – after all this is unique to each customer.

The CRM reaches the User Acceptance Testing (UAT) stage. The business units are provided a test script by the vendor which didn’t include tests for around 60% of the functionality that they needed in the tool. Not understanding the implications of these omissions and assuming the other functionality must have been tested by someone else, they dutifully completed the incomplete tests. Each business unit head signed off the tests as complete. The CRM gets approved to deploy into production with so far no involvement by information security or legal at all. But it’s ok as delivery is going even quicker than expected!

It’s in! The CRM tool goes live and the business units start to interact with the production environment during early life support. The business quickly realises that key functionality is still missing and start to raise issues. The CIO now goes back to the vendor who couldn’t have been so helpful before the contract was signed is now not so amenable (they want to move on to the next project). “This is new functionality” they say, “this will need a change request and will come at a cost.” The CIO, realising he is now stuck and has no further budget the CIO asks their DBA to bodge a workaround. This bodge involves exporting data from the CRM into a CSV file on the S3 bucket and then into a MySQL database also hosted elsewhere in AWS. Some manipulation is conducted within the MySQL database and then the processed data is uploaded back into the CRM via another CSV file. The business units have no idea this workaround exists; they are just happy the absent functionality is now available. Early life support ends and everyone gives the CIO a big pat on the back for getting this CRM in and working in record time. The DBA get a bigger bonus for being especially helpful.

Because of the CIOs newly acquired reputation for swift delivery, they are headhunted to another role outside the company. The business units in the CIO’s old company carry on, completely unaware of the risk they now hold. The CRM becomes integral to a number of revenue-generating projects – money is being made! 12 months after the CIO leaves, the organisation is subject to a major cyber-attack. The entire customer database is found on the deep web, including their credit card information. The main CRM doesn’t appear to have been breached but it is clear the data originated from the CRM. In the aftermath, the new CIO then finds out about all the workarounds from the DBA and, so too does the Regulator, who slams the organisation with a major GDPR fine. All the gains made from the investment in the new CRM have been completely wiped out. In a similar way to the financial services institutions who went under during the Global Financial Crisis (GFC), the organisation was completely unaware they had been holding a material technical debt until it was too late.

Pay Day Loan: Straight Talking Money

To sum up the technical debt, the CIO borrowed money from the organisation to deliver a project at break-neck speed. What money did he borrow? The CIO borrowed additional budget from the organisation’s financial reserves, set aside for crises and emergencies, to pay for this project. When factoring in the fines and remediation action, the total cost of ownership for the new CRM tool is now several orders of magnitude over what the cost of implementing the new CRM in a controlled, well-governed manner would have been. The CIO has effectively bought the new CRM using a payday loan. When the organisation was fined, that loan was called.

The business units also took on technical debt. They borrowed money against future profits and future system capability to get a system in quick rather than paying for a better system upfront. They did this by failing to engage effectively with governance and oversight bodies such as information security, operational risk, legal and compliance. Sure, it may have taken longer, and sure it may have meant the costs would be higher in the short term. Overall, though, the total cost of ownership would have been lower and the actual return on investment would be materially higher!

The Good Technical Debt, The Bad…and the Grey…

“But most projects have ‘fudges’ to get them into production…right?”

When implementing new systems it is accepted that an organisation is highly unlikely to get everything they want in a system within a limited budget so (unless you have limitless funds) there is always going to be a modicum of technical debt. Holding an amount of technical debt is not necessarily a bad thing. Like normal debt, there can be good technical debt and bad technical debt. As Martin Lewis of moneysavingexpert.com explains, just like normal debt, there is also ‘grey’ technical debt. If you want the full video, click here, but to paraphrase Mr Lewis; good debt is planned, it’s budgeted for, it’s affordable and most importantly, it’s thought through. Conversely, bad debt is not planned, it’s taken on impulse, it’s not budgeted for, and it’s not affordable. Grey debt is where there are competing commitments in terms of benefit and risk and so a decision to take on such a debt essentially depends on risk appetite.

An example of grey technical debt in the case study could have been the costs associated with vendor engineer access to the system. The CIO had a choice to create a more controlled workaround for engineers wanting to connect to the environment from anywhere (good debt) or just allow access as and when accepting that this exposed the organisation unnecessarily to an external attack on the CRM. Instead of full access, a CIS hardened bastion host could be created which would allow engineers the ability to connect into the AWS environment securely. This could then be combined with break glass AD accounts for support which are disabled by default. The good technical debt comes in the form of increased the running costs (slightly) and involves additional administration steps but the technical debt burden is significantly lower than the original scenario. As you can see it’s thought through, it’s affordable and it’s planned.

What the CIO did in the case study was outright bad debt. The CIO made decisions which were clearly not thought through or budgeted for. The CIO didn’t consciously comprehend the debt they were taking on, making decisions based on their unconscious attitude to risk instead of using the organisation’s risk appetite. This CIO actively chose a path to actively circumvent the organisation’s governance and oversight controls. One could also argue that the organisation’s risk management framework was broken because it gave the CIO the flexibility to unilaterally take on this debt without any appropriate governance or oversight.

What can an organisation do to understand their technical debt?

The first step is to ensure the Board and senior management teams understand what technical debt is and how much Project Sponsors are borrowing from their organisation in terms of technical debt on top of the officially approved budget. Organisations must formally incorporate a mechanism to estimate technical debt in financial terms and this debt must be signed off as being within the organisation’s risk appetite. There are many ways to do this and, of course, we at Fox Red Risk can help calculate how much technical debt a project is holding at any stage from inception through to the product or service end of life.

Once an organisation understands technical debt the next step (and I would encourage this to be implemented as soon as practicable) is to ensure project governance and oversight is triggered by more than just arbitrary financial levels (e.g. If Project spend is more than £50k Information Security must be involved). IT projects can now be delivered using commoditised services which means your existing financial triggers may never be breached. This could allow hundreds of projects to be delivered without any governance and oversight at all. Consider triggers as straightforward as “Does this project involve personal data?” or “Will any data be processed in an environment outside the corporate network?” or “Will any data be processed by non-employees?” or “Will any third party be provided access to the Corporate network, either temporarily or permanently?”. If the answer to any of these questions is yes, governance bodies must be involved. If the answer to the questions is no, governance bodies must still be informed.

Ultimately, there should never be a technology or data related project where information security has not been informed prior to inception. If more organisations understood the true amount of technical debt they held and enforced good governance, a significantly greater number of projects would be successful and a lot fewer fines would be issued.

About The Author

About Fox Red Risk

Fox Red Risk is a boutique data protection and cybersecurity consultancy and Managed Security Service Provider which, amongst other things, helps client organisations with implementing control frameworks for resilience, data protection and information security risk management. Call us on 020 8242 6047 or contact us via the website to discuss your needs.

cio ciso Data Protection GDPR oversight project management technical debt