Data is Risky Business: A Wicked Problem This Way Comes

An article written by Daragh O’Brien and posted by The Data Administration Newsletter September 6, 2023, points out that “a recent data security incident in the Police Service of Northern Ireland (PSNI) got me thinking about the idea of wicked problems and data. The data security incident was the disclosure of the names, ranks, and job assignments of every officer and civilian support staff member in the PSNI. This happened due to ‘human error’ (according to the official response).”

But, adds O’Brien, “a human error is often the end result of a series of other systemic issues that need to be considered to help us identify the key root cause, or causes, that can help prevent the error and improve the system. This can often lead us to having to think about wicked problems in data and the people-side of data.

A “wicked problem” is a complex social or cultural challenge that is difficult or impossible to solve due to incomplete or contradictory knowledge, the number of stakeholders (and their varying opinions), the economic cost of a solution, or the interconnectedness of one problem with other problems. According to O’Brien, “a key trait of a wicked problem is that it is often difficult to get a definitive scope or formulation of the problem, so it is often hard to know when to declare ‘Mission Accomplished’.”

The governance and management of data in many organizations represents a “wicked problem,” he says. “There are many stakeholders. There is often incomplete knowledge, or ‘silos’ leading to contradictory knowledge or perspectives. Often turning over one data rock to solve Problem A reveals something else. And how we scope the problem we’re solving can often lead to us declaring mission accomplished at the wrong time or for the wrong mission.”

While the PSNI incident is the largest wicked problem he has come across, “the last few days have seen other UK police forces admitting to data security breaches that are very similar. In each case, a common element is the use of Excel to present statistical data in response to an FOI query.”

Many people are asking how did this happen? How did secure and confidential data get disclosed in an Excel spreadsheet published on a public website? O’Brien believes that a better question to start asking if we want to get the shape and measure of the ‘Wicked Problem’ is WHY.

How did people unhide hidden tabs in a pivot table spreadsheet?O’Brien asks, adding that “because, let’s face it, a key cause here is that statistical data seems to have been prepared in pivot tables in Excel is a question that might lead us down a path to a solution of ‘we will only publish data in PDF form from now on.’ And that answer might not actually be a solution because of other stakeholder requirements and needs – for example there might be a legal barrier to using locked down PDF files to publish the data.”

For example, he notes, “the PSNI’s decision to only use PDF format to publish FOI responses is likely to be in breach of the UK’s Freedom of Information Act 2000. Wicked problems have many stakeholders and angles to consider.”

The use of Excel by police forces to respond to FOI requests “is almost certainly a result of decisions and choices that have been made historically about the design, storage, and handling of data. These ‘end-of-food chain’ processes often have to collate data from multiple sources to provide answers to queries, and they are often operating to legally defined deadlines. But often they fall foul of the technical debt and data debt elsewhere in the organization, usually because the requirements of safe, secure, and repeatable data reporting processes outside the organization haven’t been considered. After all, it’s only FOI request handling. What’s the worst that could happen?”

Perhaps using a business intelligence tool like PowerBI or Tableau would be a solution? But, notes O’Brien, “that then raises the question of licence costs and training costs (after all, our staff might know Excel but they probably won’t know PowerBI or Tableau). And it still might not stop the risk of unauthorized disclosure of data if staff are still able to include the raw data in the output reports (that will inevitably go to Excel as a publication format).”

So, he suggests, “perhaps there needs to be consideration given to how a reporting layer can be exposed in the organization that doesn’t provide access to the raw data to humble FOI request handlers? What if the queries related to something the organization might need to know on a regular basis? Like how many staff were employed at what grade? A simple solution would be to create materialized views or stored procedures in the database that could produce those aggregated facts without exposing the source data. But that might not have been considered in the design of the source systems. That means budget will need to be found and, potentially, someone very senior will have to admit they signed off on not delivering that kind of security-conscious reporting requirement. And, more often than not, that is not going to be simple.”

For a whole lot more of O’Brien’s advice see Data is Risky Business: A Wicked Problem This Way Comes –