This is part 2 of 3 on a series on Big Data and crime prevention, and will consider whether using Big Data in such a way is responsible for unintended bias and prejudice within both law enforcement and in the judicial system.

Whilst Big Data has proven itself to be an immense asset in crime prevention, unwanted consequences have arisen. Increasingly, individuals from certain racial and socio-economic backgrounds are being unfairly targeted. While this has been an issue in the prevention and prosecution of crime throughout the years, profiling’s existence was always the result of human preconception. Now, however, it appears possible for machines to also make biased or unfair decisions.

Locating the source of this bias is not an easy task. The bias stems from a combination of problematic data and problematic data analysis by machines.

Problematic Data 

Issues in the data presented to machines making decisions can cause these machines to reach unintended or biased conclusions.

Data sets used for crime prevention often present what is known as a ‘signalling problem’. This occurs when certain communities do not provide enough data or provide too much, resulting in either an overflow or a significant gap in the data set. The bias in the data could also be present due to existing human practices- for example, studies in the US show that, historically, prejudice in law enforcement meant that certain racial groups were more likely to imprisoned for drug-related crimes. The high rate of imprisonment of these individuals has skewed criminal data. This skewed data, when analysed for prevention then alerts police to a ‘hotspot’ of crime, often in a community populated by a specific racial group. The local police would then allocate an increased number of resources to patrolling that area increasing the probability of an arrest. Due to ‘hotspot’ predictions, individuals’ transgressions in these regions are caught almost every time, no matter how big or small they may be.

Consider this – how many times does an average driver run a red light or speed in their lifetime? How many times are they caught? Statistically, driving offences are discovered only 30% of the time. On the other hand, if you were to drive through an area under constant observation, your odds of being apprehended might jump to over 75%.

In the US, judges often rely on recidivism models when determining sentencing. These models may forecast certain groups as more likely to reoffend, leading to harsher criminal sentences. It has been argued that big data extrapolations have overshadowed a judge’s consideration of a suspect’s personal background and circumstances. Recently, a data analysis platform in the US mistakenly assigned an exaggerated expectation of recidivism for an area’s black civilians. Examples like this suggest caution should be exercised as recidivism models have not been found to be wholly accurate. The regular presence of false positives and negatives consistently influence and distort predictions that have a profound impact on human lives and further exacerbate problems in data.

Problematic Machines

Whether for prevention, prosecution or parole, Big Data is being used extensively to inform decisions, but who are actually making these decisions? The very nature of Big Data means that technology is central to this analysis. Machine learning, artificial intelligence and neural networks are all being implemented by law enforcement and judiciary to assist in or replace the decision-making process.

This automated decision-making can be attributed to prejudicial outcomes as machines are only as effective as their programing, what they are taught, and what they learn – all of which can be influenced and be biased. So with this in mind, would transparency and a means of determining how decisions are reached help to root out and prevent incorrect outcomes?

This theory was tested in the US where courts and prisons are increasingly using machines to determine a defendant’s ‘risk’. This ‘risk’ can then be used to determine recidivism or even the probability the defendant will not appear for their court date. More increasingly, this analysis and automated decision-making is being used to inform bail, sentencing and parole. The machines making these decisions are generally developed by private corporations meaning the technology and underlying algorithms are proprietary; therefore, the ability to understand how the software works is very limited. This lack of transparency paints a worrying picture of an offender being unable to appeal a decision because the decision-making process cannot be clearly determined or is protected by intellectual property law.

This was put to the test in the US case of Wisconsin v. Loomis, where defendant Eric Loomis was found guilty for his role in a drive-by shooting. Loomis answered questions that were entered into a risk assessment tool developed by a private corporation and used by the Wisconsin Department of Corrections. Due to the output of the risk assessment, the trial judge gave Loomis a long sentence. Loomis challenged his sentence finding it unfair that he was not allowed to assess the tool’s algorithm and decision-making process. Ruling against Loomis, the State Supreme Court set a concerning precedent by concluding that knowledge of the algorithm’s output was a sufficient level of transparency.

To read part 1 of this series click here