Humans Can’t Watch All the Surveillance Cameras Out There, so Computers Are

Computers equipped with artificial intelligence video analytics software are able to monitor footage in real time, flag unusual activities, and identify faces in crowds.

Look up in most cities and it won’t take long to spot a camera. They’re attached to lampposts, mounted outside shop doors, and stationed on the corner of nearly every building. They’re mounted to the dashboard in police cars. Whether you even notice these things anymore, you know you’re constantly being filmed. But you might also assume that all the resulting footage is just sitting in a database, unviewed unless a crime is committed in the area of the camera. What human could watch all that video?

Most of the time, a human isn’t watching all the footage – but, increasingly, software is.
Computers equipped with artificial intelligence video analytics software are able to monitor footage in real time, flag unusual activities and identify faces in crowds. These machines don’t get tired, and they don’t need to be paid.

And this isn’t at all hypothetical. In New York City, the police department partnered with Microsoft in 2013 to network at least 6,000 surveillance cameras to its Domain Awareness System, which can scan video footage from across the city and flags signs of suspicious activity the police program it to look for, like a person’s clothing or hair color. Last October, Immigration and Customs Enforcement put out a request for vendors to provide a system that can scan videos for “triggers,” like photographs of people’s faces. For retail stores, video analytics startups are currently marketing the ability to spot potential shoplifters using facial recognition or body language alone.

New York City’s police department partnered with Microsoft in 2013 to network at least 6,000 surveillance cameras to its Domain Awareness System. Photo: Reuters

Rising demand

Demand is expected to increase. In 2018, the AI video analytics market was estimated to already be worth more than $3.2 billion, a value that’s projected to balloon to more than $8.5 billion in the next four years, according to research firm Markets and Markets. “Video is data. Data is valuable. And the value in video data is about to become accessible,” says Jay Stanley, a senior policy analyst at the American Civil Liberties Union who authored a report released earlier this month about video analytics technologies and the risks they pose when deployed without policies to prevent new forms of surveillance. For decades, surveillance cameras have produced more data then anyone’s been able to make use of, which is why video analytics systems are so appealing to police and data miners hoping to make use of all the footage that’s long been collected and left unanalysed.

One of the problems with this, Stanley writes, is that some of the purported uses of video analytics, like the ability to recognise someone’s emotional state, aren’t well-tested and are potentially bogus, but could still usher in new types of surveillance and ways to categorise people.

One company described in the report is Affectiva, which markets an array of cameras and sensors for ride-share and private car companies to put inside their vehicles. Affectiva claims its product “unobtrusively measures, in real time, complex and nuanced emotional and cognitive states from face and voice.”

Also Read: Will India’s Snooping State Survive Judicial Scrutiny?

Another company, Noldus, claims its emotion detection software can read up to six different facial expressions and their intensity levels – the company even has a product for classifying the emotional states of infants.

In 2014, when the Olympics were hosted in Sochi, Russia, officials deployed live video analytics from a company called VibraImage, which claimed to be able to read signs of agitation from faces in the crowd in order to detect potential security threats.

While in the US, this technology is primarily marketed for commercial uses, like for brands to detect if a customer is reacting positively to an ad campaign, it’s not a stretch to imagine that law enforcement may ask for emotion recognition in the future.


Yet the idea that emotions manifest in fixed, observable states that can be categorised and labeled across diverse individuals has been challenged by psychologists and anthropologists alike, according to research published last year by the AI Now Institute.

Even when the applications offered by video analytics technologies are well tested, that doesn’t mean they’ve been found to work. One of the most popular uses of video analytics is the ability to identify a person captured in a moving image, commonly called facial recognition. Amazon has been working with the FBI and multiple police departments across the country to deploy its facial recognition software, Rekognition, which claims to be able to “process millions of photos a day” and to identify up to 100 people in a single image – a valuable tool for surveillance of large crowds, like at protests, in crowded department stores, or in subway stations.

Also Read: Home Ministry Allows 10 Central Agencies to Engage in Electronic Snooping

The problem with law enforcement using this tech is that it doesn’t work that well, especially for people with darker skin tones. An MIT study released earlier this year found that Rekognition misidentified darker-skinned women as men 31% of the time, yet made no mistakes for lighter-skinned men. The potential for injustice here is real: If police decide to approach someone based on data retrieved from facial recognition software and the software is wrong, the result could lead to unwarranted questioning or, worse, the misapplied use of force. Amazon said in a statement to the New York Times that it offers clear guidelines on how law enforcement can review possible false matches and that the company has not seen its software used “to infringe on citizens’ civil liberties.”

Policies to provide oversight and curb misuse

Though law enforcement and private businesses are already experimenting with video analytics systems, Stanley says the technology isn’t so ubiquitous yet that it’s too late to put policies in place to provide oversight, curb misuse, and in some cases, even prevent deployment. In May, for example, San Francisco passed a law banning the use of facial recognition by law enforcement. And in cities like Nashville, Oakland, and Seattle, officials are required to hold public meetings before adopting any new surveillance technologies. These policies are promising, but they’re nowhere near commonplace across the country.

Law enforcement and private businesses are already experimenting with video analytics systems. Photo: Flickr/Sheila Scarborough CC BY 2.0

A good first step to creating regulations around how video analytics are used by law enforcement, according to Ángel Díaz, a technology and civil rights lawyer at the Brennan Center for Justice, is requiring local police departments and public agencies to be transparent about the surveillance technologies they use. “Once you have mandated disclosure, it’s possible for the public to come in and have experts testify and have a public forum to discuss how these systems work and whether we want to be deploying them at all in the first place,” says Díaz.

One way to jump-start that conversation, Díaz says, is by requesting audits from inspector general offices at police departments, which people can call on to conduct audits of their local police department’s use of video analytics and other surveillance technologies. Once it’s clear what type of technologies are being used, it’s easier to create other mechanisms of accountability, like task forces for evaluating automated systems adopted by city agencies to help ensure they’re not reproducing biases.

Once a piece of technology is already in the field, it’s much harder to pass rules that restrict how it’s used, which is why it’s so important for communities concerned about over broad surveillance technologies to demand transparency and accountability now. We’re already being watched – the question now is whether we’re going to trust computers to do all the watching.

This piece was originally published on Future Tense, a partnership between Slate magazine, Arizona State University, and New America.