Turning technology against human traffickers
Last October, the White House released the National Action Plan to Combat Human Trafficking. The plan was motivated in part by a greater understanding of the pervasiveness of the crime. In 2019, 11,500 situations of human trafficking in the United States were identified through the National Human Trafficking Hotline, and the federal government estimates there are nearly 25 million victims globally.
This increasing awareness has also motivated MIT Lincoln Laboratory, a federally funded research and development center, to harness its technological expertise toward combatting human trafficking.
In recent years, researchers in the Humanitarian Assistance and Disaster Relief Systems Group have met with federal, state, and local agencies, nongovernmental organizations, and technology companies to understand the challenges in identifying, investigating, and prosecuting trafficking cases. In 2019, the team compiled their findings and 29 targeted technology recommendations into a roadmap for the federal government. This roadmap informed the Department of Homeland Security’s recent counter-trafficking strategy released in 2020.
"Traffickers are using technology to gain efficiencies of scale, from online commercial sex marketplaces to complex internet-driven money laundering, and we must also leverage technology to counter them," says Matthew Daggett, who is leading this research at the Laboratory.
In July, Daggett testified at a congressional hearing about many of the current technology gaps and made several policy recommendations on the role of technology countering trafficking. "Taking advantage of digital evidence can be overwhelming for investigators. There's not a lot of technology out there to pull it all together, and while there are pockets of tech activity, we see a lot of duplication of effort because this work is siloed across the community," he adds.
Breaking down these siloes has been part of Daggett's goal. Most recently, he brought together almost 200 practitioners from 85 federal and state agencies, nongovernmental organizations, universities, and companies for the Counter–Human Trafficking Technology Workshop at Lincoln Laboratory. This first-of-its-kind virtual event brought about discussions of how technology is used today, where gaps exist, and what opportunities exist for new partnerships.
The workshop was also an opportunity for Laboratory researchers to present several advanced tools in development. "The goal is to come up with sustainable ways to partner on transitioning these prototypes out into the field," Daggett says.
Uncovering networks
One the most mature capabilities at the Laboratory in countering human trafficking deals with the challenge of discovering large-scale, organized trafficking networks.
"We cannot just disrupt pieces of an organized network, because many networks recover easily. We need to uncover the entirety of the network and disrupt it as a whole," says Lin Li, a researcher in the Laboratory’s Artificial Intelligence Technology Group.
To help investigators do that, Li has been developing machine learning algorithms that automatically analyze online commercial sex ads to reveal whether they are likely associated with human trafficking activities and if they belong to the same organization.
This task may have been easier only a few years ago, when a large percentage of trafficking-linked activities were advertised, and reported, from listings on Backpage.com. Backpage was the second largest classified ad listing service in the United States after Craigslist, and was seized in 2018 by a multi-agency federal investigation. A slew of new advertising sites has since appeared in its wake. "Now we have a very decentralized distributed information source, where people are cross posting on many web pages," Li says. Traffickers are also becoming more security-aware, Li says, often using burner cellular or internet phones that make it difficult to use "hard" links such as phone numbers to uncover organized crime.
So, the researchers have instead been leveraging "soft" indicators of organized activity, such as semantic similarities in the ad descriptions. They use natural language processing to extract unique phrases in content to create ad templates, and then find matches for those templates across hundreds of thousands of ads from multiple websites.
"We've learned that each organization can have multiple templates that they use when they post their ads, and each template is more or less unique to the organization. By template matching, we essentially have an organization-discovery algorithm," Li says.
In this analysis process, the system also ranks the likelihood of an ad being associated with human trafficking. By definition, human trafficking involves compelling individuals to provide service or labor through the use of force, fraud, or coercion — and does not apply to all commercial sex work. The team trained a language model to learn terms related to race, age, and other marketplace vernacular in the context of the ad that may be indicative of potential trafficking.
To show the impact of this system, Li gives an example scenario in which an ad is reported to law enforcement as being linked to human trafficking. A traditional search to find other ads using the same phone number might yield 600 ads. But by applying template matching, approximately 900 additional ads could be identified, enabling the discovery of previously unassociated phone numbers.
"We then map out this network structure, showing links between ad template clusters and their locations. Suddenly, you see a transnational network," Li says. "It could be a very powerful way, starting with one ad, of discovering an organization's entire operation."
Analyzing digital evidence
Once a human trafficking investigation is underway, the process of analyzing evidence to find probable cause for warrants, corroborate victim statements, and build a case for prosecution can be very time- and human-intensive. A case folder might hold thousands of pieces of digital evidence — a conglomeration of business or government records, financial transactions, cell phone data, emails, photographs, social media profiles, audio or video recordings, and more.
"The wide range of data types and formats can make this process challenging. It's hard to understand the interconnectivity of it all and what pieces of evidence hold answers," Daggett says. "What investigators want is a way to search and visualize this data with the same ease they would a Google search."
The system Daggett and his team are prototyping takes all the data contained in an evidence folder and indexes it, extracting the information inside each file into three major buckets — text, imagery, and audio data. These three types of data are then passed through specialized software processes to structure and enrich them, making them more useful for answering investigative questions.
The image processor, for example, can recognize and extract text, faces, and objects from images. The processor can then detect near-duplicate images in the evidence, making a link between an image that appears on a sex advertisement and the cell phone that took it, even for images that have been heavily edited or filtered. They are also working on facial recognition algorithms that can identify the unique faces within a set of evidence, model them, and find them elsewhere within the evidence files, under widely different lighting conditions and shooting angles. These techniques are useful for identifying additional victims and corroborating who knows whom.
Another enrichment capability allows investigators to find "signatures" of trafficking in the data. These signatures can be specific vernacular used, for example, in text messages between suspects that refer to illicit activity. Other trafficking signatures can be image-based, such as if the picture was taken in a hotel room, contains certain objects such as cash, or shows specific types of tattoos that traffickers use to brand their victims. A deep learning model the team is working on now is specifically aimed at recognizing crown tattoos associated with trafficking. “The challenge is to train the model to identify the signature across a wide range of crown tattoos that look very different from one another, and we’re seeing robust performance using this technique," Daggett says.
One particularly time-intensive process for investigators is analyzing thousands of jail phone calls from suspects who are awaiting trial, for indications of witness tampering or continuing illicit operations. The Laboratory has been leveraging automated speech recognition technology to develop a tool to allow investigators to partially transcribe and analyze the content of these conversations. This capability gives law enforcement a general idea of what a call might be about, helping them triage ones that should be prioritized for a closer look.
Finally, the team has been developing a series of user-facing tools that use all of the processed data to enable investigators to search, discover, and visualize connections between evidentiary artifacts, explore geolocated information on a map, and automatically build evidence timelines.
“The prosecutors really like the timeline tool, as this is one of the most labor-intensive tasks when preparing for trial,” Daggett says.
When users click on a document, a map pin, or a timeline entry, they see a data card that links back to the original artifacts. "These tools point you back to the primary evidence that cases can be built on," Daggett says. "A lot of this prototyping is picking what might be called low-hanging fruit, but it's really more like fruit already on the ground that is useful and just isn't getting picked up."
Victim-centered training
These data analytics are especially useful for helping law enforcement corroborate victim statements. Victims may be fearful or unwilling to provide a full picture of their experience to investigators, or may have difficulty recalling traumatic events. The more nontestimonial evidence that prosecutors can use to tell the story to a jury, the less pressure prosecutors must place on victims to help secure a conviction. There is greater awareness of the retraumatization that can occur during the investigation and trial processes.
"In the last decade, there has been a greater shift toward a victim-centered approach to investigations," says Hayley Reynolds, an assistant leader in the Human Health and Performance Systems Group and one of the early leaders of counter–human trafficking research at the Laboratory. "There's a greater understanding that you can't bring the case to trial if a survivor's needs are not kept at the forefront."
Improving training for law enforcement, specifically in interacting with victims, was one of the team's recommendation in the trafficking technology roadmap. In this area, the Laboratory has been developing a scenario-based training capability that uses gameplay mechanics to inform law enforcement on aspects of trauma-informed victim interviewing. The training, called a “serious game,” helps officers experience how the approach they choose to gather information can build rapport and trust with a victim, or can reduce the feeling of safety and retraumatize victims. The capability is currently being evaluated by several organizations that specialize in victim-centered practitioner training. The Laboratory recently published a journal on serious games built for multiple mission areas over the last decade.
Daggett says that prototyping in partnership with the state and federal investigators and prosecutors that these tools are intended for is critical. "Everything we do must be user-centered," he says. "We study their existing workflows and processes in detail, present ideas for technologies that could improve their work, and they rate what would have the most operational utility. It's our way to methodically figure out how to solve the most critical problems," Daggett says.
When Daggett gave congressional testimony in July, he spoke of the need to establish a unified, interagency entity focused on R&D for countering human trafficking. Since then, some progress has been made toward that goal — the federal government has now launched the Center for Countering Human Trafficking, the first integrated center to support investigations and intelligence analysis, outreach and training activities, and victim assistance.
Daggett hopes that future collaborations will enable technologists to apply their work toward capabilities needed most by the community. "Thoughtfully designed technology can empower the collective counter–human trafficking community and disrupt these illicit operations. Increased R&D holds the potential make a tremendous impact by accelerating justice and hastening the healing of victims."