Protecting data privacy

Most of us have increasing concerns about the use of personal data collected from our digital footprints. Computer scientist Abhradeep Guha Thakurta has developed commercial-scale methods that help ensure privacy while also providing access to valuable data. Credit: pxhere, CC0.

What if your cell phone could assess your risk for developing diabetes? For that to be possible, a predictive model would first need to be trained on vast amounts of user data. And that poses a problem, said Abhradeep Guha Thakurta, assistant professor of computer science and engineering.

“Access to this type of personal data is an insanely sensitive issue,” said Thakurta. “How do we harness such data while still protecting privacy?”

Thakurta has developed machine learning algorithms, or computational rulesets, to help solve this problem. His approach is based on introducing randomness into the data. Imagine your data being sent from your phone to a company server. While your actual data might reveal a family history of heart disease, Thakurta’s algorithms could replace this truth with its opposite—you don’t. “The data are intentionally a little bad,” he said.

Feeding models slightly bad data turns out to be good for them—they learn the general trends, without relying on any individual’s “real” data. The method maximizes data utility, while also maintaining user privacy.

Several companies are working to adapt his approach to their technology, Thakurta said. Apple, for example, is already using it in all their devices to ensure keyboard stroke privacy.

Alison F. Takemura