Making sense of big data can be a big problem. Information comes from disparate sources with bias, collection and missing data issues, noted Lise Getoor, UC Santa Cruz professor of computer science. So how do you build methods that can integrate this data and make interesting (and valid) inferences?
A Fellow of the Association for the Advancement of Artificial Intelligence, Getoor founded a line of research called statistical relational learning. Her methods mix network information with statistical information, exploiting background knowledge and domain theory as well—mapping, for instance, the underlying power structures among corporate entities or predicting outbreaks of social unrest.
For such work, Getoor and her collaborators developed an open-source, probabilistic soft logic (PSL) tool, that can represent structure and uncertainties in networks, and make inferences in a scalable way—at orders of magnitude faster than competing approaches. The latest PSL release occurred this year.“The key thing about my work,” said Getoor, “is that it not only makes use of probabilistic information, it also makes use of relational information, uncovering links and ties among all kinds of data.”
On the flipside, Getoor and her group also study privacy issues—learning how to prevent unwanted linkages.