A common sense approach to machine learning

Dr. Vered Shwartz points to a 2019 headline that spread the news: “Stevie Wonder announces he’ll be having kidney surgery during London concert”.

“One of the properties of languages is that we only say what we believe to be complimentary to the listener’s knowledge,” says Dr. Shwartz. “I immediately understand that the announcement happened during the concert, but the surgery will be sometime later. It's common sense. Machines don’t have that ability—yet.”

No human reader would assume that the musician would actually go under the knife on stage, but a natural language processing (NLP) program could interpret the headline differently.

Ambiguity in language presents a hurdle that NLP cannot consistently clear, a problem that Dr. Shwartz, an assistant professor in the department of computer science, is working to solve by building and training machine learning models that use natural language, with the goal of advancing towards human-level language understanding.

Most NLPs today are based on something called a “language model,” a very large neural network trained on vast amounts of unlabelled text—the entire web for example. The resulting model knows a lot about language, word meanings, and syntax. It tries to predict words from their context, or the next word in a sentence. In the last four years, it's transformed NLP.

What’s inside the black box?

Machine learning is not intuitive. In fact, the process is so opaque and incomprehensible that it’s been dubbed a “black box.”

“It doesn’t imitate the way that people represent language in their brain or the way that people process or acquire language,” says Dr. Shwartz. “You train the model to predict the correct answer. If it doesn’t, you tweak the parameters so that the next time it's going to answer correctly. A neural network doesn't know how to explain how it made its decisions, but researchers are working on post hoc analysis methods to try to interpret what the network learned.”

Mistakes can be avoided by applying an element of common sense. One of the common resources for commonsense knowledge is called ConceptNet. It’s connected to WordNet, a multilingual lexical resource that contains definitions of different meanings of the same word.

“If you look for the entry for breakfast in English, it says that it contains pancakes, for example. But the same entry for Chinese says that it contains noodles,” Dr. Shwartz explains. “So it's also culturally aware.”

Data entry for ConceptNet is done through crowd sourcing—people are hired to perform tasks or answer questions. By collecting multiple answers for the same question you gain the wisdom of the crowd. But crowd sourcing also opens the door for gender or racial bias to creep into the data, as does learning from the internet.

“There's a heated debate in the fairness community. There are people who say, ‘Well this is just a reflection of the data and you can’t do much about it. You've started from the data and the world is biased so that's what it's going to learn,’” says Dr. Shwartz. “But you can avoid mistakes by having some social norms and ethics in the system.”

At the Allen Institute for AI in Seattle Dr. Shwartz collected social norms to train a model to reason about a particular social situation. Because the data was drawn from people across the U.S., there was a vast range of opinions on what constitutes normal in society.

“In cases where you collect large amounts of data, you can't verify everything,” she says. “You get a lot of things that may not reflect your own values. An NLP model trained on data from the web can generate racial slurs and other offensive language, so it's important to be aware of this when deploying these models for real-world applications.”

One example is the gender bias inherent in specific word representations such as “doctor” or “nurse.” While it’s statistically correct that in most countries doctors are usually male and nurses are usually female, it's a problem in modeling. On one hand, you want the model or representation of the words to capture the reality.

“But when you use these representations in downstream tasks it could be harmful,” says Dr. Shwartz. “If you build a resume filtering system, in a hypothetical hiring scenario you get two identical CVs, one from a male candidate and another from a female candidate where the names clearly indicate their gender. What if the system learns to prefer the male candidate because their representation of male is more similar to the advertised doctor position? If the NLP model is trained on biased data it can affect people's lives in sensitive applications.”

Dr. Shwartz faced no such challenges with her own job application to UBC. She came to love the Pacific Northwest during her postdoctoral studies with the Allen Institute for AI, and when the time came to secure a faculty position, she chose UBC.

“I really liked the atmosphere in the computer science department when I did my remote interview,” she says. “Plus, several of the schools that I was considering in the U.S. had established NLP groups. At UBC there are other faculty working on NLP, but it’s a smaller group so I felt like there was more opportunity for growth. I think also that UBC really cares about teaching, probably more than the average of other universities.”

In the Spring 2022 semester, Dr. Shwartz will teach a graduate course on common sense reasoning and NLP. Her student should appreciate her common-sense approach to learning: take the evenings and weekends off.

“Go see some nature,” she tells her students. “Don’t spend a few years here and not see anything and be closed up in the lab working on your research. It's not worth it. Grad school takes a few years, don't put your life on hold for it.”

Related Links:

Dr. Vered Shwartz's faculty profile at UBC Computer Science