Research Symposium

The Vector Institute Research Symposium is a one-day event showcasing new cutting-edge research from the Vector research community.

Join us for the fourth annual Research Symposium to celebrate the achievements of our research community. Come listen to spotlight and keynote presentations from Vector’s Faculty Members and award-winning researchers, check out posters from Vector researchers, and spend some time chatting with old and new friends. This event is a great way to learn about what your fellow researchers are working on and spend some time connecting with potential collaboration partners.

This year the Research Symposium will be hybrid, with keynote presentations being held in-person and streamed and poster presentations in-person.

9:15 AM	Registration and breakfast
10:00 AM	Opening remarks (hybrid)
10:15 AM	Spotlight presentation from Franziska Boenisch (hybrid) Talk title: "What trust model is needed for federated learning to be private?"
10:45 AM	Keynote from Vered Shwartz (hybrid) Talk title: "Incorporating Commonsense Reasoning into NLP Models"
12:00 PM	Lunch
1:00 PM	Keynote from Shai Ben-David (hybrid) Talk title: "Can Fairness be retained under distribution shift"
1:55 PM	Keynote from Colin Raffel (hybrid) Talk title: "Building Better Language Models"
2:45 PM	Poster presentations (in-person)
4:15 PM	Event concludes

What trust model is needed for federated learning to be private?
Abstract: In federated learning (FL), data does not leave personal devices when they are jointly training a machine learning model. Instead, these devices share gradients with a central party (e.g., a company). Because data never "leaves" personal devices, FL was promoted as privacy-preserving. Yet, recently it was shown that this protection is but a thin facade, as even a passive attacker observing gradients can reconstruct data of individual users.

In this talk, I will explore the trust model required to implement practical privacy guarantees in FL by studying the protocol under the assumption of an untrusted central party. I will first show that in vanilla FL, when dealing with an untrusted central party, there is currently no way to provide meaningful privacy guarantees. I will depict how gradients of the shared model directly leak some individual training data points—and how this leakage can be amplified through small, targeted manipulations of the model weights. Thereby, the central party can directly and perfectly extract sensitive user-data at near-zero computational costs. Then, I will move on and discuss defenses that implement privacy protection in FL. Here, I will show that an actively malicious central party can still have the upper hand on privacy leakage by introducing a novel practical attack against FL protected by secure aggregation and differential privacy – currently considered the most private instantiation of the protocol. I will conclude my talk with an outlook on what it will take to achieve privacy guarantees in practice.

Incorporating Commonsense Reasoning into NLP Models
Abstract: Human language is often ambiguous, underspecified, and grounded in the physical world and in social norms. As humans, we employ commonsense knowledge and reasoning abilities to fill in those gaps and understand others. Endowing NLP models with the same abilities is imperative for reaching human-level language understanding and generation skills. In this talk, I will present several lines of work in which we test NLP models on their commonsense reasoning abilities, develop commonsense reasoning models, and incorporate them into models to improve the performance on NLP tasks.

Building Better Language Models
Abstract: The standard recipe for building and using large language models (LLMs) involves training a decoder-only Transformer on a large collection of unstructured web text and then adapting it to a downstream task via in-context learning. In this talk, I will present our work that challenges this recipe and ultimately provides much more effective and efficient ways of using LLMs. First, I'll discuss our early work demonstrating how multitask prompted training enables strong zero-shot and cross-lingual generalization. Then, I will present an empirical study that verifies how encoder-decoder models work dramatically better in this setting. Finally, I will discuss methods for few-shot training that work dramatically better than in-context learning and produce a much more efficient model.

If you are travelling to Toronto for the Research Symposium, students currently enrolled in Vector recognized master’s programs, Vector researchers (i.e., graduate students and postdoctoral fellows currently supervised by Vector Faculty Members), Vector Scholarship in AI recipients, and graduate students and postdoctoral fellows supervised by Vector Faculty Affiliates, may be eligible to have a portion of your travel covered. Take a look at the Vector Institute Expense Policy for Eligible Attendees on what you may be reimbursed for. To be reimbursed, you must check in at the registration table on February 22, 2023.

If you are planning to claim a travel reimbursement, please add your contact information to the list. For any questions, please contact research@vectorinstitute.ai

The Vector Institute has a corporate rate at the Holiday Inn Downtown, located at 30 Carlton Street. Please note hotel accommodations are not covered under the travel policy.

Room rates are subject to applicable taxes and availability. Room rates are:

Standard room $149
One bedroom suite $249

Rates are for single/double occupancy. Additional occupancy is $20.00 per guest with a maximum of 4 people per room.

Reservations can be made by calling the hotel direct:

Reservations: (416) 977-6655
Toll free: 1 (877) 660-8550

When making a reservation please request the “Vector Institute” corporate rate.

Vector Institute
Research Symposium

Join us online or in person at

Date and time

About this event

Franziska Boenisch

Vered Shwartz

Shai Ben-David

Colin Raffel

Travelling to Toronto?

Hotel accommodations