The LAMA-WeST Seminar Series - Tag-Debias: Entity and Concept Typing for Social Bias Mitigation in PLMs
Pre-trained language models exhibit noticeable stereotypical biases in various downstream tasks. Consequently, it is imperative to explore methods aimed at addressing or mitigating social biases in these models. In this research, we propose novel gender tagging strategies to achieve a higher level of abstraction for sensitive attributes in the corpus. Subsequently, we fine-tune BERT-family models on this tagged corpus. Our method indicates improvement in fairness when compared to both the initial and scrubbed model. Finally, we applied our proposed tagged model to Candidates CVs' ranking, revealing a 15% improvement in fairness ranking compared to the initial model and 10% compared to state-of-the-art models.
The LAMA-WeST Seminar Series - SORBET: a Siamese Network for Ontology Embeddings using a Distance-based Regression Loss and BERT
Ontology embedding methods have been popular inrecent years, especially when it comes to representation learning algorithms for solving ontology-related tasks. Despite the impact of large language modelson knowledge graphs’ related tasks, there has been less focus on adapting thesemodels to construct ontology embeddings that are both semantically relevant and faithful to the ontological structure. In this paper, we present a novelontology embedding method that encodes ontology classes into a pre-trained SBERT through random walks and then fine-tunes the embeddings using adistance-based regression loss. We benchmark our algorithm on four different datasets across two tasks and show the impact of transfer learning and ourdistance-based loss on the quality of the embeddings. Our results show thatSORBET outperform state-of-the-art ontology embedding techniques for theperformed tasks.
The LAMA-WeST Seminar Series - TwiRGCN:Temporally Weighted Graph Convolution for Question Answering over Temporal Knowledge Graphs
Recent years have witnessed interest in Temporal Question Answering over Knowledge Graphs(TKGQA), resulting in the development of multiple methods. However, these are highly engineered, thereby limiting their generalizability, and they do not automatically discover relevant parts of the KG during multi-hop reasoning.Relational graph convolutional networks (RGCN) provide an opportunity to address both these challenges – we explore this direction in the talk.Specifically, we propose a novel, intuitive and interpretable scheme to modulate the messages passed through a KG edge during convolution based on the relevance of its associated period to the question. We also introduce a gating device to predict if the answer to a complex temporal question is likely to be a KG entity or time and use this prediction to guide our scoring mechanism. We evaluate the resulting system, which we call TwiRGCN, on a recent challenging dataset for multi-hop complex temporal QA called TimeQuestions. We show thatTwiRGCN significantly outperforms state-of-the-art models on this dataset across diverse question types. Interestingly, TwiRGCN improves accuracy by 9–10 percentage points for the most difficult ordinal and implicit question types.
The LAMA-WeST Seminar Series - Natural Language to SPARQL Query Generation: A Comprehensive Evaluation of the Copy Mechanism and its Generalization Capabilities
In recent years, the field of neural machine translation (NMT) for SPARQL query generation has witnessed a significant growth. Recently, the addition of the copy mechanism to traditional encoder-decoder architectures and the use of pre-trained models have set new performance benchmarks. These state-of-the-art models have reached almost perfect query generation for simple datasets. However, such progress raises the question of the ability of these models to generalize and deal with unseen questions and entities. This work presents a large variety of experiments that replicate and expand upon recent NMT-based SPARQL generation studies, comparing pre-trained and non-pre-trained models, question annotation formats, and the use of a copy mechanism for non-pre-trained and pre-trained models. This work then evaluate the ability of several models to handle unknown question-query pairs and out-of-vocabulary URIs.
The LAMA-WeST Seminar Series - Unsupervised learning to cluster events labeled as “other” in the NSIR-RT incident learning database
Purpose:In radiotherapy, clinical staff are encouraged to report incidents that may occur in an incident learning system (ILS). An investigator is then assigned to follow up on the incident and label it according to radiation oncology-specific taxonomies. However, as new incidents occur, existing labels may become insufficient to correctly categorize all of them or investigators may be unsure which label is most appropriate and choose the catch-all “other” label. As a result, many incidents get labeled as "other" in the ILS, limiting the opportunity for learning and quality improvement they would/should otherwise provide. In this project, we aimed to automatically relabel some of these “other” incidents using already existing labels based on closer inspection of their narrative texts using NLP and unsupervised ML techniques. Method: Over 6,000 incident reports were gathered from the Canadian National System for Incident Reporting-Radiation Treatment (NSIR-RT) as well as our local ILS, which uses the NSIR-RT taxonomy. Incident descriptions from these reports were processed using various NLP techniques to obtain their vectorized representations. Processed data with all the expert-generated labels except for the “other” incidents (2,618 incidents) were clustered using the k-means clustering algorithm based on their incident description data. Each cluster was automatically assigned a label based on the frequency of expert-generated labels within the cluster. Incidents labeled as “other” were then introduced to the latent space to check if they fell within the range of an existing cluster. If they did, the corresponding cluster label was used to relabel the “other” incident. If they did not, we classified them as new incident types that will require a new label. Results: Out of 2,618 incidents labeled as “other” our pipeline re-labeled 1,928 incidents with existing labels, and 690 were labeled as new incident types. Future work will attempt to validate the reballing of the “other” incidents and will cluster the events labeled as new incident types to identify groupings of events that may give rise to new labels.
The LAMA-WeST Seminar Series - Applying Deep Learning for Avionic Security: An Experiment Design Study
In this presentation, we will explore the use of deep learning in the development of an intrusion detection system for avionics. The focus will be on the experiment design and strategies used to create a state-of-the-art system that can detect security threat in real-time. This presentation will also provide the opportunity for an open discussion to validate the methodology and to share ideas and insights on how to further improve the approach.
The LAMA-WeST Seminar Series - Entity Typing with Natural Language Inference for Fine-grained Named Entity Recognition.
Fine-grained Named Entity Recognition (FgNER) consists of detecting named entity mentions (mention detection) and typing them(entity typing - ET) with types from a relatively large set. Typing mentions becomes harder with an increase in the number of types. Also, traditional NER architectures are composed of a fixed-size classifier, which cannot be adapted to a bigger type set. Recent Prompt-based ET systems are achieving good performances for few-shot learning over large type sets without any fixed classifiers. Integrating these ET systems with various techniques for FgNER might increase the performances achieved on FgNER.
The LAMA-WeST Seminar Series - A Copy Mechanism for Handling Knowledge Base Elements in SPARQL Neural Machine Translation
Neural Machine Translation (NMT) models from English to SPARQL are a promising development for SPARQL query generation. However, current architectures are unable to integrate the knowledge base (KB) schema and handle questions on knowledge resources, classes, and properties unseen during training, rendering them unusable outside the scope of topics covered in the training set. Inspired by the performance gains in natural language processing tasks, we propose to integrate a copy mechanism for neural SPARQL query generation as a way to tackle this issue. We illustrate our proposal by adding a copy layer and a dynamic knowledge base vocabulary to two Seq2Seq architectures (CNNs and Transformers). This layer makes the models copy KB elements directly from the questions, instead of generating them. We evaluate our approach on state-of-the-art datasets, including datasets referencing unknown KB elements and measure the accuracy of the copy-augmented architectures. Our results show a considerable increase in performance on all datasets compared to non-copy architectures.
The LAMA-WeST Seminar Series - Structural Embeddings with BERT Matcher: A Representation Learning Approach for Schema Matching
The schema matching task consists of finding different types of relations between 2 ontologies. Algorithms finding these relations often need a combination of semantics, structural and lexical inputs coming from the ontologies. Structural Embeddings with BERT Matcher (SEBMatcher) is a system that leverages all of these inputs by having Random Walks as its foundation. It is also a system that employs a 2 step approach: An unsupervised pretraining of a Masked Language Modeling BERT for random walks, followed by a supervised training of a BERT classifier for positive and negative mappings. During its participation in the Ontology Alignment Evaluation Initiative (OAEI), SEBMatcher obtained promising results in participating tracks.
The LAMA-WeST Seminar Series - Digital Twinning to predict radiotherapy replanning for head and neck cancer patients
Head and neck cancer patients undergoing radiotherapy often experience weight loss over the course of treatment due to the effects of radiation. This weight loss can result in significant anatomical changes that require the patient’s treatment to be replanned to ensure that an acceptable dose of radiation is being delivered to the tumour and the nearby radiosensitive organs. Unfortunately, the decision to replan a patient is typically done with short notice to the planning team, which can significantly disrupt the workflow and consequently affect the timeline of other patient treatments. Our goal is therefore to pre-emptively determine if and when a patient will need replanning by predicting how a patient’s anatomy will change over the course of treatment. The proposed project will be carried out in three main steps. First, a variational autoencoder will be trained on patient’s cone-beam CT (CBCT) scans that are taken throughout treatment to learn latent space representations of the data. Next, the trajectory of each patient’s change in CBCT scans will be mapped in latent space such that a new patient’s trajectory can be predicted based on past patient trends that neighbour them in latent space. Finally, we aim to incorporate a digital twin framework whereby patient trajectories will be dynamically updated based on new data collected over the course of treatment.
The LAMA-WeST Seminar Series - Language Understanding and the Ubiquity of Local Structure
Recent research has shown that neural language models are surprisingly insensitive to text perturbation, such as shuffling the order of words. If the order of words is unnecessary to perform natural language understanding on many tasks, what is? We empirically demonstrate that local structure is always relied upon by neural language models to build understanding, and global structure is often unused. These results hold for over 400 different languages. We use this property of neural language models to automatically detect which of those 400 different languages are not currently well understood by our current crop of pretrained cross-lingual models, thus providing visibility into where our efforts should go as a research community.