COHORT.4

2022 – 2026

Modal Title

Ajay Kumar M

Student

Project Title

Low power DNN Processor for Embedded AI in IoT sensors

Project Description

Deep neural networks (DNNs), which are loosely modelled after the human brain, have been shown to be
remarkably successful in recognizing and interpreting patterns in many applications, such as image and video
processing. However, due to high computational complexity, power, and resource requirements, DNNs are not
well explored for low power applications such as Internet of Things (IoT) sensors, wearable healthcare, etc.
IoT devices have stringent energy and resource constraints and, in many cases, deal with one-dimensional
time series data.

This project will investigate DNN hardware accelerator architectures for low-power edge implementation. The
project will address the challenges of implementing sparse DNN models in energy-efficient IoT edge sensors.
The DNN accelerator hardware developed will be based on a RISC-V CPU and will be scalable,
programmable, and easily adaptable to different DNN model types. The accelerator developed will be
demonstrated in an FPGA (or ASIC) for a wearable biomedical application such as heartbeat classification

Modal Title

Asad Ullah

Student

Project Title

Intermediate Speech Representations for Low Resource Speech Models

Project Description

Feature extraction and intermediate speech representation are important components for speech processing tasks. Many different approaches exist, e.g. methods such as wav2vec2, Mockingjay, TERA, autoregressive methods. Beyond taking the raw input wave, basic feature extractions (e.g. MFCC, log mel spectrogram etc.) are widely used in models. Some also create intermediate representations that are useful for self-supervised learning and to allow base models trained without labels to be fine-tuned and applied to a variety of different prediction tasks and target outputs. Understanding and explaining why these methods work well on some downstream tasks and but not others has not been well studied for different speech objectives such as phoneme recognition, speaker identification, speech recognition, language identification, spoken language understanding, speech translation, emotion recognition, voice conversion, speech synthesis etc.

This project will adapt state-of-the-art deep learning architecture to improve the existing speech representation methods. New methods emerging from the fields of computer vision (CV) and natural language processing (NLP) will be reviewed for cross-domain inspiration. Datasets will be sourced to fine-tune models with varying amounts of labelled data, This will inform the relationship between fine-tuning dataset size and the chosen representation, highlighting the potential for application to low resourced speech tasks. Better understanding of the relationship between the representations and the training data for the initial frozen model and fine-tuned models will help inform model and data choices across different classes of speech models.

Modal Title

Conor O'Sullivan

Student

Project Title

Monitoring Ireland’s coastal areas from satellite imagery using deep neural networks

Project Description

Monitoring coastal evolution linked to climate change is a difficult task. Coastlines and water bodies are impacted by climate changes and local weather conditions. It is essential to be able to monitor changes occurring along coastlines, because these changes impact populations living in coastal areas. In this PhD project, we intend to explore deep neural networks on open-access Copernicus Sentinel-2 satellite image data to develop machine learning methodologies to monitor climate change-induced coastal evolution along the Irish coast and other inland regions. This thesis involves forecasting Irish sea levels using Long short-term memory (LSTM)-based neural networks and classifying wetlands and other inland objects from satellite imagery.

Modal Title

Davide Serramazza

Student

Project Title

Explanation Methods for Multivariate Time Series Classification

Project Description

Multivariate time series classification is an important computational task arising in applications where data is recorded over time and over multiple channels. For example a mobile phone or a smartwatch can record the acceleration and orientation of a person’s motion, and these signals are recorded as multivariate time series. One can classify this data to understand and predict human movement and various conditions such as fitness levels. Classification alone is not enough in many applications. In most applications we need to classify but also to understand what the model learns (e.g. why a prediction was given, based on what information in the data?).

The main focus of this project is on understanding, developing and evaluating explanation methods tailored to Multivariate Time Series Classification (MTSC).

 

Some of the main research questions are:

  1. What are the current explanation methods for MTSC, and what are their strengths and limitations?
  2. How do we objectively evaluate and compare multiple explanation methods for MTSC, especially for scenarios where their outputs disagree (e.g. for saliency-based methods, different explanations provide very different saliency maps)?
  3. How can we integrate the insights gained from the explanation methods to improve the accuracy of the classification algorithms (eg an iterative optimisation approach)?
  4. How can we integrate the insights gained from the explanation methods to improve or reduce the input data (eg, by removing noisy data, selecting good features)?
Modal Title

Eanna Curran

Student

Project Title

Neural Network Architecture for Solving Geometric Optimization Problems

Project Description

Combinatorial optimization problems arise naturally in many areas of computer science and other disciplines, such as business analytics, operations research, bioinformatics and electronic commerce. Since many of these optimization problems are NP-hard, applications typically rely on meta-heuristic frameworks, approximation algorithms and carefully designed heuristics for specific instance classes to solve them efficiently. However, the resultant solutions can be very far from optimal, and the development of good algorithms often requires significant human effort. The goal of this PhD project is to augment the human ability to design good algorithms and data structures by using machine learning techniques to explore the search space efficiently.

Over the last two decades, a large number of neural network architectures have been proposed to deal with a range of NLP and image processing tasks. However, these architectures (e.g., CNN) heavily rely on temporal and/or spatial coherence in the input sequence. In recent years, researchers have attempted to adapt these frameworks for solving combinatorial optimization problems with limited success. The combinatorial optimization problems often have long-ranged and complex correlations in the input element sequence. Therefore, the traditional architectures do not generalize as well as they do for the NLP and image processing tasks. In this project, we plan to consider restricted domains of combinatorial optimization problems (e.g., geometric optimization problems) and design neural network architectures that can learn efficient features for problems from the domain and leverage those features to find good solutions for large instances of the problems.

Modal Title

Fangyijie Wang

Student

Project Title

Automatic imaging biomarkers in foetal development using multi-task deep learning framework

Project Description

Ultrasonography is widely used in the field of obstetrics, which is a popular way of assessing the state of foetal development (e.g. gestational age (GA) estimation, foetal weight (FW) estimation) and safety of the pregnancy. The operator usually performs an array of measurements during ultrasound (US) scan sessions. The scans are safe and can be carried out at any stage of your pregnancy. Over the years, some researchers have demonstrated that deep learning algorithms are used to reduce operator dependent errors and improve the accuracy of foetal well-being assessment, but the use of AI is still in a stage of infancy.

 

Foetal biometric measurement is a standard examination during pregnancy used for the foetal growth monitoring and estimation of gestational age. The most important foetal measurements include the measurements of biparietal diameter (BPD), head circumference (HC), femur length (FL) and abdominal circumference (AC). To obtain these proper measurements, it requires the use of standardised planes. The operator needs substantial knowledge and experience to identify the standardised planes that are foetal abdomen (FASP), brain (FBSP) and femur (FFESP) standard planes. However, expert resources are scarce, especially in underdeveloped countries. We believe that deep learning algorithms can be a valuable tool to tackle these challenges. Biometry measurements are performed on ultrasound images that have standardised planes. After that, these biometric parameters can be used to evaluate foetus growth and estimate the following parameters: GA and FW.

 

The first phase of this research is using a literature review to understand the recent research results

related to foetal ultrasound deep learning applications. The second phase of this research is exploring multi-task neural networks for automatically classifying and segmenting foetal body parts in 2D ultrasound images. Foetal body parts include the foetal head, abdomen and femur. The third phase of this research is developing a novel deep learning framework for measuring the foetal body parts, BPD, HC, FL and AC. These foetal biometrics are used to assess foetal growth.

The results of this research may reduce the rates of misdiagnosis for foetus growth monitoring. Thus, it contributes to the improvement of the quality of medical services and ultimately benefits patients.

Modal Title

Jiwei Zhang

Student

Project Title

Explainable Natural Language Processing for Legal Text Analysis

Project Description

As a result of developments in machine learning, particularly neural networks, a growing number of state-of-the-art technologies are employing deep learning to identify solutions to real-world problems. Due to the complexity of its real-world data, Natural Language Processing (NLP) is a domain in which deep learning techniques have become dominant, particularly for tasks dealing with long text documents.

The document-level classification task is a significant challenge in the research community of NLP because it has a wide range of practical applications, including legal text analysis, sentiment analysis and mapping labels for news articles. A key difficulty for document-level classification tasks is to understand the relations between sentences, which is not easily achievable by traditional approaches like regression models. To achieve document-level understanding, current approaches typically rely heavily on transformer-based neural network modules, such as BERT and its variants (e.g. DocBERT and RoBERTa), XLNet and GPT-3.

However, as the implementation of deep learning neural networks becomes more widespread, additional obstacles emerge. In the majority of instances, when a neural network is employed for downstream tasks, users are only privy to the predicted results but not reasons for those predictions. Neural networks are often referred to as “black boxes” since it is impossible to interpret the actual meaning of the weight matrix. In other words, people have difficulty interpreting the relationship between the inputs and the outputs. Even if the accuracy of the prediction outcomes may be the most important feature in some disciplines, the prediction method must be transparent, understandable, and interpretable.

For legal text classification, it is common that documents are long and domain-specific in terms of their vocabularies. For example, in the legal AI community, there is much research focusing on tasks like categorising legal cases based on legal opinions and legal regulation classification. Classification models generally have complex architectures and consist of several embedding modules and neural network modules, limiting interpretability. Therefore, it is rare for industries or legal departments to make use of these results directly in the real world because of a lack of understanding of why predictions have been made. Legal and business leaders are typically reluctant to rely on opaque models of this type in their decision-making processes.

To find a solution to this problem, the most vibrant area of research is eXplainable Artificial Intelligence (XAI). In the context of this project, we will investigate XAI approaches for long-text document classification tasks in the legal domain. Our research will include but not be limited to current reasoning approaches, state-of-the-art models for long-text classification tasks, interpretability of neural networks, machine learning on weakly-labelled data, and the application of these technologies within the legal domain. With this research project, we aim to maximise the advantages that deep learning approaches have brought, by achieving transparency and interpretability in long-text classification tasks in the legal domain, thus providing a more reliable base up which decisions can be made.

Modal Title

Naoise McNally

Student

Project Title

Auditing Algorithms: Investigations of the Facebook News Feed Recommender System

Project Description

In the past decade digital intermediaries such as Facebook, Google and Twitter, have assumed a pivotal role in the information space as central distribution channels for all forms of content, most notably news. The distribution and (in)visibility of information on these platforms are dictated by algorithmic recommender systems, which are often described as black boxes, in reference to the opaque nature of their decision-making systems. The information delivery outcomes of such systems are understood to have profound implications for the public discourse, yet oversight and transparency have been severely limited.

Efforts to understand the effects of such algorithmic systems have resulted in an emerging area of research: algorithmic auditing. Sandvig et al (2014) set out the initial understanding of algorithmic audits as a variety of methods used to uncover issues within the decision-making structure of an algorithm. As a nascent area of research, methods have yet to be standardised and encompass a wide variety of techniques investigating algorithmic systems by indirect means. Such audits by academics, journalists and activists in recent years have uncovered evidence of harmful algorithmic mechanics on various platforms including issues of racial bias, discrimination, misjudgement and misattribution.

The proposed research uses an empirical approach to investigate the effects of algorithmic governance of information on the Facebook platform, by auditing the Facebook News Feed recommender system. The research design includes using a parametrized timeline of known strategic changes to the Facebook News Feed recommender system, in conjunction with media content including a corpus from The Guardian newspaper (2011-2020), and utilising CrowdTangle to access Facebook engagement metrics for such content. The proposed method is to build a model based on the documented changes to Facebook algorithms, which is subsequently modeled with Cross-Correlation temporal analysis, Augmented Dickey–Fuller and Granger-Causality tests, and finally the Seasonal Hybrid ESD (S-H-ESD) algorithm. This study presents a proof-of-concept audit of the Facebook News Feed and forms the basis of an extended set of further investigations aimed at contributing to the understanding of algorithmic governance on digital intermediaries.

Modal Title

Nils Hohing

Student

Project Title

Grounded Language Understanding

Project Description

Natural language understanding systems rely on word statistics performed by language models. These approaches capture an approximation of language semantics, but they exhibit many known failure cases like poor understanding of causal relationships, being sensitive to rephrasing of sentences and producing syntactically convincing, but semantically questionable outputs.

Many of these issues can be attributed to language models’ lack of grounding in reality. Humans know which concepts from our world the words of a text correspond to, e.g. for the word “tree” what a tree looks like, which sounds it produces and which tactile impressions are associated with it. This knowledge gives us an edge over current language understanding systems in reasoning which is implicitly required for all language understanding tasks.

Existing research has contributed a variety of benchmarks to measure the alignment between different modalities like vision, language and audio. The best way to test for alignment are retrieval benchmarks like Winoground, where the model is tasked to retrieve items like images from a big database that best match a key, e.g. a given text. Image or text generation benchmarks in contrast have unreliable automatic evaluation because defining a sensible distance metric between a ground truth image or text and a generated one is very hard.

Learning the alignment currently works very well for higher level concepts, for example understanding the visual differences between a wolf and a bear, but it fails in the details. For example simple spatial understanding like discerning between left and right surprisingly often does not work. Also unusual compositions are rarely understood well. For the prompt “a cup on a spoon” DALLE-2, the image generation model, generates only spoons in cups. This reveals serious deficits in the model’s language understanding.

This project aims to overcome failure cases of those existing solutions by improving models that understand the relationship between words and images (possibly also videos) measured by existing benchmarks. Additionally, new benchmarks to measure the performance in those areas more precisely will be created. At last the point is to demonstrate that these image-text multimodal models can outperform language models in purely textual domains (when there is no visual information available at inference time).

For the first step there could be three sources of the aforementioned problems with image-text models: the model architecture, the data and the learning strategy.

-The initial experiments have shown that the image and text processing architectures are capable of learning basic physical relations like “left” and “right”

-Since the datasets used in this domain contain millions up to billions of image-text pairs, a lack of data also seems unlikely.

-Therefore, either the quality of the data or the learning strategies must be the problem.

To start, the goal of this project is therefore to examine the data quality for the specific purposes and to improve image-text alignment via novel curriculum learning strategies.

The main challenges will be working with very big datasets and doing meaningful evaluation.

Modal Title

Reyhaneh Kheirbakhsh

Student

Project Title

Semantic Deep-Learning Approach for Medical Image Analytics

Project Description

Gliomas are among the most aggressive primary tumours that occur in the brain and spinal cord. A glioma can affect your brain function and be life-threatening depending on its location and rate of growth. Gliomas are classified according to the type of glial cell involved in the tumour, as well as the tumour’s genetic features, which can help predict how the tumour will behave over time and the treatments most likely to work. Therefore, identifying the type of glioma will help determine appropriate treatment and prognosis. In this project, we propose to use data analytics to identify the type of glioma tumour using an elaborative mining process. The process requires data collection, data pre-processing and segmentation, application of deep learning and interpretation of the results. The innovative part of this project is in its ability to incorporate semantic elements in the learning process so that the results are reliable with high accuracy. This study will be extended to other types of medical images to identify other types of tumours and injuries.

Modal Title

Svetoslav Nizhnichenkov

Student

Project Title

Human-in-the loop model training with explanations and feedback

Project Description

This PhD will work on human-compatible AI systems with a focus on trust and user-induced feedback loops. Recent advances in explainability and interactive AI have opened up avenues for users to influence AI systems, however, this presents many challenges such as how to quantify knock on effects, how to seek feedback from users, how to efficiently incorporate user feedback, and how to facilitate AI systems to negotiate a common objective and understanding between the system, the domain expert user and the AI system designer. This PhD will explore research challenges in this space with a view on solving real world problems that AI system designers face.