COHORT.4

2022 – 2026

Modal Title

Alexandria Mulligan

Student

Project Title

ROP-StuDIO

Project Description

ROP-StuDIO is a research project centered around medical applications of machine learning regarding retinopathy of prematurity (ROP). At present, ROP is the leading cause of preventable childhood blindness and impacts premature newborns. The more premature a birth, the greater the risk of ROP and the increased risk medical scanning induces on the health of the child. While sometimes ROP is corrected naturally other times it is severe and treatment is required. Current ROP treatment is able to save a part vision in a child that would otherwise go blind. However, it is critical for ROP screening methods to predict the risk level of each newborn to reduce limiting sight on a child that may fully recover or missing treating another child who could have had some of their vision saved end up with no vision. In addition ROP screening methods require a specialised medical practitioner who is not always available in remote locations and the methods themselves are invasive for critical state premature births.

Research Aim:

This project aims to explore and address two research questions. The first, “can machine learning applications use clinical data of premature babies and their mothers to map risk assessment for infants developing ROP?”

Furthermore, the second question for exploration is, “can deep learning approaches be used to track ROP development using retinal fundus images?” The exploration of this question includes developing a set of guidelines around how to acquire optimal retinal scans for analysis. This secondary outcome intends to decrease the number of images and scans necessary for each baby.

Research Scope and Objectives:

The scope of this project is agile with seven main sprints or deliverable packages. The first is achieving domain understanding using a literature review and interacting with domain experts. One of these experts is the internal advisory supervisor for this project who has access to a substantial image dataset at Cork University Hospital. The second sprint focuses on this dataset and requires more domain training to understand the rental scans in the context of ROP. The third focus of the research project is to clean the data and perform necessary image pre-processing for the fourth sprint which seeks to develop a deep learning model for image evaluation. This fourth work package’s discoveries will be prepared for publication.

The fifth sprint is model development and testing for clinical data of the baby and mothers. This data provides a possible challenge in acquiring and will need to be applied for. The outcome of this model should identify premature babies as high risk or low risk for ROP due to the clinical data building off the current practice of using gestational age and weight as risk identifiers. The feature behaviours, and model developed in this stage will also be prepared for publication.

Work package six for this project evaluates the models from package four and five alongside clinical experts. The outcome of this is to determine if the models hold results that are acceptable in clinical practice. The final package of this project is thesis write up and submission combining the results and discoveries from all prior work packages.

Modal Title

Andrea Heaney

Student

Project Title

A machine learning approach to women+ centred health across the lifecycle

Project Description

Historically women+ have been significantly underrepresented with regards to medical research (Merone, Tsey, Darren, Nagle, 2022). This is seen in both policies relating to women+ health, research in clinical trials but also the presentation and management of conditions in clinical settings (Merone, Tsey, Darren, Nagle, 2022)( Maucais-Jarvis, Merz et al., 2020). For example in 1977 the Food and Drug Administration in the US released policy guidlines for clinical trials in General Considerations for the clinical Evaluation of Drugs which was based on data that excluded all ‘’Women of Childbearing Potential’’ from phase 1 clinical trials, regardless if they were on birth control, single or if their husband had a vasectomy (U.S. FDA, 1977). It wasn’t until 1993 that the FDA revised its publication from 1977 allowing women+ to be included in all stages of clinical trials if it met certain criteria (U.S. FDA, 1993). This was done as there was a “growing concern that the drug development process does not produce adequate information about the effects of drugs on women’’(U.S. FDA, 1993).

In addition to the exclusion of women+ from clinical trials and health data collection processes, the presentation and prevalence of health conditions can vary according to gender ( Maucais-Jarvis, Merz et al., 2020) but this is not always considered in clinical management of female patients. For example, clinical evidence shows that women+ are half as likely to receive interventional medicine for coronary artery disease when compared to their male counterparts (Weisz, Gusmano, Rodwin, 2004).

Although there is clear evidence of exclusion and bias in women+ healthcare, one must first know where the deficiencies and bias lie within particular conditions to be able to appropriately address them. Machine learning has the potential to make a vast impact on women+ centered health by analysing health related data of various conditions, identifing these key defiencies and biases and addressing these key deficiencies to develop a more appropriate approach to the management of these conditions.

Research Aim: The aim of this research is to explore the deficiencies in the management of women+’s health conditions from presentation to diagnosis and treatment across the lifecycle and investigate issues of bias and exclusion.

Objectives: The first steps in this research will involve a qualitative study to explore the key deficiencies in the management of women+’s health with key experts in the areas of health (GP’s and Pharmacists) and health related policy makers. The data collected will be analysed using thematic analysis (Braun and Clarke, 2006) and the results will then directly inform subsequent data driven explorations in particular health conditions where deficiencies and biases have been identified.

Data Exploration – Some of the main sources of data initially identified include:

  • The Autoimmune Association {Charity/Research organisation}
  • DAISy PCOS {Research organisation}
  • SWAN datasets {Open Source Data }

Statistical Analysis of data from stage 2 to determine critical factors that have the largest impact for health outcomes for women+

Modal Title

Mazhar Qureshi

Student

Project Title

XAI for hate speech monitoring on social media

Project Description

In this age of uninterrupted social media access, the extent of connectivity for an ordinary individual has reached unprecedented levels, allowing the spread of ideas to increase many folds and turning the world into a global village of public views and opinion. Social media’s low-cost and high-speed connectivity has made it a favourable avenue for alternative or alt views that may be underrepresented in mainstream media. Hate speech on social media has been an interesting research topic for many years under the broader umbrella of Internet Sciences. Most researchers define ‘hate speech’ as derogatory remarks towards an individual, race, religion, gender, or sexual orientation. However, what constitutes derogatory and what does not remain a much-debated topic.

Several techniques have been proposed in the artificial intelligence community to identify hate speech and disinformation. Several benchmark datasets contain data gathered from popular social media platforms, i.e., Twitter, Facebook and other platforms such as Gab and Whisper. This data is often biased due to the source of labelling or the models trained on these datasets’ failure to identify multiple hate-speech incidents. Conversely, many excerpts are falsely identified as hate speech as well. The lack of consistency in the definitions, labels and classification of hate speech, along with a lack of explainability behind classification, causes mistrust between social media networks and users.

The explainability of AI models refers to the degree to which an ML model is understandable to its stakeholders. Here, Explainable AI (XAI) aims to improve the user experience by increasing trust in the decision-making capabilities of a system. The introduction of explainability to purpose-built AI and ML solutions has been a sought-after concept for several years now. Around the world, policymakers have pushed for more explainable and transparent solutions to enable effective policy making using AI in critical systems. Similarly, an explainable model for hate speech detection can provide similar insights to policymakers for legislation on hate speech control on social media. Furthermore, explainable solutions also improve the public understanding of these frameworks and algorithms.

There is a need to address the lack of explainability in monitoring hate speech on social media. Establishing a certain level of trust between the system and its stakeholders is a genuine requirement. Similarly, explainability enables policymakers to develop better policies with an increased understanding of the system. The following research objectives define the scope of the project:

  • To analyse the current systems in place for hate speech detection, their implementation and a review of the relevant literature.
  • To develop a hate speech detection system that can classify hateful statements with minimum biases.
  • To explain hate speech by applying and adopting various effective XAI approaches that bring transparency into a policy violation.
Modal Title

Nighat Bibi

Student

Project Title

Explainable Arficial Intelligence (XAI) in Healthcare

Project Description

The brain is the body’s command centre; it controls the function of each organ. The effects of any disruption in the brain can be disastrous. Therefore, it is essential to find brain illnesses early before they deteriorate. Brain tumour, Alzheimer’s disease, Autism, and other common brain conditions must be recognized in the early stages; otherwise, the outcome may be worse.

Brain tumour occurs because of the abnormal development of cells in the brain. It is one of the significant reasons for death in adults around the globe. Millions of deaths can be prevented through the early detection of brain tumours. MRI images are considered helpful for detecting and localising tumours.

Alzheimer’s disease is a degenerative neurological condition that causes the brain to atrophy, which causes the brain to shrink and the brain cells to die. It affects people between the ages of 30 to middle 60. Alzheimer’s disease affects 5.8 million people in the United States who are 65 years or older. It is a typical dementia cause. Sadly, Alzheimer’s is incurable and can cause death and a severe loss of brain function. Therefore, it must be detected early and treated.

Autism spectrum disorder (ASD) is a neurological disorder that impacts how people connect with others, communicate, learn, and conduct. It first manifests in early childhood, evolves throughout life, and needs to be caught early to speed up therapy and recovery. In addition, medical brain imaging techniques may be used to identify these impairments.

There are different biomedical image techniques. However, MRI images provide clear images of a brain that can help an accurate diagnosis of brain diseases.

Many AI-based approaches already exist for diagnosing brain diseases; however, the black-box approaches are not considered more reliable in the healthcare field, so the explainability of AI-based models is crucial in disease diagnosis. Explainable Artificial Intelligence supports researchers in justifying their model with transparent results that lead to trustworthiness for clinicians, doctors, and patients.

Objectives:

We aim to provide explainability of the diagnosis of brain diseases, i.e., Brain tumours, Alzheimer’s Disease, and Autism, from MRI images. The fundamental reasons behind this research are:

  • Provide accurate, fast, and early detection of brain diseases
  • Provide a transparent/trustworthy/explainable diagnosis of brain diseases (why and how our model predicts these results)
  • Detect more than one type of brain disease from MRI images
  • Proof that AI-based models are trustworthy for the diagnosis of diseases from MRI images

Approach

In this research, machine learning and deep learning models will be employed to diagnose brain diseases (Brain tumours, Alzheimer’s disease, and Autism) with high accuracy from MRI images and XAI methods (like SHAP, LIME, and LRP) will be used to provide transparency of the models and reason behind their decision (output).  

Modal Title

Van Hoang

Student

Project Title

Style and Personalisation in Situated Interaction

Project Description

Since the appearance of Siri from Apple, dialogue systems have become more and more prominent in our lives. Recently, there has been an increasing interest in the Natural Language Processing (NLP) community to design adaptive systems. Initial research has shown that stylized and personalized conversations, tailored to users’ needs and preferences, would help strengthen the connections between dialog systems and human users. As a result, personalized content improves user engagement in conversations, increases communication effectiveness, and develops trust in the systems.

For the selection of user-centred content, psychologically motivated concepts such as emotions and personality have been investigated and incorporated into the development of human-like conversational dialogue systems. In contrast to short-lived emotions and affective states, personality traits are more stable and endurable over time. Therefore, personality is better suited to model long-term user preferences while emotions are for short- and mid-term preferences. Injecting human traits into a system should start with understanding real human interactions. However, there is seemingly a lack of insights from other disciplines in popular research literature. The definitions of “emotions” and “personality” are often data-driven, and so are the responses of the systems to the users. In the PERSONA-CHAT dataset, a persona, or personality, is a list of five random characteristics.

There are three key challenges in delivering stylized and personalized content to users by emotion- and personality-aware dialogue systems.

The first one involves the automatic detection of the user’s affective states and personality traits to build their models. Which emotion and personality inventories should be selected? And what could be used as feature cues for the detection (e.g. texts, speech, body language)? Secondly, with the user data from the previous step, the generated responses should be personalized and stylized to user preferences. Furthermore, the personality of the systems should be consistent throughout the conversations. The last challenge is about the ethical aspects of the dialogue systems. For example, given the users’ distressing emotional states, how should the systems respond? More importantly, when should they try to change user behaviours, and when not? 

Using established theories from both psychology and linguistics, and latest model architectures from NLP, the project aims to partly address the second and third challenges. In the Style Transfer task, the GAN architecture has been utilised extensively for the conversion of texts from one style to another according to users preferences. This method lies in the assumption that style and content can be separated completely . However, recent work has proven that such clear separation is not easily attainable, if not impossible, depending greatly on the domains. These findings have motivated us to examine other frameworks for a deeper understanding of their own strengths and weaknesses. Taking a data-centric approach to the challenges, we work to develop flexible ML applications in NLP that can deliver these goals.