COHORT.1

2019 – 2023

Modal Title

Alan Cowap

Student

Project Title

Project Description

Both humans and machines have difficulty detecting fake news, bias, and identifying emotions. These problems take on a new dimension with the advent of ever more sophisticated machine generated text, such as GPT-2 and Grover which were both released this year. We now face the additional difficulty of detecting machine generated synthetic text. Such is the sophistication of machine generated text that there is ongoing work on release strategies for text generators in order to avoid misuse. GPT-2 is undergoing a staged release approach and was fully released publicly only last week, whereas Grover’s authors plan to release it, because they found “the best defense against Grover turns out to be Grover itself”. There is ongoing work to detect synthetic text, and bias. These approaches can be categorised into Human detection, Automated ML-based detection, and Human-machine teaming. Metadata-based (e.g. time taken to write text, social graph of participants, etc.) prevention provides another tool for detecting synthetic text. My initial scan of the work to date shows no inclusion of emotion classification in these approaches. In one example using controlled generation Grover was prompted with a headline “Timing of May’s ‘festival of Britain’ risks Irish anger”, note the emotion “anger” in the prompt, and tasked to write an article. The human authored article includes emotional words such as “fear”, “attacks”, “hostility”, “mocked”; whereas Grover’s generated article is relatively light on emotion. Other useful applications for emotion classification include assisting people who have difficulty detecting emotions e.g. Asperger’s syndrome; and to allow people to filter content based on emotions. Can emotion help detect synthetic text?

There are several schemes for classifying emotions. IBM Watson Tone Analyzer detects seven tones in written text i.e. “anger, fear, joy, sadness, confident, analytical, and tentative”. Other approaches use six emotions i.e. “Happiness, Sadness, Surprise, Disgust, Anger, Fear” per Ekman’s model. Several other models exist for classifying emotions.

This proposal is to investigate the role emotion classification can play in synthetic text detection. A possible roadmap is to establish the state of the art, partner with OpenAI (who are reaching out for partners). Choose the most appropriate emotion set(s), and benchmark(s), and then to compare human vs synthetic text using e.g. a classifier trained on human text, and synthetic text. The result of this proposal should prove useful and could stand alone, or form part of the wider analysis of neural network interpretability and explainable AI

Modal Title

Bashayar Al Mukhaini

Student

Project Title

Machine learning and AI to optimise the cost of ownership for small-scale reverse osmosis processes

Project Description

The demand for agricultural, industrial, and potable water for domestic use has increased continuously over the last thirty years, reportedly increasing by 1% year on year since the 1980s (UN Water report, 2019). By 2050 consumption is expected to exceed current usage by 20 to 30%, leaving many countries experiencing severe water stress. It is evident that effective and efficient management of this vital resource is critical. Desalination technologies are becoming increasingly necessary to meet water demand, with reverse osmosis being the most prevalent technology, accounting for greater than 60% of installed global capacity (Desal data, 2016). Reverse osmosis, in conjunction with its necessary pre-treatment processes, is resource intensive, particularly in terms of energy, chemicals, and membranes. Economies of scale mitigate operating costs somewhat for large seawater desalination plants. However, smaller-scale systems are becoming more common to treat low volume saline water for industrial and agro-industrial applications, and these smaller systems pose specific challenges in terms of process and operational cost optimisation. ML techniques such as support vector machines and artificial neural networks have been applied to model various desalination processes that pose multivariate and time series challenges. However, it is unclear whether these approaches are optimal for smaller-scale industrial seawater treatment. The aim of this project is to develop models using AI and ML techniques to optimise the cost of ownership in small-scale desalination and water treatment processes. An instrumented and automated reverse osmosis rig will be used to collect data under different operating conditions. Using a combination of existing reverse osmosis operational data and results from experimental work, AI/ML techniques will be applied based on current methodologies and engineering techniques to establish the benefits and limitations of computational intelligence and propose methods for optimisation of small-scale desalination processes.

Modal Title

Carles Garcia-Cabrera

Student

Project Title

Semi-supervised Learning for CMR Segmentation using ECGs

Project Description

Cardiac Magnetic Resonance Imaging is one of the most widely used scanning methods for acquiring data from patients for a variety of medical conditions. Similarly, electrocardiograms provide much information about different parts of the heart and its cycle. Whilst adoption of machine learning techniques in medical image processing applications has been slower than in other domains, this is a growing area given the potential of machine learning to assist in diagnosis and reduce costs. There are already examples of machine learning algorithms using such sources individually for identifying and locating issues. Using both signals at the same time is an interesting research direction as it could lead to performance improvements while enhancing the explainability of the diagnosis that this kind of algorithms usually lacks. Data is a key challenge when trying to use off-the-shelf algorithms in this area, specifically the amount of annotated data and its quality. Many researchers report in the literature how they struggle to achieve good results with existing annotated data, especially when working with open datasets. Furthermore, in some cases, there is much data annotated but with noisy labels that lead to very poor accuracy outside the training datasets. For this reason, what I would like to do in my PhD project is to address this challenge by using semi-supervised learning, not just to overcome the lack of labelled data but also to try to improve the performance over test sets and to enhance the explainability of the algorithm. To achieve this goal I will use data provided by collaborators in Tampere and perhaps some of the available open datasets. With it, my goal will be to bring to this field some of the current state-of-the-art techniques in computer vision for similar problems and look to extend them based on the learnings obtained. I strongly believe that succeeding in my objective will have impact in the medical and health sciences field, enhancing and cheapening the diagnosis, and improving the current explainability of state of the art machine learning algorithms.

Modal Title

Chenyang Lyu

Student

Project Title

Injecting Structured Knowledge into Pretrained Language Models

Project Description

Pre-trained language models such as BERT (Devlin et al., 2018) and XLNet (Yang et al., 2019) have greatly improved the performance of many NLP tasks. These models can capture rich patterns from large-scale corpus and learn good representations for texts. However such models have shortcomings – they underperform on complicated noisy text (Xiong et al., 2019) or texts that need inference and external knowledge to be understood. Niven et al. (2019) found that the reason why BERT performs well on the reasoning task is that it uses spurious statistical cues in the dataset, thus highlighting the limited capability of BERT to truly understand natural language. Liu et al. (2019) proposed to incorporate knowledge graphs into BERT to aid understanding. We build on this work and conjecture that incorporating structured knowledge, such as entity relations or linguistic information, can improve such models’ performance on some NLP tasks.

Specifically, we aim to explore how to inject structured knowledge into large-scale pre-trained models, and we wish to focus on two tasks: Question Answering and Sentiment Analysis. Our main focus will be on the English language, but some other languages such as Chinese will be explored – resources permitting – as well as cross-lingual representations. Two side questions that we also expect to address are 1) how to reduce model size through incorporating structured knowledge, and 2) understanding the role of particular pretraining objectives.

Modal Title

Phuc Le Khac Hong

Student

Project Title

Using Generative Forms of Media to Summarise Video

Project Description

What is the area of project?
Deep Learning has made enormous progress in terms of performance and capabilities of understanding multimedia contents in the past few years. Fused between the fields of Computer Vision and Natural Language Processing, video forms of multimedia poses many interesting research challenges and opportunities.

Why is it important?
An enormous amount of video content is being generated daily. How to process and make sense of those information streams can provide tremendous commercial values. From a theoretical point of view, being able to process and understand video content compared with image and text alone, is a step toward more advanced and general AI.

What are the vectors of attack?
Audio and visual information in video makes it a great testbed for multi-modal Deep Learning. Highlystructured and sequential in nature, it represents a fertile ground for self-supervised and unsupervised learning methods. In addition, true generative video content is also under-explored compared to generative content in the form of text and images.

What is the expected outcome?
Major challenges in video contains can be roughly divided into a few sub-tasks related to video structuring, video description, video shortening, video rating/ranking and video generation.

This project will focus on advancing the state-of-the-art in different video-related tasks and explore how generative forms of multimedia content can be created as summaries of video.

Modal Title

Wandri Jooste

Student

Project Title

Knowledge Distillation: Building Fast, Compact and Deployable Deep Neural Networks for Resource- Constrained Environments

Project Description

Deep neural networks (DNN) underpin state-of-the-art applications of artificial intelligence (AI) in almost all fields, such as image, speech and natural language processing. However, DNN architectures are often data, compute, space, power and energy hungry, typically requiring powerful GPUs or large-scale clusters to train and deploy, which has been viewed as a “non-green” technology. Furthermore, often the best performing models are ensembles of hundreds or thousands of base-level models. The space required to store these cumbersome models, and the time required to execute these models at run-time, significantly prohibit their use for applications with limited memory, storage space, or computational power such as mobile devices or sensor networks, and for applications in which real-time predictions are needed.

Knowledge distillation (a novel and cutting-edge model compression method for deep neural nets) can transfer the knowledge from a teacher network (a cumbersome model) to a student (a small model) network, so it is a much more promising technique to disrupt current situation for NLP tasks where almost all systems are tending to use cumbersome DNN architectures. Knowledge distillation techniques have been successfully adapted to the state-of-the-art speech synthesis model WaveNet, which generates realistic-sounding voices for the Google Assistant. This production model is more than 1000 times faster than the original and with higher quality. However, for NLP tasks using cumbersome DNNs (e.g. neural machine translation), distilling knowledge is more challenging and different from the speech task.

Therefore, our goal in this proposal is to develop a more efficient and effective knowledge distillation framework to build fast and compact DNN models for NLP tasks, and to deploy on resource-constrained environments without quality loss and with low latency. Regarding this goal, we have three specific questions to address:

(1) architecture of the student model: it needs to be a simple and small architecture, suitable for parallel computing in terms of training and inference, and suitable for deploying at resource-constrained environments;

(2) the kind of knowledge that needs to be transferred or distilled: the original model memorises the whole dataset and it learns different knowledge, so how can we design the objective function so that we can transfer the required knowledge to the student model?

(3) balance between model size and performance: we need to carefully design the architecture, knowledge to be distilled and objective function to have a better balance between the model size and system performance based on the deployment and run-time requirements.

Modal Title

Yasser Abdelaziz Dahou Djilali

Student

Project Title

GRAPH NEURAL NETWORK

Project Description

Many scientific fields study data with an underlying structure that is a non-Euclidean space. Some examples include social networks in computational social sciences, sensor networks in communications, functional networks in brain imaging and regulatory networks in genetics. For this reason, Graph Neural Networks have recently emerged as an interesting methodology of analysing graphs whilst leveraging the power of deep learning. Being their ability to fit many real-world datasets, which have an inherent graph structure, GNNs have found applications in many different domains including:

Applications of GNNs in computer vision include scene graph generation, point cloud classification and segmentation, action recognition, etc.

Recognising semantic relationships between objects facilitates the understanding of the meaning of a visual scene. Scene graph generation models aim to parse an image into a semantic graph which consists of objects and their semantic relationships. Another application reverses the process by generating realistic images given scene graphs. This hints at the intriguing possibility of synthesising images given textual descriptions.

Traffic Accurately forecasting traffic speed, volume or the density of roads in traffic networks is fundamentally important in a smart transportation system. The authors of address the traffic prediction problem using Spatio-Temporal GNNs.

Recommender Systems Graph-based recommender systems consider items and users as nodes.

Chemistry In the field of chemistry, researchers apply GNNs to study the graph structure of molecules/compounds.

Though GNNs have proven their power in learning graph data, challenges still exist due to the complexity of graphs. This PhD project aims to target some urgent challenges and issues facing the generalisation of GNNs,

Model Depth: the performance of a ConvGNN drops dramatically with an increase in the number of graph convolutional layers. This raises the question of whether going deep is still a good strategy for learning graph data.

Scalability Trade-off The scalability of GNNs is achieved at the price of corrupting graph completeness. To perform the pooling operation to coarsen graphs, some works use sampling , others use clustering, in both approaches, the model will lose part of the graph information. By sampling, a node may miss its influential neighbors. By clustering, a graph may be deprived of a distinct structural pattern. How to trade-off algorithm scalability and graph integrity could be a future research direction.

The aim of this PhD is to develop new algorithms that tackle these challenges and extend GNNs to different usage. This target will be demonstrated by showing the performance improvement in distinct application domains.