Understanding residential electricity consumers demand needs and renewable energy supply capabilities using explainable machine learning models
EU directives set out targets for renewable electricity, heat and transport. Low carbon technologies (LCTs) such as heat pumps (HPs) and electric vehicles (EVs) are components of the Government of Ireland’s Climate Action Plan to decarbonise heating and transport. The rate of adoption of LCTs and other technologies such as domestic photovoltaic (PV) generation is uncertain. The impact of geographic clustering, uncertainty in weather, and the range of technology options for EVs, HPs and PVs further complicate the evaluation of the impacts on the low voltage (LV) electricity distribution network. This poses considerable challenges to the management and operation of future distribution networks and smart grids. Similarly, renewable energy and LCTs have high potential to contribute to developing countries such as Uganda. These opportunities come with significant challenges. Currently only 1.1% of Uganda’s energy needs are served by electricity with 90% of total primary energy consumption in the form of firewood, charcoal, or crop residues. Solar offers potential in rural electrification schemes, while urban electricity networks are currently unreliable with regular load shedding. The Government of Uganda’s national plans aim to use renewable energy to support socio- and economic development in an environmentally sustainable manner. The challenges and opportunities in Uganda differ from those in Ireland, but are linked by the potential for Machine Learning to identify solutions and recommendations. This project responds to the need to identify how energy systems can be transformed to be secure (reliable), clean (green and sustainable), and fair (ensuring the citizen is at the centre of, and benefits from the transformed system). The project aims to use Machine Learning to support the evolving design demands of LV networks in urban areas, and to explore the potential for Machine Learning to support the development of renewable energy communities. It will focus on the impacts of LCTs, and the potential opportunities for local energy communities. The methodology and specific research questions will be defined in detail. Sample research themes include: ML to support (rural & solar) energy communities; ML to support LV network design; ML to support Smart Grid – fair load management in urban grids. The main deliverable will build on ML Fundamentals (particularly sequential data) to inform future LV electricity distribution network design decisions, and will address ML in Society to support fair operations of renewable energy communities.
Machine Learning for Combinatorial Optimization problems in Network Design
Combinatorial optimization problems arise in many areas of computer science and other disciplines, such as business analytics, artificial intelligence and operations research. Prominent examples are tasks such as finding shortest or cheapest round trips in graphs, graph pattern matching, scheduling, time-tabling, resource allocation etc. These problems typically involve finding groupings, orderings or assignments of discrete, finite set of elements that satisfy certain conditions or constraints. A large number of optimization problems are NP-hard. As a result, applications typically rely on approximation algorithms, carefully designed heuristics for specific instance classes, or meta- heuristic frameworks to solve them efficiently. While the first two require significant human effort, the last set of techniques can be very far from optimal. Finding good optimization solutions efficiently has been a long-standing quest and such solutions are likely to have very high impact in a range of applications. In recent years, researchers are exploring if machine learning techniques can be used to learn the solutions directly from the input. The early results based on deep-learning techniques, that modelled these problems as sequence to sequence learning tasks, have shown that it is indeed viable to do so, as there are universal motifs present in graphs (across datasets and scales) that can be leveraged to learn effective models for optimization problems. However, these deep-learning based techniques do not seem to generalize well, require a lot of training data (which requires solving a large number of NP-hard problem instances) and are not interpretable. This PhD project will explore machine learning techniques to design novel heuristics that can achieve optimal (or close-to-optimal) solutions for combinatorial optimization problems. The goal will be to use simple interpretable models relying only on local features of elements, such that these heuristics can potentially be mathematically analyzed for optimality guarantees on specific input distributions. In terms of learning technique, we will use a learning-to-prune framework, reinforcement learning and graph neural networks for this purpose. The success of this project will greatly augment the human ability to design algorithms for combinatorial optimization problems in industry.
A credit rating evaluates the creditworthiness of an entity that is seeking to borrow money, with regards to a particular financial obligation (Investopedia, 2020). Credit rating in the case of corporates or governments is normally supplied by a credit rating agency, such as Fitch, Standard and Poor’s, and Moody’s. Credit rating also determines the cost of borrowing for the entity issuing a financial instrument. However, the rating agencies have a significant conflict of interest in evaluating the creditworthiness of an entity as it is the entity who pays for the rating. The consequences of such conflict of interest have been seen in the financial crisis of 2007-2008 when the highest ratings were assigned to financial products that were of a significantly poorer quality (Stirier, 2008). Additionally, credit ratings are usually expensive due to the amount of labour that is required to produce them. Inaccurate or out-of-date credit score can result in a company, especially a small to medium-size enterprise (SME), having a higher cost of borrowing. Thus, in my PhD, I would like to design a method, based on explainable machine learning (explainable ML), that would accurately evaluate an entity’s credit rating. A well-performing model would tackle the issue of rating agency conflict of interest providing an unbiased evaluation of the creditworthiness of an entity. Academic literature has explored both statistical modelling and machine learning methods, finding the deep learning models to overall yield better performance (Dastile et al., 2020). However, deep learning lacks transparency which is required for credit rating models under Basel II accord (BIS, 2006). As a result, research has been hesitant to implement machine learning models that do not satisfy the legal transparency requirements in credit research. Dastile et al. in a literature survey conducted in 2020 found that only 8% of the investigated studies used transparency techniques. In my research, I will focus on creating an explainable machine learning approach which would yield a performance equivalent or similar to that of the deep learning models. In line with the Basel II accord, it would provide a sufficient level of detail into how the model arrives at a decision. Some success has been achieved using explainable ML in credit scoring to date (Fahner, 2018; Bussman et al., 2020). However, significant progress is yet to be made in the field in terms of model performance and explainability trade-off in corporate credit scoring.
Deep Learning approaches for Group Anomaly Detection in Cyber security
Today, the frequency and scale of cyber-attacks and cyber fraud are increasing every year. Incidents related to cybercrime are becoming more sophisticated, complex and multi-faceted. For example, cybercrimes are relying on tools, procedures/process already installed on the system for their attack campaigns because these tools/procedures are normally used by administrators, directors and security analysts for legitimate purposes for their routine tasks. On the defensive side, the detection of patterns that differ from typical behaviour is utterly important to detect new threats or fraud patterns. This requirement has been satisfied by using popular machine learning-based algorithms that are capable of detecting point anomalies. However, many of such approaches cannot detect a variety of different deviations that are evident in group datasets. For example, the activity of a domain admin on a machine can be similar to a cybercrime’s activity confusing any point anomaly detectors. Identifying attacker activities, in this case, require more specialised techniques for robustly differentiating such behaviour. On the other hand, Group Anomaly Detection aims to identify groups that deviate from the regular group pattern. Generally, a group consists of a collection of two or more points and group behaviour could be described by a greater number of observations. Group Anomaly Detection has been studied in various domains to find group anomalies where point-wise methods failed. Recently, Group Anomaly Detection has been applied in cyber security with simple Deep Learning (DL) models such as Adversarial Autoencoder to detect targeted cybercrimes who hide their activity. However, such simple DL models are still limited in detecting sophisticated activities in cyber systems. Hence, in this research, we are looking at developing a new Deep Learning model for Group Anomaly Detection for cyber systems with the existing of sophisticated activities such as multiple attack groups, new cyber-fraud patterns. This new Deep Learning model should have a capable of multi- class classifier. The new model will be evaluated with both open-source datasets and real-world datasets from cyber security industry.
To say that AI is all-pervasive is now trite. However, despite its inescapable presence, society has yet to identify an effective way to oversee and control this technology. AI solutions which have an overall social benefit but which are also safe, trustworthy and legal are needed. “Control” in in this context can take many forms, from specific company guidelines through to legislative interventions, from soft regulation to criminal law sanctions. The lack of compliance with, or absence of, standards, regulation and laws in AI impacts trustworthiness. This is a weakness in the adoption and usage of AI. There has been criticism of the use of AI models, in the justice and healthcare systems, among others. The use of the COMPAS decision support tool in sentencing to assess recidivism, for example is controversial. Dressel (2018) found that the tool’s accuracy was not dissimilar to predictions made by people without any criminological experience and that race bias was a significant failing. Carter (2020) reviewed the ethical, legal and social implication of using AI in breast cancer care and emphasised the need for detailed discussion on when and what kind of AI should be deployed. On a global scale, the challenge being faced is how to control the development and deployment of AI solutions in a way that facilitates both progress and protection. This research investigates how a regulatory framework could be applied to AI and how we could best design a system to maximise adoption, oversight and compliance. Adopting an interdisciplinary approach, this research will test existing regulatory frameworks against the field of computer science. The research will require bridging the technical feasibility of measuring AI trustworthiness with socio- legal and regulatory practices and frameworks using and combining methods from a variety of disciplines. It will evaluate the hypothesis that a bottom up regulation approach will provide trustworthy AI solutions with measurable compliance in a simpler and more legally sound way than a top down product certification approach. This is an emerging and urgent challenge facing policy makers and this project will provide an integrated perspective on the techno-socio-legal challenge.
Spoken Open Domain Dialogue Systems for Non-native Speakers
Open-domain dialog systems have been receiving a great deal of attention from both academia and industry, resulting in many applications. Chatbots like Alexa are designed for information retrieval and chit-chat purposes, while others like Replika are built for companion and emotional support. There is also a growing interest in using open domain chatbots as a language educator as practising conversations is an effective way for non-native speakers to acquire a new language. However, building a chatbot that can interact with non-native speakers is challenging because: (1) Errors propagated from automatic speech recognition (ASR) systems lead to unexpected responses, (2) Non-native speech may contain a lot of disfluencies and grammatical errors, which degrade the chatbot’s performance, (3) Most non-native speakers are not good conversationalists because of their limited linguistic ability, thus chatbots must be highly interactive and engaging, and able to lead and create a meaningful conversation. This PhD will improve upon state-of-the-art open-domain chatbots, making it possible for chatbots to create a smooth and engaging conversation with non-native speakers. In particular, the PhD will focus on: (1) Making open-domain dialog systems robust to noisy inputs (e.g. ASR errors and disfluencies) by using multiple ASR decoding outputs or enabling the system to ask clarifying questions (2) Making open-domain dialog systems more engaging by leveraging a user’s personalized data such as interests, goals, beliefs, and values. This information is necessary for the chatbot to choose the right topic for the conversation, to avoid discussing things that are out of the user’s interests. The chatbot will learn to do that by looking at dialogue examples in which two people have an engaging/meaningful conversation, on condition that they already know about each other. For direction (1), we will modify the transformer-based seq2seq model, allow it to encode not only the dialog history but also additional ASR information at the current turn. The model will learn either to generate an appropriate response or to ask a clarifying question by looking at spoken dialogue examples. For direction (2), we will create a dataset called DeepConversations, consisting of many engaging/meaningful dialogues. We also propose a transformer-based memory network to encodes each of the user’s personalized data as individual memory representations, and then generating the engaging response word-by-word. The outcome of this PHD will benefit not only non-native speakers but also native speakers who using chatbots for entertainment purposes.
Challenges Even though many Machine Learning (ML) and Deep Learning (DL) object identification and classification methods have equalled or surpassed human level performance, they still face adoption challenges, especially in the health sector. Adoption of AI in medical image analysis highly depends on trust users have in an automated system. Making AI transparent is one way to increase its adoption. Typical accuracy of a radiologist interpreting ligament injury from an MRI is 94%. ML and DL based classifiers can achieve similar levels of accuracy, but their black box nature means that we do not know if they are diagnosing using relevant features. Studies in Longoni et al. have shown that consumers/patients in healthcare are concerned that ML and DL based solutions would not account for their unique injury features as much as humans would do. This can be true for DL models used in medical image classification, which are trained on large datasets, and may not be able to look for unique features in a patient’s imaging output. This suggests that interpretability, human-In-the-loop (HITL) feature labelling, and unlearning inappropriate features could be combined to move towards more personalised models.
Objectives The main objective of this project is to develop transparent medical image analysis solutions where domain experts can participate in the model building process by combining model interpretability and HITL ML techniques. These solutions will be driven by and demonstrated in the important application of Anterior Cruciate Ligament (ACL) injury classification. Models with prototype layers will be used to develop interpretable classifiers. The proposed project will involve designing user interfaces that help users navigate through a trained model’s decision process, and enable them to interact with the model by reporting back inaccurate image features the model may have picked up in its training phase. This will be followed by unlearning incorrect features that the trained model may be using to reach classification decisions, which in turn improves classification performance. Methods developed will be released to radiologists for evaluation. The resulting interpretability and unlearning solutions should be transferable across different knee joint injury classification problems, as well as to other body parts, and imaging modalities. This project will have three main contributions: (1) Prototype layer based medical images classification and feature visualization; (2) A user interface for HITL feature labelling feedback; (3) Improved system performance through unlearning incorrect features extracted by a trained model.
MRI Classification, Automatic Report Generation and Modelling of Sensor Data for Musculoskeletal Injury Management
The main objective of this project is to develop machine learning techniques for musculoskeletal injury management. The themes of the project are computer vision, language in machine learning and machine learning for sequential data. The project will consist of two phases. The first phase will focus on medical image analysis of Magnetic Reasoning Imaging (MRI) in musculoskeletal regions such as the calf muscle. The aim of this retrospective analysis is to detect and classify injuries. The project may also extend to classifying injuries based on Diffusion Tensor Imaging (DTI). Applications for machine learning in medical image analysis extend beyond classification problems. Automatic generation of the radiologist report is a useful application of machine learning as analysing a medical image and writing a standardised report can be both time consuming and tedious. Many notable approaches have been developed in the field of automating the generation of radiology reports for X-rays of the chest. However, there is a sparsity of literature applying such techniques to 3D medical images such as MRIs. Phase one of the project will apply or further develop approaches to automate the radiologist reports for MRIs of musculoskeletal areas. The phase one model will be trained with multi and cross-modal inputs. Multiple images from an MRI from different angles will be input into the model, along with the radiologist report in order to train the model to generate the report. Structured data such as injury statistics and demographic data will also be used as input to aid the generation of the radiologist report. Methods of combining multi and cross modal input will be explored. The objective of phase two is to develop machine learning techniques for musculoskeletal injury rehabilitation. This will involve analysing sensor data to ensure physio-therapy or exercise is performed correctly. Electromyography (EMG) and/or accelerometry are some of the possible types of sensor data that will be used in phase two.
Bayesian approaches to identifying and ameliorating systematic bias in machine learning algorithms and data
Researchers have identified a number of ways in which various standard ma- chine learning approaches can produce systematic bias against underrepresented or minority groups (or more generally, against categories only present in small subsets of data). This project will look at ways in which such systematic bias can arise in Bayesian inference (a fundamental normative model behind many machine learning approaches). This project will also propose techniques for mitigating this bias, and will implement, test and validate these techniques. The aim in this project will be to produce objective measures of the degree of systematic bias produced by standard Bayesian approaches for data sets with particular characteristics, and to produce measures of the degree to which extensions of the Bayesian approach influence or mitigate against such systematic bias. The aim in this project is not just to address the origins of bias in the approximate approaches implemented in various machine-learning techniques, but to also investigate bias in general, by looking at normatively correct models of reasoning such as full Bayesian inference (models which underlie these approximate approaches).
Quantum computing is a potential game-changer as the field promises an exponential increase in computing power which will enable breakthrough applications in areas as diverse as vaccine and drug discovery, climate modelling, protein folding modelling, financial services and artificial intelligence among others. Equal 1 Laboratories Ireland Limited (Equal 1) is an innovative start-up creating a paradigm shift in quantum computing by developing disruptive, scalable and cost-effective quantum computing technology. Equal1 currently has a number of Quantum Computers operating at 3 kelvin with one currently on site at UCD that will be available for conducting experiments as part of this PhD project. A key goal would be the use of Variational hybrid quantum-classical algorithms. This class of algorithms enhance classical machine learning algorithms with quantum machine learning algorithms, for example quantum Boltzmann machines in which they are used to learn binary probability distributions. This type of algorithm is very promising for gaining an advantage over a classical computer in the Noisy Intermediate-Scale Quantum (NISQ) era, Different large-scale generative and optimization tasks in high impact domains (e.g. medical image processing) can be approached. The student will participate in a collaborative project with Equal1 to explore new data-driven and machine learning-based algorithms in the field of Quantum Artificial Intelligence. This will require the student to tackle open problems like the input problem and output problem, the comparison of gate based and adiabatic quantum computer and the analysis and an development of new approaches that make best use of the underlying hardware capabilities (e.g. native gates and their qubit connectivity, error characteristics).
Event driven AI techniques and hardware implementation for IoT wearable devices
Wireless biomedical sensors should dramatically reduce the costs and risks associated with personal health care while being more and more exploited by telemedicine and efficient e-health systems. However, because of the large power consumption of continuous wireless transmission, the battery life of the sensors is reduced for long-term use. Sub-Nyquist continuous-time discrete-amplitude (CTDA) sampling approaches using level-crossing analog-to-digital converters (ADCs) have been developed to reduce the sampling rate and energy consumption of the sensors. However, traditional machine learning techniques and architectures are not compatible with the non-uniform sampled data obtained from level crossing ADCs. This project aims to develop analog algorithms, circuits, and systems for the implementation of machine learning techniques in CTDA sampled data in wireless biomedical sensors. This “near-sensor computing” approach, will help reduce the wireless transmission rate and therefore the power consumption of the sensor. The output rate of the CTDA is directly proportional to the activity of the analog signal at the input of the sensor. Therefore, artificial intelligence hardware that processes CTDA data should consume significantly less energy. The project involves algorithm development, circuit/chip implementation of the event driven AI , testing and verification etc.
Reasoning with Cases and Knowledge Graphs to Uncover Relationships in Financial Markets
The stochastic nature of financial markets reflects a complex network of interactions, making them a challenging target for analysis and prediction. Within this application domain, identifying meaningful relationships between financial assets is a difficult but important problem for various financial applications, including portfolio optimization, benchmarking company performance, identifying peers and competitors and quantifying market share. However, with recent research, particularly those using machine learning (ML) and deep learning (DL) techniques, focused mostly on returns forecasting, the literature investigating the modelling of asset correlations has lagged somewhat. To address this, the focus of this work is on developing novel ML and DL frameworks to successfully uncover relationships between financial assets. These frameworks will leverage multiple data modalities, and the efficacy of the learned relationships will be demonstrated on several downstream tasks in the financial domain, including portfolio optimization, returns forecasting and sector classification.
Analysis of Aspects of ML Algorithms that Lead to Bias
Issues of algorithmic fairness/bias have received a lot of attention in AI & ML research in recent years. There are two main sources of bias in ML: Negative Legacy: the bias is there in the training data, either due to poor sampling, incorrect labeling or discriminatory practices in the past. Underestimation: the classifier underfits the data, thereby focusing on strong signals in the data and missing more subtle phenomena. In most cases the data (negative legacy) rather than the algorithm itself is the source of bias. Fairness research focuses on fair outcomes no matter what is the source of the problem so the underestimation side of algorithmic bias has not received a lot of attention. However, the algorithmic side of algorithmic bias is important because it is inextricably tied to regularisation, i.e. the extent to which the model fits (overfits) the data. Overfitting occurs when the model fits to noise in the training data thus reducing generalisation. ML practitioners expend a lot of effort avoiding overfitting. This PhD research will focus on the algorithmic aspect of algorithmic bias and the relationship between model fitting and underestimation. An initial paper on this research is available on arxiv. “Algorithmic Bias and Regularisation in Machine Learning” Pádraig Cunningham, Sarah Jane Delany https://arxiv.org/abs/2005.09052 For a wider perspective on research relating to fairness in ML have a look at the papers published at the ACM FAccT conferences https://facctconference.org.
Orchestration of Microservices on The edge: A machine learning-based approach
International Data Corporation predicts that the collective sum of the world’s data will grow to 175ZB by 2025, out of which 90ZB of data will be created on IoT and edge nodes. Offering processing, and storage services at the cloud level is costly. Furthermore, In some applications, the requirements such as low latency, privacy, and scalability are not satisfied if all data is uploaded to the cloud [Mechalikh 2019]. To address these issues Edge or Fog Computing is the new paradigm that allows computation and storage to happen at the edge of the network [Bonomi 2012]. Furthermore, edge nodes are heterogeneous and operate in volatile and dynamic environments, with high degrees of mobility and geo-distribution. Additionally, limited storage and computational power of edge nodes require applications’ software to be developed as a set of lightweight, independent, and executable modules called microservices. Therefore, in such an environment, edge nodes require an orchestrator or a set of orchestrators to support the orchestration of microservices on-demand based on available resources. Some of the challenges in these environments are as follows [LM Vaquero 2019][K Velasquez 2018]:
• Mobility and Dynamism: Due to the uncertainty in mobile and dynamic environments, a network of adaptive orchestrators must be designed so they can adapt to the changes in the environment and to respond to the end-users’ demands on-the-fly and satisfy the quality of service. •Heterogeneity: The orchestrator must employ compatible techniques such as container-based techniques to work with a wide range of heterogeneous software and hardware infrastructure at the same time. •Functionality chaining: Arranging and scheduling heterogeneous microservices with different Service Level Agreement and resource requirements which do not have a standard specification is still an open research challenge. •Data streaming and Data scattering: Identifying data sources that are frequently accessed by microservices and offering a data scattering mechanism that improves bandwidth utilization and reduces data access time.
The proposed research aims to design a distributed network of orchestrators that utilizes Machine Learning techniques so it can: (a) operate in heterogeneous, mobile, and dynamic environments by adapting the services based on the availability of the resources in the environment; (b) provide a dynamic arrangement of functionality chaining to ensure the interoperability of the heterogenous microservices and the required resources. (c) identify and predict frequently used data to facilitate data accessibility and improve the efficiency of the system.