DISA

Centre for Data Intensive Sciences and Applications

Welcome to PhD-seminar September 2024

Postat den 29th August, 2024, 10:28 av Diana Unander

When? Friday September 6th 14-16
Where? Onsite: D2272 at Linnaeus University in Växjö and online
Registration: Please sign up for the PhD-seminar via this link https://forms.gle/xtG9s5Qhs4SFd98E7 by September 4th (especially important if you plan on attending onsite so we have fika for everyone)

Agenda

14.00-14.10 Welcome and practical information from Welf Löwe
14.10-14.55 Presentation and discussion: Reuse of health data, combing the best of two worlds – Machine learning driven knowledge discovery from real world health data with collaboration of domain expert – Olle Björneld, Industrial PhD-student Region Kalmar
14.55 – 15.05 Coffee break
15.05 – 15.50 Presentation and discussion: Remaining useful life prediction of batteries based on historical loading-unloading cycle logs – Zijie Feng, Industrial PhD-student Micropower
15.50 -16.00 Sum up and plan for our next seminar on October 4th and other ongoing activities.

Abstracts

Reuse of health data, combing the best of two worlds – Machine learning driven knowledge discovery from real world health data with collaboration of domain expertOlle Björneld, Industrial PhD-student Region Kalmar
The main objective of the PhD project is “How can data driven knowledge discovery in databases (KDD), performed in the medical research domain supported with domain knowledge, be more effective and efficient?” To answer this question the following work has been performed:

Knowledge discovery from real-world data in health care can be demanding due to unstructured data and low registration quality in electronic health records (EHRs). Close collaboration between domain experts and data scientists is essential. New variables, referred to as features, are generated from domain experts and computer scientists in collaboration with medical researchers. This process is named knowledge-driven feature engineering (KDFE). (Study A, published)

A case study comprising two projects (P1 and P2) was performed to evaluate the effectiveness of manual KDFE (mKDFE), the effectiveness was represented of classification performance, more precisely the area under the receiver operating characteristic curve (AUROC). The study gave salient results that it is valuable for medical researchers to involve a data scientist when medical research based on real world medical data is performed. When mKDFE was compared to baseline, the average classification performance measured by AUROC for the engineered features rose for P1 from 0.62 to 0.82 and for P2 from 0.61 to 0.89 (p-values << 0.001). (Study B, published)

To perform KDD more effectively and efficiently, a framework for automatic Knowledge Driven Feature Engineering (aKDFE) was developed. Central to aKDFE is an automated feature engineering (FE), i.e., an automated construction of new, highly informative features, from those directly observed and recorded, e.g., in EHRs. The framework learns and aggregates domain knowledge to generate features that are more informative compared to those recorded in EHRs or manually engineered (manual FE) as done in many medical research projects today.

aKDFE is (i) more efficient than manual FE since it automates the manual knowledge discovery and FE processes. It is (ii) more effective due to its higher predictive power compared to manual KDFE. Finally, aKDFE (iii) applies and describes data pivoting and feature generation as explicit and transparent operation sequences on EHR features. (Study C, published)

Domain expert knowledge can be found in knowledge databases or expert knowledge decision support systems, in which derived and distilled knowledge has been manually entered and can be represented as risk scores or indexes. To leverage the effect of expert knowledge in aKDFE we will dissect the following questions: “How does decision support scores impact the effectiveness of aKDFE?”. (Study D/E, under construction)

aKDFE saves time and resources from medical researchers and produces more informative features, however future enhancements still exists, (i) evaluation of more sophisticated time series oriented models, (ii) use of LLMs to collect and structure domain knowledge, and (iii) evaluate multi-agent knowledge discovery.

Remaining useful life prediction of batteries based on historical loading-unloading cycle logsZijie Feng, Industrial PhD-student Micropower

As technology advances, battery usage has become increasingly prevalent in daily life. Many traditional fuel-powered mechanical devices, such as forklifts and automated guided vehicles, are now powered by battery. Concurrently, concerns about safety and efficiency have heightened the focus on monitoring the condition of batteries in these large devices.

During usage, the battery’s actual capacity diminishes gradually. When the capacity falls to a certain threshold, the battery becomes unusable. In general, we can measure the remaining useful life of a battery (i.e., RUL) in two ways: directly by measuring the physical and chemical characteristics of the battery, and indirectly by using data-driven models. Since direct measurement of batteries is very inconvenient, RUL prediction based on data models is a promising research direction. RUL is typically estimated, considering the battery’s condition and the customer’s usage. However, both factors are influenced by numerous variables, introducing uncertainty into the estimated RUL, and consequently significant fluctuations in the RUL curve.

In this presentation, we will share the progress we’ve made at Micropower in developing a workflow that predicts battery RUL with confidence intervals using machine learning algorithms on historical battery cycle logs. These results will help battery owners and suppliers plan maintenance and replacements in advance. Additionally, we will also introduce our ongoing project on anomaly detection within battery logs.

Det här inlägget postades den August 29th, 2024, 10:28 och fylls under General

Comments are closed.