Representation Learning and Parallelization for Machine Learning Applications with Graph, Tabular, and Time-Series Data
Time: Mon 2024-10-21 09.00
Location: Sal C, Electrum, Kistagången 16
Video link: https://kth-se.zoom.us/s/63322131109
Language: English
Subject area: Computer Science
Doctoral student: Tianze Wang , Programvaruteknik och datorsystem, SCS
Opponent: Associate Professor Farzaneh Etminani, Halmstad University, Sweden
Supervisor: Professor Vladimir Vlassov, Programvaruteknik och datorsystem, SCS; Amir H. Payberah, Programvaruteknik och datorsystem, SCS; Jim Dowling, Hopsworks AB
Abstract
Machine Learning (ML) models have achieved significant success in representation learning across domains like vision, language, graphs, and tabular data. Constructing effective ML models hinges on several critical considerations: (1) data representation: how to represent the input data in a meaningful and effective way; (2) learning objectives: how to define desired prediction target in a specific downstream task; (3) model architecture: which representation learning model architecture, i.e., the type of neural network, is the most appropriate for the given downstream task; (4) training strategy: how to effectively train the selected ML model for better feature extraction and representation quality.
This thesis explores representation learning and parallelization in machine learning, addressing how to boost model accuracy and reduce training time. Our research explores several innovative approaches to improve the efficiency and effectiveness of ML applications on graph, tabular, and time-series data, with contributions to areas such as combinatorial optimization, parallel training, and ML methods across these data types. First, we explore representation learning in combinatorial optimization and integrate a constraint-based exact solver with the predictive ML model to enhance problem-solving efficiency. We demonstrate that combining an exact solver with a predictive model that estimates optimal solution costs significantly reduces the search space and accelerates solution times. Second, we employ graph Transformer models to leverage topological and semantic node similarities in the input data, resulting in superior node representations and improved downstream task performance. Third, we empirically study the choice of model architecture for learning from tabular data. We showcase the application of tabular Transformer models to large datasets, revealing their ability to create high predictive power features. Fourth, we utilize Transformer models for detailed user behavior modeling from time-series data, illustrating their effectiveness in capturing fine-grained patterns. Finally, we dive into the training strategy and investigate graph traversal strategies to improve device placement in deep learning model parallelization, showing that optimized traversal order enhances parallel training speed. Collectively, these findings advance the understanding and application of representation learning and parallelization in diverse ML contexts.
This thesis enhances representation learning and parallelization in ML models, addressing key challenges in representation quality. Our methods advance combinatorial optimization, parallel training, and ML on graph, tabular, and time-series data. Additionally, our findings contribute to understanding Transformer models, leading to more accurate predictions and improved performance across various domains.