Online learning algorithms specialist

About Me


About Me


I did my level 5-7 studies from School of Mathematics, Statistics & Actuarial Science (SMSAS), University of Kent, Canterbury, UK and Department of Computer Science, Royal Holloway, University of London, Egham, UK. My PhD was funded by European Commission under the Horizon 2020 Grant 687691 for the project on online machine learning. The PhD was awarded by Bournemouth University under the supervision of Abdelhamid Bouchachia. My research is mainly on design and analysis of online learning algorithms. During and after my PhD I worked as a machine learning researcher and quant for asset and wealth managers respectively. Now, a Post-Doc at Bournemouth University.
 


Publications

Iterative Ridge Regression using the Aggregating Algorithm

In this paper, regularised regression for sequential data is investigated and a new ridge regression algorithm is proposed. It uses the Aggregating Algorithm (AA) to devise an iterative version of ridge regression (IRR). This algorithm is called AAIRR. A competitive analysis is conducted to show that the guarantee on the performance of AAIRR is better than that of the known online ridge regression algorithms. Moreover, an empirical study is carried out on real-world datasets to demonstrate the superior performance over those state-of-the-art algorithms.

AAIRR

Online Bayesian Shrinkage Regression

The present work introduces an original and new online regression method that extends the shrinkage via limit of Gibbs sampler (SLOG) in the context of online learning. In particular, we theoretically show how the proposed online SLOG (OSLOG) is obtained using the Bayesian framework without resorting to the Gibbs sampler or considering a hierarchical representation. Moreover, in order to define the performance guarantee of OSLOG, we derive an upper bound on the cumulative squared loss. It is the only online regression algorithm with sparsity that gives logarithmic regret. Furthermore, we do an empirical comparison with two state-of-the-art algorithms to illustrate the performance of OSLOG relying on three aspects: normality, sparsity and multicollinearity showing an excellent achievement of trade-off between these properties.

OBSR

Competitive Normalized Least-Squares Regression

Online learning has witnessed an increasing interest over the recent past due to its low computational requirements and its relevance to a broad range of streaming applications. In this brief, we focus on online regularized regression. We propose a novel efficient online regression algorithm, called online normalized least-squares (ONLS). We perform theoretical analysis by comparing the total loss of ONLS against the normalized gradient descent (NGD) algorithm and the best off-line LS predictor. We show, in particular, that ONLS allows for a better bias-variance tradeoff than those state-of-the-art gradient descent-based LS algorithms as well as a better control on the level of shrinkage of the features toward the null. Finally, we conduct an empirical study to illustrate the great performance of ONLS against some state-of-the-art algorithms using real-world data.

CNLSR

Competitive Regularised Regression

Regularised regression uses sparsity and variance to reduce the complexity and over-fitting of a regression model. The present paper introduces two novel regularised linear regression algorithms: Competitive Iterative Ridge Regression (CIRR) and Online Shrinkage via Limit of Gibbs Sampler (OSLOG) for fast and reliable prediction on “Big Data” without making distributional assumption on the data. We use the technique of competitive analysis to design them and show their strong theoretical guarantee. Furthermore, we compare their performance against some neoteric regularised regression methods such as Online Ridge Regression (ORR) and the Aggregating Algorithm for Regression (AAR). The comparison of the algorithms is done theoretically, focusing on the guarantee on the performance on cumulative loss, and empirically to show the advantages of CIRR and OSLOG.

CRR

Model selection in online learning for times series forecasting

This paper discusses the problem of selecting model parameters in time series forecasting using aggregation. It proposes a new algorithm that relies on the paradigm of prediction with expert advice, where online and offline autoregressive models are regarded as experts. The desired goal of the proposed aggregation-based algorithm is to perform not worse than the best expert in the hindsight. The theoretical analysis shows that the algorithm has a guarantee that holds for any data sequence. Moreover, the empirical evaluation shows that the algorithm outperforms other popular model selection criteria such as Akaike and Bayesian information criteria on cyclic behaving time series.

Msots

Scalable online learning for flink: SOLMA library

Driven by the needs of Flink to expand the offline engine to a hybrid one, a new machine learning (ML) library, called SOLMA is proposed. This library aims to cover online learning algorithms for data streams. In this setting, data streams are processed sequentially example by example. SOLMA, which is under development, currently contains two classes of algorithms: (i) basic streaming routines such as online sampling, online PCA, online statistical moments and (ii) advanced online ML algorithms covering in particular classification, regression and drift/anomaly detection and handling. This paper briefly highlights the concepts underlying SOLMA.

SOLMA

Aggregation algorithm vs. average for time series prediction

Learning with expert advice as a scheme of on-line learning has been very successfully applied to various learning problems due to its strong theoretical basis. In this paper, for the purpose of times series prediction, we investigate the application of Aggregation Algorithm, which is a generalisation of the famous weighted majority algorithm. The results of the experiments done, show that the Aggregation Algorithm performs very well in comparison to average.

AAvsAVG

Projects

ExtremeXP

Experiment Driven and User Experience Oriented Analytics for Extremely Precise Outcomes and Decisions.

ExtremeXP

PROTEUS

Scalable online machine learning for predictive analytics and real-time interactive visualization.

PROTEUS

ETI

The High Frequency Appliance Disaggregation Analysis project analysed real world data from the ETI's Home Energy Management System in five homes to gather detailed energy data from water, gas and electricity use.

ETI

Contact


P325a
Talbot Campus
Poole BH12 5BB
wjamil@yjw.info