I did my level 5-7 studies from School of Mathematics, Statistics & Actuarial Science (SMSAS), University of Kent, Canterbury, UK and Department of Computer Science, Royal Holloway, University of London, Egham, UK. My PhD was funded by European Commission under the Horizon 2020 Grant 687691 for the project on online machine learning. The PhD was awarded by Bournemouth University under the supervision of Abdelhamid Bouchachia. My research is mainly on design and analysis of online learning algorithms. During and after my PhD I worked as a machine learning researcher and quant for asset and wealth managers respectively. Now, a Post-Doc at Bournemouth University.
In this paper, regularised regression for sequential data is investigated and a new ridge regression algorithm is proposed. It uses the Aggregating Algorithm (AA) to devise an iterative version of ridge regression (IRR). This algorithm is called AAIRR. A competitive analysis is conducted to show that the guarantee on the performance of AAIRR is better than that of the known online ridge regression algorithms. Moreover, an empirical study is carried out on real-world datasets to demonstrate the superior performance over those state-of-the-art algorithms.
AAIRRThe present work introduces an original and new online regression method that extends the shrinkage via limit of Gibbs sampler (SLOG) in the context of online learning. In particular, we theoretically show how the proposed online SLOG (OSLOG) is obtained using the Bayesian framework without resorting to the Gibbs sampler or considering a hierarchical representation. Moreover, in order to define the performance guarantee of OSLOG, we derive an upper bound on the cumulative squared loss. It is the only online regression algorithm with sparsity that gives logarithmic regret. Furthermore, we do an empirical comparison with two state-of-the-art algorithms to illustrate the performance of OSLOG relying on three aspects: normality, sparsity and multicollinearity showing an excellent achievement of trade-off between these properties.
OBSROnline learning has witnessed an increasing interest over the recent past due to its low computational requirements and its relevance to a broad range of streaming applications. In this brief, we focus on online regularized regression. We propose a novel efficient online regression algorithm, called online normalized least-squares (ONLS). We perform theoretical analysis by comparing the total loss of ONLS against the normalized gradient descent (NGD) algorithm and the best off-line LS predictor. We show, in particular, that ONLS allows for a better bias-variance tradeoff than those state-of-the-art gradient descent-based LS algorithms as well as a better control on the level of shrinkage of the features toward the null. Finally, we conduct an empirical study to illustrate the great performance of ONLS against some state-of-the-art algorithms using real-world data.
CNLSRRegularised regression uses sparsity and variance to reduce the complexity and over-fitting of a regression model. The present paper introduces two novel regularised linear regression algorithms: Competitive Iterative Ridge Regression (CIRR) and Online Shrinkage via Limit of Gibbs Sampler (OSLOG) for fast and reliable prediction on “Big Data” without making distributional assumption on the data. We use the technique of competitive analysis to design them and show their strong theoretical guarantee. Furthermore, we compare their performance against some neoteric regularised regression methods such as Online Ridge Regression (ORR) and the Aggregating Algorithm for Regression (AAR). The comparison of the algorithms is done theoretically, focusing on the guarantee on the performance on cumulative loss, and empirically to show the advantages of CIRR and OSLOG.
CRRThis paper discusses the problem of selecting model parameters in time series forecasting using aggregation. It proposes a new algorithm that relies on the paradigm of prediction with expert advice, where online and offline autoregressive models are regarded as experts. The desired goal of the proposed aggregation-based algorithm is to perform not worse than the best expert in the hindsight. The theoretical analysis shows that the algorithm has a guarantee that holds for any data sequence. Moreover, the empirical evaluation shows that the algorithm outperforms other popular model selection criteria such as Akaike and Bayesian information criteria on cyclic behaving time series.
MsotsDriven by the needs of Flink to expand the offline engine to a hybrid one, a new machine learning (ML) library, called SOLMA is proposed. This library aims to cover online learning algorithms for data streams. In this setting, data streams are processed sequentially example by example. SOLMA, which is under development, currently contains two classes of algorithms: (i) basic streaming routines such as online sampling, online PCA, online statistical moments and (ii) advanced online ML algorithms covering in particular classification, regression and drift/anomaly detection and handling. This paper briefly highlights the concepts underlying SOLMA.
SOLMALearning with expert advice as a scheme of on-line learning has been very successfully applied to various learning problems due to its strong theoretical basis. In this paper, for the purpose of times series prediction, we investigate the application of Aggregation Algorithm, which is a generalisation of the famous weighted majority algorithm. The results of the experiments done, show that the Aggregation Algorithm performs very well in comparison to average.
AAvsAVGExperiment Driven and User Experience Oriented Analytics for Extremely Precise Outcomes and Decisions.
ExtremeXPScalable online machine learning for predictive analytics and real-time interactive visualization.
PROTEUSThe High Frequency Appliance Disaggregation Analysis project analysed real world data from the ETI's Home Energy Management System in five homes to gather detailed energy data from water, gas and electricity use.
ETI
P325a
Talbot Campus
Poole BH12 5BB
wjamil@yjw.info