BigDataFr recommends: Scalable and Accurate Online Feature Selection for Big Data
Feature selection is important in many big data applications. There are at least two critical challenges. Firstly, in many applications, the dimensionality is extremely high, in millions, and keeps growing. Secondly, feature selection has to be highly scalable, preferably in an online manner such that each feature can be processed in a sequential scan.
In this paper, we develop SAOLA, a Scalable and Accurate OnLine Approach for feature selection. With a theoretical analysis on bounds of the pairwise correlations between features, SAOLA employs novel online pairwise comparison techniques to address the two challenges and maintain a parsimonious model over time in an online manner.
Furthermore, to tackle the dimensionality that arrives by groups, we extend our SAOLA algorithm, and then propose a novel group-SAOLA algorithm for online group feature selection. […]
Read article
By Kui Yu, Xindong Wu, Wei Ding, Jian Pei
Source: arxiv.org


![[Data Science Med] : 7 offres de stages IA et Médecine à Toulouse dès le 1er mars 2023! [Data Science Med] : 7 offres de stages IA et Médecine à Toulouse dès le 1er mars 2023!](http://www.big-data-fr.com/mednum/mednum3.png)
![[Advance AI Strategic Collaboration – Amazon x Anthropic] [Advance AI Strategic Collaboration – Amazon x Anthropic]](http://www.big-data-fr.com/ai/amazon/ai-new.png)
![[Quantum Computing] Pasqal launches First Neutral Atoms Quantum Computing Exploration Platform [Quantum Computing] Pasqal launches First Neutral Atoms Quantum Computing Exploration Platform](http://www.big-data-fr.com/Pasqal/image/laptop-quantum.jpg)
![[ChatGPT] Evolution or Revolution? Stay Tuned [ChatGPT] Evolution or Revolution? Stay Tuned](http://www.big-data-fr.com/chatgpt/chatgpt.png)
![[Mistral AI Jobs] Internship – Master – CIFRE [Mistral AI Jobs] Internship – Master – CIFRE](https://img.mailinblue.com/7788104/images/content_library/original/68d17b570f1dd4b3904a6099.jpg)