Big Data Subspace Clustering
Mar 09,2022 Projects
Project description/goals
Large-scale subspace clustering via k-factorization
KDD’21 (The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining). 2021.
Importance/impact, challenges/pain points
1.Clustering 100k data points in 1 minute
2.Online clustering
Solution description
Group-sparse matrix factorization for clustering
Key contribution/commercial implication
1.Linear time and space complexity
2.High-clustering accuracy on large-scale datasets
3.Handle outliers and missing values
Team/contributors
Jicong Fan
Numerical results