Tree ensemble models like XGBoost are traditionally favored for classification and regression for tabular data4. Gradient-Boosted Decision Trees (GBDT) such as XGBoost, LightGBM, and CatBoost dominate tabular data applications due to their strong performance.
Deep learning models face challenges with tabular data due to several reasons. Tabular data often contains mixed feature types, sparsity, and lacks the inherent structure present in image or text data. Additionally, deep learning models require extensive hyperparameter tuning and are computationally expensive compared to traditional models like XGBoost. These factors make it difficult for deep learning models to consistently outperform XGBoost on a wide range of tabular datasets.
In Intel's study, XGBoost outperformed deep learning models across various datasets, including those used in the papers that proposed the deep models1. XGBoost required significantly less tuning and was more efficient in hyperparameter optimization. However, an ensemble of deep models and XGBoost performed better than XGBoost alone.