Multimodality and Stacking Ensemble Models in Demand Prediction

View this project on my Github: Link Here

Classified advertisement platforms, such as Avito, serve as crucial facilitators of online commerce, connecting millions of users daily. This project aims to predict the likelihood of a successful deal by leveraging multi-modal data from Avito’s dataset, which includes tabular, text, and image data. Using machine learning models such as LightGBM and ensemble techniques, we explored the integration of diverse data modalities to improve predictive accuracy. Advanced feature engineering, including embeddings from NLP models like FastText, SpaCy, and TFI-DF, and image embeddings from ResNet50, combined with techniques such as stacking and Bayesian hyperparameter tuning, demonstrated the complementary power of different data types. The best-performing model, which integrated tabular, text embeddings, and image embeddings, demonstrated a significant improvement in predictive accuracy, highlighting the effectiveness of leveraging multi-modal data for demand prediction in dynamic online marketplaces.

Share on

X (formerly Twitter) Facebook LinkedIn

Ellie Yang

Share on