Collective Bayesian Matrix factorization Hashing for cross-modal retrieval

Loubna Karbil; Ahmad Sani; Imane Daoudi

doi:https://doi.org/10.14445/22315373/IJMTT-V67I3P508

International Journal of Mathematics Trends and Technology

Research Article | Open Access | Download PDF

Volume 67 | Issue 3 | Year 2021 | Article Id. IJMTT-V67I3P508 | DOI : https://doi.org/10.14445/22315373/IJMTT-V67I3P508

Collective Bayesian Matrix factorization Hashing for cross-modal retrieval

Loubna Karbil, Ahmad Sani, Imane Daoudi

Citation :

Loubna Karbil, Ahmad Sani, Imane Daoudi, "Collective Bayesian Matrix factorization Hashing for cross-modal retrieval," International Journal of Mathematics Trends and Technology (IJMTT), vol. 67, no. 3, pp. 58-69, 2021. Crossref, https://doi.org/10.14445/22315373/IJMTT-V67I3P508

Abstract

Matrix factorization hashing approaches have been widely applied in large scale cross-modality visual search due to their efficiency to preserve similarities among multimodal features. In this paper, we propose a novel cross-modality hashing technic based on Bayesian matrix factorization that factorizes all modalities into a shared latent semantic space using the Bayesian inference. To achieve better search performance, we measure the similarity using the cosine distance. Several experiments prove that the proposed method achieves better performance than many known methods on cross-modal retrieval applications.

Keywords

Cross-modal retrieval, Matrix Factorization, Bayesian hashing, hash function, Multimodal hashing

References

[1] W. Wang, X. Yang, B. C. Ooi, D. Zhang, and Y. Zhuang, “Effectivedeep learning-based multi-modal retrieval,” VLDBJ, vol. 25, no. 1, pp.79–101, 2016.
[2] PEREIRA, Jose Costa, COVIELLO, Emanuele, DOYLE, Gabriel, et al. On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE transactions on pattern analysis and machine intelligence, 2013, vol. 36, no 3, p. 521-535.
[3] W. Liu, J. Wang, R. Ji, Y.-G. Jiang, and S.-F. Chang. Supervised hashing with kernels. In CVPR. IEEE, 2012.
[4] B. Kulis and K. Grauman. Kernelized locality-sensitive hashing for scalable image search. In ICCV. IEEE, 2009.
[5] KUMAR, Shaishav et UDUPA, Raghavendra. Learning hash functions for cross-view similarity search. In : Twenty-second international joint conference on artificial intelligence. 2011.
[6] WEISS, Yair, TORRALBA, Antonio, FERGUS, Robert, et al. Spectral hashing. In : Nips. 2008. p. 4.
[7] ZHU, Xiaofeng, HUANG, Zi, SHEN, Heng Tao, et al. Linear cross-modal hashing for efficient multimedia search. In : Proceedings of the 21st ACM international conference on Multimedia. 2013. p. 143-152.
[8] ZHEN, Yi et YEUNG, Dit Yan. Co-regularized hashing for multimodal data. Advances in neural information processing systems, 2012, vol. 2, p. 1376.
[9] SONG, Jingkuan, YANG, Yang, YANG, Yi, et al. Inter-media hashing for large-scale retrieval from heterogeneous data sources. In : Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. 2013. p. 785-796.
[10] WANG, Di, GAO, Xinbo, WANG, Xiumei, et al. Multimodal discriminative binary embedding for large-scale cross-modal retrieval. IEEE Transactions on Image Processing, 2016, vol. 25, no 10, p. 4540-4554.
[11] SHEN, Xiaobo, SHEN, Fumin, SUN, Quan-Sen, et al. Multi-view latent hashing for efficient multimedia search. In : Proceedings of the 23rd ACM international conference on Multimedia. 2015. p. 831-834.
[12] ZHANG, Dan, WANG, Fei, et SI, Luo. Composite hashing with multiple information sources. In : Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. 2011. p. 225-234.
[13] ZHOU, Jile, DING, Guiguang, et GUO, Yuchen. Latent semantic sparse hashing for cross-modal similarity search. In : Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval. 2014. p. 415-424.
[14] WANG, Di, GAO, Xinbo, WANG, Xiumei, et al. Semantic topic multimodal hashing for cross-media retrieval. In : Twenty-fourth international joint conference on artificial intelligence. 2015.
[15] WANG, Di, WANG, Quan, et GAO, Xinbo. Robust and flexible discrete hashing for cross-modal similarity search. IEEE Transactions on Circuits and Systems for Video Technology, 2017, vol. 28, no 10, p. 2703-2715.
[16] ZHANG, Dongqing et LI, Wu-Jun. Large-scale supervised multimodal hashing with semantic correlation maximization. In : Proceedings of the AAAI conference on artificial intelligence. 2014.
[17] LIN, Zijia, DING, Guiguang, HU, Mingqing, et al. Semantics-preserving hashing for cross-view retrieval. In : Proceedings of the IEEE conference on computer vision and pattern recognition. 2015. p. 3864-3872.
[18] JIANG, Qing-Yuan et LI, Wu-Jun. Deep cross-modal hashing. In : Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. p. 3232-3240.
[19] LI, Chuan-Xiang, YAN, Ting-Kun, LUO, Xin, et al. Supervised robust discrete multimodal hashing for cross-media retrieval. IEEE Transactions on Multimedia, 2019, vol. 21, no 11, p. 2863-2877.
[20] SUN, Changchang, SONG, Xuemeng, FENG, Fuli, et al. Supervised hierarchical cross-modal hashing. In : Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019. p. 725-734.
[21] JIN, Lu, LI, Zechao, et TANG, Jinhui. Deep semantic multimodal hashing network for scalable image-text and video-text retrievals. IEEE Transactions on Neural Networks and Learning Systems, 2020.
[22] YAN, Cheng, BAI, Xiao, WANG, Shuai, et al. Cross-modal hashing with semantic deep embedding. Neurocomputing, 2019, vol. 337, p. 58-66.
[23] CHEN, Yaxiong et LU, Xiaoqiang. Deep discrete hashing with pairwise correlation learning. Neurocomputing, 2020, vol. 385, p. 111-121.
[24] DING, Guiguang, GUO, Yuchen, et ZHOU, Jile. Collective matrix factorization hashing for multimodal data. In : Proceedings of the IEEE conference on computer vision and pattern recognition. 2014. p. 2075-2082.
[25] KARBIL, Loubna et DAOUDI, Imane. Large-Scale Supervised Hashing for Cross-Modal Retreival. In : 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA). IEEE, 2017. p. 803-808.
[26] ZHAO, Huan, WANG, Song, SHE, Xiaolin, et al. Supervised Matrix Factorization Hashing With Quantitative Loss for Image-Text Search. IEEE Access, 2020, vol. 8, p. 102051-102064.
[27] XIONG, Haixia, OU, Weihua, YAN, Zengxian, et al. Modality-specific matrix factorization hashing for cross-modal retrieval. Journal of Ambient Intelligence and Humanized Computing, 2020, p. 1-15.
[28] WANG, Di, WANG, Quan, HE, Lihuo, et al. Joint and individual matrix factorization hashing for large-scale cross-modal retrieval. Pattern Recognition, 2020, vol. 107, p. 107479.
[29] MNIH, Andriy et SALAKHUTDINOV, Russ R. Probabilistic matrix factorization.
[30] Advances in neural information processing systems, 2007, vol. 20, p. 1257-1264.
[31] SALAKHUTDINOV, Ruslan et MNIH, Andriy. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In : Proceedings of the 25th international conference on Machine learning. 2008. p. 880-887.
[32] GANTNER, Zeno, DRUMOND, Lucas, FREUDENTHALER, Christoph, et al. Learning attribute-to-feature mappings for cold-start recommendations. In : 2010 IEEE International Conference on Data Mining. IEEE, 2010. p. 176-185.
[33] GELFAND, Alan E. Gibbs sampling. Journal of the American statistical Association, 2000, vol. 95, no 452, p. 1300-1304.
[34] CHUA, Tat-Seng, TANG, Jinhui, HONG, Richang, et al. Nus-wide: a real-world web image database from national university of singapore. In : Proceedings of the ACM international conference on image and video retrieval. 2009. p. 1-9. http://www.svcl.ucsd.edu/projects/crossmodal/.
[35] RASIWASIA, Nikhil, COSTA PEREIRA, Jose, COVIELLO, Emanuele, et al. A new approach to cross-modal multimedia retrieval. In : Proceedings of the 18th ACM international conference on Multimedia. 2010. p. 251-260.
[36] HARDOON, David R., SZEDMAK, Sandor, et SHAWE-TAYLOR, John. Canonical correlation analysis: An overview with application to learning methods. Neural computation, 2004, vol. 16, no 12, p. 2639-2664.
[37] YAO, Tao, HAN, Yaru, WANG, Ruxin, et al. Efficient discrete supervised hashing for large-scale cross-modal retrieval. Neurocomputing, 2020, vol. 385, p. 358-367.
[38] WANG, Di, GAO, Xinbo, WANG, Xiumei, et al. Multimodal discriminative binary embedding for large-scale cross-modal retrieval. IEEE Transactions on Image Processing, 2016, vol. 25, no 10, p. 4540-4554.
[39] ZHU, Xiaofeng, HUANG, Zi, SHEN, Heng Tao, et al. Linear cross-modal hashing for efficient multimedia search. In : Proceedings of the 21st ACM international conference on Multimedia. 2013. p. 143-152.