SLT-Based ELM for Big Social Data Analysis

Oneto, Luca; Bisio, Federica; Cambria, Erik; Anguita, Davide

doi:10.1007/s12559-016-9440-6

SLT-Based ELM for Big Social Data Analysis

Published: 26 November 2016

Volume 9, pages 259–274, (2017)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Luca Oneto¹,
Federica Bisio²,
Erik Cambria ORCID: orcid.org/0000-0002-3030-1280³ &
…
Davide Anguita¹

390 Accesses
10 Citations
1 Altmetric
Explore all metrics

Abstract

Recently, social networks and other forms of media communication have been gathering the interest of both the scientific and the business world, leading to the increasing development of the science of opinion and sentiment analysis. Facing the huge amount of information present on the Web represents a crucial task and leads to the study and creation of efficient models able to tackle the task. To this end, current research proposes an efficient approach to support emotion recognition and polarity detection in natural language text. In this paper, we show how the most recent advances in statistical learning theory (SLT) can support the development of an efficient extreme learning machine (ELM) and the assessment of the resultant model’s performance when applied to big social data analysis. ELM, developed to overcome some issues in back-propagation networks, represents a powerful learning tool. However, the main problem is represented by the necessity to cope with a large number of available samples, and the generalization performance has to be carefully assessed. For this reason, we propose an ELM implementation that exploits the Spark distributed in memory technology and show how to take advantage of SLT results in order to select ELM hyperparameters able to provide the best generalization performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparing ELM with SVM in the Field of Sentiment Classification of Social Media Text Data

Predictive Modeling and Sentiment Classification of Social Media Through Extreme Learning Machine

Developing a supervised learning-based social media business sentiment index

Article 10 January 2019

References

Agrawal D, Das S, El Abbadi A. Big data and cloud computing: current state and future opportunities. In: International conference on extending database technology; 2011.
Akusok A, Bjork KM, Miche Y, Lendasse A. High-performance extreme learning machines: a complete toolbox for big data applications. IEEE Open Access 2015;3:1011–1025.
Article Google Scholar
Anguita D, Ghio A, Oneto L, Ridella S. Maximal discrepancy vs. rademacher complexity for error estimation. In: European symposium on artificial neural networks, computational intelligence and machine learning (ESANN); 2011.
Anguita D, Ghio A, Oneto L, Ridella S. In-sample and out-of-sample model selection and error estimation for support vector machines. IEEE Trans Neural Netw Learn Syst. 2012;23(9):1390–1406.
Article PubMed Google Scholar
Anguita D, Ghio A, Oneto L, Ridella S. A learning machine with a bit-based hypothesis space. In: European symposium on artificial neural networks, computational intelligence and machine learning; 2013.
Anguita D, Ghio A, Ridella S, Sterpi D. K-fold cross validation for error rate estimate in support vector machines. In: International conference on data mining; 2009.
Bartlett PL, Boucheron S, Lugosi G. Model selection and error estimation. Mach Learn. 2002;48(1–3): 85–113.
Article Google Scholar
Bartlett PL, Bousquet O, Mendelson S. Local Rademacher complexities. Ann Stat. 2005;33(4):1497–1537.
Article Google Scholar
Bartlett PL, Mendelson S. Rademacher and Gaussian complexities: risk bounds and structural results. J Mach Learn Res. 2003;3:463–482.
Google Scholar
Bishop CM. Neural networks for pattern recognition. Oxford: Clarendon Press; 1995.
Google Scholar
Bisio F, Gastaldo P, Zunino R, Cambria E. A learning scheme based on similarity functions for affective common-sense reasoning. In: International joint conference on neural networks; 2015. p. 2476–2481.
Bobicev V, Sokolova M, Oakes M. What goes around comes around: learning sentiments in online medical forums. Cogn Comput 2015;7(5):609–621.
Article Google Scholar
Bousquet O, Elisseeff A. Stability and generalization. J Mach Learn Res. 2002;2:499–526.
Google Scholar
Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
Article Google Scholar
Breiman L. Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat Sci. 2001; 16(3):199– 231.
Article Google Scholar
Cambria E. Affective computing and sentiment analysis. IEEE Intell Syst. 2016;31(2):102–107.
Article Google Scholar
Cambria E, Fu J, Bisio F, Poria S. AffectiveSpace 2: enabling affective intuition for concept-level sentiment analysis. In: AAAI. Austin; 2015. p. 508–514.
Cambria E, Gastaldo P, Bisio F, Zunino R. An ELM-based model for affective analogical reasoning. Neurocomputing. 2015;149:443–455.
Article Google Scholar
Cambria E, Huang GB, et al. Extreme learning machines. IEEE Intell Syst. 2013;28(6):30–59.
Article Google Scholar
Cambria E, Poria S, Bajpai R, Schuller B. SenticNet 4: a semantic resource for sentiment analysis based on conceptual primitives. In: COLING; 2016.
Cambria E, Wang H, White B. Guest editorial: big social data analysis. Knowl-Based Syst. 2014;69:1–2.
Article Google Scholar
Cambria E, White B. Jumping NLP curves: a review of natural language processing research. IEEE Comput Intell Mag. 2014;9(2):48–57.
Article Google Scholar
Cao LJ, Keerthi SS, Ong CJ, Zhang JQ, Periyathamby U, Fu XJ, Lee HP. Parallel sequential minimal optimization for the training of support vector machines. IEEE Trans Neural Netw. 2006;17(4):1039–1049.
Article CAS PubMed Google Scholar
Carlyle AG, Harrell SL, Smith PM. Cost-effective hpc: the community or the cloud? In: IEEE international conference on cloud computing technology and science; 2010.
Caruana R, Lawrence S, Lee G. Overfitting in neural nets: backpropagation, conjugate gradient, and early stopping. In: Neural information processing systems; 2001.
Chang CC, Lin CJ. Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol. 2011;2 (3):27.
Article Google Scholar
Cherkassky V. The nature of statistical learning theory. IEEE Trans Neural Netw. 1997;8(6):1564–1564.
Article CAS PubMed Google Scholar
Devroye L, Györfi L., Lugosi G. A probabilistic theory of pattern recognition. Springer; 1996.
Dietrich R, Opper M, Sompolinsky H. Statistical mechanics of support vector networks. Phys Rev Lett. 1999;82(14):2975.
Article CAS Google Scholar
Efron B, Tibshirani RJ. An introduction to the bootstrap. Chapman & Hall; 1993.
Floyd S, Warmuth M. Sample compression, learnability, and the vapnik-chervonenkis dimension. Mach Learn. 1995;21(3):269–304.
Google Scholar
Furuta H, Kameda T, Fukuda Y, Frangopol DM. Life-cycle cost analysis for infrastructure systems: life cycle cost vs. safety level vs. service life. In: Life-cycle performance of deteriorating structures: assessment, design and management ; 2004.
Gangemi A, Presutti V, Reforgiato D. Frame-based detection of opinion holders and topics: a model and a tool. IEEE Comput Intell Mag 2014;9(1):20–30.
Article Google Scholar
Gopalani S, Arora R. Comparing apache spark and map reduce with performance analysis using k-means. Int J Comput Appl. 2015;113(1).
He Q, Shang T, Zhuang F, Shi Z. Parallel extreme learning machine for regression based on mapreduce. Neurocomputing. 2013;102:52–58.
Article Google Scholar
Hoeffding W. Probability inequalities for sums of bounded random variables. J Am Stat Assoc. 1963;58(301): 13–30.
Article Google Scholar
Huang G, Cambria E, Toh K, Widrow B, Xu Z. New trends of learning in computational intelligence [guest editorial]. IEEE Comput Intell Mag. 2015;10(2):16–17.
Article Google Scholar
Huang G, Huang GB, Song S, You K. Trends in extreme learning machines: a review. Neural Netw. 2015;61:32–48.
Article PubMed Google Scholar
Huang GB. An insight into extreme learning machines: random neurons, random features and kernels. Cogn Comput. 2014;6(3):376–390.
Article Google Scholar
Huang GB. What are extreme learning machines? Filling the gap between Frank Rosenblatt’s dream and John von Neumann’s puzzle. Cogn Comput. 2015;7(3):263–278.
Article Google Scholar
Huang GB, Chen L, Siew CK. Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw. 2006;17(4):879– 892.
Article PubMed Google Scholar
Huang GB, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern. 2012;42(2):513–529.
Article PubMed Google Scholar
Huang GB, Zhu QY, Siew CK. Extreme learning machine: a new learning scheme of feedforward neural networks. In: IEEE international joint conference on neural networks; 2004.
Huang GB, Zhu QY, Siew CK. Extreme learning machine: theory and applications. Neurocomputing. 2006; 70(1):489–501.
Article Google Scholar
Huang S, Wang B, Qiu J, Yao J, Wang G, Yu G. Parallel ensemble of online sequential extreme learning machine based on mapreduce. In: ELM-2014; 2015.
Karau H, Konwinski A, Wendell P, Zaharia M. Learning spark. O’Reilly Media; 2015.
Khan FH, Qamar U, Bashir S. Multi-objective model selection (moms)-based semi-supervised framework for sentiment analysis. Cogn Comput. 2016;8(4):614–628.
Article Google Scholar
Kleiner A, Talwalkar A, Sarkar P, Jordan MI. A scalable bootstrap for massive data. J R Stat Soc Ser B (Stat Methodol). 2014;76(4):795–816.
Article Google Scholar
Kohavi R, et al. A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International joint conference on artificial intelligence; 1995.
Koltchinskii V. Rademacher penalties and structural risk minimization. IEEE Trans Inf Theory. 2001;47(5): 1902–1914.
Article Google Scholar
Langford J. Tutorial on practical prediction theory for classification. J Mach Learn Res. 2006;6(1):273.
Google Scholar
Lever G, Laviolette F, Shawe-Taylor J. Tighter PAC-Bayes bounds through distribution-dependent priors. Theor Comput Sci. 2013;473:4–28.
Article Google Scholar
Madden S. From databases to big data. IEEE Internet Comput. 2012;16(3):4–6.
Article Google Scholar
Magdon-Ismail M. No free lunch for noise prediction. Neural Comput. 2000;12(3):547–564.
Article CAS PubMed Google Scholar
Mills S, Lucas S, Irakliotis L, Rappa M, Carlson T, Perlowitz B. DEMYSTIFYING BIG DATA: a practical guide to transforming the business of Government. In: Technical report. http://www.ibm.com/software/data/demystifying-big-data; 2012.
Ofek N, Poria S, Rokach L, Cambria E, Hussain A, Shabtai A. Unsupervised commonsense knowledge enrichment for domain-specific sentiment analysis. Cogn Comput. 2016;8(3):467–477.
Article Google Scholar
Olukotun K. Beyond parallel programming with domain specific languages. In: Symposium on principles and practice of parallel programming; 2014.
Oneto L, Bisio F, Cambria E, Anguita D. Statistical learning theory and ELM for big social data analysis. IEEE Comput Intell Mag. 2016;11(3):45–55.
Article Google Scholar
Oneto L, Ghio A, Ridella S, Anguita D. Fully empirical and data-dependent stability-based bounds. IEEE Trans Cybern. 2015;45(9):1913–1926.
Article PubMed Google Scholar
Oneto L, Ghio A, Ridella S, Anguita D. Global rademacher complexity bounds: From slow to fast convergence rates. Neural Process Lett. (in–press) 2015.
Oneto L, Ghio A, Ridella S, Anguita D. Local rademacher complexity: sharper risk bounds with and without unlabeled samples. Neural Netw (in–press). 2015.
Oneto L, Pilarz B, Ghio A, D A. Model selection for big data: algorithmic stability and bag of little bootstraps on gpus. In: European symposium on artificial neural networks, computational intelligence and machine learning; 2015.
Poria S, Cambria E, Gelbukh A. Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In: Conference on empirical methods on natural language processing; 2015. p. 2539–2544.
Poria S, Cambria E, Gelbukh A. Aspect extraction for opinion mining with a deep convolutional neural network. Knowl-Based Syst. 2016;108:42–49.
Article Google Scholar
Poria S, Cambria E, Gelbukh A, Bisio F, Hussain A. Sentiment data flow analysis by means of dynamic linguistic patterns. IEEE Comput Intell Mag. 2015;10(4):26–36.
Article Google Scholar
Poria S, Chaturvedi I, Cambria E, Bisio F. Sentic LDA: Improving on LDA with semantic similarity for aspect-based sentiment analysis. In: IJCNN; 2016.
Poria S, Chaturvedi I, Cambria E, Hussain A. Convolutional MKL based multimodal emotion recognition and sentiment analysis. In: ICDM. Barcelona; 2016.
Prechelt L. Automatic early stopping using cross validation: quantifying the criteria. Neural Netw. 1998;11(4): 761–767.
Article PubMed Google Scholar
Reforgiato Recupero D, Presutti V, Consoli S, Gangemi A, Nuzzolese AG. Sentilo: frame-based sentiment analysis. Cogn Comput. 2015;7(2):211–225.
Article Google Scholar
Reyes-Ortiz JL, Oneto L, Anguita D. Big data analytics in the cloud: Spark on hadoop vs mpi/openmp on beowulf. Procedia Computer Science 2015.
Ridella S, Rovetta S, Zunino R. Circular backpropagation networks for classification. IEEE Trans Neural Netw. 1997;8(1):84–97.
Article CAS PubMed Google Scholar
dos Santos CN, Gatti M. Deep convolutional neural networks for sentiment analysis of short texts. In: International conference on computational linguistics; 2014.
Shalev-Shwartz S, Ben-David S. Understanding machine learning: from theory to algorithms. Cambridge University Press; 2014.
Shoro AG, Soomro TR. Big data analysis: Apache Spark perspective. Global J Comp Sci Technol. 2015;15 (1).
Strapparava C, Valitutti A. WordNet-Affect: an affective extension of WordNet. In: International conference on language resources and evaluation; 2004.
Suykens JAK, Vandewalle J. Least squares support vector machine classifiers. Neural Process Lett. 1999;9(3):293–300.
Article Google Scholar
Tang D, Wei F, Qin B, Liu T, Zhou M. Coooolll: a deep learning system for twitter sentiment classification. In: Proceedings of the 8th international workshop on semantic evaluation; 2014.
Tang D, Wei F, Yang N, Zhou M, Liu T, Qin B. Learning sentiment-specific word embedding for twitter sentiment classification. In: Annual meeting of the association for computational linguistics; 2014.
Valiant LG. A theory of the learnable. Commun ACM. 1984;27(11):1134–1142.
Article Google Scholar
Vapnik VN. Statistical learning theory. Wiley-Interscience; 1998.
Wang CC, Huang CH, Lin CJ. Subsampled hessian newton methods for su-pervised learning. Neural Comput. 2015;27(8):1766–1795.
Article PubMed Google Scholar
White T. Hadoop: the definitive guide. O’Reilly Media, Inc.; 2012.
Wolpert DH. The lack of a priori distinctions between learning algorithms. Neural Comput. 1996;8(7):1341–1390.
Article Google Scholar
Wu X, Zhu X, Wu GQ, Ding W. Data mining with big data. IEEE Trans Knowl Data Eng. 2014;26(1):97–107.
Article Google Scholar
Xin J, Wang Z, Chen C, Ding L, Wang G, Zhao Y. ELM*: distributed extreme learning machine with mapreduce. World Wide Web. 2014;17(5):1189–1204.
Article Google Scholar
Xin RS, Rosen J, Zaharia M, Franklin MJ, Shenker S, Stoica I. Shark: Sql and rich analytics at scale. In: ACM SIGMOD international conference on management of data; 2013.
Xu R, Chen T, Xia Y, Lu Q, Liu B, Wang X. Word embedding composition for data imbalances in sentiment and emotion classification. Cogn Comput. 2015;7(2):226–240.
Article Google Scholar
You Y, Song SL, Fu H, Marquez A, Dehnavi MM, Barker K, Cameron KW, Randles AP, Yang G. Mic-svm: designing a highly efficient support vector machine for advanced modern multi-core and many-core architectures. In: IEEE international parallel and distributed processing symposium; 2014.
Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauley M, Franklin MJ, Shenker S, Stoica I. Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: USENIX conference on networked systems design and implementation; 2012.
Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I. Spark: cluster computing with working sets. In: USENIX conference on hot topics in cloud computing; 2010.

Download references

Author information

Authors and Affiliations

DIBRIS, University of Genova, Via Opera Pia 13, I-16145, Genova, Italy
Luca Oneto & Davide Anguita
aizoOn S.r.l., Strada del Lionetto 6, I-10146, Torino, Italy
Federica Bisio
School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore, 639798, Singapore
Erik Cambria

Authors

Luca Oneto
View author publications
You can also search for this author in PubMed Google Scholar
Federica Bisio
View author publications
You can also search for this author in PubMed Google Scholar
Erik Cambria
View author publications
You can also search for this author in PubMed Google Scholar
Davide Anguita
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Erik Cambria.

Ethics declarations

Conflict of Interest

The authors have received no grants. All the authors declare they have no conflict of interest.

Additional information

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Oneto, L., Bisio, F., Cambria, E. et al. SLT-Based ELM for Big Social Data Analysis. Cogn Comput 9, 259–274 (2017). https://doi.org/10.1007/s12559-016-9440-6

Download citation

Received: 04 November 2016
Accepted: 14 November 2016
Published: 26 November 2016
Issue Date: April 2017
DOI: https://doi.org/10.1007/s12559-016-9440-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SLT-Based ELM for Big Social Data Analysis

Abstract

Access this article

Similar content being viewed by others

Comparing ELM with SVM in the Field of Sentiment Classification of Social Media Text Data

Predictive Modeling and Sentiment Classification of Social Media Through Extreme Learning Machine

Developing a supervised learning-based social media business sentiment index

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Ethical Approval

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SLT-Based ELM for Big Social Data Analysis

Abstract

Access this article

Similar content being viewed by others

Comparing ELM with SVM in the Field of Sentiment Classification of Social Media Text Data

Predictive Modeling and Sentiment Classification of Social Media Through Extreme Learning Machine

Developing a supervised learning-based social media business sentiment index

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Ethical Approval

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation