Abstract
Distributed learning in expert referral networks is an emerging challenge in the intersection of Active Learning and Multi-Agent Reinforcement Learning, where experts—humans or automated agents—have varying skills across different topics and can redirect difficult problem instances to connected colleagues with more appropriate expertise. The learning-to-refer challenge involves estimating colleagues’ topic-conditioned skills for appropriate referrals. Prior research has investigated different reinforcement learning algorithms both with uninformative priors and partially available (potentially noisy) priors. However, most human experts expect mutually-rewarding referrals, with return referrals on their expertise areas so that both (or all) parties benefit from networking, rather than one-sided referral flow. This paper analyzes the extent of referral reciprocity imbalance present in high-performance referral-learning algorithms, specifically multi-armed bandit (MAB) methods belonging to two broad categories – frequentist and Bayesian – and demonstrate that both algorithms suffer considerably from reciprocity imbalance. The paper proposes modifications to enable distributed learning methods to better balance referral reciprocity and thus make referral networks win-win for all parties. Extensive empirical evaluations demonstrate substantial improvement in mitigating reciprocity imbalance, while maintaining reasonably high overall solution performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Additionally, we present experimental results in Table 3 indicating that the performance is not sensitive to the choice of C over a reasonable set of values.
References
Agrawal, S., Goyal, N.: Analysis of Thompson sampling for the multi-armed bandit problem. In: COLT, pp. 39–41 (2012)
Ambati, V., Vogel, S., Carbonell, J.G.: Active learning and crowd-sourcing for machine translation (2010)
Audibert, J.Y., Munos, R., Szepesvári, C.: Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theor. Comput. Sci. 410(19), 1876–1902 (2009)
Berry, D.A., Fristedt, B.: Bandit Problems: Sequential Allocation of Experiments. Monographs on Statistics and Applied Probability, vol. 12. Springer, Dordrecht (1985). https://doi.org/10.1007/978-94-015-3711-7
Chapelle, O., Li, L.: An empirical evaluation of Thompson sampling. In: Advances in Neural Information Processing Systems (NIPS), pp. 2249–2257 (2011)
De Marco, G., Immordino, G.: Reciprocity in the principal-multiple agent model. BE J. Theor. Econ. 14(1), 445–482 (2014)
Donmez, P., Carbonell, J.G., Schneider, J.: Efficiently learning the accuracy of labeling sources for selective sampling. Proc. KDD 2009, 259 (2009)
Graepel, T., Candela, J.Q., Borchert, T., Herbrich, R.: Web-scale Bayesian click-through rate prediction for sponsored search advertising in Microsoft’s Bing search engine. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 13–20 (2010)
Hütter, C., Böhm, K.: Cooperation through reciprocity in multiagent systems: an evolutionary analysis. In: The 10th International Conference on Autonomous Agents and Multiagent Systems-Volume 1 (AAMAS), pp. 241–248. International Foundation for Autonomous Agents and Multiagent Systems (2011)
Kaelbling, L.P.: Learning in Embedded Systems. MIT Press, Cambridge (1993)
KhudaBukhsh, A.R., Carbonell, J.G.: Endorsement in referral networks. In: Slavkovik, M. (ed.) EUMAS 2018. LNCS (LNAI), vol. 11450, pp. 172–187. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-14174-5_12
KhudaBukhsh, A.R., Carbonell, J.G.: Expertise drift in referral networks. In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), pp. 425–433. International Foundation for Autonomous Agents and Multiagent Systems (2018)
KhudaBukhsh, A.R., Carbonell, J.G., Jansen, P.J.: Proactive skill posting in referral networks. In: Kang, B.H., Bai, Q. (eds.) AI 2016. LNCS (LNAI), vol. 9992, pp. 585–596. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50127-7_52
KhudaBukhsh, A.R., Carbonell, J.G., Jansen, P.J.: Incentive compatible proactive skill posting in referral networks. In: Belardinelli, F., Argente, E. (eds.) EUMAS/AT -2017. LNCS (LNAI), vol. 10767, pp. 29–43. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01713-2_3
KhudaBukhsh, A.R., Carbonell, J.G., Jansen, P.J.: Robust learning in expert networks: a comparative analysis. J. Intell. Inf. Syst. 51(2), 207–234 (2018)
KhudaBukhsh, A.R., Jansen, P.J., Carbonell, J.G.: Distributed learning in expert referral networks. In: European Conference on Artificial Intelligence (ECAI) 2016, pp. 1620–1621 (2016)
Murugesan, K., Carbonell, J.: Active learning from peers. In: Advances in Neural Information Processing Systems (NIPS), pp. 7011–7020 (2017)
Sen, S., Sekaran, M.: Using reciprocity to adapt to others. In: Weiß, G., Sen, S. (eds.) IJCAI 1995. LNCS, vol. 1042, pp. 206–217. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-60923-7_29
Settles, B.: Active learning. Synth. Lect. Artif. Intell. Mach. Learn. 6(1), 1–114 (2012)
Thompson, W.R.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3/4), 285–294 (1933)
Urner, R., David, S.B., Shamir, O.: Learning from weak teachers. In: Artificial Intelligence and Statistics (AISTATS), pp. 1252–1260 (2012)
Vogiatzis, G., MacGillivray, I., Chli, M.: A probabilistic model for trust and reputation. In: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: vol. 1, pp. 225–232. International Foundation for Autonomous Agents and Multiagent Systems (2010)
Wang, Y., Singh, M.P.: Formal trust model for multiagent systems. In: IJCAI, vol. 7, pp. 1551–1556 (2007)
Wiering, M., Schmidhuber, J.: Efficient model-based exploration. In: Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior (SAB 1998), pp. 223–228 (1998)
Yu, H., Shen, Z., Leung, C., Miao, C., Lesser, V.R.: A survey of multi-agent trust management systems. IEEE Access 1, 35–50 (2013)
Zamora, J., Millán, J.R., Murciano, A.: Learning and stabilization of altruistic behaviors in multi-agent systems by reciprocity. Biol. Cybern. 78(3), 197–205 (1998)
Zhang, C., Chaudhuri, K.: Active learning from weak and strong labelers. In: Advances in Neural Information Processing Systems (NIPS), pp. 703–711 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
KhudaBukhsh, A.R., Carbonell, J.G. (2019). Toward Reciprocity-Aware Distributed Learning in Referral Networks. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11671. Springer, Cham. https://doi.org/10.1007/978-3-030-29911-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-29911-8_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29910-1
Online ISBN: 978-3-030-29911-8
eBook Packages: Computer ScienceComputer Science (R0)