Skip to main content

Exploring Non-Human Traffic in Online Digital Advertisements: Analysis and Prediction

  • Conference paper
  • First Online:
Computational Collective Intelligence (ICCCI 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11684))

Included in the following conference series:

Abstract

An advertisement (ad) click fraud occurs when a user or a bot clicks on an ad with a malicious intent where advertisers need to pay for those fake clicks. Click-fraud is a serious problem for the online advertising industry. Our study demonstrates a hybrid approach using a two-level fingerprint to detect the illegitimate bots targeting ad click fraud. The approach consists of two detection phases: (1) a rule-based phase and (2) a machine learning-based phase. The first level of the fingerprint is used for rule-based detection phase. It is generated using immutable information about the user and traversing a website’s page. The second level of the fingerprint is generated using ad click behavioral patterns. It is used for machine learning-based detection phase. Different traditional classification algorithms were evaluated to be applied in the machine learning-based detection phase. To test our approach, we used a real commercial website for ads called Waseet where the access log of the website server was utilized as a dataset for our experiments. The results of our experiments show that our proposed hybrid approach entails promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. In: incapsula.com. https://www.incapsula.com/blog/bot-traffic-report-2016.html. Accessed 26 May 2019

  2. Truth #2: Illegitimate Traffic Sourcing is the Main Cause of Fraud. In: Alliance For Audited Media, Knowledge Base. https://knowledge.auditedmedia.com/blog/illegitimate-traffic-sourcing-is-the-main-cause-of-fraud. Accessed 26 May 2019

  3. Zeifman, I.: Imperva Incapsula, January 2017. https://www.incapsula.com/blog/bot-traffic-report-2016.html

  4. Waseet website, classifieds ads. http://waseet.net. Accessed 4 Nov 2018

  5. Mladenow, A., Novak, N.M., Strauss, C.: Online ad-fraud in search engine advertising campaigns. In: Khalil, I., Neuhold, E., Tjoa, A.M., Da Xu, L., You, I. (eds.) CONFENIS/ICT-EurAsia -2015. LNCS, vol. 9357, pp. 109–118. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24315-3_11

    Chapter  Google Scholar 

  6. Alrwais, S.A., Gerber, A., Dunn, C.W., Spatscheck, O., Gupta, M., Osterweil, E.: Dissecting ghost clicks: ad fraud via misdirected human clicks. In: Proceedings of the 28th Annual Computer Security Applications Conference, pp. 21–30. ACM, December 2012

    Google Scholar 

  7. Mungamuru, B., Weis, S.: Competition and fraud in online advertising markets. In: Tsudik, G. (ed.) FC 2008. LNCS, vol. 5143, pp. 187–191. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85230-8_16

    Chapter  Google Scholar 

  8. Smith, A.D., Lias, A.R.: Identity theft and e-fraud as critical CRM concerns. Int. J. Enterp. Inf. Syst. (IJEIS) 1(2), 17–36 (2005)

    Article  Google Scholar 

  9. Toledano, M., Cohen, I., Ben-Simhon, Y., Tadeski, I.: Real-time anomaly detection system for time series at scale. In: KDD 2017 Workshop on Anomaly Detection in Finance, pp. 56–65, January 2018

    Google Scholar 

  10. Badhe, A.: Click fraud detection in mobile ads served in programmatic inventory. Neural Netw. Mach. Learn. 1(1), 1 (2017)

    Google Scholar 

  11. Hsu, C.-H., Huang, C.-Y., Chen, K.-T.: Fast-flux bot detection in real time. In: Jha, S., Sommer, R., Kreibich, C. (eds.) RAID 2010. LNCS, vol. 6307, pp. 464–483. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15512-3_24

    Chapter  Google Scholar 

  12. Lee, K., Caverlee, J., Webb, S.: Uncovering social spammers: social honeypots + machine learning. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 435–442. ACM, July 2010

    Google Scholar 

  13. Vratonjic, N., Manshaei, M.H., Raya, M., Hubaux, J.-P.: ISPs and ad networks against botnet ad fraud. In: Alpcan, T., Buttyán, L., Baras, John S. (eds.) GameSec 2010. LNCS, vol. 6442, pp. 149–167. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17197-0_10

    Chapter  MATH  Google Scholar 

  14. Yu, F., Xie, Y., Ke, Q.: Sbotminer: large scale search bot detection. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, pp. 421–430. ACM, February 2010

    Google Scholar 

  15. Villamarín-Salomón, R., Brustoloni, J.C.: Bayesian bot detection based on DNS traffic similarity. In: Proceedings of the 2009 ACM symposium on Applied Computing, pp. 2035–2041. ACM, March 2009

    Google Scholar 

  16. Al-Hammadi, Y., Aickelin, U., Greensmith, J.: DCA for bot detection. In: IEEE Congress on Evolutionary Computation, 2008. CEC 2008 (IEEE World Congress on Computational Intelligence), pp. 1807–1816. IEEE, June 2008

    Google Scholar 

  17. Gu, G., Zhang, J., Lee, W.: BotSniffer: detecting botnet command and control channels in network traffic (2008)

    Google Scholar 

  18. Villamarín-Salomón, R., Brustoloni, J.C.: Identifying botnets using anomaly detection techniques applied to DNS traffic. In: 5th IEEE Consumer Communications and Networking Conference, 2008, CCNC 2008, pp. 476–481. IEEE, January 2008

    Google Scholar 

  19. Goebel, J., Holz, T.: Rishi: identify bot contaminated hosts by IRC nickname evaluation. HotBots 7, 8 (2007)

    Google Scholar 

  20. Wang, A.H.: Detecting spam bots in online social networking sites: a machine learning approach. In: Foresti, S., Jajodia, S. (eds.) DBSec 2010. LNCS, vol. 6166, pp. 335–342. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13739-6_25

    Chapter  Google Scholar 

  21. Dong, F., et al.: FraudDroid: automated ad fraud detection for android apps. In: The 26th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2018) (2018)

    Google Scholar 

  22. Iqbal, M.S., Zulkernine, M., Jaafar, F., Gu, Y.: Fcfraud: fighting click-fraud from the user side. In: 2016 IEEE 17th International Symposium on High Assurance Systems Engineering (HASE), 7 January 2016, pp. 157–164. IEEE (2016)

    Google Scholar 

  23. HTTP headers. In: MDN Web Docs. https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers. Accessed 26 May 2019

  24. MDN web docs. User-Agent. https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent. Accessed 4 Nov 2018

  25. Curl command line tool and library. https://curl.haxx.se/. Accessed 4 Nov 2018

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bashar Al-Shboul .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Almahmoud, S., Hammo, B., Al-Shboul, B. (2019). Exploring Non-Human Traffic in Online Digital Advertisements: Analysis and Prediction. In: Nguyen, N., Chbeir, R., Exposito, E., Aniorté, P., Trawiński, B. (eds) Computational Collective Intelligence. ICCCI 2019. Lecture Notes in Computer Science(), vol 11684. Springer, Cham. https://doi.org/10.1007/978-3-030-28374-2_57

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-28374-2_57

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-28373-5

  • Online ISBN: 978-3-030-28374-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics