Cyber Defense Advisors

The psychological and strategic challenges posed by AI-enhanced cyberattacks and influence campaigns

The world will likely soon witness malware campaigns fully augmented and shaped by artificial intelligence (AI). Citing an arms race logic, cybersecurity luminary Mikko Hypp?nen said in a recent CSO article that the use of AI to enhance all aspects of disruptive cyber operations is virtually inevitable. As attackers have begun to use large language models (LLMs), deepfakes, and machine learning tools to craft sophisticated attacks at speed, cyber defenders have also turned to AI to keep up. In the face of quickening reaction times and automated obstacles to interference, the response for would-be attackers’ use of AI is obvious – double-down.

What does this near-term transformation of AI-centered cyber campaigns mean for national security and cybersecurity planners? Hypp?nen highlighted human-side challenges with spiraling AI usage that stem from the black box problem. As malicious cyber and information operations (IO) become more potent, defenders face a challenge that attackers don’t: letting a deep learning model loose as a defensive guardian will often produce actions that are difficult to explain. This is problematic for client coordination, defensive analytics, and more, all of which make the threat of bigger, smarter, faster AI-augmented influence campaigns feel more ominous.

Such techno-logistical developments stemming from AI-driven and -triggered influence activities are valid concerns. That said, novel information activities along these lines will also likely augur novel socio-psychological, strategic, and reputational risks for Western industry and public-sector planners. This is particularly true with regard to malign influence activities. After all, while it’s tempting to think about the AI-ification of IO purely in terms of heightened potency — i.e., the future will see “bigger, smarter, faster” versions of the interference we’re already so familiar with — history also suggests that insecurity will also be driven by how society reacts to a development so unprecedented. Fortunately, research into the psychology and strategy of novel technological insecurities offers insights into what we might expect.

The human impact of AI: Caring less and accepting less security

Ethnographic research into malign influence activities, artificial intelligence systems, and cyber threats provides a good baseline for what to expect from the augmentation of IO with machine-learning techniques. In particular, the past four years have seen scientists walk back a foundational assumption about how individuals respond to novel threats. Often called the “cyber doom” hypothesis, pundits, experts and policymakers alike have described forthcoming digital threats as having unique disruptive potential for democratic societies for nearly three decades. First, the general public recurrently encounters unprecedented security scenarios (e.g., the downing of electrical grids in Ukraine in 2015). Then they panic. In this way, every augmentation of technological insecurity opens space for dread, anxiety, and irrational response far more than what we might see with more conventional threats.

Recent scholarship tells us that the general public does respond this way to truly novel threats like that of AI-augmented IO, but only for a short time. Familiarity with digital technology in either a personal or professional setting – extremely commonplace – allows people to rationalize disruptive threats after just a small amount of exposure. This means that the chances of AI-augmented influence activities turning society on its head simply by dint of their sudden appearance are unlikely.

However, it would be disingenuous to suggest that the average citizen and consumer in advanced economies is well-adjusted to discount the potential for disruption that AI-ification of influence activities might bring. Research suggests a troubling set of psychological reactions to AI based on both exposure to AI systems and trust in information technologies. While those with limited exposure to AI trust it less (in line with cyber doom research findings), it takes an enormous amount of familiarity and knowledge to think objectively about how technology works and is being used. In something resembling the Dunning-Kruger effect, the vast majority of persons in between these extremes are prone to automation bias that manifests as an overconfidence in the potency of AI for all manner of activities.

This outcome of recent research pairs worryingly with other findings that suggest overexposure to deepfakes and other influence activities dulls the critical participatory preferences of the average citizen. Where it becomes exceedingly hard to separate fact and fiction, the value in attempting to do so diminishes. Automation bias kicks in to compound this inclination by bolstering confidence in technologies’ defensive potential and, unfortunately, lowering the quality threshold for support for defensive measures.

Yet another study, for instance, has illustrated how concerned decision-makers, when presented with a potentially problematic AI system, cease to be so worried that there is a human presence in the loop taking steps to ensure AI robustness. This result holds regardless of how extensive or obviously impactful this human input is, strongly implying that most people will value ineffective security measures so long as they involve a human.

With AI-augmented cyber and influence operations, the risk inherent in these cyber-psychological findings is that industry defenders and developers will be hard-pressed to overcome hygiene issues while market incentives push them toward half-measures. For instance, a social media platform that relies on a deep-learning defensive approach trained against user flags will see substantially greater variability in model accuracy and the rise of false positives if the trends outlined above hold true amidst a deluge of AI-altered or -created content.

Simply relying on more limited training regimes for defensive tools risks model overfit, making adversary adaptation to avoid detection easier. At the same time, turning to hybrid approaches to content verification for effective IO defense implies extensive costs to maintain human oversight just as demand for any such products  — i.e., not necessarily for the best version  — skyrockets and motivates convenience rather than best practices.

The ironic centrality of cyber conflict

This dynamic  — wherein defence against IO becomes more challenging with AI while the market simultaneously works to incentivize convenient solutions over effective ones  — is also compounded by the fact that government deterrence of IO is likely to continue to center on the strategic employment of cyber operations. This stance is motivated by two conditions. First, a side effect of the increased use of AI systems across all areas of national security practice and industry is that cyberspace becomes the central avenue via which interference of AI systems and the injection of AI-created content (code, disinformation, etc.) occurs. As such, a continuing uptick in offensive cyber activity on the part of malicious state actors is almost inevitable, especially as the urgency of employing AI defenses to combat AI-produced malware campaigns rises. In turn, continued emphasis on national cyber missions, which in the West have centred more on cyber effect operations than the resiliency of national information environments, seems likely.

Government preference for cyber operations responses to digital threats also stems from shortcomings in our abilities to detect malign digital interference activities. There is debate among academics and other researchers about how readily we might differentiate organized malign influence activities from normal social media discourse. The challenges are varied but two points bear mention. Most importantly, the primary feature that distinguishes influence operations from the organic character of sociopolitical discourse is its coordination by an outside force. As such, techniques for detecting bot activity or trolls at the level of individual accounts or small networks of accounts are insufficient for interdicting large-scale foreign-backed IO.

Second, influence activities on large social media platforms are not uniform in either their input vectors (e.g., bot vs. human-controlled accounts) or their level of persistence (e.g., the tempo or style of engagement with normal platform users). This second quality makes IO particularly hard to grapple with. Just like extremist discourse, adversary accounts will not always use messaging that is clearly identifiable as linked to the ideology or perspective being pushed. Much activity, particularly human-controlled actions, takes the form of normal chatter as IO operators try to build networks to generate and amplify influence. This has been particularly true of recent non-Russian IO activities, including campaigns by Chinese, Saudi Arabian, and Iranian threat actors that appear to see interference as a natural low-intensity form of permanent engagement.

One need only talk to ChatGPT for a few minutes to see how likely it is that this dynamic will continue with the onset of AI-augmented IO alongside novel cyber campaigns, a dynamic that will likely prime policymakers toward favoring deterrents that are more punitive than detective. If this occurs, then even the rise of AI-shaped malign political interference in the next decade will likely only gradually shift focus toward the need for better informational defense and funding. If detection is difficult, investment in deterrence that doesn’t require quite so high-fidelity attribution seems prudent. This is somewhat ironic, of course, as IO is far from just a cybersecurity threat, a lesson ostensibly learned by many from the Western inability to forecast the widespread manipulation of social web processes by authoritarian regimes in the 2010s.

How to mitigate the risk of AI-enhanced cyber and influence campaigns?

Clearly, countering AI-augmented cyber and influence activities will require better detection than we have today. For malware, the task is somewhat more straightforward than for malicious influence, though still subject to the arms race logic Hypp?nen cited. For IO, some small successes in using machine learning to find coordinated malign influence campaigns do exist. However, challenges remain. On the one hand, unsupervised methods for campaign detection often involve community detection as the primary method for addressing the coordination challenge noted above.

Unfortunately, the practical requirements of community detection methods dictate that analysis must occur after the activity has occurred, making it unhelpful for deterrent operations. Moreover, these methods aren’t generally scalable solutions, particularly as malign IO use diverse vectors for engagement and often seek to co-opt existing, interacting networks rather than build new ones. By comparison, supervised models that rely on past input data to train a classifier lack the external touchpoints (e.g., knowledge of strategy or guiding interests) to observe something that changes over time and circumstance, as IO fundamentally does. Research that has tried to combine approaches, such as Alizadeh et al.’s content-based community detection approach, doesn’t provide an ability to distinguish between clearly different IO campaign efforts, generally because attribution is required ahead of time.

So where might better capacity to leverage strategic gains from enhanced detection of IO come from? One advantage that defenders gain from AI augmentation of influence activities is in the altered or directly generated outputs of LLMs themselves. While LLMs are rapidly becoming better at pairing inputs with outputs that are difficult to detect as AI-generated, the best results will continue to emerge from AI hacking. Skilled human input will be required to get the best from AI systems. Added to existing capabilities to find generic AI content, this strongly implies that a structured operator perspective will still underlie malign content generation and can thus still be mapped  — perhaps by the use of ontological models, as has recently been suggested — to provide novel detection capabilities.

Given that consumers and government might react to the threat of AI-augmented IO in idiosyncratic fashion, technology companies and other vendors might be set to see greater volatility in the reputational environment than in eras past. It seems likely that the additional machine-learning coordination and content generation to a style of threat now so familiar that Western consumers won’t polarize sentiment against industry. Responsibility for dealing with malign influence threats will likely remain mixed, with consumers both concerned about intrusive content and sympathizing with technology vendors on the difficulties of keeping up.

Maintaining this mixed position and preventing hostile backlash against technology vendors strongly implies that industry initiatives should emphasize customer and client transparency about how AI is being deployed in defense. To prevent the situation that Hypp?nen noted, industry clients must recognize that variability is inherent in attempting to fit defensive models and oversight to evolving offensive doctrines. The same is true for the public, whose sympathy with technology developers has ever been anchored in an understanding of novel innovations as inherently complex. Continued talk about just how AI is being arrayed for defense against its offensive applications underlines a necessary point with regard to national security planning: Malign information operations are a society-level threat and only openness and collaboration sustain avenues for progress.

Advanced Persistent Threats, Cyberattacks, Generative AI