Automated Security Incident Detection and Response with HPC and AI-Assisted Code Generation

Scientific/governmental/private partners involved:

WeLoData is a fast-growing company specializing in real-time data management, predictive analytics, and machine learning integration. The firm works with clients across multiple industries to build scalable data infrastructures, ensuring reliable insights, high performance, and adaptability to growing computational demands. In its projects, WeLoData faces the increasing challenge of securing massive volumes of customer data while ensuring compliance and rapid responsiveness to threats.

Technical/scientific Challenge:

Modern organizations today face a radically transformed cybersecurity landscape. The expansion of cloud computing, mobile devices, and remote work has multiplied the number of systems generating logs and security events. A single enterprise may collect data from:

firewalls, intrusion detection/prevention systems, and endpoint protection platforms,
cloud service providers and SaaS applications,
IoT sensors and industrial control systems,
user authentication systems, VPNs, and remote access gateways.

The result is a continuous torrent of security events measured in millions per second. Handling this volume manually or with conventional tools is not feasible. Several interconnected challenges emerge:

Overwhelming data volumes

Security analysts are often confronted with massive event streams that exceed the processing capacity of traditional SOC platforms. Important signals of an attack may be hidden within routine noise, leading to delayed detection or complete oversight.

Evolving and obfuscated threats

Cyber adversaries adapt quickly, using low-and-slow tactics, advanced obfuscation, and multi-stage attacks. Standard rule-based detection, which depends on static signatures, fails against novel malware, insider threats, or zero-day exploits.

High false positive rates

Static correlation rules often trigger excessive alarms, most of which turn out to be benign. Analysts become overwhelmed by “alert fatigue,” increasing the risk of missing genuine incidents.

Slow incident response

Even when an anomaly is detected, the time between detection and containment can stretch into hours. Manual investigation, scripting, and coordination among tools slow down response, while adversaries may already be moving laterally or exfiltrating sensitive data.

Fragmentation of tools and processes

Most SOC environments use multiple monitoring and response tools, each operating in silos. Without seamless integration, analysts are forced to manually extract data, write ad-hoc scripts, and coordinate workflows across disparate systems. This fragmentation increases complexity, costs, and the likelihood of human error.

Compliance and regulatory pressure

Strict regulations such as GDPR, NIS2, and industry-specific standards (HIPAA, PCI DSS) require timely detection, reporting, and documentation of incidents. Meeting these requirements is difficult when workflows are slow, manual, and error prone.

Scalability constraints

As data volumes grow, conventional SOC infrastructures cannot scale cost-effectively. Adding more servers or licenses does not solve the fundamental issue: the need for parallel, high-performance data analytics that can operate at scale in real time.

Solution:

The new approach transforms the way a Security Operations Center operates by combining artificial intelligence with high-performance computing. Instead of relying on human experts to manually program detection rules, correlation queries, or machine learning models, the system shifts this burden entirely to AI assistants that can generate and deploy code automatically. Analysts interact with the platform through natural language instructions, while the underlying HPC and HPDA infrastructure ensures that the resulting models and rules can operate in real time, even when dealing with millions of security events per second.

The workflow begins at the ingestion layer. Security logs and telemetry data are continuously collected from firewalls, cloud services, VPN gateways, and endpoint devices. These event streams are pushed into Apache Kafka, which acts as a high-throughput buffer capable of handling hundreds of thousands of messages per second. From there, Spark Structured Streaming and Flink pipelines, deployed on HPC clusters, process events in parallel micro-batches. This parallelism is crucial: it allows correlations to be detected across heterogeneous sources—for example, when a suspicious login on a cloud account coincides with abnormal data transfers from a corporate endpoint.

On top of this high-performance data layer, the system introduces AI-assisted code generation. Instead of writing detection logic manually, an analyst formulates requirements in plain English. For example: “Detect any user who logs in from two countries within five minutes, combined with more than three failed login attempts in the same period.” The AI engine translates this request into an executable Spark SQL query or SIEM detection script, optimizes it for performance, and injects it into the live pipeline. What previously took hours of manual scripting and debugging is now produced in seconds.

The same principle applies to machine learning. Analysts are not expected to design or tune models themselves. If they request anomaly detection on login activity, the AI assistant generates a complete ML workflow, including feature extraction, data normalization, model selection, and training code. Depending on the data, it may select an Isolation Forest for clustering anomalies, an Autoencoder for detecting abnormal patterns in high-dimensional data, or an LSTM neural network for time-series log sequences. The model is deployed directly into the streaming pipeline and executed on GPU-accelerated nodes, ensuring inference within milliseconds. Retraining is automated: whenever the system detects feature drift, it schedules new training jobs on the HPC cluster, maintaining accuracy as attack patterns evolve.

Incident response is handled with the same degree of automation. When a model or detection rule flags a potential attack, the AI generates a predefined workflow without requiring any human scripting. For example, an endpoint showing suspicious lateral movement can be automatically quarantined from the network, forensic logs can be captured into secure storage, and compliance teams can be notified with a structured incident report. Analysts may review or override these steps, but in most cases the workflow runs without delay, reducing the time-to-response from hours to under a minute.

Transparency remains central to this design. Each AI-generated rule or model comes with an explanation of why it was generated, which features it considers, and what actions it will take. This ensures that analysts remain in control, even though they no longer perform the manual programming. In this way, the SOC becomes more adaptive, scalable, and proactive: AI handles the repetitive, code-intensive tasks, while HPC infrastructure guarantees real-time performance across massive event volumes, and human experts focus on oversight and strategic decision-making.

Scientific impact:

The scientific contribution of this work can be outlined in several key directions:

Natural language as an interface – the approach demonstrates how plain-text descriptions of threats can be reliably transformed into executable detection logic and machine learning workflows, reducing the need for manual programming.
High-performance data analytics – parallel processing on HPC/HPDA infrastructure enables real-time anomaly detection across very large event streams, overcoming the scalability limitations of conventional SOC platforms.
Adaptive machine learning – the system is capable of automatically selecting and retraining models (e.g., Isolation Forests, Autoencoders, LSTMs) when data patterns change, ensuring long-term detection accuracy.
Transparency and compliance – all generated rules and models are explainable and auditable, supporting accountability and alignment with regulatory requirements.
Generalisability – while developed for cybersecurity, the methodology can also be applied to other domains such as fraud detection, healthcare monitoring, or industrial IoT.

Benefits

The faster detection and response – incidents are identified and contained within minutes rather than hours.
Lower barrier to adoption – no programming skills are required; AI translates plain-language instructions into executable detection and ML code.
Operational efficiency – automation reduces repetitive manual tasks, freeing analysts to focus on strategic decision-making.
Scalability and performance – the system processes millions of security events per second without data loss, ensuring reliable real-time monitoring.
Adaptive defence – models retrain automatically as attack techniques evolve, minimizing the risk of outdated defences.
Cost savings – reduced time for scripting, triage, and investigation leads to significant optimization of SOC resources.
Regulatory alignment – automated reporting and transparent workflows simplify compliance with data protection and cybersecurity directives.
Stronger resilience – organizations build a more proactive security posture, supported by both AI automation and human oversight.

Success story # Highlights:

AI replaces manual programming – threat detection rules and ML models are generated automatically from natural language descriptions.
High-performance real-time analytics – HPC infrastructure processes millions of security events per second, ensuring no signal is missed.
Adaptive and transparent defence – ML pipelines retrain automatically to follow evolving threats, while explainable AI keeps decisions auditable.
Faster, more cost-efficient SOC operations – incident detection and response times are reduced from hours to minutes, with lower operational costs.

Figure 1: Automated security incident detection and response pipeline

Metric	Manual SOC	AI SOC
Events processed per second	10k	1M+
Time to detection	Hours	Minutes
Time to response	2–3 hours	Less than one minute
False positives	High	Reduced

Table 1: Manual SOC vs. AI SOC – Key Metrics Comparison. The comparison highlights the clear improvements achieved by the AI SOC over traditional manual operations. While conventional systems handle around 10,000 events per second, the new architecture scales to over one million, ensuring no important signals are missed in real time. Detection and response are also transformed. Manual workflows often take hours, while the AI SOC reduces both steps to minutes or less than one minute. Another advantage is the reduction of false positives, a common challenge for security teams. Adaptive machine learning models filter out noise more effectively, allowing analysts to focus on high-confidence alerts.

Contact:

Assist. Dr. Ivona Velkova, [email protected]
Prof. Kamelia Stefanova, [email protected]
Prof. Valentin Kisimov, [email protected]

University of National and World Economy, Bulgaria