Navigating the Future of SIEM Detections: Balancing Signature-Based and AI-Driven Approaches

Tim Nary, Chief Product Officer and Co-Founder at SnapAttack

In the early days of cybersecurity, implementing a Security Information and Event Management (SIEM) system was akin to constructing a house from scratch. The SIEM was a blank slate, and transforming raw data into actionable insights was a long and arduous journey. It began with the daunting task of ingesting data from various disparate sources and formats. From there, security teams had to craft detections — rules designed to identify malicious or suspicious activity. Visualization came next, with the creation of dashboards to bring alerts to life, providing the context needed for informed action. Only then could the SIEM be put to use, though not without ongoing tweaking and tuning. Often, this process revealed the need for further investments to handle the sheer volume of alerts and automate responses (👀  looking at you, SOAR).

Enterprise software already has long sales cycles, and vendors quickly realized this multi-step process created significant friction in the buying process. Reducing the time-to-value became critical for customer adoption and minimizing churn. As a result, the evolution of SIEM detections began to accelerate.

The Emergence of Out-of-the-Box Security Content

In response to these early challenges, modern SIEM solutions have evolved to include out-of-the-box (OOTB) templates, detections, and playbooks that security teams can deploy immediately. Vendors have also seized this opportunity to showcase their expertise, often highlighting the efforts of their dedicated threat research teams. By producing insightful blog posts and comprehensive threat intelligence reports, they reassure customers that they are proactively defending against emerging threats. This not only enhances the perceived value of their products, but also fosters a deeper sense of trust and confidence among their customers.

A look at some of the OOTB detections available to customers today from leading SIEM vendors:

Splunk: Boasting over 2,000 detections, analytic stories, and playbooks, Splunk’s offerings are available on Splunk Research and in the ESCU app. These are meticulously crafted and maintained by the Splunk Threat Research Team (STRT), who also contribute significantly to open source projects like attack range, attack data, and as maintainers of atomic red team. Their comprehensive approach and active community involvement make Splunk a formidable player in the SIEM market.

Google: Google offers curated detections that can be enabled in SecOps (formerly Chronicle) with a simple toggle, covering Cloud, Linux, Windows, and other threats. For those seeking more, Mandiant’s extensive library can be purchased via the separate Breach Analytics for Chronicle or as a managed service with Mandiant Hunt for Chronicle. This integration brings together Google’s robust infrastructure with Mandiant’s deep threat intelligence expertise.

Microsoft: Microsoft allows discovery and on-demand installation of OOTB content and solutions for Sentinel from the content hub. This efficient approach provides users with a wide array of tools to enhance their security posture, leveraging Microsoft’s vast ecosystem and regular updates.

Sumo Logic: Its built-in Cloud SIEM rules provide a straightforward, no-nonsense approach. This simplicity is ideal for users seeking quick deployment, though advanced users may need to tailor these global rules to better fit their specific environments.

LogRhythm / Exabeam: Both vendors independently provide OOTB content bundles and content libraries. However, with their impending merger, it remains to be seen how their offerings will be integrated.

Securonix: Their Threat Labs shares valuable research through blog posts, and maintains the Autonomous Threat Sweeper IOC feed available on GitHub. There is some room for improvement, especially for a company whose foundation was built around behavioral analytics.

Elastic: Elastic publishes their detection rules on Github under the Elastic License v2. Elastic customers can quickly operationalize these rules; themselves; however, the license restricts MSSPs from using the detections as managed services and limits companies like SnapAttack from helping joint customers operationalize these detections. This restrictive licensing is a significant drawback in an otherwise open-source-friendly ecosystem.

Panther: Panther offers managed detection packs, which are helpful given the dual requirement for analysts to write SQL queries and Python rules. However, it highlights a broader issue: the troubling number of SOC analysts who lack proficiency in either language, diminishing their effectiveness in the role. This, unfortunately, speaks to a larger skills gap within the industry.

Google Curated Detections

In many cases, the problem now shifts from the availability of the detections to the adoption and operationalization of them.

When detections are stored completely outside of the platform, such as on GitHub, this can create a significant barrier for users. Awareness can be an issue; many users may not even know these resources exist, as evidenced by the relatively low number of stars on some GitHub repositories. Additionally, getting these detections into the SIEM can be cumbersome. If it’s not a simple copy-and-paste process, it might not get done. And if integrating them requires scripting or using an API, it becomes even less likely.

Github Detection Repositories

Github Detection Repositories

Conversely, if detections are available within the platform, the integration process is much smoother. However, even something as minor as requiring users to manually activate a rule can hinder adoption. Imagine if your antivirus or EDR vendor required you to turn on every new rule or content pack! Streamlining this process, ideally through automation, can significantly reduce friction and increase usage.

Once the detections are activated, analysts face another challenge: understanding and effectively using them. Since they didn’t write the detection rules themselves, they might not fully grasp how they work, how to triage alerts, or how to tune them to minimize false positives. In such cases, comprehensive documentation becomes crucial. Unfortunately, many times, only the detection rules are provided, leaving organizations to create their own documentation as they operationalize these rules.

One of the most significant issues is determining what data sources are required to make the detection work and whether that data is available in their SIEM. At the SANS Purple Team Summit in 2021, I discussed this in my presentation, “Don’t Fear the Zero: A Test-Driven Approach to Analytic Development.” The question of what zero hits mean is vital — does it indicate the absence of a threat, or is there a problem with the data pipeline? Being able to test and validate that a rule is functioning correctly once it is activated is crucial for effective security operations, and something missing from the SIEM ecosystem today.

The New Detection Paradigm for EDR

When Endpoint Detection and Response (EDR) emerged, it revolutionized the cybersecurity landscape with its unique, vertically integrated design. EDR vendors controlled everything from the sensors capturing telemetry to the entire data model and the platform where detection and response activities took place. This comprehensive control allowed them to create a seamless and highly effective security solution. If they needed to capture new telemetry to alert on emerging threats, they could simply update their sensors to hook into the relevant operating system APIs. This flexibility meant there was virtually no limit to what EDR solutions could capture and alert on.

Because EDRs managed the entire data collection process and owned the entire data pipeline, they could offer detections and alerts that “just work.” Customers didn’t need to enable individual rules or even create them. As new threats emerged, the EDR systems would automatically roll out updated detections. This frictionless value proposition was incredibly appealing: install the agent, and the protection was immediate and comprehensive.

For EDRs to be deployed as widely as they are, their detections must maintain an extremely low false positive rate. What works for one organization may not be effective for another. But finding the common ground across thousands of organizations isn’t “a rising tide lifts all boats”; rather, it’s more like “our detections are only as good as the ones in the worst-performing customer networks.” When a detection must achieve a very low false positive rate, it often means compromising on the breadth and flexibility of detection capabilities to ensure universal applicability across diverse environments.

But because most alerts were generally high confidence and actionable, trust in EDR vendors grew significantly over time. Although users might not see all the detections, they trusted that the major threats were covered. Marketing materials boasted claims like “we stop breaches,” and without enough data points to contradict these assertions, customers were inclined to believe them. High scores in the MITRE ATT&CK evaluations further bolstered this trust, offering empirical evidence of their detection capabilities, despite the fact that these evaluations were based on well-documented open-source TTPs, and vendors often “studied” for these tests, knowing the scenarios well in advance. The old adage, “Nobody ever got fired for buying IBM,” was becoming applicable to major EDR vendors as well.

However, for those attempting to measure detection coverage, the task proved challenging. EDRs typically utilized “hidden detections” that only became visible when triggered. This approach led many to assume coverage was comprehensive until proven otherwise. Yet, EDRs had their blind spots, especially when it came to detecting living-off-the-land techniques that could produce numerous false positives or keeping up with new CVEs and celebrity threats. Breach and Attack Simulation (BAS) tools offered some insights into coverage but couldn’t encompass the full range of an adversary’s tactics, particularly those highly malicious actions no sane person would run in a production environment or the true “unknown unknowns” intended for threat hunters to discover. In truth, the measurable detection coverage provided by EDRs could be quite limited.

Measuring detection coverage can be difficult…

threat detection

With EDR, customers grew accustomed to not seeing the logic behind their detections, or even knowing which detections existed in the first place. The rationale often given is, “We’re doing this to protect the integrity of our detections — if you can’t see them, the adversary can’t see them either, so they can’t as easily get around them.” Make no mistake — this “fog of coverage” benefits vendors more than customers. Customers can have significant gaps in detection without being aware they are unprotected. They cannot easily verify or disprove that vendors have them adequately covered.

The dangers associated with hidden detections become even more perilous in the SIEM space. Historically, SIEMs have not been vertically integrated like EDRs. They ingest whatever data they can get their hands on. SIEMs are the plumbers of the security world — manipulating, normalizing, and enriching the data as it flows through the pipeline. However, with hidden detections, there can be multiple “leaks” in this pipeline:

  • Duplicated Efforts: When users can’t see the built-in SIEM rules, they often end up writing their own versions to ensure they’re covered against specific threats. This duplication wastes valuable time and resources and increases the likelihood of dealing with multiple alerts for the same incident.
 
  • Operational Drift: Hidden detections prevent users from seeing rule changes. What if a previously effective detection stops working? Would the user even notice? Without visibility into the underlying detection logic changes, a vendor could inadvertently break a detection during routine maintenance or over-tune it to the point of ineffectiveness. This lack of transparency makes it difficult to trust coverage and identify breakages.
 
  • Non-Fungible Data: Not all data is equal or interchangeable. A detection that works for Okta likely won’t work for Azure AD, just as an AWS CloudTrail detection won’t work for GCP, even if they perform similar functions. This issue is particularly pronounced in the EDR world. For instance, one vendor might log a registry key as HKEY_LOCAL_MACHINE while another uses HKLM, or one might capture a file event as \\Device\HardDiskVolume3 while another logs it as C:\. While normalization and data models can address some of these discrepancies, they can’t compensate for vendors that simply don’t capture the same data. In such cases, those detections will never work effectively.
 
These issues strongly suggest that platforms and consolidation will likely prevail, with the best experiences and support coming from SIEMs that are vertically integrated with their logging and telemetry, similar to EDRs.
 

As EDR vendors like CrowdStrike and SentinelOne enter the SIEM market, the question remains: will they bring this hidden detection paradigm with them? There’s already precedence with Google SecOps, which offers “curated detections” where users only see alerts, not the underlying logic. My sincere hope — and expectation — is that customers, accustomed to seeing and interacting with SIEM rules, will not tolerate these “hidden detections” in the SIEM environment. Transparency and interaction are crucial for maintaining trust and ensuring comprehensive threat coverage.

The Rise of AI and Machine Learning Detections

Over the last decade, every major security tool has incorporated a machine learning detection layer. Unlike traditional signature-based alerting, these models enabled security tools to detect threats they hadn’t encountered before or didn’t have existing signatures for. This innovation allowed for more proactive threat detection, identifying suspicious activities that might have previously slipped through the cracks. Machine learning models are now widely used for tasks such as classifying emails as phishing or spam and determining whether an unknown binary is malicious.

Building and training these models required large amounts of data. The process involved selecting the most relevant features of the data, weighting them appropriately, and using these inputs to train the model. Model selection varied based on the data’s characteristics and the problem at hand — for example, a classification algorithm might mark emails as “spam” or “not spam,” while a clustering algorithm might group similar types of events together to identify patterns of attacks without predefined labels. Models could be developed for specific tasks or chained together to analyze different facets of the data.

This approach, while groundbreaking, wasn’t without its challenges. Insufficient training data would lead to models that couldn’t learn patterns effectively, resulting in inaccurate predictions or overfitting on the training data and failing to generalize to new data. The absence of labeled data, which happens more often than not, meant that supervised learning couldn’t be used. This often meant the best thing you could do was cluster the data, such as grouping similar network traffic patterns without explicitly labeling them as “benign” or “malicious.” Sometimes, the labels existed but were too biased, leading to models that unfairly targeted certain threats while neglecting others, thus reducing detection accuracy and trustworthiness.

Additionally, machine learning outputs needed to be explainable to provide value. But many of these models couldn’t do that. Neural networks are often considered black boxes due to their complex architectures with many layers and parameters, making it difficult to interpret how individual features influence predictions. This opacity brought back the hidden detection problem: we often don’t know what features the model is keying in on or why it considers something malicious. Users might only see a confidence score indicating that an event is malicious with “medium” or “high” confidence, but without understanding the reasoning behind it.

Large Language Models (LLMs) and traditional ML models share foundational principles of machine learning, but differ significantly in scale, complexity, and applications. LLMs offer broad generalization and versatility in tasks, whereas traditional ML models are often more specialized and interpretable for specific use cases. It’s still too early to determine what the “killer apps” for LLMs will be, or the limitations that will affect their effectiveness and reach. Today, we know that LLMs can be unpredictable, may hallucinate facts, and make confident yet incorrect assertions, often leaving users unsure whether to trust the output.

Despite these challenges, machine learning will undoubtedly play a significant role in the future of SIEMs. The data fusion capabilities of SIEMs, centralizing all data for model access, will be a game changer when executed correctly. Inferencing models as data flows into the SIEM will allow it to detect subtle features that an analyst might overlook or that are difficult to capture with signature-based detection. These models can analyze different events and data sources, understanding and making sense of complex relationships. Even if some outputs remain unexplainable, well-trained models can be highly trustworthy.

Lessons Learned from the Antivirus Market

Antivirus (AV) is perhaps the oldest security tool, initially relying entirely on signature-based detections. Regular content updates protected customers as new signatures were added to the database. It has always been, and largely still is, a cat-and-mouse game between attackers and vendors. Someone had to be “patient zero” and get infected before a signature could be created to protect others.

Machine learning eventually entered the AV scene, with some vendors like Cylance diving in head-first, claiming to understand the “DNA of malware.” This approach meant they could extract specific features from malware samples and train models to predict if a new sample was malicious or benign. ML models showed promising results and often eliminated the need for a “patient zero.” However, these models needed to be continuously updated and retrained as new attacks and variants emerged. In some cases, there was a higher return on investment in sticking with traditional methods — it is incredibly fast and cheap to create a static signature. Today, most AV solutions remain largely reactive and signature-based, with little innovation because the existing methods work “well enough.”

History has a way of repeating itself. What does this mean for SIEM? Will we see a rise in fully-autonomous, automated SIEM detections based on machine learning models and AI? Or will we continue to rely on creating signatures as a stop-gap measure? As we navigate these questions, the future of SIEM may hinge on balancing the proven effectiveness of signature-based detections with the potential of cutting-edge machine learning technologies.

 

The Future of SIEM Detections

While I don’t have a crystal ball, observing how the industry has evolved over the last 20 years provides some insights into what the future holds:

  • Signature-based detections will likely always have a place in SIEMs. There is no quicker, easier, faster, or cheaper way to alert on something — especially compared to the time required to adjust a machine learning model. AI, and more specifically Large Language Models (LLMs), will likely play a part in accelerating time-to-detection and creating more robust signatures.
 
  • Detections need to be transparent and explainable. Whether rejecting built-in, “hidden detections” or demanding more from future ML models, analysts need to understand how a detection works to tune and troubleshoot it, measure its robustness and coverage, and determine if an alert is a false positive.
 
  • Machine learning will play a significant role in the future of SIEMs. A truly next-gen SIEM capable of centralizing and labeling data, training or refining models per customer, and inferencing models as events stream into the system will be a game-changer.
 

Machine learning has revolutionized threat detection, allowing security tools to identify previously unseen threats and adapt to new attack vectors. While there are challenges in terms of data requirements, model explainability, and potential biases, the benefits of machine learning in enhancing SIEM capabilities are undeniable. As we continue to refine these technologies and integrate them more seamlessly into SIEM platforms, we can expect a future where threat detection is more proactive, accurate, and comprehensive than ever before. The fusion of vast amounts of data with sophisticated machine learning models holds the promise of transforming how we approach cybersecurity, making our defenses smarter and more resilient in the face of evolving threats.

About SnapAttack: SnapAttack is an innovator in proactive, threat-informed security solutions. The SnapAttack platform helps organizations answer their most pressing question: “Are we protected against the threats that matter?”

By rolling threat intelligence, adversary emulation, detection engineering, threat hunting, and purple teaming into a single, easy-to-use product with a no-code interface, SnapAttack enables companies to get more from their tools and more from their teams so they can finally stay ahead of the threat.