Checkmarx's New SAST Engine Isn't About the LLM: It's About What Happens After
Understanding post-scan intelligence and how to leverage modern SAST architectures for actionable security outcomes.
1. Why This Matters
The major static application security testing (SAST) vendors are now wrapping large language models around their legacy scanning engines, marketing this as revolutionary AI-powered security. But here's the reality that most marketing glosses over: Checkmarx's new SAST engine isn't about the LLM—it's about what happens after the initial scan completes.
The fundamental problem with traditional SAST tools hasn't been detection capability. Most mature scanners catch the majority of common vulnerabilities. The real pain points have always been:
- False positive fatigue: Security teams drowning in alerts that turn out to be non-issues
- Lack of context: Findings that don't explain exploitability or business impact
- Remediation gaps: Developers receiving warnings without actionable fix guidance
- Integration friction: Results that don't flow into existing workflows
- Access to a SAST platform with API capabilities (Checkmarx One, Checkmarx SAST 9.x+)
- CI/CD pipeline with webhook support (Jenkins, GitLab CI, or GitHub Actions)
- Python 3.9+ for automation scripts
- Access to your source code repository with read permissions Knowledge Requirements:
- Basic understanding of REST APIs
- Familiarity with JSON data structures
- Understanding of common vulnerability types (CWE classifications)
- Experience with your organization's ticketing system Access and Credentials:
- SAST platform API tokens with scan results read permissions
- Repository access tokens for code context retrieval
- Ticketing system API credentials (Jira, ServiceNow, or Azure DevOps)
What truly matters in modern SAST isn't the scanning engine itself—it's the post-scan intelligence layer that transforms raw findings into prioritized, contextual, and actionable security insights. This guide walks you through implementing and optimizing this critical layer in your security pipeline.
2. Prerequisites
Before implementing post-scan intelligence workflows, ensure you have: Technical Requirements:
3. Step-by-Step Instructions
Step 1: Configure API Access to Your SAST Platform
First, establish programmatic access to retrieve scan results. This example uses Checkmarx One's REST API:
import requests
from typing import Dict, List
import os
class SASTResultsClient:
def __init__(self, base_url: str, api_key: str):
self.base_url = base_url.rstrip('/')
self.headers = {
'Authorization': f'Bearer {api_key}',
'Content-Type': 'application/json'
}
def get_scan_results(self, scan_id: str) -> Dict:
"""Retrieve raw scan results from the SAST platform."""
endpoint = f"{self.base_url}/api/results/{scan_id}"
response = requests.get(endpoint, headers=self.headers)
response.raise_for_status()
return response.json()
def get_vulnerability_details(self, result_id: str) -> Dict:
"""Fetch detailed information for a specific finding."""
endpoint = f"{self.base_url}/api/results/{result_id}/details"
response = requests.get(endpoint, headers=self.headers)
response.raise_for_status()
return response.json()
# Initialize the client
client = SASTResultsClient(
base_url=os.environ['SAST_API_URL'],
api_key=os.environ['SAST_API_KEY']
)
This client forms the foundation for all post-scan operations. Store credentials securely using HashiCorp Vault or your CI/CD platform's secret management.
Step 2: Implement the Post-Scan Intelligence Layer
The critical differentiation happens here. Build a processing layer that enriches raw findings with context:
from dataclasses import dataclass
from enum import Enum
from typing import Optional
import subprocess
class ExploitabilityRating(Enum):
CONFIRMED = "confirmed"
LIKELY = "likely"
POSSIBLE = "possible"
UNLIKELY = "unlikely"
@dataclass
class EnrichedFinding:
original_finding: Dict
exploitability: ExploitabilityRating
data_flow_context: str
remediation_snippet: str
business_context: Optional[str]
priority_score: float
class PostScanIntelligence:
def __init__(self, sast_client: SASTResultsClient):
self.client = sast_client
self.context_cache = {}
def analyze_data_flow(self, finding: Dict) -> str:
"""
Analyze the complete data flow from source to sink.
This is where the real intelligence happens.
"""
source_file = finding.get('sourceFile')
source_line = finding.get('sourceLine')
sink_file = finding.get('sinkFile')
sink_line = finding.get('sinkLine')
# Retrieve code context around source and sink
source_context = self._get_code_context(source_file, source_line, 5)
sink_context = self._get_code_context(sink_file, sink_line, 5)
# Build human-readable data flow description
data_flow = f"""
USER INPUT ENTERS AT:
File: {source_file}:{source_line}
{source_context}
VULNERABILITY TRIGGERED AT:
File: {sink_file}:{sink_line}
{sink_context}
FLOW PATH: {' -> '.join(finding.get('flowPath', []))}
"""
return data_flow
def _get_code_context(self, file_path: str, line_num: int,
context_lines: int) -> str:
"""Retrieve surrounding code for context."""
try:
with open(file_path, 'r') as f:
lines = f.readlines()
start = max(0, line_num - context_lines - 1)
end = min(len(lines), line_num + context_lines)
context = []
for i, line in enumerate(lines[start:end], start=start+1):
marker = ">>>" if i == line_num else " "
context.append(f"{marker} {i}: {line.rstrip()}")
return '\n'.join(context)
except FileNotFoundError:
return f"[Code context unavailable for {file_path}]"
def calculate_priority(self, finding: Dict) -> float:
"""
Generate a normalized priority score (0-100) based on
multiple factors beyond just severity.
"""
base_score = {
'critical': 90,
'high': 70,
'medium': 50,
'low': 30,
'info': 10
}.get(finding.get('severity', 'medium').lower(), 50)
modifiers = 0
# Increase priority for publicly accessible endpoints
if self._is_public_endpoint(finding):
modifiers += 15
# Increase for authentication/authorization related files
if self._is_security_critical_path(finding):
modifiers += 10
# Decrease for test files
if self._is_test_file(finding):
modifiers -= 30
# Factor in exploitability
exploitability = self._assess_exploitability(finding)
if exploitability == ExploitabilityRating.CONFIRMED:
modifiers += 20
elif exploitability == ExploitabilityRating.UNLIKELY:
modifiers -= 20
return min(100, max(0, base_score + modifiers))
def _is_public_endpoint(self, finding: Dict) -> bool:
"""Check if the vulnerable code handles external requests."""
public_patterns = ['controller', 'handler', 'endpoint',
'api', 'route', 'view']
file_path = finding.get('sourceFile', '').lower()
return any(pattern in file_path for pattern in public_patterns)
def _is_security_critical_path(self, finding: Dict) -> bool:
"""Identify security-sensitive code paths."""
critical_patterns = ['auth', 'login', 'session', 'token',
'password', 'credential', 'permission']
file_path = finding.get('sourceFile', '').lower()
return any(pattern in file_path for pattern in critical_patterns)
def _is_test_file(self, finding: Dict) -> bool:
"""Detect test files that should be deprioritized."""
test_patterns = ['test', 'spec', 'mock', '__tests__', 'fixtures']
file_path = finding.get('sourceFile', '').lower()
return any(pattern in file_path for pattern in test_patterns)
def _assess_exploitability(self, finding: Dict) -> ExploitabilityRating:
"""Determine how easily the vulnerability can be exploited."""
# Check for direct user input to dangerous function
if finding.get('directUserInput') and finding.get('noSanitization'):
return ExploitabilityRating.CONFIRMED
# Check for validation present but bypassable
if finding.get('partialValidation'):
return ExploitabilityRating.LIKELY
# Check for multiple prerequisites needed
if finding.get('requiresAuthentication') and finding.get('requiresPrivilege'):
return ExploitabilityRating.UNLIKELY
return ExploitabilityRating.POSSIBLE
Step 3: Generate Contextual Remediation Guidance
Move beyond generic fix suggestions to code-aware remediation:
class RemediationGenerator:
def __init__(self):
self.remediation_templates = self._load_templates()
def generate_fix(self, finding: Dict, code_context: str) -> str:
"""
Generate a specific fix based on the vulnerability type
and the actual code pattern detected.
"""
vuln_type = finding.get('vulnerabilityType', '')
language = finding.get('language', 'unknown')
if vuln_type == 'SQL_Injection':
return self._generate_sqli_fix(finding, language)
elif vuln_type == 'XSS':
return self._generate_xss_fix(finding, language)
elif vuln_type == 'Path_Traversal':
return self._generate_path_traversal_fix(finding, language)
else:
return self._get_generic_remediation(vuln_type)
def _generate_sqli_fix(self, finding: Dict, language: str) -> str:
"""Generate SQL injection specific remediation."""
if language.lower() == 'python':
return """
RECOMMENDED FIX (Python):
Replace dynamic query construction:
python
VULNERABLE - Do not use
query = f"SELECT * FROM users WHERE id = {user_input}" cursor.execute(query)SECURE - Use parameterized queries
query = "SELECT * FROM users WHERE id = %s" cursor.execute(query, (user_input,))If using an ORM like SQLAlchemy:
python
SECURE - ORM with bound parameters
user = session.query(User).filter(User.id == user_input).first()ADDITIONAL STEPS:
1. Implement input validation using allow-lists where possible
2. Apply principle of least privilege to database accounts
3. Consider using stored procedures for complex queries
"""
elif language.lower() in ['java', 'kotlin']:
return """
RECOMMENDED FIX (Java):
Replace string concatenation with PreparedStatement:
java
// VULNERABLE - Do not use
String query = "SELECT * FROM users WHERE id = " + userInput;
Statement stmt = connection.createStatement();
ResultSet rs = stmt.executeQuery(query);
// SECURE - Use PreparedStatement String query = "SELECT * FROM users WHERE id = ?"; PreparedStatement pstmt = connection.prepareStatement(query); pstmt.setString(1, userInput); ResultSet rs = pstmt.executeQuery();
"""
return self._get_generic_remediation('SQL_Injection')
Step 4: Integrate Into Your CI/CD Pipeline
Create a GitHub Actions workflow that processes results after scans:
.github/workflows/sast-post-processing.ymlname: SAST Post-Scan Intelligence
on:
workflow_run:
workflows: ["Security Scan"]
types:
- completed
jobs:
process-results:
runs-on: ubuntu-latest
if: ${{ github.event.workflow_run.conclusion == 'success' }}
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: |
pip install requests pyyaml jira-python
- name: Process SAST Results
env:
SAST_API_URL: ${{ secrets.SAST_API_URL }}
SAST_API_KEY: ${{ secrets.SAST_API_KEY }}
JIRA_URL: ${{ secrets.JIRA_URL }}
JIRA_TOKEN: ${{ secrets.JIRA_TOKEN }}
run: |
python scripts/post_scan_processor.py \
--scan-id ${{ github.event.workflow_run.id }} \
--priority-threshold 70 \
--output results/enriched_findings.json
- name: Create Issues for Critical Findings
run: |
python scripts/create_security_tickets.py \
--input results/enriched_findings.json \
--min-priority 80
Step 5: Configure Automated Ticket Creation
Route enriched findings to your tracking system:
from jira import JIRA
import json
def create_security_tickets(enriched_findings: List[EnrichedFinding],
jira_client: JIRA,
project_key: str):
"""Create detailed tickets for actionable findings."""
for finding in enriched_findings:
if finding.priority_score < 70:
continue
description = f"""
h2. Vulnerability Details
*Type:* {finding.original_finding.get('vulnerabilityType')}
*Severity:* {finding.original_finding.get('severity')}
*Priority Score:* {finding.priority_score}/100
*Exploitability:* {finding.exploitability.value}
h2. Data Flow Analysis
{finding.data_flow_context}
h2. Remediation Guidance
{finding.remediation_snippet}
h2. Business Context
{finding.business_context or 'N/A'}
"""
issue_dict = {
'project': project_key,
'summary': f"[Security] {finding.original_finding.get('vulnerabilityType')} in {finding.original_finding.get('sourceFile')}",
'description': description,
'issuetype': {'name': 'Bug'},
'priority': {'name': 'High' if finding.priority_score >= 80 else 'Medium'},
'labels': ['security', 'sast', 'automated']
}
jira_client.create_issue(fields=issue_dict)