Application Security · 7 min read · 1,461 words

Checkmarx SAST Engine: Post-Scan Intelligence

Disclosure: Some links in this article are affiliate links. We may earn a commission at no extra cost to you if you purchase through them.

Checkmarx's New SAST Engine Isn't About the LLM: It's About What Happens After

Understanding post-scan intelligence and how to leverage modern SAST architectures for actionable security outcomes.

1. Why This Matters

The major static application security testing (SAST) vendors are now wrapping large language models around their legacy scanning engines, marketing this as revolutionary AI-powered security. But here's the reality that most marketing glosses over: Checkmarx's new SAST engine isn't about the LLM—it's about what happens after the initial scan completes.

The fundamental problem with traditional SAST tools hasn't been detection capability. Most mature scanners catch the majority of common vulnerabilities. The real pain points have always been:

3. Step-by-Step Instructions

Step 1: Configure API Access to Your SAST Platform

First, establish programmatic access to retrieve scan results. This example uses Checkmarx One's REST API:

import requests
from typing import Dict, List
import os

class SASTResultsClient:
    def __init__(self, base_url: str, api_key: str):
        self.base_url = base_url.rstrip('/')
        self.headers = {
            'Authorization': f'Bearer {api_key}',
            'Content-Type': 'application/json'
        }
    
    def get_scan_results(self, scan_id: str) -> Dict:
        """Retrieve raw scan results from the SAST platform."""
        endpoint = f"{self.base_url}/api/results/{scan_id}"
        response = requests.get(endpoint, headers=self.headers)
        response.raise_for_status()
        return response.json()
    
    def get_vulnerability_details(self, result_id: str) -> Dict:
        """Fetch detailed information for a specific finding."""
        endpoint = f"{self.base_url}/api/results/{result_id}/details"
        response = requests.get(endpoint, headers=self.headers)
        response.raise_for_status()
        return response.json()

# Initialize the client
client = SASTResultsClient(
    base_url=os.environ['SAST_API_URL'],
    api_key=os.environ['SAST_API_KEY']
)

This client forms the foundation for all post-scan operations. Store credentials securely using HashiCorp Vault or your CI/CD platform's secret management.

Step 2: Implement the Post-Scan Intelligence Layer

The critical differentiation happens here. Build a processing layer that enriches raw findings with context:

from dataclasses import dataclass
from enum import Enum
from typing import Optional
import subprocess

class ExploitabilityRating(Enum):
    CONFIRMED = "confirmed"
    LIKELY = "likely"
    POSSIBLE = "possible"
    UNLIKELY = "unlikely"

@dataclass
class EnrichedFinding:
    original_finding: Dict
    exploitability: ExploitabilityRating
    data_flow_context: str
    remediation_snippet: str
    business_context: Optional[str]
    priority_score: float

class PostScanIntelligence:
    def __init__(self, sast_client: SASTResultsClient):
        self.client = sast_client
        self.context_cache = {}
    
    def analyze_data_flow(self, finding: Dict) -> str:
        """
        Analyze the complete data flow from source to sink.
        This is where the real intelligence happens.
        """
        source_file = finding.get('sourceFile')
        source_line = finding.get('sourceLine')
        sink_file = finding.get('sinkFile')
        sink_line = finding.get('sinkLine')
        
        # Retrieve code context around source and sink
        source_context = self._get_code_context(source_file, source_line, 5)
        sink_context = self._get_code_context(sink_file, sink_line, 5)
        
        # Build human-readable data flow description
        data_flow = f"""
        USER INPUT ENTERS AT:
        File: {source_file}:{source_line}
        {source_context}
        
        VULNERABILITY TRIGGERED AT:
        File: {sink_file}:{sink_line}
        {sink_context}
        
        FLOW PATH: {' -> '.join(finding.get('flowPath', []))}
        """
        return data_flow
    
    def _get_code_context(self, file_path: str, line_num: int, 
                          context_lines: int) -> str:
        """Retrieve surrounding code for context."""
        try:
            with open(file_path, 'r') as f:
                lines = f.readlines()
            
            start = max(0, line_num - context_lines - 1)
            end = min(len(lines), line_num + context_lines)
            
            context = []
            for i, line in enumerate(lines[start:end], start=start+1):
                marker = ">>>" if i == line_num else "   "
                context.append(f"{marker} {i}: {line.rstrip()}")
            
            return '\n'.join(context)
        except FileNotFoundError:
            return f"[Code context unavailable for {file_path}]"
    
    def calculate_priority(self, finding: Dict) -> float:
        """
        Generate a normalized priority score (0-100) based on
        multiple factors beyond just severity.
        """
        base_score = {
            'critical': 90,
            'high': 70,
            'medium': 50,
            'low': 30,
            'info': 10
        }.get(finding.get('severity', 'medium').lower(), 50)
        
        modifiers = 0
        
        # Increase priority for publicly accessible endpoints
        if self._is_public_endpoint(finding):
            modifiers += 15
        
        # Increase for authentication/authorization related files
        if self._is_security_critical_path(finding):
            modifiers += 10
        
        # Decrease for test files
        if self._is_test_file(finding):
            modifiers -= 30
        
        # Factor in exploitability
        exploitability = self._assess_exploitability(finding)
        if exploitability == ExploitabilityRating.CONFIRMED:
            modifiers += 20
        elif exploitability == ExploitabilityRating.UNLIKELY:
            modifiers -= 20
        
        return min(100, max(0, base_score + modifiers))
    
    def _is_public_endpoint(self, finding: Dict) -> bool:
        """Check if the vulnerable code handles external requests."""
        public_patterns = ['controller', 'handler', 'endpoint', 
                          'api', 'route', 'view']
        file_path = finding.get('sourceFile', '').lower()
        return any(pattern in file_path for pattern in public_patterns)
    
    def _is_security_critical_path(self, finding: Dict) -> bool:
        """Identify security-sensitive code paths."""
        critical_patterns = ['auth', 'login', 'session', 'token', 
                            'password', 'credential', 'permission']
        file_path = finding.get('sourceFile', '').lower()
        return any(pattern in file_path for pattern in critical_patterns)
    
    def _is_test_file(self, finding: Dict) -> bool:
        """Detect test files that should be deprioritized."""
        test_patterns = ['test', 'spec', 'mock', '__tests__', 'fixtures']
        file_path = finding.get('sourceFile', '').lower()
        return any(pattern in file_path for pattern in test_patterns)
    
    def _assess_exploitability(self, finding: Dict) -> ExploitabilityRating:
        """Determine how easily the vulnerability can be exploited."""
        # Check for direct user input to dangerous function
        if finding.get('directUserInput') and finding.get('noSanitization'):
            return ExploitabilityRating.CONFIRMED
        
        # Check for validation present but bypassable
        if finding.get('partialValidation'):
            return ExploitabilityRating.LIKELY
        
        # Check for multiple prerequisites needed
        if finding.get('requiresAuthentication') and finding.get('requiresPrivilege'):
            return ExploitabilityRating.UNLIKELY
        
        return ExploitabilityRating.POSSIBLE

Step 3: Generate Contextual Remediation Guidance

Move beyond generic fix suggestions to code-aware remediation:

class RemediationGenerator:
    def __init__(self):
        self.remediation_templates = self._load_templates()
    
    def generate_fix(self, finding: Dict, code_context: str) -> str:
        """
        Generate a specific fix based on the vulnerability type
        and the actual code pattern detected.
        """
        vuln_type = finding.get('vulnerabilityType', '')
        language = finding.get('language', 'unknown')
        
        if vuln_type == 'SQL_Injection':
            return self._generate_sqli_fix(finding, language)
        elif vuln_type == 'XSS':
            return self._generate_xss_fix(finding, language)
        elif vuln_type == 'Path_Traversal':
            return self._generate_path_traversal_fix(finding, language)
        else:
            return self._get_generic_remediation(vuln_type)
    
    def _generate_sqli_fix(self, finding: Dict, language: str) -> str:
        """Generate SQL injection specific remediation."""
        if language.lower() == 'python':
            return """
RECOMMENDED FIX (Python):

Replace dynamic query construction:
python

VULNERABLE - Do not use

query = f"SELECT * FROM users WHERE id = {user_input}" cursor.execute(query)

SECURE - Use parameterized queries

query = "SELECT * FROM users WHERE id = %s" cursor.execute(query, (user_input,))
If using an ORM like SQLAlchemy:
python

SECURE - ORM with bound parameters

user = session.query(User).filter(User.id == user_input).first()
ADDITIONAL STEPS:
1. Implement input validation using allow-lists where possible
2. Apply principle of least privilege to database accounts
3. Consider using stored procedures for complex queries
"""
        elif language.lower() in ['java', 'kotlin']:
            return """
RECOMMENDED FIX (Java):

Replace string concatenation with PreparedStatement:
java // VULNERABLE - Do not use String query = "SELECT * FROM users WHERE id = " + userInput; Statement stmt = connection.createStatement(); ResultSet rs = stmt.executeQuery(query);

// SECURE - Use PreparedStatement String query = "SELECT * FROM users WHERE id = ?"; PreparedStatement pstmt = connection.prepareStatement(query); pstmt.setString(1, userInput); ResultSet rs = pstmt.executeQuery();

"""
        return self._get_generic_remediation('SQL_Injection')

Step 4: Integrate Into Your CI/CD Pipeline

Create a GitHub Actions workflow that processes results after scans:

.github/workflows/sast-post-processing.ymlname: SAST Post-Scan Intelligence

on:
  workflow_run:
    workflows: ["Security Scan"]
    types:
      - completed

jobs:
  process-results:
    runs-on: ubuntu-latest
    if: ${{ github.event.workflow_run.conclusion == 'success' }}
    
    steps:
      - name: Checkout code
        uses: actions/checkout@v4
      
      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
      
      - name: Install dependencies
        run: |
          pip install requests pyyaml jira-python
      
      - name: Process SAST Results
        env:
          SAST_API_URL: ${{ secrets.SAST_API_URL }}
          SAST_API_KEY: ${{ secrets.SAST_API_KEY }}
          JIRA_URL: ${{ secrets.JIRA_URL }}
          JIRA_TOKEN: ${{ secrets.JIRA_TOKEN }}
        run: |
          python scripts/post_scan_processor.py \
            --scan-id ${{ github.event.workflow_run.id }} \
            --priority-threshold 70 \
            --output results/enriched_findings.json
      
      - name: Create Issues for Critical Findings
        run: |
          python scripts/create_security_tickets.py \
            --input results/enriched_findings.json \
            --min-priority 80

Step 5: Configure Automated Ticket Creation

Route enriched findings to your tracking system:

from jira import JIRA
import json

def create_security_tickets(enriched_findings: List[EnrichedFinding], 
                           jira_client: JIRA,
                           project_key: str):
    """Create detailed tickets for actionable findings."""
    
    for finding in enriched_findings:
        if finding.priority_score < 70:
            continue
        
        description = f"""
h2. Vulnerability Details
*Type:* {finding.original_finding.get('vulnerabilityType')}
*Severity:* {finding.original_finding.get('severity')}
*Priority Score:* {finding.priority_score}/100
*Exploitability:* {finding.exploitability.value}

h2. Data Flow Analysis
{finding.data_flow_context}

h2. Remediation Guidance
{finding.remediation_snippet}

h2. Business Context
{finding.business_context or 'N/A'}
        """
        
        issue_dict = {
            'project': project_key,
            'summary': f"[Security] {finding.original_finding.get('vulnerabilityType')} in {finding.original_finding.get('sourceFile')}",
            'description': description,
            'issuetype': {'name': 'Bug'},
            'priority': {'name': 'High' if finding.priority_score >= 80 else 'Medium'},
            'labels': ['security', 'sast', 'automated']
        }
        
        jira_client.create_issue(fields=issue_dict)

4. Common Pitfalls & How to Avoid Them

Pitfall 1: Processing All Findings Equally Not every vulnerability deserves immediate attention. Implement tiered processing where critical findings get immediate

Tags: SAST · application security · static analysis · vulnerability detection · security tools