NeuroAgent

Port Scanning Detection: Network Traffic Analysis

Enhanced port scanning detection algorithm in C++. Analysis of temporal patterns, TCP flags, and behavioral features for effective network traffic monitoring.

Question

How can I detect port scanning from network traffic logs?

I’m a student developing a C++ network sniffer for monitoring traffic and detecting suspicious activity. For port scanning analysis, I’ve implemented the following algorithm:

cpp
#pragma once
#include "IAnalyzer.hpp"
#include "PacketInfo.h"
#include <string>
#include <unordered_map>
#include <set>
#include <unordered_set>
#include <vector>

struct ScanResult {
    std::string srcIp;
    std::string dstIp;
    int uniqueDstPorts;
    bool suspicious;
};

class PortScanningAnalyzer : public IAnalyzer
{
public:
    nlohmann::json analyze(const std::vector<PacketInfo>& packets) override;

private:
    std::vector<ScanResult> detectPortScanning(const std::vector<PacketInfo>& packets, int threshold);
};

nlohmann::json PortScanningAnalyzer::analyze(const std::vector<PacketInfo>& packets) {
    nlohmann::json result;
    result["rule"] = "portScan";

    const int threshold = 100;

    std::vector<ScanResult> scanResults = detectPortScanning(packets, threshold);

    nlohmann::json counts = nlohmann::json::object();
    std::set<std::string> suspiciousIps;

    for (const auto& r : scanResults) {
        counts[r.srcIp] = counts.contains(r.srcIp) ? counts[r.srcIp].get<int>() + r.uniqueDstPorts : r.uniqueDstPorts;

        if (r.suspicious) {
            suspiciousIps.insert(r.srcIp);
        }
    }

    result["dst_ports_count"] = counts;
    result["suspicious_ips"] = suspiciousIps;
    result["threshold"] = threshold;

    return result;
}

std::vector<ScanResult> PortScanningAnalyzer::detectPortScanning(const std::vector<PacketInfo>& packets, int threshold) {
    std::unordered_map<std::string, std::unordered_map<std::string, std::set<int>>> portsMap;

    for (const auto& p : packets) {
        if (p.protocol == "TCP" || p.protocol == "UDP") {
            portsMap[p.srcIp][p.dstIp].insert(p.dstPort);
        }
    }

    std::vector<ScanResult> results;

    for (const auto& [srcIp, dstMap] : portsMap) {
        for (const auto& [dstIp, portSet] : dstMap) {
            ScanResult r;
            r.srcIp = srcIp;
            r.dstIp = dstIp;
            r.uniqueDstPorts = static_cast<int>(portSet.size());
            r.suspicious = r.uniqueDstPorts > threshold;
            results.push_back(r);
        }
    }

    return results;
}

The problem: when testing the algorithm, not all port scanning attempts are being detected. For example, when scanning from a mobile device, the unique ports didn’t exceed 4-10, and when scanning from another machine, the counter stopped at around 100 ports instead of the actual 1000 requests.

Questions:

  1. What are your comments on the current port scanning detection algorithm?
  2. What other indicators suggest port scanning besides the number of unique destination ports? (for example, broadcast packets)

P.S. I’m not a cybersecurity specialist, so please explain in an accessible way without excessive jargon.

NeuroAgent

The main problem with your algorithm is its excessive simplicity - it only considers the number of unique destination ports, which is insufficient for detecting various types of port scans. Let’s examine this in more detail.

Table of Contents

Problems with the Current Algorithm

Your current algorithm has several significant drawbacks:

1. Ignoring different types of scanning

  • You process all TCP/UDP packets identically, although there are different types of scanning with varying degrees of detectability
  • For example, SYN scanning only sends SYN packets and doesn’t establish full connections, making it less noticeable
  • FIN-, XMAS-, and NULL-scanning use unusual flag combinations that can bypass basic detection systems

2. Lack of temporal analysis

  • Your algorithm doesn’t consider when requests occur - quickly or slowly
  • Slow scans can last for weeks, as mentioned in research
  • When testing from a mobile device, the number of ports was small (4-10), but they could have been requested over an extended period

3. Insufficient consideration of traffic characteristics

  • Research shows that port scanners create many small data streams, while normal traffic has larger size and greater variability
  • Your algorithm doesn’t analyze packet sizes or request sequences

4. Simple threshold approach

  • Using a single threshold (100 ports) is too crude
  • Normal traffic to different hosts can legitimately request many ports
  • There’s no adaptation to normal network behavior

Additional Port Scanning Indicators

In addition to the number of unique ports, the following indicators point to scanning:

1. Temporal Patterns

  • Request rhythm: Constant intervals between packets to different ports
  • High frequency: Many requests in a short period
  • Abnormal time windows: Activity during non-working hours or unusual temporal patterns

2. TCP Flag Combinations

  • FIN scanning: Packets with only the FIN flag set
  • XMAS scanning: Packets with FIN, PSH, and URG flags set
  • NULL scanning: Packets with no flags set

3. Connection Behavior

  • Incomplete connections: Many SYN packets without responses or without establishing full connections
  • Timeouts: Many requests ending in timeout
  • Retries: Many repeated requests to the same ports

4. Statistical Anomalies

  • Sequential vs random access: Scanning often occurs sequentially (1,2,3…) rather than randomly
  • Access patterns: Incorrect distribution of requests across ports (e.g., only ports above 1024)
  • Deviation from norm: Behavior that differs from typical for this source

5. Protocol Characteristics

  • ICMP scanning: Many ping requests (ICMP echo requests) for host discovery
  • ARP scanning: In local networks - many ARP requests
  • DNS requests: Unusual DNS requests for service discovery

Improved Detection Approach

Research shows that effective port scan detection should use multiple methods simultaneously:

Data Stream Analysis Methods

  • Stream size analysis: Port scanners create many small streams, while normal traffic has larger and more variable stream sizes [source]
  • Sequential hypothesis testing: Allows detection of scanning even with a small number of ports if patterns are sequential

Time Window Analysis

  • Sliding windows: Traffic analysis in windows of specific size (e.g., 60 seconds)
  • Adaptive thresholds: Thresholds that depend on normal network behavior

Combined Indicators

  • Weighting system: Different indicators give different scores (number of ports, speed, patterns)
  • State machine: Tracking connection states and behavior patterns

Practical Implementation Recommendations

1. Add TCP flag analysis

cpp
struct PacketFlags {
    bool syn;
    bool ack;
    bool fin;
    bool rst;
    bool psh;
    bool urg;
};

// In your PacketInfo structure
PacketInfo {
    // ... existing fields
    PacketFlags flags;
    int packetSize;
    timestamp_t timestamp;
};

2. Implement temporal analysis

cpp
struct TimeWindow {
    std::vector<PacketInfo> packets;
    timestamp_t startTime;
    timestamp_t endTime;
    
    double getPacketRate() const {
        double duration = endTime - startTime;
        return packets.size() / duration;
    }
    
    double getAveragePacketSize() const {
        double totalSize = 0;
        for (const auto& p : packets) {
            totalSize += p.packetSize;
        }
        return totalSize / packets.size();
    }
};

3. Add port pattern detector

cpp
class PortPatternDetector {
public:
    enum class PatternType {
        SEQUENTIAL,
        RANDOM,
        COMMON_PORTS,
        HIGH_RANGE
    };
    
    PatternType detectPattern(const std::set<int>& ports) {
        // Port sequence analysis
        // Pattern search in ranges
        // Detection of popular ports
    }
};

4. Implement scoring system

cpp
class PortScanScorer {
public:
    double calculateSuspicionScore(const std::vector<PacketInfo>& packets) {
        double score = 0.0;
        
        // Basic indicators
        score += countUniquePorts(packets) * 0.3;
        score += getPacketRate(packets) * 0.2;
        score += getFlagAnomalies(packets) * 0.4;
        score += getTimePatternScore(packets) * 0.1;
        
        return score;
    }
    
private:
    // Helper methods for indicator calculation
};

5. Improve the main algorithm

cpp
std::vector<ScanResult> PortScanningAnalyzer::detectPortScanning(
    const std::vector<PacketInfo>& packets, int threshold) {
    
    // Grouping by time windows
    std::vector<TimeWindow> timeWindows = groupIntoTimeWindows(packets, 60); // 60 seconds
    
    std::vector<ScanResult> results;
    
    for (const auto& window : timeWindows) {
        for (const auto& srcIp : getUniqueSources(window)) {
            auto srcPackets = filterBySource(window, srcIp);
            
            // Calculate various indicators
            int uniquePorts = countUniquePorts(srcPackets);
            double packetRate = calculatePacketRate(srcPackets);
            double avgPacketSize = calculateAveragePacketSize(srcPackets);
            PatternType pattern = detectPortPattern(srcPackets);
            
            // Combined evaluation
            double suspicionScore = calculateSuspicionScore({
                uniquePorts, packetRate, avgPacketSize, pattern
            });
            
            if (suspicionScore > SUSPICION_THRESHOLD) {
                results.push_back(createScanResult(srcIp, suspicionScore));
            }
        }
    }
    
    return results;
}

Conclusion

Your current algorithm is a good starting point, but for effective port scan detection, you need to consider several factors:

  1. Multiple indicators: Use a combination of port count, temporal patterns, packet characteristics, and behavioral patterns
  2. Adaptive thresholds: Instead of a fixed threshold, use dynamic thresholds based on normal network behavior
  3. Temporal analysis: Add traffic analysis in time windows to detect slow scans
  4. Variety of scan types: Consider various scanning methods (SYN, FIN, XMAS, etc.)

Start by adding TCP flag analysis and temporal patterns, then gradually complicate the algorithm by adding additional indicators. This approach will allow detection of both fast and slow scans, as well as their various types.

Sources

  1. Detection of slow port scans in flow-based network traffic - PMC
  2. PD-CPS: A practical scheme for detecting covert port scans in high-speed networks - ScienceDirect
  3. What is a Port Scan + How to Detect It - Vectra
  4. Detecting and Understanding Online Port Scans - ExtraHop
  5. The Ultimate Port Scanning Guide - SecOps Solution