How can I detect port scanning from network traffic logs?
I’m a student developing a C++ network sniffer for monitoring traffic and detecting suspicious activity. For port scanning analysis, I’ve implemented the following algorithm:
#pragma once
#include "IAnalyzer.hpp"
#include "PacketInfo.h"
#include <string>
#include <unordered_map>
#include <set>
#include <unordered_set>
#include <vector>
struct ScanResult {
std::string srcIp;
std::string dstIp;
int uniqueDstPorts;
bool suspicious;
};
class PortScanningAnalyzer : public IAnalyzer
{
public:
nlohmann::json analyze(const std::vector<PacketInfo>& packets) override;
private:
std::vector<ScanResult> detectPortScanning(const std::vector<PacketInfo>& packets, int threshold);
};
nlohmann::json PortScanningAnalyzer::analyze(const std::vector<PacketInfo>& packets) {
nlohmann::json result;
result["rule"] = "portScan";
const int threshold = 100;
std::vector<ScanResult> scanResults = detectPortScanning(packets, threshold);
nlohmann::json counts = nlohmann::json::object();
std::set<std::string> suspiciousIps;
for (const auto& r : scanResults) {
counts[r.srcIp] = counts.contains(r.srcIp) ? counts[r.srcIp].get<int>() + r.uniqueDstPorts : r.uniqueDstPorts;
if (r.suspicious) {
suspiciousIps.insert(r.srcIp);
}
}
result["dst_ports_count"] = counts;
result["suspicious_ips"] = suspiciousIps;
result["threshold"] = threshold;
return result;
}
std::vector<ScanResult> PortScanningAnalyzer::detectPortScanning(const std::vector<PacketInfo>& packets, int threshold) {
std::unordered_map<std::string, std::unordered_map<std::string, std::set<int>>> portsMap;
for (const auto& p : packets) {
if (p.protocol == "TCP" || p.protocol == "UDP") {
portsMap[p.srcIp][p.dstIp].insert(p.dstPort);
}
}
std::vector<ScanResult> results;
for (const auto& [srcIp, dstMap] : portsMap) {
for (const auto& [dstIp, portSet] : dstMap) {
ScanResult r;
r.srcIp = srcIp;
r.dstIp = dstIp;
r.uniqueDstPorts = static_cast<int>(portSet.size());
r.suspicious = r.uniqueDstPorts > threshold;
results.push_back(r);
}
}
return results;
}
The problem: when testing the algorithm, not all port scanning attempts are being detected. For example, when scanning from a mobile device, the unique ports didn’t exceed 4-10, and when scanning from another machine, the counter stopped at around 100 ports instead of the actual 1000 requests.
Questions:
- What are your comments on the current port scanning detection algorithm?
- What other indicators suggest port scanning besides the number of unique destination ports? (for example, broadcast packets)
P.S. I’m not a cybersecurity specialist, so please explain in an accessible way without excessive jargon.
The main problem with your algorithm is its excessive simplicity - it only considers the number of unique destination ports, which is insufficient for detecting various types of port scans. Let’s examine this in more detail.
Table of Contents
- Problems with the Current Algorithm
- Additional Port Scanning Indicators
- Improved Detection Approach
- Practical Implementation Recommendations
- Conclusion
Problems with the Current Algorithm
Your current algorithm has several significant drawbacks:
1. Ignoring different types of scanning
- You process all TCP/UDP packets identically, although there are different types of scanning with varying degrees of detectability
- For example, SYN scanning only sends SYN packets and doesn’t establish full connections, making it less noticeable
- FIN-, XMAS-, and NULL-scanning use unusual flag combinations that can bypass basic detection systems
2. Lack of temporal analysis
- Your algorithm doesn’t consider when requests occur - quickly or slowly
- Slow scans can last for weeks, as mentioned in research
- When testing from a mobile device, the number of ports was small (4-10), but they could have been requested over an extended period
3. Insufficient consideration of traffic characteristics
- Research shows that port scanners create many small data streams, while normal traffic has larger size and greater variability
- Your algorithm doesn’t analyze packet sizes or request sequences
4. Simple threshold approach
- Using a single threshold (100 ports) is too crude
- Normal traffic to different hosts can legitimately request many ports
- There’s no adaptation to normal network behavior
Additional Port Scanning Indicators
In addition to the number of unique ports, the following indicators point to scanning:
1. Temporal Patterns
- Request rhythm: Constant intervals between packets to different ports
- High frequency: Many requests in a short period
- Abnormal time windows: Activity during non-working hours or unusual temporal patterns
2. TCP Flag Combinations
- FIN scanning: Packets with only the FIN flag set
- XMAS scanning: Packets with FIN, PSH, and URG flags set
- NULL scanning: Packets with no flags set
3. Connection Behavior
- Incomplete connections: Many SYN packets without responses or without establishing full connections
- Timeouts: Many requests ending in timeout
- Retries: Many repeated requests to the same ports
4. Statistical Anomalies
- Sequential vs random access: Scanning often occurs sequentially (1,2,3…) rather than randomly
- Access patterns: Incorrect distribution of requests across ports (e.g., only ports above 1024)
- Deviation from norm: Behavior that differs from typical for this source
5. Protocol Characteristics
- ICMP scanning: Many ping requests (ICMP echo requests) for host discovery
- ARP scanning: In local networks - many ARP requests
- DNS requests: Unusual DNS requests for service discovery
Improved Detection Approach
Research shows that effective port scan detection should use multiple methods simultaneously:
Data Stream Analysis Methods
- Stream size analysis: Port scanners create many small streams, while normal traffic has larger and more variable stream sizes [source]
- Sequential hypothesis testing: Allows detection of scanning even with a small number of ports if patterns are sequential
Time Window Analysis
- Sliding windows: Traffic analysis in windows of specific size (e.g., 60 seconds)
- Adaptive thresholds: Thresholds that depend on normal network behavior
Combined Indicators
- Weighting system: Different indicators give different scores (number of ports, speed, patterns)
- State machine: Tracking connection states and behavior patterns
Practical Implementation Recommendations
1. Add TCP flag analysis
struct PacketFlags {
bool syn;
bool ack;
bool fin;
bool rst;
bool psh;
bool urg;
};
// In your PacketInfo structure
PacketInfo {
// ... existing fields
PacketFlags flags;
int packetSize;
timestamp_t timestamp;
};
2. Implement temporal analysis
struct TimeWindow {
std::vector<PacketInfo> packets;
timestamp_t startTime;
timestamp_t endTime;
double getPacketRate() const {
double duration = endTime - startTime;
return packets.size() / duration;
}
double getAveragePacketSize() const {
double totalSize = 0;
for (const auto& p : packets) {
totalSize += p.packetSize;
}
return totalSize / packets.size();
}
};
3. Add port pattern detector
class PortPatternDetector {
public:
enum class PatternType {
SEQUENTIAL,
RANDOM,
COMMON_PORTS,
HIGH_RANGE
};
PatternType detectPattern(const std::set<int>& ports) {
// Port sequence analysis
// Pattern search in ranges
// Detection of popular ports
}
};
4. Implement scoring system
class PortScanScorer {
public:
double calculateSuspicionScore(const std::vector<PacketInfo>& packets) {
double score = 0.0;
// Basic indicators
score += countUniquePorts(packets) * 0.3;
score += getPacketRate(packets) * 0.2;
score += getFlagAnomalies(packets) * 0.4;
score += getTimePatternScore(packets) * 0.1;
return score;
}
private:
// Helper methods for indicator calculation
};
5. Improve the main algorithm
std::vector<ScanResult> PortScanningAnalyzer::detectPortScanning(
const std::vector<PacketInfo>& packets, int threshold) {
// Grouping by time windows
std::vector<TimeWindow> timeWindows = groupIntoTimeWindows(packets, 60); // 60 seconds
std::vector<ScanResult> results;
for (const auto& window : timeWindows) {
for (const auto& srcIp : getUniqueSources(window)) {
auto srcPackets = filterBySource(window, srcIp);
// Calculate various indicators
int uniquePorts = countUniquePorts(srcPackets);
double packetRate = calculatePacketRate(srcPackets);
double avgPacketSize = calculateAveragePacketSize(srcPackets);
PatternType pattern = detectPortPattern(srcPackets);
// Combined evaluation
double suspicionScore = calculateSuspicionScore({
uniquePorts, packetRate, avgPacketSize, pattern
});
if (suspicionScore > SUSPICION_THRESHOLD) {
results.push_back(createScanResult(srcIp, suspicionScore));
}
}
}
return results;
}
Conclusion
Your current algorithm is a good starting point, but for effective port scan detection, you need to consider several factors:
- Multiple indicators: Use a combination of port count, temporal patterns, packet characteristics, and behavioral patterns
- Adaptive thresholds: Instead of a fixed threshold, use dynamic thresholds based on normal network behavior
- Temporal analysis: Add traffic analysis in time windows to detect slow scans
- Variety of scan types: Consider various scanning methods (SYN, FIN, XMAS, etc.)
Start by adding TCP flag analysis and temporal patterns, then gradually complicate the algorithm by adding additional indicators. This approach will allow detection of both fast and slow scans, as well as their various types.
Sources
- Detection of slow port scans in flow-based network traffic - PMC
- PD-CPS: A practical scheme for detecting covert port scans in high-speed networks - ScienceDirect
- What is a Port Scan + How to Detect It - Vectra
- Detecting and Understanding Online Port Scans - ExtraHop
- The Ultimate Port Scanning Guide - SecOps Solution