From Narrative to Knowledge Graph | LLM-Driven Information Extraction in Cyber Threat Intelligence

This post evaluates using large language models to extract and contextualize information from CTI reports, turning narrative into structured JSON entities and a unified knowledge graph for downstream defense workflows. It describes a three‑phase workflow (sanitization, LLM extraction, knowledge‑graph assembly), experimental results across multiple LLMs, and operational trade‑offs in accuracy, abstention, ensembling, and data‑model design. #GPT4_1 #ClaudeSonnet4_5

Keypoints

Proposes a three‑phase CTI extraction pipeline: report ingestion and sanitization, LLM‑based extractors (Infrastructure, Executables, Playbook), and knowledge‑graph assembly linking IOCs, steps, playbooks, and threat actors.
Uses custom extractor data models (including 12 IOC contextual attributes) and structured prompts with an evidence‑grading policy (High/Medium/Low) and an explicit abstention class (None) to constrain inference.
Evaluates off‑the‑shelf LLMs (GPT‑4.1, GPT‑5, GPT‑5.2, Claude Sonnet 4.5, Claude Opus 4.5) on a ground truth of 343 atomic IOCs and 1,859 labeled IOC attribute instances, emphasizing feasibility rather than definitive model ranking.
LLM extractors delivered large speedups versus human analysts (human baseline ~41 minutes per report; LLMs ~3.3 minutes on average, >18× speed‑up) while trading off coverage and correctness depending on settings.
Extraction quality depends strongly on report formatting, label/context cues, OCR coverage for embedded images, prompt‑model fit, and deliberate data‑model wording that guides LLM attention.
Discusses ensemble strategies, abstention error metrics (FDR, FNR), and the need for representative, flexible ground truth that can capture genuine ambiguity via multiple valid labels.
Recommends deliberate operational planning, continuous evaluation, and tailored data‑model and prompt design to balance accuracy, coverage, latency, and downstream reliability.

MITRE Techniques

Indicators of Compromise

[Domain ] selective extraction of attacker‑owned versus benign domains – malicious-example[.]com, attacker-domain[.]net
[IP address ] infrastructure indicators described in reports and IOC tables – 192.0.2.1, 198.51.100.23
[File hash ] executable identifiers (MD5, SHA‑1, SHA‑256) for malicious or attacker‑used binaries – d41d8cd98f00b204e9800998ecf8427e (MD5), e3b0c44298fc1c149afbf4c8996fb92427ae41e4… (SHA‑256), and 2 more hashes
[File path ] extracted open‑text attributes for reported artifacts on host systems – C:WindowsTempmal.exe, /usr/bin/evil
[Command line ] command‑line arguments associated with executables – mal.exe -s -c, python exploit.py –target 10.0.0.1
[Injected process ] names of processes reported as injection targets – explorer.exe, svchost.exe
[Certificate fingerprint ] pivotable artifacts used for correlation and blocking decisions – SHA1:AB:CD:EF:12:34:56:78:90:AB:CD:EF:12:34:56:78:90, cert-fingerprint:12:34:56:78:90:AB

SHARE THIS STORY

WhatsApp X (Twitter)Telegram Bluesky Facebook LinkedIn Threads Email Print