Command injection is a direct path to remote code execution. An attacker who can inject OS commands into your application can read secrets, exfiltrate data, establish persistence, or pivot to internal systems. The vulnerability is not subtle: it appears when application code constructs a shell command by concatenating user input. This article covers the mechanics, the common patterns developers reach for that introduce the bug, and the safe alternatives.
How Command Injection Works
Applications sometimes need to run system commands: convert an image, call a CLI tool, check a network address, run a report. The most direct way to do this is to construct a shell string and execute it.
# Vulnerable Python
import os
def ping_host(hostname):
result = os.system(f"ping -c 1 {hostname}")
return result
If hostname comes from user input and an attacker supplies google.com; cat /etc/passwd, the shell interprets this as two commands: ping -c 1 google.com followed by cat /etc/passwd. The semicolon is a shell command separator.
Other shell metacharacters that enable injection:
| Character | Effect |
|---|---|
; | Command separator |
&& | Run second command if first succeeds |
|| | Run second command if first fails |
| | Pipe output to another command |
` | Command substitution |
$(...) | Command substitution |
\n | Newline as separator in some shells |
>, >> | Output redirection |
Any of these in user-controlled data breaks out of the intended command.
Common Vulnerable Patterns
Python: os.system and subprocess with shell=True
import subprocess, os
# Vulnerable patterns
os.system(f"convert {user_filename} output.jpg")
subprocess.run(f"ffmpeg -i {input_file} {output_file}", shell=True)
subprocess.getoutput(f"whois {domain}")
shell=True passes the string to /bin/sh -c, which interprets all shell metacharacters. When user input reaches any of these calls, injection is possible.
Node.js: exec and template literals
const { exec } = require('child_process');
// Vulnerable
function generateThumbnail(filename) {
exec(`convert ${filename} -resize 200x200 thumb.jpg`, (err, stdout) => {
// ...
});
}
// Also vulnerable
const util = require('util');
const execAsync = util.promisify(exec);
await execAsync(`git log --oneline ${userBranch}`);
Go: passing user input to exec.Command with sh -c
// Vulnerable
func runDig(domain string) (string, error) {
out, err := exec.Command("sh", "-c", "dig +short " + domain).Output()
return string(out), err
}
The pattern is always the same: user-controlled data is concatenated into a string that gets passed to a shell interpreter.
The Fix: Avoid the Shell
The fundamental fix is to pass arguments as separate values rather than as a single shell string. This bypasses the shell entirely. The program is launched directly; no metacharacter interpretation occurs.
Python: subprocess with argument list
import subprocess
def ping_host(hostname):
# Validate input first
import re
if not re.match(r'^[a-zA-Z0-9.\-]+$', hostname):
raise ValueError("Invalid hostname")
result = subprocess.run(
['ping', '-c', '1', hostname], # List form -- no shell
capture_output=True,
text=True,
timeout=5
)
return result.stdout
When you pass a list to subprocess.run, Python invokes the binary directly using execvp. The OS never sees a shell. hostname is passed as a literal argument to ping, regardless of what characters it contains.
Node.js: execFile or spawn with argument arrays
const { execFile, spawn } = require('child_process');
const { promisify } = require('util');
const execFileAsync = promisify(execFile);
// Safe: execFile does not invoke a shell
async function generateThumbnail(filename) {
// Validate filename doesn't contain path traversal
const path = require('path');
const safe = path.basename(filename);
if (safe !== filename) throw new Error('Invalid filename');
const { stdout } = await execFileAsync('convert', [
safe, '-resize', '200x200', 'thumb.jpg'
]);
return stdout;
}
// Safe: spawn with argument array
function streamConvert(inputFile, outputFile) {
const proc = spawn('ffmpeg', ['-i', inputFile, outputFile]);
// handle proc.stdout / proc.stderr
return proc;
}
execFile and spawn without {shell: true} do not invoke a shell. User data is passed verbatim as process arguments.
Go: exec.Command with separate arguments
import (
"os/exec"
"regexp"
)
func runDig(domain string) (string, error) {
// Validate input
valid := regexp.MustCompile(`^[a-zA-Z0-9.\-]+$`)
if !valid.MatchString(domain) {
return "", fmt.Errorf("invalid domain")
}
// Separate arguments -- no shell
out, err := exec.Command("dig", "+short", domain).Output()
return string(out), err
}
exec.Command does not invoke a shell when the program path and arguments are supplied separately. The shell is only invoked when you explicitly use "sh", "-c", someString.
Input Validation as Defence in Depth
Bypassing the shell removes the structural vulnerability. Validating input adds a second layer that limits damage even if an error introduces a shell path later.
For common argument types:
import re
from pathlib import Path
# Hostname: alphanumeric, dots, hyphens only
def validate_hostname(s: str) -> str:
if not re.match(r'^[a-zA-Z0-9.\-]{1,253}$', s):
raise ValueError(f"Invalid hostname: {s!r}")
return s
# File paths: reject traversal, enforce extension, use basename
def validate_image_path(s: str) -> Path:
p = Path(s)
if p.name != s or '..' in s:
raise ValueError("Invalid path")
if p.suffix.lower() not in {'.jpg', '.jpeg', '.png', '.gif', '.webp'}:
raise ValueError("Unsupported extension")
return p
# Integer arguments: parse as int, never pass raw string
def validate_port(s: str) -> int:
port = int(s) # ValueError if not numeric
if not 1 <= port <= 65535:
raise ValueError("Invalid port")
return port
Never build a blocklist of “dangerous” characters. Blocklists get bypassed. Use an allowlist of what is valid for each argument type.
When You Cannot Avoid a Shell
Some rare cases need shell features: pipes, redirects, shell builtins. If you must use a shell, escape arguments explicitly rather than embedding them raw.
In Python, shlex.quote wraps a single argument in single quotes and escapes any embedded single quotes:
import shlex, subprocess
def run_with_pipe(input_file, output_file):
# Validate filenames are plain names (no path separators)
for f in (input_file, output_file):
if '/' in f or '..' in f:
raise ValueError(f"Invalid filename: {f}")
# shlex.quote makes arguments shell-safe
cmd = f"cat {shlex.quote(input_file)} | wc -l > {shlex.quote(output_file)}"
subprocess.run(cmd, shell=True, check=True)
shlex.quote('foo; rm -rf /') produces 'foo; rm -rf /' — wrapped in single quotes, so the shell treats the entire string as a literal argument rather than executing the injected commands.
This is a last resort. Prefer the argument-list approach whenever possible.
Detecting Command Injection in Your Codebase
Grep for the high-risk patterns:
# Python: shell=True usage
grep -rn "shell=True" src/
# Python: os.system, os.popen, getoutput
grep -rn "os\.system\|os\.popen\|getoutput\|getstatusoutput" src/
# Node.js: exec (uses shell by default)
grep -rn "\.exec(" src/
# Go: sh -c pattern
grep -rn '"sh".*"-c"' .
For each hit, audit whether any part of the command string is derived from user input, environment variables, or database values. Even data you populated yourself can become injection-capable if it was stored unsanitised earlier.
Summary
Command injection is prevented by keeping user data out of shell strings entirely. Pass arguments as arrays or separate parameters so the shell is never involved. When validation is needed, use strict allowlists per argument type. The three-line fix: replace shell=True with an argument list, remove string concatenation, and add type-appropriate input validation.