Command Injection: Shell Escape to Code Execution

Command injection is a direct path to remote code execution. An attacker who can inject OS commands into your application can read secrets, exfiltrate data, establish persistence, or pivot to internal systems. The vulnerability is not subtle: it appears when application code constructs a shell command by concatenating user input. This article covers the mechanics, the common patterns developers reach for that introduce the bug, and the safe alternatives.

How Command Injection Works

Applications sometimes need to run system commands: convert an image, call a CLI tool, check a network address, run a report. The most direct way to do this is to construct a shell string and execute it.

# Vulnerable Python
import os

def ping_host(hostname):
    result = os.system(f"ping -c 1 {hostname}")
    return result

If hostname comes from user input and an attacker supplies google.com; cat /etc/passwd, the shell interprets this as two commands: ping -c 1 google.com followed by cat /etc/passwd. The semicolon is a shell command separator.

Other shell metacharacters that enable injection:

Character	Effect
`;`	Command separator
`&&`	Run second command if first succeeds
`\|\|`	Run second command if first fails
`\|`	Pipe output to another command
`	Command substitution
`$(...)`	Command substitution
`\n`	Newline as separator in some shells
`>`, `>>`	Output redirection

Any of these in user-controlled data breaks out of the intended command.

Common Vulnerable Patterns

Python: `os.system` and `subprocess` with `shell=True`

import subprocess, os

# Vulnerable patterns
os.system(f"convert {user_filename} output.jpg")
subprocess.run(f"ffmpeg -i {input_file} {output_file}", shell=True)
subprocess.getoutput(f"whois {domain}")

shell=True passes the string to /bin/sh -c, which interprets all shell metacharacters. When user input reaches any of these calls, injection is possible.

Node.js: `exec` and template literals

const { exec } = require('child_process');

// Vulnerable
function generateThumbnail(filename) {
  exec(`convert ${filename} -resize 200x200 thumb.jpg`, (err, stdout) => {
    // ...
  });
}

// Also vulnerable
const util = require('util');
const execAsync = util.promisify(exec);
await execAsync(`git log --oneline ${userBranch}`);

Go: passing user input to `exec.Command` with `sh -c`

// Vulnerable
func runDig(domain string) (string, error) {
    out, err := exec.Command("sh", "-c", "dig +short " + domain).Output()
    return string(out), err
}

The pattern is always the same: user-controlled data is concatenated into a string that gets passed to a shell interpreter.

The Fix: Avoid the Shell

The fundamental fix is to pass arguments as separate values rather than as a single shell string. This bypasses the shell entirely. The program is launched directly; no metacharacter interpretation occurs.

Python: `subprocess` with argument list

import subprocess

def ping_host(hostname):
    # Validate input first
    import re
    if not re.match(r'^[a-zA-Z0-9.\-]+$', hostname):
        raise ValueError("Invalid hostname")

    result = subprocess.run(
        ['ping', '-c', '1', hostname],  # List form  --  no shell
        capture_output=True,
        text=True,
        timeout=5
    )
    return result.stdout

When you pass a list to subprocess.run, Python invokes the binary directly using execvp. The OS never sees a shell. hostname is passed as a literal argument to ping, regardless of what characters it contains.

Node.js: `execFile` or `spawn` with argument arrays

const { execFile, spawn } = require('child_process');
const { promisify } = require('util');
const execFileAsync = promisify(execFile);

// Safe: execFile does not invoke a shell
async function generateThumbnail(filename) {
  // Validate filename doesn't contain path traversal
  const path = require('path');
  const safe = path.basename(filename);
  if (safe !== filename) throw new Error('Invalid filename');

  const { stdout } = await execFileAsync('convert', [
    safe, '-resize', '200x200', 'thumb.jpg'
  ]);
  return stdout;
}

// Safe: spawn with argument array
function streamConvert(inputFile, outputFile) {
  const proc = spawn('ffmpeg', ['-i', inputFile, outputFile]);
  // handle proc.stdout / proc.stderr
  return proc;
}

execFile and spawn without {shell: true} do not invoke a shell. User data is passed verbatim as process arguments.

Go: `exec.Command` with separate arguments

import (
    "os/exec"
    "regexp"
)

func runDig(domain string) (string, error) {
    // Validate input
    valid := regexp.MustCompile(`^[a-zA-Z0-9.\-]+$`)
    if !valid.MatchString(domain) {
        return "", fmt.Errorf("invalid domain")
    }

    // Separate arguments  --  no shell
    out, err := exec.Command("dig", "+short", domain).Output()
    return string(out), err
}

exec.Command does not invoke a shell when the program path and arguments are supplied separately. The shell is only invoked when you explicitly use "sh", "-c", someString.

Input Validation as Defence in Depth

Bypassing the shell removes the structural vulnerability. Validating input adds a second layer that limits damage even if an error introduces a shell path later.

For common argument types:

import re
from pathlib import Path

# Hostname: alphanumeric, dots, hyphens only
def validate_hostname(s: str) -> str:
    if not re.match(r'^[a-zA-Z0-9.\-]{1,253}$', s):
        raise ValueError(f"Invalid hostname: {s!r}")
    return s

# File paths: reject traversal, enforce extension, use basename
def validate_image_path(s: str) -> Path:
    p = Path(s)
    if p.name != s or '..' in s:
        raise ValueError("Invalid path")
    if p.suffix.lower() not in {'.jpg', '.jpeg', '.png', '.gif', '.webp'}:
        raise ValueError("Unsupported extension")
    return p

# Integer arguments: parse as int, never pass raw string
def validate_port(s: str) -> int:
    port = int(s)  # ValueError if not numeric
    if not 1 <= port <= 65535:
        raise ValueError("Invalid port")
    return port

Never build a blocklist of “dangerous” characters. Blocklists get bypassed. Use an allowlist of what is valid for each argument type.

When You Cannot Avoid a Shell

Some rare cases need shell features: pipes, redirects, shell builtins. If you must use a shell, escape arguments explicitly rather than embedding them raw.

In Python, shlex.quote wraps a single argument in single quotes and escapes any embedded single quotes:

import shlex, subprocess

def run_with_pipe(input_file, output_file):
    # Validate filenames are plain names (no path separators)
    for f in (input_file, output_file):
        if '/' in f or '..' in f:
            raise ValueError(f"Invalid filename: {f}")

    # shlex.quote makes arguments shell-safe
    cmd = f"cat {shlex.quote(input_file)} | wc -l > {shlex.quote(output_file)}"
    subprocess.run(cmd, shell=True, check=True)

shlex.quote('foo; rm -rf /') produces 'foo; rm -rf /' — wrapped in single quotes, so the shell treats the entire string as a literal argument rather than executing the injected commands.

This is a last resort. Prefer the argument-list approach whenever possible.

Detecting Command Injection in Your Codebase

Grep for the high-risk patterns:

# Python: shell=True usage
grep -rn "shell=True" src/

# Python: os.system, os.popen, getoutput
grep -rn "os\.system\|os\.popen\|getoutput\|getstatusoutput" src/

# Node.js: exec (uses shell by default)
grep -rn "\.exec(" src/

# Go: sh -c pattern
grep -rn '"sh".*"-c"' .

For each hit, audit whether any part of the command string is derived from user input, environment variables, or database values. Even data you populated yourself can become injection-capable if it was stored unsanitised earlier.

Summary

Command injection is prevented by keeping user data out of shell strings entirely. Pass arguments as arrays or separate parameters so the shell is never involved. When validation is needed, use strict allowlists per argument type. The three-line fix: replace shell=True with an argument list, remove string concatenation, and add type-appropriate input validation.