Skip to content

system_probe

system_probe

Consolidated system probes for the Marianne daemon.

Provides a single SystemProbe class that encapsulates the "try psutil → fallback to /proc" pattern used across the daemon for:

  • Memory usage (RSS)
  • Child process counting
  • Zombie detection and reaping
  • Process group member counting

Extracted from monitor.py and pgroup.py (FIX-16) to eliminate 5× duplication of the psutil/proc fallback logic. All methods are static — callers import the class and call methods directly.

Classes

SystemProbe

Consolidated system resource probes.

Each method tries psutil first, then falls back to /proc on Linux. Returns None when all probes fail — callers should treat that as a critical condition (fail-closed).

Functions
get_memory_mb staticmethod
get_memory_mb()

Get current process RSS memory in MB.

Uses psutil.Process().memory_info().rss if available, falls back to reading VmRSS from /proc/self/status.

Returns:

Type Description
float | None

RSS in megabytes, or None when all probes fail.

Source code in src/marianne/daemon/system_probe.py
@staticmethod
def get_memory_mb() -> float | None:
    """Get current process RSS memory in MB.

    Uses ``psutil.Process().memory_info().rss`` if available,
    falls back to reading ``VmRSS`` from ``/proc/self/status``.

    Returns:
        RSS in megabytes, or ``None`` when all probes fail.
    """
    if _psutil is not None:
        try:
            proc = _psutil.Process()
            rss_bytes: int = proc.memory_info().rss
            return rss_bytes / (1024 * 1024)
        except (_psutil.NoSuchProcess, _psutil.AccessDenied, OSError, AttributeError):
            _logger.debug("psutil_memory_probe_failed", exc_info=True)
    # Fallback: /proc/self/status (Linux only)
    try:
        with open("/proc/self/status") as f:
            for line in f:
                if line.startswith("VmRSS:"):
                    return int(line.split()[1]) / 1024  # kB -> MB
    except (OSError, ValueError):
        pass
    _logger.warning("memory_probe_failed_all_methods", exc_info=True)
    return None
get_child_count staticmethod
get_child_count()

Count child processes of the current process (recursive).

Uses psutil.Process().children(recursive=True) if available, falls back to scanning /proc/*/status for matching PPid.

Returns:

Type Description
int | None

Number of child processes, or None when all probes fail.

Source code in src/marianne/daemon/system_probe.py
@staticmethod
def get_child_count() -> int | None:
    """Count child processes of the current process (recursive).

    Uses ``psutil.Process().children(recursive=True)`` if available,
    falls back to scanning ``/proc/*/status`` for matching PPid.

    Returns:
        Number of child processes, or ``None`` when all probes fail.
    """
    if _psutil is not None:
        try:
            current = _psutil.Process()
            return len(current.children(recursive=True))
        except (_psutil.NoSuchProcess, _psutil.AccessDenied, OSError, AttributeError):
            _logger.debug("psutil_child_count_probe_failed", exc_info=True)
    return SystemProbe._proc_child_count()
get_zombies staticmethod
get_zombies()

Detect zombie child processes (without reaping).

Returns:

Type Description
list[int]

List of zombie PIDs that were detected (not necessarily reaped).

Source code in src/marianne/daemon/system_probe.py
@staticmethod
def get_zombies() -> list[int]:
    """Detect zombie child processes (without reaping).

    Returns:
        List of zombie PIDs that were detected (not necessarily reaped).
    """
    return SystemProbe._scan_zombies(reap=False)
reap_zombies staticmethod
reap_zombies()

Detect and reap zombie child processes.

Returns:

Type Description
list[int]

List of PIDs that were reaped.

Source code in src/marianne/daemon/system_probe.py
@staticmethod
def reap_zombies() -> list[int]:
    """Detect and reap zombie child processes.

    Returns:
        List of PIDs that were reaped.
    """
    return SystemProbe._scan_zombies(reap=True)
count_group_members staticmethod
count_group_members(pgid, exclude_pid=0)

Count processes in a process group, excluding one PID.

Uses psutil Process.pgid per-process if available, falls back to reading /proc/*/stat field 5 (0-indexed 4).

Note: psutil.process_iter(["pgid"]) raises ValueError because pgid is not a valid as_dict attribute. We use per-process os.getpgid(proc.pid) instead.

Parameters:

Name Type Description Default
pgid int

Process group ID to count members of.

required
exclude_pid int

PID to exclude from count (typically self).

0

Returns:

Type Description
int | None

Number of matching processes, or None if all probes fail.

Source code in src/marianne/daemon/system_probe.py
@staticmethod
def count_group_members(pgid: int, exclude_pid: int = 0) -> int | None:
    """Count processes in a process group, excluding one PID.

    Uses psutil ``Process.pgid`` per-process if available, falls
    back to reading ``/proc/*/stat`` field 5 (0-indexed 4).

    Note: ``psutil.process_iter(["pgid"])`` raises ``ValueError``
    because ``pgid`` is not a valid ``as_dict`` attribute.  We use
    per-process ``os.getpgid(proc.pid)`` instead.

    Args:
        pgid: Process group ID to count members of.
        exclude_pid: PID to exclude from count (typically self).

    Returns:
        Number of matching processes, or None if all probes fail.
    """
    if _psutil is not None:
        try:
            count = 0
            for proc in _psutil.process_iter(["pid"]):
                try:
                    pid = proc.info["pid"]
                    if pid == exclude_pid:
                        continue
                    if os.getpgid(pid) == pgid:
                        count += 1
                except (_psutil.NoSuchProcess, _psutil.AccessDenied, OSError, AttributeError):
                    continue
            return count
        except (_psutil.NoSuchProcess, _psutil.AccessDenied, OSError, AttributeError):
            _logger.debug("psutil_group_count_failed", exc_info=True)
    return SystemProbe._proc_group_count(pgid, exclude_pid)

Functions