Article ยท March 2026

First-Order Logic in Software Engineering

Software engineering has a reasoning gap. Every bug that escapes to production was a logical relationship that nobody verified โ€” not because humans are lazy, but because exhaustive verification is computationally hard without the right tool.

Large language models have made that gap worse, not better. They fill it with plausibility โ€” answers that sound right, pass code review, and break in production. First-order logic fills it with proof โ€” answers that are mathematically guaranteed to be correct.

This article explores 17 concrete places in the software lifecycle where first-order logic โ€” implemented via a Prolog runtime โ€” outperforms probabilistic reasoning. Not in theory. In practice.

By Umur Ozkul ยท CTO, Rust developer, 20+ years of functional programming ยท github.com/umuro/prolog-mcp

The Reasoning Gap

LLMs generate plausible answers

They reason by pattern matching. "This looks like a valid config" is not the same as "I proved every constraint is satisfied." The difference is bugs.

Prolog generates provable answers

It explores the complete solution space. Every answer is a mathematical proof. If Prolog says "no solution exists," there is no solution.

Foundational: Graph & Relational Reasoning
๐Ÿ•ธ๏ธ

1. Dependency Graph Detection

Model any system as facts: depends_on(service_a, database_b). Query for cycles, transitive dependencies, or circular references. LLMs see names and guess relationships โ€” Prolog proves them.

What Prolog finds:

โŒ CYCLE: api โ†’ auth โ†’ user_db โ†’ api
โŒ TRANSITIVE: api โ†’ cache โ†’ auth (hidden 2-step dependency)
Fix: break cycle with async queue at user_db

LLMs guess who depends on whom. Prolog proves it โ€” including the hidden 3-step chains nobody noticed.

๐Ÿ”€

2. Message Routing & Dispatch

Express routing rules as Prolog clauses: handles(billing, Channel). Query for the correct channel given a message type, payload constraints, or user context. Deterministic, not probabilistic.

What Prolog resolves:

handles(billing, Priority) โ†’ Priority = urgent_queue
handles(refund, Default) โ†’ Default = finance_channel
โŒ UNREACHABLE: no_rule_for(emergency_refund) โ€” gap found!

LLMs say "route to the finance team." Prolog proves which queue, with which priority, and finds routing gaps before they become dead messages.

๐Ÿ—“๏ธ

3. Scheduling & Resource Contention

Encode schedules, capacity constraints, and time windows as Prolog facts. Query for valid assignments that satisfy every constraint simultaneously.

What Prolog solves:

โœ… VALID: worker_1(shift_1, room_A), worker_2(shift_1, room_B)
โŒ CONFLICT: worker_1(shift_1, room_A), worker_1(shift_1, room_C) โ€” same person, same shift, different rooms
โŒ CAPACITY: room_A capacity=2, assigned=3

Constraint satisfaction is what Prolog was built for. LLMs propose schedules; Prolog proves they're valid.

๐Ÿง 

4. Agent Self-Knowledge & Persistent Memory

Agents accumulate facts over time: user_preference(alice, theme, dark), session_context(bob, project, prolog_mcp). Query the agent's own knowledge base to make context-aware decisions. Survives restarts.

What Prolog enables:

user_preference(alice, theme, dark).
user_preference(alice, lang, rust).
recommend(alice, Resource) :- user_preference(alice, lang, rust), is_rust_example(Resource).
โ†’ recommend(alice) โ†’ Functional Rust examples

LLMs forget between sessions. Prolog facts persist โ€” the agent remembers what the user prefers, across every conversation.

Phase: Requirements
๐Ÿ“‹

5. Requirements Consistency Checking

You have 15 requirements written by 3 teams. An LLM says "they look fine." But requirement #4 says "All users must have a unique email" and #12 says "Users can share emails across departments."

What Prolog does:

โŒ CONTRADICTION: req_4 (unique_email) โ†” req_12 (shared_email)
Fix: exclude department-shared accounts from uniqueness axiom.

LLMs read text. Prolog proves consistency โ€” or finds the exact pair that conflicts. Before a single line of code is written.

๐Ÿ”—

6. Cross-Team Interface Contracts

Team A produces JSON with user_id: string. Team B expects user_id: integer. They discover the mismatch in production on Friday at 23:00.

What Prolog catches at design time:

โŒ CONTRACT VIOLATION: team_a.produces(user_id, string) โ‰  team_b.expects(user_id, integer)
Location: api-gateway โ†’ order-service boundary

Express every team's input/output schema as Prolog facts. Query all_interfaces_valid before the teams even meet.

๐Ÿ“

7. Acceptance Criteria as Provable Contracts

Instead of "the system should handle edge cases" โ€” write acceptance criteria as executable logic:

// "An order is valid only if..."

valid_order(User, Items) :-

has_permission(User, create_order),

all_items_in_stock(Items),

total_below_credit_limit(User, Items),

not(has_pending_dispute(User)).

Every test case passes or fails deterministically. No "looks good to me." The spec is the test oracle.

Phase: Architecture & Design
โš™๏ธ

8. Configuration Constraint Satisfaction

"Service A needs port 8080. Service B needs 8080. They can't share. C depends on A but not B. D depends on both. What's the valid port assignment?"

What Prolog returns:

A=8080, B=8081, C=8082, D=8083
A=8090, B=8091, C=8092, D=8093
... (all 12 valid configurations)

LLMs guess. Prolog searches the entire solution space and returns every valid configuration.

๐Ÿ”

9. Access Control Matrix Verification

RBAC with 8 roles, 40 permissions, 12 resource types, and escalation rules. How do you know no role can accidentally access admin functions?

What Prolog proves:

โœ… VERIFIED: No non-admin role can reach delete_user.
โŒ ESCALATION BUG: editor + on_leave + emergency_override โ†’ admin access
Path: editor โ†’ emergency_override โ†’ bypass_approval โ†’ admin_function

Security audits by LLMs say "this looks secure." Prolog explores every role-path combination and finds the one that breaks your model.

๐Ÿ”„

10. State Machine Invariant Verification

"Can the system ever reach a state where a payment is captured but no order exists?"

What Prolog explores:

โŒ VIOLATION: capture_payment(S3) โ†’ cancel_order(S4) โ†’ no_order_with_payment(S5)
Fix: cancel_order must refund first, or prevent cancellation after payment.

LLMs can't reason about all state transitions. Prolog explores the transition graph and finds the exact sequence that breaks the invariant.

Phase: Implementation
๐Ÿงช

11. Exhaustive Test Case Generation

Precondition: user is authenticated AND has role (admin OR editor) AND item is not locked AND item belongs to user's workspace AND workspace is active.

Prolog generates minimal test matrix:

โœ“ auth+admin+unlocked+own+active โ†’ PASS
โœ— auth+admin+locked+own+active โ†’ FAIL (item locked)
โœ— auth+admin+unlocked+other+active โ†’ FAIL (wrong workspace)
... (8 cases, 0 missed edge cases)

LLMs generate "likely" test cases and miss the ones that break your system. Prolog generates every combination.

๐Ÿ”’

12. Data Flow Integrity โ€” Can Sensitive Data Leak?

GDPR says PII must never reach analytics. Your system has 40 services. Does any data path carry PII from the user service to the analytics pipeline?

What Prolog traces:

โŒ LEAK: user_service โ†’ order_transformer โ†’ analytics_pipeline
Path carries: email (PII) โ†’ order_context โ†’ analytics_log
Fix: add anonymization step at order_transformer.

Query pii_leak(source, sink) and get the actual path. Not "review your data flow." The actual path.

โ™ป๏ธ

13. Refactoring Safety โ€” Does Behavior Change?

You're refactoring a 10-year-old function. The LLM says "looks equivalent." But does it handle every input the same way?

What Prolog checks:

โœ… EQUIVALENT for 47 of 48 input classes
โŒ DIFFERS: old returns empty_list when input = [], new returns null
Regression: parse_config("") would crash downstream

Query equivalent(old, new, Input) for every input class. Find the one case where your refactor changes behavior.

Phase: Deployment & Operations
๐Ÿš€

14. Deployment Ordering & Dependency Resolution

25 microservices. What's the optimal deployment order that minimizes downtime and never starts a service before its dependencies?

Deploy order: [auth_db โ†’ auth_service โ†’ user_db โ†’ user_service โ†’ ...]
Parallel group 1: [logging, metrics]
โŒ CYCLE: service_a โ†’ service_b โ†’ service_c โ†’ service_a
Fix: break cycle at service_c (use async queue)

Topological sort with constraints. Prolog finds the order, parallelizes where safe, and detects cycles.

๐Ÿ“Š

15. Cron Conflict Detection

47 cron jobs across 12 servers. Some share databases. Some share rate-limited APIs. Which pairs can never run simultaneously?

โœ… 42 jobs: no conflicts
โŒ CONFLICT: nightly_etl (2:00-04:00) โˆฉ report_generator (03:00-03:30) โ†’ shared db lock
Fix: shift report_generator to 04:30

LLMs "estimate" conflicts. Prolog checks every time overlap against shared resource rules.

Phase: Acceptance & Compliance
โš–๏ธ

16. Regulatory Compliance as Provable Rules

GDPR, HIPAA, SOC2 โ€” express regulations as Prolog rules, assert your workflow facts, query for violations:

% GDPR: data must be deletable within 30 days

compliant(Workflow) :-

has_deletion_mechanism(Workflow),

max_retention_days(Workflow, 30),

audit_trail_for(Workflow, deletion).

Query compliant(my_workflow)? Provable yes/no. Show this to an auditor.

๐Ÿ—๏ธ

17. Architectural Debt Detection

Your architecture docs say "no service should bypass the API gateway." Six months later, 3 services call each other directly. Which services violate the architectural rule?

What Prolog finds:

โŒ VIOLATION: order_service โ†’ inventory_service (direct call, bypasses gateway)
โŒ VIOLATION: inventory_service โ†’ notification_service (direct call)
Fix: route both through api_gateway, or add exception to architecture rule

Architectural rules as Prolog clauses. Query the actual codebase facts. Find violations automatically โ€” before the architecture document becomes fiction.

Software engineering has a reasoning gap.

LLMs fill it with plausibility.

Prolog fills it with proof.

Every bug that escapes to production was a logical relationship that nobody verified. Not because humans are lazy โ€” but because exhaustive verification is computationally hard without the right tool.

Prolog MCP gives your AI agent that tool. Not to replace it โ€” to make it trustworthy.

๐Ÿ”ฎ github.com/umuro/prolog-mcp

Published by High Tech Mind ยท More AI articles โ†’