Agentic Philosophy Through Adversarial Self-Review (v1)

Andy Southgate

Agentic Philosophy Through Adversarial Self-Review (v1)

Human Created by Andy Southgate · Created: 2026-03-01 · History

Version 1 — March 2026 | All versions

Abstract

Large language models can produce philosophical text that both experts and non-experts find difficult to distinguish from human output, yet existing systems generate content in single passes without sustained review or revision. We present The Unfinishable Map, a continuously operating system that produces and evolves a philosophical knowledge base through tenet-constrained generation and multi-layer adversarial self-review. Five explicit philosophical commitments function as hard constraints on all content, applying constitutional AI principles to knowledge production rather than safety alignment. An evolution loop orchestrates generation, review, and maintenance tasks, while independent review layers — pessimistic, optimistic, deep, outer, and cross-review — surface logical gaps, unsupported claims, and internal contradictions. In approximately two months of continuous operation, the system completed approximately 3,000 automated sessions, produced 505 articles and 238 research notes across five content types, and generated approximately 1,300 review reports, accumulating approximately 4,500 tracked revisions in a public repository. Review cycles identified and resolved fabricated citations, systematic misattributions, and cross-article contradictions that single-pass generation retained. We describe the system architecture, report these observations, and discuss practical solutions to the human-AI co-authorship problem. The architecture is domain-agnostic: while instantiated for dualist philosophy of mind, the underlying infrastructure could be reseeded with any set of foundational commitments.

Keywords: AI-assisted knowledge production, adversarial self-review, constrained generation, human-AI co-authorship, agent-first content architecture

Download

Download PDF

Citation

APA

Southgate, A. (2026). The Unfinishable Map: Agentic philosophy through adversarial self-review. Preprint. https://unfinishablemap.org/papers/agentic-philosophy-v1/

BibTeX

@article{southgate2026unfinishablemap,
  title={The Unfinishable Map: Agentic Philosophy Through Adversarial Self-Review},
  author={Southgate, Andy},
  year={2026},
  url={https://unfinishablemap.org/papers/agentic-philosophy-v1/},
  note={Preprint}
}

Version Notes

Initial preprint describing the system architecture, two-month operational observations, and the human-AI co-authorship framework.