Codalog Validation API

Experimental code quality API for agents, bots, and pipelines.

Codalog is an API-first layer for inspecting submitted code against curated good-code exemplars, storing samples, and preparing for stronger semantic retrieval later. It scores structure and maintainability now, while emitting fingerprints you can feed into a larger exemplar memory over time.

Codalog is not a definitive human-vs-AI detector. It estimates how closely a snippet aligns with curated good-code examples and highlights places that should be reviewed more carefully.

Current focus: internal validation of ranking usefulness. The signal is still experimental and should be used as review input, not as a standalone quality judgment.

Read API Reference Agent Protocol Jump to Examples

Working Views

These Playwright-captured images come from the live static Codalog page so the landing page can show actual working states without turning into an app UI.

Codalog API hero screenshot — The API-first framing, endpoint rail, and human-vs-AI caveat stay visible above the fold.

Slide 01

API Surface At A Glance

Codalog is positioned as a machine-first, validation-led code quality layer, not a chat demo and not a black-box detector.

Public catalog and analysis routes
Authenticated sample persistence route
Static docs-style framing for agents and CI

Codalog request examples screenshot — Curl examples are designed to be copied directly into CI, bots, or other agents.

Slide 02

Copy-Paste Request Examples

The page documents the smallest useful request shapes instead of hiding the contract behind a UI.

Analyze a snippet directly
Store a sample into private memory
Check current language coverage when needed

Codalog example response screenshot — The response shape is stable and machine-readable: summary, heuristics, fingerprint, exemplar alignment, similarity, and storage hints.

Slide 03

Response Shape You Can Build Against

The site shows the exact structure agents should expect, including the curated exemplar verdict and the STM-ready fingerprint emitted for future semantic matching.

Quality classification and confidence
Exemplar alignment verdict and nearest matches
Fingerprints for later retrieval
Storage hints for owner-scoped memory

Codalog exemplar coverage screenshot — The landing page stays focused on curated exemplar coverage instead of exposing internal sourcing notes as marketing copy.

Slide 04

Curated Exemplar Coverage

The public page now shows what matters to users: exemplar-backed scoring, current language coverage, and how to treat outliers that do not match the good-code pool.

Language-by-language curated starter pool
Review recommendation when alignment is weak
Coverage metadata kept lightweight and secondary

What V1 Does

Codalog is intentionally thin right now: API endpoints, stable response shapes, a curated-good exemplar starter corpus, and an internal validation loop. No dashboard, no sample management console, and no automatic mirroring of third-party datasets.

Analysis Response Shape

Every analysis response includes `summary`, `heuristics`, `fingerprint`, `signals`, `similarity`, `exemplar_alignment`, and `storage_hints`.

`storage_hints` carries the default namespace, normalized tags, text query terms, and an STM-ready fingerprint.

Persistence Model

Samples are stored in existing owner-scoped memory. Default namespaces follow `codalog:samples:<language>`.

Ownerless self-serve agent tokens must use private visibility when storing samples.

Curated Exemplar Path

V1 defaults to a curated good-code corpus by language. Public-source expansion stays a backend concern, not a landing-page feature.

The emitted fingerprint is designed so semantic tensor memory or vector ranking can be layered in later without redesigning the external API.

Example Requests

These examples are enough to wire Codalog into CI, a repo bot, a browser extension, or another agent.

Inspect Coverage Metadata

curl -s https://mnemolog.com/api/codalog/catalog

Analyze Submitted Code

curl -s https://mnemolog.com/api/codalog/analyze \
  -H 'Content-Type: application/json' \
  -d '{
    "language": "typescript",
    "filename": "collect.ts",
    "code": "export function collectVisibleItems(items: Item[]): Item[] {\\n  return items.filter((item) => item.isVisible);\\n}"
  }'

Store a Private Sample

curl -s https://mnemolog.com/api/codalog/samples \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $MNA_TOKEN" \
  -d '{
    "visibility": "private",
    "language": "python",
    "filename": "chunked.py",
    "code": "def chunked(items, size):\\n    if size <= 0:\\n        raise ValueError(\\"size must be positive\\")\\n    return [items[i:i + size] for i in range(0, len(items), size)]"
  }'

Example Response

Codalog returns a stable shape intended for machines first and humans second.

{
  "analysis_kind": "style_quality_inference",
  "disclaimer": "Codalog scores code quality and style consistency. It does not prove human authorship or AI authorship.",
  "summary": {
    "language": "typescript",
    "filename": "collect.ts",
    "classification": "clean_human_leaning",
    "clean_human_code_score": 86,
    "confidence": 0.82
  },
  "heuristics": {
    "readability": 88,
    "consistency": 84,
    "maintainability": 85,
    "human_likeness": 84
  },
  "fingerprint": {
    "line_count": 4,
    "non_empty_line_count": 4,
    "average_line_length": 34.5,
    "blank_line_ratio": 0,
    "comment_density": 0,
    "long_line_count": 0,
    "duplicate_line_ratio": 0,
    "function_count": 1,
    "import_count": 0,
    "indent_style": "spaces",
    "mixed_indent_lines": 0,
    "max_indent": 2,
    "identifier_stats": {
      "unique_count": 6,
      "total_count": 9,
      "snake_case": 0,
      "camel_case": 2,
      "pascal_case": 1,
      "single_letter": 0,
      "top_identifiers": [
        { "identifier": "items", "count": 2 },
        { "identifier": "collectVisibleItems", "count": 1 }
      ]
    },
    "first_signal_line": "export function collectVisibleItems(items: Item[]): Item[] {"
  },
  "signals": {
    "strengths": [
      "Identifier naming is mostly consistent.",
      "Low repeated-line ratio suggests the code is not heavily copy-pasted."
    ],
    "concerns": []
  },
  "similarity": {
    "compared_reference_count": 3,
    "reference_strategy": "curated_good_corpus",
    "nearest_references": [
      {
        "id": "good-ts-format-latency",
        "label": "Typed formatting helper",
        "source_kind": "curated",
        "similarity": 0.412,
        "token_similarity": 0.391,
        "structure_similarity": 0.462
      }
    ]
  },
  "exemplar_alignment": {
    "pool_size": 3,
    "compared_exemplar_count": 3,
    "coverage_confidence": "low",
    "alignment_score": 0.412,
    "verdict": "partially_aligned",
    "nearest_good_examples": [
      {
        "id": "good-ts-format-latency",
        "label": "Typed formatting helper",
        "source_kind": "curated",
        "similarity": 0.412,
        "token_similarity": 0.391,
        "structure_similarity": 0.462
      }
    ]
  },
  "storage_hints": {
    "namespace": "codalog:samples:typescript",
    "tags": ["codalog", "language:typescript", "quality:clean_human_leaning", "alignment:partially_aligned"],
    "search_query": "items collectvisibleitems",
    "stm_ready_fingerprint": {
      "language": "typescript",
      "identifiers": ["items", "collectvisibleitems"],
      "line_shapes": ["A A(A: A[]): A[] {", "A A.A((A) => A.A);", "}"]
    }
  }
}

Curated Exemplar Coverage

The public product surface is centered on the curated good-code pool Codalog actually uses today, not on every public dataset that might be useful later.

Starter Language Coverage

The initial curated pool covers JavaScript, TypeScript, Python, Go, Rust, Java, and C#. Weak alignment in those languages means review is recommended, not that a snippet is automatically bad.

Coverage Metadata Only

The catalog endpoint exists for lightweight corpus coverage checks. The real product surface is `POST /api/codalog/analyze`, not a public dataset browser.