Guide · A buyer's diagnostic, not a glossary

Every ATS in 2026 says transparent. Almost none can hand you the row.

Walk into any AI ATS demo this year and the slide deck has the same three words: transparent, explainable, audit-ready. Nobody says what the words mean. The 2026 hiring law cluster (NYC Local Law 144, Illinois HB 3773, Colorado CAIA, EU AI Act Annex III) does not grade vendors on adjectives. It grades them on a data shape: can the system produce, on demand, a per-decision record for one candidate that names which AI claims drove the call, which spans of the resume were the evidence, what the human reviewer changed, and when. This page is a 9-question diagnostic for telling a vendor's transparency claim from the slogan, with the row shape that decides each question.

Matthew Diakonov, Written with AI

Published May 6, 202611 min read

Direct answer · Verified 2026-05-06

How to tell a real ATS transparency claim from a marketing one, in one move.

Ask the vendor to produce one real candidate's per-decision audit record, queried by candidate_id, in the live product. If the response is a confidence score and a free-text summary, the transparency claim is a slogan. If the response is a per-claim ledger with weights, classifications, evidence pointers, and a separate override row carrying reviewer_id and timestamp, the claim is a data structure. The shape is the test. Verified against the public NYC DCWP Local Law 144 final rules and EU AI Act Annex III conformity-assessment language as of May 2026.

The transparency word means three different things in three vendor decks

Vendor decks in 2026 use transparency to mean three things, often on the same slide. The first is that the model is explainable, in the narrow sense that it can produce a paragraph describing why it scored a candidate the way it did. The second is that the workflow has a human in the loop, which usually means a recruiter clicks through stages in a Kanban view. The third is that the vendor commits to publishing a bias audit summary once a year. Each of those is a real thing. None of them is what the 2026 hiring law cluster is grading.

What the laws grade is whether the vendor can hand a regulator a per-candidate audit record after a complaint. The record has to name the AI tool and its version, the claims the AI inferred, the spans of the resume the AI used, the weight on each claim, the recruiter's override with timestamp and reviewer ID, and the candidate notice receipt. None of those are paragraphs. All of them are rows. A vendor whose transparency claim does not produce a row will fail the test the moment the test arrives.

The mismatch is mechanical. The vendor markets the paragraph because the paragraph is what shows up in the demo. The law cares about the row because the row is what survives the request. The diagnostic below is a way to turn the demo into the row.

Marketing transparency vs data-layer transparency

Read the right column as the diagnostic. Every question in the 9-point check below maps back to one of these rows.

Feature	Marketing transparency (fails)	Data-layer transparency (passes)
What the vendor produces on request	A regenerated paragraph and a 0-100 fit score per candidate.	A stored claim ledger plus an override log, queryable by candidate_id.
Where the rubric lives	Implicit inside the model; weights are not visible.	Explicit as 5-15 weighted claims, each with classification.
What evidence looks like	Sentences in a summary, recomputed at request time.	Span text, span offset, source, and a hash that survives deletion.
How recruiter overrides are stored	Free-text comment field or in-place edit of AI output.	Separate logged row with reviewer_id, role, prior, new, timestamp.
What survives candidate-data deletion	Nothing scoped to the decision; the row was the resume.	Claim text, classification, weight, span hash, override row.
How candidate notice is recorded	An email log in a marketing automation tool.	A receipt row tied to the candidate_id and the AEDT version.
Refresh of the bias audit	Annual PDF posted somewhere on the marketing site.	Annual public summary plus the per-decision data behind it.
Failure mode in front of a regulator	Confidence score and login log; the AI is left undefended.	Eleven rows of claim, two override entries, one notice receipt.

Both columns coexist in the market. The left column is what most ATS vendors call transparency in 2026. The right column is what 2026 hiring law actually rewards.

The row a transparent ATS has to be able to produce

The schema below is the minimum viable shape. It is the anchor the entire diagnostic depends on. A vendor that can map every question on the 9-point check to a column in this schema has a transparent system. A vendor that cannot is selling adjectives.

claim_row.ts

Three things in that schema do most of the work. The evidence span hash survives resume deletion: when the candidate exercises a GDPR Article 17 right to erasure on the resume itself, the claim row keeps the proof of what the AI saw without retaining the source text. The human_override is a separate row, not a field replacement, so the original AI decision and the human decision both stay in the ledger. The classification is encoded as must_have, nice_to_have, or red_flag, not as a numeric heuristic, so a reviewer can argue with the rubric instead of arguing with a number.

Most ATS systems today produce one floating-point match score per candidate plus a paragraph. That output is a marketing artifact dressed up as an audit artifact. When a candidate complaint arrives and the regulator asks which claims drove the rejection, the floating point has no answer. The ledger has eleven.

11 claims

“The audit that fails in 2026 is the one where the regulator asks for the row that explains the rejection and the ATS hands over a confidence score and a SOC 2 report.”

Pattern across NYC LL144 candidate complaints and emerging Illinois HB 3773 cases

The two API responses, side by side

Below is the literal call you should ask any AI ATS vendor to make in their live product against one real candidate from last quarter. The first response passes the diagnostic. The second response is what most vendors hand you when you push past the slide deck.

Pass: data-layer transparency

Fail: paragraph plus a number

The pass response answers a regulator's per-candidate question by handing over the rubric, the evidence pointers, the override log, and the notice receipt. The fail response answers nothing the regulator asked. It regenerates a paragraph at request time, which is not a stored audit record, and offers a stage transition with no link to which AI claim caused it.

The 9-question diagnostic for any ATS transparency claim

Show me one real candidate's per-decision audit record from last quarter, queried by candidate_id, in the live product. Pass: a stored ledger comes back. Fail: a paragraph regenerates on request.
Show me where the rubric is stored. Point at the table, the column, and one row. Pass: 5 to 15 weighted claims with classifications. Fail: the rubric is implicit in model weights.
Show me the evidence object on one claim. I want span_text, span_offset, source, and span_hash. Pass: those fields exist. Fail: the evidence is sentences in a summary.
Show me a recruiter override on a real claim. I want a separate row, not a soft delete or an in-place edit. Pass: prior weight, new weight, reviewer_id, timestamp. Fail: a free-text comment.
Tell me what your data shape does after the candidate exercises GDPR Article 17 erasure on the resume. Pass: the claim row plus span_hash survives. Fail: the audit trail dies with the resume.
Show me the candidate notice receipt for that candidate. Pass: a row tied to candidate_id and the AEDT version with a timestamp. Fail: an email log in a marketing tool, or no record.
Show me the bias audit summary you publish under NYC Local Law 144 and the per-decision data behind it. Pass: both. Fail: an annual PDF and no decision-level data, or vice versa.
Tell me what the override row looks like when a recruiter approves the AI's score without changing anything. Pass: still a distinct row, with reviewer_id and timestamp. Fail: silence collapses with active approval.
Tell me which fields a regulator can ask for without forcing a full data export. Pass: at minimum claim_id, evidence.span_hash, override.reviewer_id, override.at. Fail: vague references to internal logs.

Marketing transparency

Adjectives at the top of the deck

Transparent. Explainable. Audit-ready. Human in the loop. Words on a slide that point at no specific row, no specific table, and no specific column. Survives a procurement security review. Does not survive a candidate complaint. Costs nothing to the vendor and nothing to the buyer until the request arrives.

Data-layer transparency

A row a regulator can ask for

Per-decision claim ledger, override row with reviewer_id and timestamp, evidence span hash that survives resume deletion, candidate notice receipt scoped to the AEDT version. The shape that answers an NYC DCWP request, an Illinois Department of Human Rights inquiry, a Colorado AG impact assessment, and an EU national authority Annex III review with the same query.

The honest counterargument, and why it does not save you at renewal

A reasonable procurement officer can argue that 2026 enforcement of these laws has been thinner than the statutory text suggests. NYC Local Law 144 enforcement has been concentrated on a small number of complaints. Illinois HB 3773 effective January 1, 2026 is new. Colorado SB 24-205 was amended during the 2025 session and lands with portions phased into 2026. The EU AI Act's hiring obligations are still phasing in through 2026 and 2027. The argument is: the per-decision audit shape is a leading indicator, not a failing-grade question this quarter.

The argument is correct on enforcement volume today and wrong on procurement direction. Boards accountable for hiring data and customers running procurement security reviews are already asking the per-decision question because the law is public, the shape is set, and a vendor that cannot produce the artifact today has to ship a roadmap to do so by renewal. A 2026 ATS purchase with a 12-month contract is a 2027 renewal conversation. The transparency-claim diagnostic is what protects the buyer at renewal, not what protects the buyer this Tuesday.

The shorter version: transparency-as-slogan was free while the laws were proposed. It is not free now that the laws are statutes and the data shape is in the statutory text.

Where 10xats sits on the diagnostic

Match Rating is the claim-ledger agent. It extracts 5 to 15 testable claims from each JD plus the org's hiring criteria, attaches a visible numeric weight to each, classifies must-have versus nice-to-have versus red flag, pulls the resume span that supports each claim, and records every recruiter override as a separate logged row with reviewer_id, prior weight, new weight, and timestamp. The approval queue covers the rest of the agents (sourcing drafts, scheduling drafts) on the same row-per-action shape.

The product is in development. Visitors join the waitlist on 10xats.com for first access and the first published price. Pricing is published, not gated behind a demo: Starter $0 up to three open reqs, Growth $99 founding ($399 after), Enterprise custom. Every feature on every plan, including the MCP server and the Claude / ChatGPT integrations. The contractual posture in the privacy policy is no AI training on customer data. The terms require human oversight on every AI-influenced decision.

If you are evaluating an AI ATS this quarter, run the 9-question diagnostic on whoever you are about to sign with. If they pass, sign. If they fail, the renewal conversation in 12 months will be unpleasant.

Run the diagnostic on your shortlist before you renew.

Bring the AI ATS vendors on your shortlist. We walk the 9 questions on each of them, in the live product, against one real candidate from last quarter. No deck.

Frequently asked questions

How do I evaluate an ATS vendor's transparency claim in one move?

Ask the vendor to produce one real candidate's per-decision audit record, queried by candidate_id, in the actual product (not a marketing screenshot). If the response is a confidence score, a similarity percentage, or a free-text notes field, the transparency claim is a slogan. If the response is a list of testable claims with weights, classifications, evidence pointers, and a separate override row carrying reviewer_id and timestamp, the claim is real. The whole diagnostic on this page is variations on that one move.

What is the difference between transparency and explainability in AI hiring?

Most vendor decks use the words interchangeably and most laws do not. Explainability is a property of the model: can the model produce a human-readable account of why it scored a candidate the way it did. Transparency is a property of the workflow: is the rubric, the score, the evidence, and the human override visible as data the recruiter, the candidate, and the regulator can each query. A model can be explainable in a vacuum (it returns a paragraph) and still produce an opaque workflow (the paragraph is not stored, the rubric is not visible, the override is a free-text comment). The 2026 hiring law cluster grades transparency, not explainability. The data shape is what survives a request.

Which laws actually require this kind of transparency?

Four anchors and a thickening pile around them. NYC Local Law 144 requires a per-deployment bias audit summary, candidate notice that an automated employment decision tool is in use, and a regulator-readable record per decision. Illinois HB 3773 makes discriminatory AI hiring an Illinois Human Rights Act violation effective January 1, 2026, with a notice obligation. Colorado SB 24-205 imposes deployer obligations including documented impact assessments and consumer notice on high-risk hiring AI. EU AI Act Annex III lists hiring as high-risk and triggers conformity assessment, transparency, and post-market monitoring. None of those four are satisfied by a SOC 2 report and a confidence score.

If a vendor says explainable AI in their demo, what do I ask next?

Two questions, in this order. First: show me the data structure your explanation lives in. If the answer is a paragraph rendered at request time, the explanation is recomputed and not auditable. If the answer is a stored per-claim record, keep going. Second: show me one candidate's record from last quarter. Not a hypothetical, not a demo seed, a real candidate. The vendor that can pull a six-month-old per-decision record on demand has a transparent system. The vendor that cannot has a marketing claim.

Why is one similarity score not transparency?

Because a single number compresses every signal that drove it into a value with no internal structure. The recruiter cannot argue with the rubric. The candidate cannot ask which claim disqualified them. The regulator cannot reconstruct which spans of the resume the model used. The hiring manager cannot override one weight without overriding the whole score. The number is the result of the rubric; the rubric is what transparency is supposed to expose. Surfacing the result and hiding the rubric is the inverse of transparency.

What is a claim ledger and why does the diagnostic anchor on it?

A claim ledger is the data structure a transparent AI ATS produces every time it scores a candidate against a job. One ledger per (candidate, job) pair. Each row is one testable claim extracted from the job description and the org's hiring criteria. Six fields: claim_id, claim_text, classification (must-have, nice-to-have, or red flag), a visible numeric weight, an evidence object pointing at the resume span the AI relied on, and a separate human_override row with reviewer_id, timestamp, and decision. The diagnostic anchors on this shape because every question a 2026 regulator will ask after a candidate complaint maps to a column in that schema. If the schema is not there, the answer to the regulator is not there either.

Is human in the loop a transparency claim worth anything?

Only if the loop is a row, not a slogan. Most vendors call any reviewer presence in the workflow human in the loop. The 2026 laws care about a narrower thing: did a named human have the information and the authority to override, did the override actually control the outgoing decision, and is the override stored as a separate logged row tied back to the AI's first answer. If the override is a soft delete, an in-place edit, or a free-text comment, the human is advisory and the loop does not count. If the override is its own row with reviewer_id and timestamp, the loop counts. The shape decides, not the slogan.

Does claim-by-claim transparency slow recruiters down?

It compresses the slow part. The slow part of recruiting at a 12 to 25 req team is defending a rejection in a debrief, answering a candidate's why, or drafting a notice during a regulator request. Each of those costs hours when the artifact is a confidence score. Each costs minutes when the artifact is an eleven-row ledger with two reviewer overrides. The first time a recruiter pastes a JD and gets a proposed rubric, the edit pass takes about ten minutes; after a few reqs the proposed weights start landing close and the edit drops under three minutes. Scoring inbound applicants against the saved ledger is one batched call, not per-candidate clicking.

What about open-source AI scoring tools that claim transparency?

Source code visibility is a useful but separate property. A model whose weights are public can still produce an opaque workflow if the application that wraps it does not store the rubric, the evidence span, or the override. Conversely, a closed-source model wrapped by an application that writes the per-claim ledger and the override row is transparent at the workflow layer the laws actually grade. Treat open-source as a procurement-friendly bonus, not a substitute for the claim ledger and override log. The audit asks about the shape of the record, not the license of the inference engine.

Where does 10xats sit in this diagnostic?

Match Rating is the claim-ledger agent. It extracts 5 to 15 testable claims from each JD plus the org's hiring criteria, attaches a visible numeric weight to each, classifies must-have versus nice-to-have versus red flag, finds the resume span that supports the claim, and records the recruiter override as a separate logged row with reviewer_id and timestamp. The approval queue is the human override surface across the rest of the agents (sourcing, scheduling). Pricing is published, not gated behind a demo: Starter $0 up to three open reqs, Growth $99 founding, Enterprise custom. The product is in development; visitors join the waitlist on 10xats.com for first access.