Testing Guide

Automated vs manual testing:
what each catches

Automated tools catch 30-50% of accessibility issues. Here's what that means for your testing strategy — and why both approaches are necessary.

The Data

The ~40% reality

Industry research consistently shows that automated accessibility testing has significant limitations.

30-50%

of accessibility issues found by automated testing

Industry research consensus

~70%

of WCAG criteria require human judgment to properly evaluate

UsableNet analysis

86

total WCAG 2.2 success criteria (Levels A, AA, AAA combined)

W3C specification

Why the gap exists

Automated tools are excellent at checking objective, technical criteria: Is there alt text? Does color contrast meet ratios? Is there a form label? But accessibility is ultimately about whether real people can use your content — and that often requires human judgment.

Questions like "Is this alt text actually meaningful?" or "Can a screen reader user understand this interaction flow?" can't be answered by checking code alone. They require context, interpretation, and testing with actual assistive technologies.

Automated Testing

What automated tools reliably catch

These are the categories where automated scanning excels — issues with clear, objective pass/fail criteria.

Missing text alternatives

Automated tools reliably detect when required text is missing entirely.

  • Images without alt attributes
  • Form inputs without associated labels
  • Buttons and links without accessible names
  • ARIA labels that reference missing IDs

Color contrast failures

Programmatic color analysis can measure precise contrast ratios against WCAG requirements.

  • Text that fails WCAG contrast ratios (4.5:1 for normal, 3:1 for large)
  • Link text indistinguishable from surrounding text
  • Focus indicators with insufficient contrast
  • Graphics and UI components below 3:1 ratio

Structural HTML issues

DOM analysis reveals structural problems that affect assistive technology parsing.

  • Heading hierarchy problems (skipped levels)
  • Missing language declarations
  • Duplicate IDs causing ARIA reference failures
  • Tables without proper headers

ARIA implementation errors

Automated tools can validate ARIA syntax and structure against the specification.

  • Invalid ARIA attribute values
  • Required ARIA properties that are missing
  • Conflicting roles and properties
  • ARIA references pointing to non-existent elements

Automated scanning is valuable precisely because it catches these issues quickly, consistently, and at scale.

Manual Testing

What requires human judgment

These areas can't be fully evaluated by automated tools — they require testing with real assistive technologies and human assessment.

Keyboard navigation flow

Can users navigate logically through the page using only a keyboard?

  • Tab order follows visual reading order
  • Focus doesn't get trapped in modals or widgets
  • Custom components are fully keyboard operable
  • Skip links work and target correct content

Why manual: Automated tools can detect if elements are technically focusable, but can't assess whether the navigation experience makes sense to a real user.

Screen reader coherence

Does the content make sense when read aloud sequentially?

  • Content flows logically without visual context
  • Interactive elements announce their purpose clearly
  • Dynamic content changes are communicated appropriately
  • Complex layouts maintain meaning when linearized

Why manual: Screen readers interpret pages differently than visual rendering. Only testing with actual screen readers reveals how content is experienced.

Focus management

Is focus handled correctly during dynamic interactions?

  • Focus moves to modal when opened
  • Focus returns appropriately when modal closes
  • Focus moves to error messages after form validation
  • Single-page app navigation manages focus on route changes

Why manual: Proper focus management requires understanding user intent and interaction context — something automated tools cannot infer.

Dynamic content accessibility

Are changes to the page communicated to assistive technology users?

  • Loading states announced via live regions
  • Form validation errors read aloud
  • Notifications don't interrupt user tasks inappropriately
  • Infinite scroll or lazy loading handled accessibly

Why manual: The timing, frequency, and appropriateness of announcements requires human judgment about user experience.

Cognitive load assessment

Is the content understandable and the interface predictable?

  • Instructions are clear before complex interactions
  • Error messages help users understand how to fix issues
  • Navigation is consistent across pages
  • Time limits are appropriate or can be extended

Why manual: Cognitive accessibility is about comprehension and mental load — concepts that require human evaluation.

Alt text quality

Does alternative text actually convey the image's meaning?

  • Alt text describes the image's purpose in context
  • Decorative images are marked as such
  • Complex images have adequate descriptions
  • Alt text isn't redundant with surrounding content

Why manual: Automated tools can detect presence of alt text, but can't evaluate whether it's meaningful, accurate, or appropriately concise.

WCAG 2.2

New criteria: automatable or not?

WCAG 2.2 added 9 new success criteria. Here's how they break down for automated testing.

CriterionLevelAutomatable?Notes
2.4.11

Focus Not Obscured (Minimum)

AAPartial

Can detect some cases where sticky elements might obscure focus, but can't reliably test all scroll/focus combinations.

2.5.7

Dragging Movements

AANo

Requires testing whether single-pointer alternatives exist and work correctly — needs manual interaction testing.

2.5.8

Target Size (Minimum)

AAYes

CSS dimensions can be measured programmatically. One of the more automatable new criteria.

3.2.6

Consistent Help

APartial

Can detect presence of help mechanisms, but determining "same relative order" across pages requires human verification.

3.3.7

Redundant Entry

ANo

Requires understanding form purpose and whether data should be auto-populated — context-dependent assessment.

3.3.8

Accessible Authentication

AANo

Evaluating whether authentication requires "cognitive function tests" needs human judgment about the task's nature.

Key takeaway: Of the 6 new Level A and AA criteria in WCAG 2.2, only one (Target Size) is fully automatable. The rest require partial or full manual testing. This pattern — where newer criteria often address more nuanced accessibility concerns — suggests that manual testing will remain essential even as automated tools improve.

Strategy

A practical approach

How to combine automated and manual testing for effective accessibility coverage.

Use automated scanning for continuous monitoring

Run automated scans regularly — ideally as part of your CI/CD pipeline. This catches regressions quickly and ensures new code doesn't introduce obvious accessibility issues. It's efficient, consistent, and scalable.

  • Catches ~40% of issues automatically
  • Provides consistent baseline across pages
  • Identifies issues to prioritize for manual review

Target manual testing at high-impact areas

You can't manually test everything with limited resources. Focus manual testing on critical user paths, complex interactive components, and pages that automated scanning flags as problematic.

  • Test key user journeys (signup, checkout, core features)
  • Review complex widgets (modals, carousels, custom forms)
  • Validate with actual screen readers (NVDA, VoiceOver, JAWS)

Document and track everything

Maintain records of both automated scan results and manual testing findings. This creates an audit trail that demonstrates ongoing compliance efforts — important for legal teams and regulators.

  • Track issues from discovery through remediation
  • Document known limitations and planned fixes
  • Show progress over time with historical data
Our Approach

Why we built inclly this way

Our tool philosophy is shaped by what automated testing can and can't do.

Automated scanning to catch the ~40%

We use axe-core, the industry-standard accessibility testing engine, to scan for issues that can be reliably detected programmatically. This catches missing alt text, contrast failures, structural issues, and ARIA errors.

AI-powered prioritization for what needs attention

Not all issues are equally important. We help prioritize based on severity, impact, and frequency — so you know where to focus manual testing efforts and remediation work.

Honest flagging of what requires human judgment

We don't pretend automated tools can catch everything. Our reports clearly indicate which issues are confirmed violations versus which areas need manual review. We tell you what we can't test.

Automated scanning is the essential first step — you can't fix what you don't know about. But it's not the complete solution. inclly fits into a broader accessibility strategy, not a replacement for one.

Frequently asked questions

Common questions about accessibility testing approaches.

If automated tools only catch 30-50%, are they worth using?

Catching 30-50% of issues automatically, at scale, with no manual effort is valuable. Automated scanning catches the low-hanging fruit quickly and consistently, freeing your team to focus manual testing on the areas that actually need human judgment. The tools complement each other.

Which automated accessibility testing tools are best?

axe-core (by Deque) is the industry standard and powers most commercial accessibility scanners including inclly. It has the lowest false positive rate and most comprehensive rule coverage. WAVE, Lighthouse, and Pa11y are also reputable tools with different strengths.

How often should I run automated scans?

Ideally, as part of every deployment through CI/CD integration. At minimum, scan weekly or before major releases. Continuous scanning catches regressions early when they're cheapest to fix. Point-in-time audits miss issues introduced between scans.

What screen readers should I test with manually?

For comprehensive coverage: NVDA (free, Windows) with Firefox or Chrome, VoiceOver (built into macOS/iOS) with Safari, and JAWS (paid, Windows) if your audience includes enterprise users. At minimum, test with one screen reader on desktop and one on mobile.

Can AI fully automate accessibility testing?

Not yet. AI can help with tasks like suggesting alt text or identifying potential issues, but accessibility ultimately requires understanding human experience and context. AI tools can augment testing but can't replace the judgment calls that manual testing provides.

Should I aim for zero automated test failures before manual testing?

Fixing automated test failures first is efficient — these are often the easiest issues to address. But don't wait for perfection. Manual testing can uncover more serious issues that automated tools miss. A balanced approach addresses both in parallel.

Powered by axe-core

Start with automated scanning

Catch the issues that automated tools can find. Get clear reports that tell you what's broken, what's flagged for manual review, and where to focus your efforts.