Automated vs manual testing:
what each catches
Automated tools catch 30-50% of accessibility issues. Here's what that means for your testing strategy — and why both approaches are necessary.
The ~40% reality
Industry research consistently shows that automated accessibility testing has significant limitations.
of accessibility issues found by automated testing
Industry research consensus
of WCAG criteria require human judgment to properly evaluate
UsableNet analysis
total WCAG 2.2 success criteria (Levels A, AA, AAA combined)
W3C specification
Why the gap exists
Automated tools are excellent at checking objective, technical criteria: Is there alt text? Does color contrast meet ratios? Is there a form label? But accessibility is ultimately about whether real people can use your content — and that often requires human judgment.
Questions like "Is this alt text actually meaningful?" or "Can a screen reader user understand this interaction flow?" can't be answered by checking code alone. They require context, interpretation, and testing with actual assistive technologies.
What automated tools reliably catch
These are the categories where automated scanning excels — issues with clear, objective pass/fail criteria.
Missing text alternatives
Automated tools reliably detect when required text is missing entirely.
- Images without alt attributes
- Form inputs without associated labels
- Buttons and links without accessible names
- ARIA labels that reference missing IDs
Color contrast failures
Programmatic color analysis can measure precise contrast ratios against WCAG requirements.
- Text that fails WCAG contrast ratios (4.5:1 for normal, 3:1 for large)
- Link text indistinguishable from surrounding text
- Focus indicators with insufficient contrast
- Graphics and UI components below 3:1 ratio
Structural HTML issues
DOM analysis reveals structural problems that affect assistive technology parsing.
- Heading hierarchy problems (skipped levels)
- Missing language declarations
- Duplicate IDs causing ARIA reference failures
- Tables without proper headers
ARIA implementation errors
Automated tools can validate ARIA syntax and structure against the specification.
- Invalid ARIA attribute values
- Required ARIA properties that are missing
- Conflicting roles and properties
- ARIA references pointing to non-existent elements
Automated scanning is valuable precisely because it catches these issues quickly, consistently, and at scale.
What requires human judgment
These areas can't be fully evaluated by automated tools — they require testing with real assistive technologies and human assessment.
Keyboard navigation flow
Can users navigate logically through the page using only a keyboard?
- Tab order follows visual reading order
- Focus doesn't get trapped in modals or widgets
- Custom components are fully keyboard operable
- Skip links work and target correct content
Why manual: Automated tools can detect if elements are technically focusable, but can't assess whether the navigation experience makes sense to a real user.
Screen reader coherence
Does the content make sense when read aloud sequentially?
- Content flows logically without visual context
- Interactive elements announce their purpose clearly
- Dynamic content changes are communicated appropriately
- Complex layouts maintain meaning when linearized
Why manual: Screen readers interpret pages differently than visual rendering. Only testing with actual screen readers reveals how content is experienced.
Focus management
Is focus handled correctly during dynamic interactions?
- Focus moves to modal when opened
- Focus returns appropriately when modal closes
- Focus moves to error messages after form validation
- Single-page app navigation manages focus on route changes
Why manual: Proper focus management requires understanding user intent and interaction context — something automated tools cannot infer.
Dynamic content accessibility
Are changes to the page communicated to assistive technology users?
- Loading states announced via live regions
- Form validation errors read aloud
- Notifications don't interrupt user tasks inappropriately
- Infinite scroll or lazy loading handled accessibly
Why manual: The timing, frequency, and appropriateness of announcements requires human judgment about user experience.
Cognitive load assessment
Is the content understandable and the interface predictable?
- Instructions are clear before complex interactions
- Error messages help users understand how to fix issues
- Navigation is consistent across pages
- Time limits are appropriate or can be extended
Why manual: Cognitive accessibility is about comprehension and mental load — concepts that require human evaluation.
Alt text quality
Does alternative text actually convey the image's meaning?
- Alt text describes the image's purpose in context
- Decorative images are marked as such
- Complex images have adequate descriptions
- Alt text isn't redundant with surrounding content
Why manual: Automated tools can detect presence of alt text, but can't evaluate whether it's meaningful, accurate, or appropriately concise.
New criteria: automatable or not?
WCAG 2.2 added 9 new success criteria. Here's how they break down for automated testing.
| Criterion | Level | Automatable? | Notes |
|---|---|---|---|
| 2.4.11 Focus Not Obscured (Minimum) | AA | Partial | Can detect some cases where sticky elements might obscure focus, but can't reliably test all scroll/focus combinations. |
| 2.5.7 Dragging Movements | AA | No | Requires testing whether single-pointer alternatives exist and work correctly — needs manual interaction testing. |
| 2.5.8 Target Size (Minimum) | AA | Yes | CSS dimensions can be measured programmatically. One of the more automatable new criteria. |
| 3.2.6 Consistent Help | A | Partial | Can detect presence of help mechanisms, but determining "same relative order" across pages requires human verification. |
| 3.3.7 Redundant Entry | A | No | Requires understanding form purpose and whether data should be auto-populated — context-dependent assessment. |
| 3.3.8 Accessible Authentication | AA | No | Evaluating whether authentication requires "cognitive function tests" needs human judgment about the task's nature. |
Key takeaway: Of the 6 new Level A and AA criteria in WCAG 2.2, only one (Target Size) is fully automatable. The rest require partial or full manual testing. This pattern — where newer criteria often address more nuanced accessibility concerns — suggests that manual testing will remain essential even as automated tools improve.
A practical approach
How to combine automated and manual testing for effective accessibility coverage.
Use automated scanning for continuous monitoring
Run automated scans regularly — ideally as part of your CI/CD pipeline. This catches regressions quickly and ensures new code doesn't introduce obvious accessibility issues. It's efficient, consistent, and scalable.
- Catches ~40% of issues automatically
- Provides consistent baseline across pages
- Identifies issues to prioritize for manual review
Target manual testing at high-impact areas
You can't manually test everything with limited resources. Focus manual testing on critical user paths, complex interactive components, and pages that automated scanning flags as problematic.
- Test key user journeys (signup, checkout, core features)
- Review complex widgets (modals, carousels, custom forms)
- Validate with actual screen readers (NVDA, VoiceOver, JAWS)
Document and track everything
Maintain records of both automated scan results and manual testing findings. This creates an audit trail that demonstrates ongoing compliance efforts — important for legal teams and regulators.
- Track issues from discovery through remediation
- Document known limitations and planned fixes
- Show progress over time with historical data
Why we built inclly this way
Our tool philosophy is shaped by what automated testing can and can't do.
Automated scanning to catch the ~40%
We use axe-core, the industry-standard accessibility testing engine, to scan for issues that can be reliably detected programmatically. This catches missing alt text, contrast failures, structural issues, and ARIA errors.
AI-powered prioritization for what needs attention
Not all issues are equally important. We help prioritize based on severity, impact, and frequency — so you know where to focus manual testing efforts and remediation work.
Honest flagging of what requires human judgment
We don't pretend automated tools can catch everything. Our reports clearly indicate which issues are confirmed violations versus which areas need manual review. We tell you what we can't test.
Automated scanning is the essential first step — you can't fix what you don't know about. But it's not the complete solution. inclly fits into a broader accessibility strategy, not a replacement for one.
Frequently asked questions
Common questions about accessibility testing approaches.
If automated tools only catch 30-50%, are they worth using?
Catching 30-50% of issues automatically, at scale, with no manual effort is valuable. Automated scanning catches the low-hanging fruit quickly and consistently, freeing your team to focus manual testing on the areas that actually need human judgment. The tools complement each other.
Which automated accessibility testing tools are best?
axe-core (by Deque) is the industry standard and powers most commercial accessibility scanners including inclly. It has the lowest false positive rate and most comprehensive rule coverage. WAVE, Lighthouse, and Pa11y are also reputable tools with different strengths.
How often should I run automated scans?
Ideally, as part of every deployment through CI/CD integration. At minimum, scan weekly or before major releases. Continuous scanning catches regressions early when they're cheapest to fix. Point-in-time audits miss issues introduced between scans.
What screen readers should I test with manually?
For comprehensive coverage: NVDA (free, Windows) with Firefox or Chrome, VoiceOver (built into macOS/iOS) with Safari, and JAWS (paid, Windows) if your audience includes enterprise users. At minimum, test with one screen reader on desktop and one on mobile.
Can AI fully automate accessibility testing?
Not yet. AI can help with tasks like suggesting alt text or identifying potential issues, but accessibility ultimately requires understanding human experience and context. AI tools can augment testing but can't replace the judgment calls that manual testing provides.
Should I aim for zero automated test failures before manual testing?
Fixing automated test failures first is efficient — these are often the easiest issues to address. But don't wait for perfection. Manual testing can uncover more serious issues that automated tools miss. A balanced approach addresses both in parallel.
Continue Learning
Explore related guides to build a complete accessibility testing strategy.
Testing Tools Comparison
Compare axe, WAVE, Lighthouse, Pa11y and other accessibility testing tools.
Manual Testing Guide
5 essential manual accessibility tests every developer should know.
WCAG 2.2 AA Checklist
All 50 Level A and AA success criteria with testing indicators.
Accessibility Audit Trail Guide
How to document compliance efforts your legal team will actually use.
Start with automated scanning
Catch the issues that automated tools can find. Get clear reports that tell you what's broken, what's flagged for manual review, and where to focus your efforts.