Third-Party AI Vendor Risk: The CISO's Due Diligence Checklist
Your standard vendor security questionnaire was designed for SaaS applications — not models that hallucinate, drift, and embed your organization's data in their next training run. Here's the AI-specific due diligence framework that surfaces the risks it misses.
May 12, 2026
9 min read
Patrick Parker
The procurement email arrives. A business unit lead wants to deploy a third-party AI vendor for contract review, clinical decision support, or customer communication. They've already done a demo. The team is excited. Now it's on your desk for security sign-off.
You send the standard vendor security questionnaire. The vendor returns a SOC 2 Type II report, a clean penetration test, and a data processing agreement. On paper, it looks fine. You approve it.
This is where mid-market AI procurement goes wrong. SOC 2 tells you whether the vendor's infrastructure is appropriately controlled. It tells you nothing about whether the model itself is reliable, whether your data is being used to train future versions, whether the model's outputs will drift over time, or what happens when it makes a confidently wrong decision that someone acts on.
68%
of AI vendors have had at least one material model failure in the last 12 months
41%
of CISOs discovered AI data handling violations post-deployment
$4.2M
Average cost of an AI-related third-party incident in regulated industries
Benchmarks sourced from Gartner AI Risk Management Survey 2026, Ponemon Institute AI Incident Cost Report, and Altiri client engagement data across healthcare, financial services, and defense.
Why SOC 2 Isn't Enough for AI Vendors
SOC 2 was designed to evaluate whether a software vendor has appropriate controls around access management, change management, availability, and data confidentiality. These controls matter — but they evaluate the infrastructure the model runs on, not the model itself.
AI systems introduce risks that have no equivalent in traditional SaaS:
Model drift — The model's behavior changes over time as the vendor updates weights, changes fine-tuning data, or deploys new versions. A model that was safe to use in January may produce materially different outputs in July with no notification to you.
Training data contamination — Your data, uploaded to the vendor's platform, may be used to train or fine-tune future model versions. Your intellectual property and customer data could become part of another customer's model.
Hallucination liability — AI models produce false information with confident presentation. In regulated industries, an AI vendor's hallucination is your compliance exposure, not theirs.
Algorithmic bias — Models trained on historical data carry historical biases. In healthcare, lending, or hiring contexts, this is a discrimination liability with regulatory teeth.
Vendor lock-in and model access — If the vendor shuts down or changes pricing, can you export your data and model configurations? Or do you lose years of fine-tuning?
The illusion of coverage. A vendor with ISO 27001, SOC 2 Type II, and a signed DPA has checked the infrastructure-security boxes. None of those certifications address whether the model is reliable, whether your data trains their next version, or whether they'll notify you when model behavior changes. You need a separate AI-specific due diligence track — not a replacement for the standard one.
The AI Vendor Risk Scoring Framework
Before you apply the detailed checklist, score each AI vendor on the five dimensions that determine how much scrutiny it requires. Higher scores mean higher scrutiny before approval — not automatic rejection.
Risk Dimension
Low (1)
Medium (2)
High (3)
Data sensitivity
No PII, no regulated data
Internal data, low PII
PHI, PII, financial data, IP
Decision impact
Suggestions only, human decides
Recommendations inform decisions
Automated decisions, clinical/financial
Model opacity
Explainable, auditable outputs
Some explanation available
Black-box, no explainability
Vendor maturity
Enterprise-grade, established
Growing company, some enterprise track record
Early-stage, startup, new to market
Integration depth
Standalone tool, no system access
API integration, limited access
Deep integration, broad data access
A vendor scoring 11–15 requires full due diligence — executive sign-off, legal review of AI-specific terms, and a formal risk acceptance before deployment. A vendor scoring 5–7 may pass with an expedited review. Most AI vendors touching regulated data will score 9 or above.
Use the score to calibrate effort, not to skip steps. A score of 6 means a lighter-touch review — not a review that skips model behavior, data handling, or incident notification. Every AI vendor gets the AI-specific questions. The score determines the depth of follow-up.
The Three-Phase Due Diligence Process
AI vendor due diligence runs in three phases, each building on the previous. Unlike the traditional vendor review — which is largely documentation-driven — the AI review includes behavioral verification: you're not just reading what the vendor says, you're testing what the model actually does.
1
Phase 1: Model and Data Governance Review
This phase addresses the AI-specific risks that standard security questionnaires miss entirely. You're asking about model lineage, training data sources, data handling during inference, fine-tuning policies, and version change notification procedures. Most vendors are not prepared for these questions — and the quality of their answers tells you a great deal about their governance maturity.
2
Phase 2: Infrastructure and Compliance Verification
Standard security due diligence — SOC 2, penetration testing, data residency, access controls, incident response — but with AI-specific additions: model access controls, inference audit logging, output monitoring, and breach notification procedures that explicitly cover AI incidents (not just data breaches). Request a copy of their AI incident register if they claim to have one.
3
Phase 3: Behavioral Testing and Contractual Alignment
Deploy a proof-of-concept in a sandboxed environment with synthetic or anonymized data. Test for accuracy, consistency, and edge-case failures specific to your use case. Simultaneously, your legal team reviews the AI-specific contractual provisions: ownership of outputs, data use rights, version change notification obligations, liability for model failures, and exit terms including data deletion and model configuration portability.
The Master Due Diligence Checklist
This checklist covers every dimension of AI vendor risk across model governance, data handling, infrastructure security, contractual terms, and operational controls. Use it as the framework for Phase 1 and Phase 2 of your due diligence — and flag any question the vendor refuses to answer.
AI Vendor Due Diligence Master Checklist
Model Governance
What foundation model(s) power the product? Are they proprietary, open-source, or licensed from a third-party provider?
How frequently is the model updated or retrained? What triggers a version change?
What is the notification procedure for material model updates? How much lead time do customers receive?
Can customers pin to a specific model version? What is the version support lifecycle?
What explainability and auditability mechanisms exist for model outputs?
Has the model been evaluated for bias against demographic groups relevant to your use case?
What is the documented hallucination rate for your specific use case category?
Data Handling & Training
Is customer data (inputs, outputs, documents) used to train or fine-tune the model, now or in the future?
Is there an explicit opt-out from training data use? Is it default-off or default-on?
Where is data stored during inference? For how long? In what jurisdiction?
Does the vendor use subprocessors for model inference (e.g., OpenAI, AWS Bedrock)? Are these disclosed?
What is the data deletion procedure? How quickly is inference data purged post-session?
Is the model available in a private deployment option (on-premises or dedicated cloud instance)?
Infrastructure & Security
SOC 2 Type II — which trust service criteria? What is the audit period covered?
Does the vendor maintain an AI-specific incident register separate from general security incidents?
What access controls govern who at the vendor can access customer data or model outputs?
Is inference fully logged? Can you access inference logs for audit purposes?
What is the vendor's breach notification SLA, and does it explicitly cover AI model incidents?
What is the documented uptime SLA, and what are the remediation terms for SLA breaches?
Contractual & Legal
Who owns the outputs generated by the model when using your data?
Does the DPA explicitly address AI inference, model training opt-outs, and AI incident notification?
Is the vendor willing to accept liability provisions for material model failures in your use case?
What are the exit terms? Can you export all data and configurations? What is the data deletion SLA post-termination?
Does the contract require notification of material changes to the model, infrastructure, or subprocessors?
Operational Controls
What human-in-the-loop controls does the product provide for high-stakes decisions?
Can you configure output filtering, confidence thresholds, or escalation triggers?
Does the vendor provide a usage dashboard showing model accuracy metrics over time?
What is the escalation path when the model produces a harmful or inaccurate output?
Is there a vendor AI ethics or responsible AI policy published and externally reviewed?
Red Flags and Green Flags
Vendor responses tell you more than the answers themselves. The pattern of what a vendor can and cannot answer quickly is itself a signal about their governance maturity. These are the flags that change a procurement decision.
✓ Green Flags
Immediate, specific answers to training data use questions — with documentation
Training opt-out is default-on, not buried in enterprise settings
Published model card with evaluation methodology and bias assessment
Version pinning available; change notifications contractually guaranteed
AI incident register maintained and available on request
Inference logs accessible to customer for audit purposes
Willing to accept liability provisions for material model failures
Third-party AI safety audit or red-teaming results available
✗ Red Flags
Vague or redirected answers to training data use questions ("we take privacy seriously")
Training opt-out requires a support ticket or is enterprise-plan-only
No documentation of model evaluation methodology or bias testing
Model updates with no versioning, no notification SLA, no rollback option
"Our legal team will need to review" when asked about output ownership
Subprocessor list not disclosed or "confidential"
No AI-specific provisions in DPA — standard SaaS template only
Exit data export not technically feasible or takes more than 30 days
One red flag is a conversation. Multiple red flags is a rejection. In regulated industries, an AI vendor who can't answer the training data question is a vendor whose product should not touch your PHI, financial records, or confidential IP. The regulatory exposure from a training data breach — where your data appears in another organization's model outputs — is not one a SOC 2 certificate will cover.
Building the AI Procurement Gate
The due diligence checklist is only useful if it runs before deployment. The organizational challenge is that most AI tool procurement happens through the path of least resistance: department head approves, credit card goes down, tool deploys. By the time security sees it, there are 40 users and months of customer data already in the vendor's system.
Three structural controls that make the gate work without becoming a bureaucratic wall:
1
A two-tier intake process
Tier 1: Any AI tool touching business data requires a 24-hour security intake — not a full review, just a risk score and routing decision. Tier 2: Tools scoring above threshold go through full due diligence. The goal is speed at Tier 1 and thoroughness at Tier 2 — not slowness everywhere. Most AI tools will clear Tier 1 in under a day. The ones that need a full review should wait for it.
2
An approved vendor register
Once a vendor clears due diligence, add them to an approved AI vendor register with the risk score, approval date, conditions (e.g., "approved for use cases that do not involve PHI"), and renewal review date. Business units can deploy from the approved list without triggering a new review. This removes the incentive to bypass security — approval is fast once it's done once.
3
Annual re-review for high-risk approvals
AI vendors change their data handling policies, update models, and get acquired — sometimes without notification. High-risk approvals (scores of 11+) get a mandatory annual re-review. Trigger the re-review off the approved vendor register renewal date, not off an ad-hoc request. It takes 2–3 hours per vendor per year. The cost of not doing it is the $4.2M incident number at the top of this article.
Regulated Industry Considerations
The checklist above applies to all mid-market organizations. In regulated industries, additional requirements layer on top — and the AI vendor needs to be evaluated against them explicitly.
Healthcare (HIPAA/FDA): AI vendors processing PHI must sign a Business Associate Agreement. The BAA must explicitly cover AI inference — not just data storage. FDA's AI/ML-based Software as a Medical Device (SaMD) framework applies to any vendor whose model is used in clinical decision support. Verify FDA clearance status before deploying clinical AI.
Financial Services (SR 11-7/OCC): Model risk management guidance treats third-party AI models as vendor models subject to validation. The CISO and model risk team must co-own the due diligence process. Any AI model used in credit, fraud, or customer-facing decisions requires model validation documentation from the vendor.
Defense (CMMC/NIST SP 800-171): AI vendors touching CUI (Controlled Unclassified Information) must demonstrate compliance with NIST SP 800-171 or hold the appropriate CMMC certification level. Inference logs may be subject to DFARS 252.204-7012 incident reporting requirements — verify with your contracting officer.
EU AI Act (applicable August 2026): If the AI vendor's product falls under the high-risk system classification in Annex III — which includes healthcare, employment, credit, and critical infrastructure — they must provide a conformity assessment and be registered in the EU AI database. Non-EU vendors selling into European markets are not exempt.
If you don't have an AI vendor due diligence process today, the fastest way to close the gap is to run the checklist retroactively against your three highest-risk current AI vendors. You'll find gaps — data handling questions that were never asked, contractual provisions that weren't addressed, model change notification procedures that don't exist.
That exercise tells you exactly what remediation your existing vendor relationships need, and it builds the institutional knowledge to run a proper process on the next procurement request.
The goal isn't to slow down AI adoption. The goal is to adopt AI in a way that doesn't create a liability that surfaces 18 months later when a regulator asks whether you knew the vendor was training on your customer data.
The CISO who builds the gate wins twice. You reduce risk on new deployments, and you build the organizational credibility to be included earlier — before the credit card goes down. The moment business units start bringing AI vendors to you at the evaluation stage instead of the post-deployment cleanup stage, the posture has fundamentally changed.