The Study
A new academic study evaluated leading AI models against two benchmarks: AAOIFI-defined Islamic finance terminology and a CSAA-based (Certified Shariah Advisor and Auditor) professional examination. The researchers tested whether general-purpose AI models can match the precision required for institutional Shariah compliance work.
The Results
93.75%
of AI-generated definitions had low conceptual similarity to AAOIFI standards
64%
ChatGPT score on the CSAA-style examination
74.63%
of human test-takers outperformed by ChatGPT, yet below the qualified advisor threshold
The results reveal a paradox. Generic AI is intelligent enough to pass most human test-takers, but not precise enough for the standard the industry actually requires. In a field where a misinterpreted clause in AAOIFI FAS 28 or an imprecise characterisation of gharar can invalidate a product structure, "close enough" is a liability.
Why Terminology Drift Matters
When the researchers compared AI-generated definitions to AAOIFI's official glossary, 93.75% showed low conceptual similarity. This is not a minor calibration issue. Islamic finance has a precise technical vocabulary. Terms like mudarabah, musharakah, ijara, and tawarruq carry specific legal and contractual meanings defined by AAOIFI standards. When an AI model uses these terms loosely, drawing from general internet training data rather than authenticated standards, the output may read plausibly but mean something different from what the standard specifies.
For compliance officers preparing board documentation, or auditors verifying product structures, this drift is dangerous precisely because it is hard to detect. A hallucinated regulation is obviously wrong. A subtly imprecise interpretation of AAOIFI SS 13 clause 3.6 looks correct until someone checks the source.
The Precision Gap Is Architectural, Not Intellectual
The study's findings point to an architectural problem, not an intelligence problem. Generic models were not trained on authenticated Islamic financial jurisprudence. They are interpolating from general knowledge. The researchers explicitly recommend domain-specific, Shariah-aligned models with human-in-the-loop governance.
Three architectural requirements
- 01
Source attribution
Every answer must cite the specific standard, clause, and version. Not "based on AAOIFI standards." The exact reference, independently verifiable.
- 02
Domain-specific knowledge bases
Training on authenticated AAOIFI, IFSB, and jurisdiction-specific standards, not web-scraped approximations.
- 03
Human oversight by design
AI augments scholarly judgment. It does not replace it.
What This Means for Islamic Financial Institutions
Institutions evaluating AI tools for Shariah compliance should ask:
- 1.
Does the tool cite specific standards and clauses, or does it speak in generalities?
- 2.
Was the model trained on authenticated AAOIFI and IFSB standards, or general internet data?
- 3.
Does the system distinguish between what it knows (verified sources) and what it does not know (knowledge boundaries)?
- 4.
Is human scholarly oversight built into the workflow, or bolted on as an afterthought?
- 5.
Can every output be independently verified against the cited source?
Generic AI scored 64% on a professional exam. Institutional compliance requires something closer to 100%. The gap is real, and it is architectural.
Frequently asked questions
An academic study published on SSRN that evaluates leading AI models (including ChatGPT) against AAOIFI-defined Islamic finance terminology and a CSAA-based professional examination. The study measures whether generic AI can match the precision required for institutional Shariah compliance.
When researchers compared AI-generated definitions of Islamic finance terms to AAOIFI's official definitions, 93.75% of the AI outputs showed low conceptual similarity. The AI uses the right words but assigns imprecise or incorrect meanings, creating risk for compliance professionals who rely on those definitions.
ChatGPT scored 64% on a CSAA-style examination, outperforming 74.63% of human test-takers. However, this score falls below the threshold required for a qualified Shariah advisor, demonstrating that generic AI is powerful but insufficiently precise for institutional compliance work.
The study recommends domain-specific, Shariah-aligned AI models with human-in-the-loop governance. Key requirements include source attribution on every answer, training on authenticated AAOIFI and IFSB standards, clear knowledge boundaries, and integration with human scholarly oversight.
Islamic finance has a precise technical vocabulary defined by AAOIFI standards. Terms like mudarabah, musharakah, ijara, and tawarruq carry specific legal and contractual meanings. When AI uses these terms loosely, the output reads plausibly but means something different from the standard. For compliance officers preparing board documentation, this drift is dangerous because it is hard to detect.
Institutions evaluating AI for Shariah compliance should ask five questions: Does it cite specific standards and clauses? Was it trained on authenticated standards? Does it distinguish what it knows from what it does not? Does it present multiple madhahib positions? Is human scholarly oversight built into the workflow? Generic AI fails on all five.
Ready to see what purpose-built Shariah AI looks like?
Request early access