Black-Box AI: When a lack of transparency cecomes a legal risk for companies

Yakin Surjadi, LL.M. (Chinese University of Hong Kong)

08.04.2026

10 min

AI systems often produce results without revealing their internal decision-making processes. This is one of the biggest problems with AI: the so-called 'black box' problem. When it becomes impossible to understand how a result is arrived at, transparency, fairness, accountability and effective human oversight are compromised. For businesses, this has long been more than just a technical issue.

Arrange a free initial consultation

Content

What is meant by a “Black Box”

A system is referred to as a black box when the inputs and outputs are visible, but the path in between largely remains hidden. In other words, you can see which data goes into a system and what comes out at the end, but you cannot reliably determine exactly how the model arrived at its result.

A model does not need to be completely incomprehensible for this to be the case. It is enough if key information about how it functions is missing or can barely be understood. In that case, it often remains unclear how the system actually operates—for users, supervisory authorities, and sometimes even for the developers themselves.

This is particularly common in powerful and complex AI systems. There are usually several causes behind it at once: mathematical complexity, strong dependence on training data, incomplete documentation, and deliberate opacity.

Complex model architectures: Deep neural networks and other high-capacity models operate with a large number of parameters, layers, and nonlinear transformations. As model size increases, it becomes increasingly difficult to make both the inference process and the training dynamics understandable step by step.
Dependence on training data: A model’s behavior depends heavily on the data on which it was trained. If the composition, gaps, or biases in that data are only partially known, the output also becomes difficult to explain. Even small changes in individual data sources or features can significantly alter model behavior.
Gaps in documentation: Even internally, there is often no clean or complete description of which data was used, which assumptions are built into the model, and where its limitations lie.
Deliberate opacity: In some cases, a system remains opaque because its architecture, training data, or code is treated as a trade secret. This makes external scrutiny even more difficult.

The black-box problem therefore has a very concrete impact on how responsibility, control, and legal certainty can be organized in AI projects.

Newsletter

For your Inbox

Current updates and important information on topics such as data law, information security, technology, artificial intelligence, and much more. (only in German)

Why opacity is a governance problem

The less understandable an AI system is, the more difficult it becomes to review outcomes, identify errors, and clarify responsibilities.

Opaque systems can produce flawed outputs while also concealing risks that may only become visible at a late stage: bias, data protection violations, security vulnerabilities, or problematic decisions whose origins no one can properly explain anymore.

In addition, AI-supported decisions often involve multiple actors: data providers, model developers, vendors, deploying companies, and internal specialist departments. If a problematic outcome then occurs, it quickly becomes unclear who is responsible for what. This creates accountability gaps in which many parties are involved, without the decision being clearly attributable to any one of them.

For companies, this is particularly sensitive. In the end, the key question is not how complex a system was, but whether its use can be classified, explained, and controlled in a legally sound manner.

The black-box problem affects several fundamental principles of data protection law at the same time. Four points are especially important: transparency, accountability, fairness, and explainability in decision-making contexts.

Transparency: Anyone processing personal data must be able to explain in a clear and intelligible way which data is used, for what purpose, and how automated processing works. That becomes particularly difficult in the case of AI systems that are hard to explain.
Accountability: Companies must be able to demonstrate that their processing is lawful. If it can no longer be clearly explained which features, data sources, or model logic shaped a given result, that demonstration becomes significantly more difficult.
Fairness: Opaque models make it harder to identify and correct biased or discriminatory effects. Where the relevant influencing factors remain in the dark, ensuring fair processing also becomes more difficult.
Explainability in decisions: Article 22 GDPR protects individuals against decisions based solely on automated processing that produce legal effects or similarly significant effects. At the same time, Articles 13 to 15 GDPR and Recital 71 require meaningful information about the logic involved, as well as the significance and envisaged consequences of the processing. It is precisely here that black-box systems reach their limits when even controllers can barely explain the model logic in any meaningful way.

The difficulties often begin even earlier. Opaque models make informed consent harder to obtain, can render privacy notices and responses to data subject access requests incomplete, and make it more difficult for companies to identify when they are already operating in the area of automated decisions with significant effects. The GDPR does not require total technical disclosure. But it does require a degree of intelligibility that actually enables the exercise of rights and effective oversight.

Why automated decisions are especially difficult

The issue becomes particularly sensitive where AI is not merely supportive, but prepares or makes decisions with tangible consequences—in other words, in situations involving benefits, access, opportunities, or burdens for individuals.

In such cases, the question arises whether affected individuals can meaningfully challenge a decision at all. That is exactly what black-box systems make more difficult. Anyone who does not know which criteria were decisive will struggle to identify errors and can only assert their own position to a limited extent.

There is also a clear imbalance of power between affected individuals and organizations. If the decision-making logic remains hidden, that imbalance becomes even greater. In disputes, it is often already difficult to prove that a decision was in fact made solely on an automated basis. This is another critical point: formal human involvement can be claimed easily, even if there is little or no genuine review in practice.

Why human oversight must be substantive

Effective human oversight requires that the responsible person understands enough to identify anomalies, has sufficient authority to change or reject outcomes, and knows the specific case well enough that the review is more than a mere formality. It also requires the ability, where necessary, to challenge a machine-generated output. This is particularly important in the case of opaque systems, because they tend to encourage automation bias—the tendency to assume too quickly that machine-generated results are correct.

Where these conditions are lacking, human control remains superficial. In that case, referring to a “human in the loop” offers little real protection—neither for affected individuals nor for the company.

What the AI Act additionally requires

The GDPR focuses primarily on the rights of affected individuals and the lawfulness of specific processing activities. The AI Act places greater emphasis on systemic transparency, documentation, and traceability. For black-box AI, that distinction is crucial.

For high-risk AI, the following requirements are particularly important:

Technical documentation: Before placing a system on the market, providers must prepare documentation setting out the system’s logic, architecture, algorithms, and potential discriminatory impacts.
Instructions for use: High-risk systems must be described in such a way that deployers can properly interpret outputs, understand accuracy and limitations, and identify risky deployment scenarios.
Logging and traceability: Events must be documented throughout the lifecycle so that the system’s behavior can later be traced and monitored.
Transparency for affected individuals: In certain high-risk AI scenarios involving AI-based decisions, the AI Act grants a right to an explanation of the individual decision-making process. This includes clear and meaningful explanations of the role of the AI and the main elements of the decision.
Access for supervisory authorities: Market surveillance authorities may obtain access to documentation, training data, and, in narrowly defined cases, even source code where this is necessary for conformity assessment.

The AI Act does not solve the black-box problem as such, but it does create a framework within which the use of such systems can be reviewed and limited more effectively. How robust that framework will prove in practice depends heavily on how technical standards later specify these requirements.

What companies should do now

When dealing with black-box AI, four things matter above all: an appropriate level of explainability, robust testing and validation, proper documentation, and resilient organizational safeguards.

Explainability should always be aligned with the use context. The higher the risk to affected individuals and the greater the potential impact, the higher the requirements for transparency, documentation, and oversight.
Black-box systems should also only be used where their opacity does not itself become the central risk. Methods such as SHAP, LIME, or counterfactual explanations can help make certain aspects visible. However, they do not provide a complete solution to the transparency problem.
Systematic testing, validation, and bias controls are just as important. Models should not only work from a technical perspective, but should also be tested for bias, failure modes, and other risky effects. Proper documentation creates a trail of transparency that facilitates audits, makes responsibilities more visible, and supports internal learning processes.
Affected individuals should be informed at an early stage and in a comprehensive manner about potential data processing activities in which a black-box issue may arise. In particular, this information should be provided to an appropriate extent in privacy notices.
Finally, companies need organizational measures that work in day-to-day practice: trained personnel, clear responsibilities, genuine intervention powers, and sufficient resources for oversight and review.

What the healthcare sector shows

The black-box problem becomes especially clear in the healthcare sector. On the one hand, AI systems can be relevant as high-risk AI in the area of statutory health insurance or in decisions on healthcare services from a regulatory perspective. In such contexts, transparency is not only a compliance issue, but also a matter of trust in essential services.

On the other hand, medical imaging shows that more explainable systems can indeed support clinical decision-making. More understandable outputs help classify results more quickly, communicate diagnoses more clearly, and involve patients more closely in decisions.

At the same time, this field also highlights the limits of explainable AI. Uniform standards for interpretability are lacking, problems relating to data quality and generalizability remain, and questions of responsibility do not resolve themselves automatically either. Explanations can even create a new form of misplaced trust if they are misunderstood or accepted uncritically.

Conclusion

The black-box problem in the use of AI confronts companies with technical, legal, and organizational issues all at once. Especially in the case of automated decisions with significant effects, companies need systems whose use remains understandable, reviewable, and justifiable. That is the aim of both the GDPR and the AI Act, even though they differ in focus and in the interests they seek to protect. Ultimately, the issue is less about complete disclosure than about a form of transparency that genuinely supports rights, oversight, and accountability in practice.

Overview of our AI advisory services

Regulatory Mapping:
Identification of relevant legal requirements through detailed mapping in accordance with various national requirements and EU data regulations.
Data & AI Governance:
Development and adaptation of governance structures, identification of relevant requirements, and preparation for compliance with the AI Act.
Training:
Workshops on the scope and implementation of the AI Act, including AI literacy training in accordance with Article 4 AI Act for executives, product teams, and developers.
AI Inventory:
Support in creating an overview of all AI systems used within the company, including determining whether a system qualifies as an AI system or not.
Contract Drafting:
Drafting contracts in connection with AI projects, such as development agreements, AI-as-a-Service agreements (AIaaS), and related contracts.
Advice on External AI Applications:
Advice and guidance on the use of external AI applications and the assessment of third-party tools.
Anonymization & Pseudonymization:
Design of and advice on anonymization and pseudonymization concepts.
Risk Assessments:
Advice on risk assessments in the context of data protection impact assessments and fundamental rights impact assessments relating to AI systems.
Copyright Advice:
Advice on copyright-related implications in connection with GenAI (e.g. rights relating to data input, the protectability of prompts, and output).
Compliant Data Use:
Advice on the legally compliant use of big data, machine learning, and generative AI in connection with data protection law, trade secrets, and database rights.
Advice on AI Development:
Holistic advice on contract management, compliance, and other legal aspects relating to AI development projects.