You’ve built and exited multiple companies before founding CNTXT AI. Looking back, what common thread connects all the businesses you’ve built?
Every company started with the same observation: something in this region was not working, and no one was fixing it because the solution had to be built here, not imported.
The edtech venture came after I saw schools managing student data on spreadsheets that could not scale. The audio platform started because I wanted Arabic content to be as accessible as English content, and it was not. CNTXT AI is the same logic at a larger scale. The region had abundant data and genuine AI ambition, but the foundational layer was still missing at scale: clean, structured, culturally grounded Arabic data that AI could actually use.
We build for problems we see here. That principle has not changed. The scale has.
Most people discovered AI in the last few years, but you’ve been working on Arabic automation, data, and language technologies long before the current wave. What convinced you early on that this was a problem worth solving?
We set out to solve our own problems
When we were building content platforms and education tools, the existing tools simply did not work in Arabic. Dialects were ignored, data was missing, and everything was built for English first. We had no choice but to build the infrastructure ourselves.The conviction came from necessity. Every company we built forced us deeper into Arabic data, Arabic automation, Arabic language systems. By the time we were working alongside G42 and TII on large language model development, we had already been building the foundations of Arabic AI for years without calling it that. The problem had been obvious to us from the start because we were living it.
When you started building in this space, what were the biggest challenges that others didn’t see yet?
The biggest challenge is data. The entire global AI industry was built on English data, and Arabic was treated as a translation problem rather than a primary language.
That meant two things. First, the data did not exist. There were no large-scale Arabic datasets, no annotated dialect corpora, no infrastructure for labeling Arabic content at scale. We had to build that from scratch. Second, the talent did not exist in the form the market understood. The people who could build Arabic AI were not the people global labs were hiring. They were linguists, dialect specialists, local engineers who understood context, not just code.
Most teams were trying to adapt English models to Arabic
We were building Arabic AI from the ground up. That is a completely different problem. It required a completely different team, a completely different data strategy, and a different definition of what good performance looks like.
Many founders entered AI after the technology became mainstream. How did being early shape the company you’ve built today?
Being early meant we had to build things that did not exist. That forced us to own the full stack.
We built our own data infrastructure, model pipelines, testing frameworks, and agent technology. Each one because nothing available handled Arabic the way we needed. Today that ownership is our advantage. We do not depend on a foreign API that might change its terms or decide our region is not a priority. We control the data layer, the model layer, the validation layer, and the application layer.
We also work closely with NVIDIA on compute and model optimization, Oracle and AWS on sovereign cloud deployment, and Figure AI on physical AI and robotics. When a government asks for sovereign AI, we can deliver it because we already built it.
What lessons from your previous ventures proved most valuable when building CNTXT AI?
Two things I learned the hard way and built into CNTXT AI from day one.
The first: traction is not a business. I have built and sold companies across education, audio, and content. In every one of them, the moment we scaled fast without the right infrastructure underneath, things broke. At CNTXT AI, we validate before we scale. Our testing framework exists because I have already paid the price of launching fast and fixing in production.
The second: The team you build is the company you build
Technical talent is not enough. You need people who understand the local context, the language, the culture.
More than 400 million people speak Arabic, yet the language has often been underserved by global technology platforms. Why do you think that gap existed for so long?
Because Arabic is hard, and the global tech industry optimizes for easy.
English is one language with consistent grammar and massive clean datasets. Arabic is dialects, complex morphology, code-switching with English and French, and a writing system that changes shape depending on where a letter sits in a word. Building for Arabic requires linguists, not just engineers, and local data collection, not just scraping the internet.
Global platforms did not ignore Arabic because they were biased
They ignored it because it was not profitable to solve. The datasets were small, the talent was scattered, and the market was fragmented.
That gap existed because no one wanted to do the hard work. We did.
What makes building AI for Arabic fundamentally different from building for English or other major languages?
Three things that compound on each other.
Data scarcity
English has decades of digitized content. Arabic has a fraction of that, and what exists is mostly formal Modern Standard Arabic, not the dialects people actually speak.
Dialect complexity. English has accents. Arabic has entirely different grammatical systems across regions. A model trained on Gulf Arabic fails in Morocco. Handling 25+ dialects is not a product feature. It is the minimum requirement.
Cultural context. AI is not just about understanding words. It is about understanding who is speaking and what they mean. Our team reviews outputs not just for accuracy but for appropriateness. They do not just understand what was said. They understand who said it, and why.
Do you see Arabic AI as a regional opportunity, or as a global opportunity that has simply been overlooked?
Both, and I think they play out on different timelines
Regional because 400 million Arabic speakers need it now. The GCC is deploying AI across government, finance, and enterprise at a pace most markets are not matching. That creates immediate demand for AI that works in Arabic, not AI that tolerates it.
Global because the world is waking up to the fact that AI cannot be built on English alone. When a global company wants to enter the Arab world, they will need Arabic AI. When they realize how hard it is to build, they will come to the companies that already did. We intend to be that company.
How important is it that AI systems understand not just language, but cultural and regional context as well?
It is the difference between a tool that works and a tool that causes problems
Language is identity, humor, respect, hierarchy, and history. An AI that understands words but gets the cultural register wrong is not just inaccurate. In a government or enterprise context, it is a liability.
This is why we do not just train on Arabic data. We train on Arabic context. Get the cultural register wrong and it does not matter how accurate the model is.
What impact could better Arabic AI have on education, media, government, and business across the region?
The impact is not incremental
It is structural.
In education, a student should be able to ask a question in their dialect and get a native answer. Munsit makes that possible today, in production.
In media, better Arabic AI means transcription, translation, and content generation that preserves dialect and tone.
In government, if AI agents do not understand the dialect of the people they serve, they fail them.
In enterprise and industrial environments, we are already working on active robotics projects: training data, evaluation, and Arabic voice interfaces for physical AI systems. Arabic-speaking operators should be able to command systems in their language. That gap is closing.
The companies that move first will win. The ones that wait will not recover the ground they lose.
We’ve heard a lot recently about “sovereign AI.” What does that term actually mean, and why has it become such an important priority for governments and enterprises?
Sovereign AI means three things: your data stays in your country, your models are trained on your data, and your systems are controlled by you, not a foreign platform that can change its terms or decide your region is not a priority.
It became a priority because the last decade taught a hard lesson
When you build on someone else’s infrastructure, you do not own your future. You rent it.
For governments, it is a national security issue. For enterprises, it is a competitive one. Your customer data is your intellectual property. If it lives under foreign jurisdiction, you are giving away your advantage. The UAE understood this early. Sovereign AI is not a compliance checkbox. It is how you stay in control of your own future.
How is CNTXT AI approaching sovereign AI differently from global providers?
Global providers offer sovereign AI as a feature. We offer it as architecture.
They will open a data center in the region and call it sovereign
But the model was trained on English data, the pipeline was built for Western use cases, and the system does not understand your dialect. That is localization, not sovereignty.
We built the entire stack from Arabic data upward: collection, labeling, annotation, evaluation, and deployment, across text, audio, image, video, and multimodal data. Our network of Arabic-speaking contributors through CNTXT AI Connect gives us data depth no foreign provider can replicate. We deploy on infrastructure built for this region, with Oracle, AWS, and NVIDIA as our technical partners.
Sovereign AI is not only where your data sits. It is who built the system that uses it.
What do you think investors are recognizing now that perhaps the market overlooked a few years ago?
A few years ago everyone was betting on models
Bigger, faster, more parameters. The assumption was that the largest server farm would win. That turned out to be only part of the story.
The winners are the ones with the right data. Clean, labeled, contextually accurate, built for the specific problem. You can spin up compute overnight. You cannot replicate years of data collection, curation, and domain expertise. That is the moat.
They are also recognizing that the Middle East is not just consuming AI. It is building it. The capital is here, the infrastructure is moving, and the region is positioning to produce the next wave.
We were doing the hard data work years before it became interesting to investors. The stack is proven, the deployments are real, and that took years to build. That is what the investment reflects.
If we’re sitting here five years from now, what would success look like for CNTXT AI, and for the broader Arabic AI ecosystem you’re helping build?
For CNTXT AI, success means being the AI layer for the region: voice, vision, agents, robotics, every modality, every sector that needs sovereign production-grade AI built and hosted here. Governments, banks, hospitals, telcos, and industrial operators across the region running on our stack.
For the ecosystem, success means the Middle East stops importing AI and starts exporting it. And personally, it means the next generation of founders looks at what we built and says: if they did it from here, so can I.
Where can readers connect with you and find out more about CNTXT AI?
The best place to follow our work is LinkedIn
Our company page shares what we are building, what we are learning, and where we think Arabic AI is going. You can also connect with me on my Linkedin page directly.
For CNTXT AI and our product portfolio, visit cntxt.tech. If you are working on something in the region and want to explore what is possible with Arabic AI, reach out directly.
