Colossyan — The Enterprise L&D AI Video Platform Built by the People Who Started a Deepfake Detector
The most interesting founders often arrive at their final idea via something that nearly worked.
In 2018, three Hungarian engineers who had met at the Technical University of Denmark in Copenhagen were watching a viral deepfake video of Barack Obama and asking a different question than most people. Not “how was this made?” but “how do we prove it’s fake?” They founded Defudger — a startup that used computer vision, machine learning, and blockchain to detect synthetic media. The company made it to the Venture Cup competition finals in January 2019. It did not become a business.
The problem was not the technology. The problem was the customer. Media companies, the logical buyers of deepfake detection, declined to pay for it. Detection infrastructure was a cost with no revenue attached. The market said: interesting problem, not our budget.
So Dominik Mate Kovacs, Zoltan Kovacs, and Kristof Szabo made an uncomfortable decision. If detection wasn’t a business, perhaps creation was. The same deep fluency in synthetic faces and video generation they’d built trying to catch fakes could be redirected toward making legitimate ones. By late 2019 they had pivoted. By 2021 they had a company: Colossyan.
By 2026, Colossyan has raised $28.2 million, served 35,000 business accounts across 46+ countries, grown revenue 155% year-over-year in 2024, and built a platform that competes directly with Synthesia for enterprise learning and development budgets. Its customers include Novartis, Johnson & Johnson, Porsche, Jaguar Land Rover, Vodafone, Cisco, UPS, and Paramount.
The origin story matters here more than with most AI video companies because it explains the technical posture. Colossyan was not built to make marketing content or social media clips. It was built by people who had spent years thinking about how synthetic faces work at a foundational level — and who made a deliberate choice to target the one corporate use case where the combination of video production quality, interactive learning features, and regulatory compliance would matter most.
We research AI tools from public sources and documentation. We do not test them hands-on.
The Founders and the Defudger Pivot
Dominik Mate Kovacs (CEO) studied General Engineering in Cyber Systems at the Technical University of Denmark, with a specialization in reinforcement learning and AI-based computer vision during an exchange at the University of Hong Kong. He was named to the Forbes 30 Under 30 Europe (Technology) list in 2024. Co-founders Zoltan Kovacs and Kristof Szabo — the latter four years older with prior video company experience — rounded out the founding team. All three are Hungarian nationals who met during their studies in Denmark.
The near-death experience the company had in summer 2022 is part of the public record. After pivoting from Defudger, the founding team raised a seed round of approximately €1 million (July 2021, led by Day One Capital with Oktogon Ventures and APX) and began building. When they returned to raise their next round in summer 2022, VCs declined during a compressed fundraising environment. Bridge financing from Day One, APX, and Oktogon kept the company alive.
That moment forced a strategic clarification. Rather than trying to be a general-purpose AI video platform competing across all use cases, the team sharpened the focus onto enterprise learning and development. The pivot was tactical — they needed a beachhead vertical with clear enterprise procurement cycles, obvious ROI metrics, and willingness to pay for compliance. L&D fit all three criteria. The $5M pre-Series A (LAUNCHub Ventures, February 2023) and $22M Series A (Lakestar leading, February 2024) followed the sharpened focus.
“The only platform combining video, courses, localization, and delivery in one” is how Colossyan positions itself. The implied contrast is with Synthesia, the market leader, which built its platform primarily around video creation and required customers to integrate separately for LMS delivery, interactive elements, and localization workflows.
Funding History
Colossyan has raised $28.2 million across three rounds.
| Round | Date | Amount | Lead Investor | Notable Co-Investors |
|---|---|---|---|---|
| Seed | July 2021 | ~$1.18M (€1M) | Day One Capital | Oktogon Ventures, APX (Axel Springer/Porsche) |
| Pre-Series A | February 2023 | $5M | LAUNCHub Ventures | Emerge Education, returning investors |
| Series A | February 2024 | $22M | Lakestar | LAUNCHub Ventures, Day One Capital, Emerge Education, Oktogon Ventures |
The 2023 round has been labeled both “Series A” and “Seed extension” in different sources; the 2024 Lakestar-led $22M is the definitive Series A. The bridge financing that saved the company in summer 2022 came from existing investors at undisclosed amounts.
For context: Synthesia has raised approximately $150 million or more and holds a $4 billion valuation. Colossyan’s $28.2M total makes it a significantly smaller operation in absolute terms, which has downstream implications for hiring, R&D scope, and enterprise sales capacity.
Revenue: Third-party revenue estimator Latka.com reported $6.6M for 2023. Colossyan’s own announcement cited 155% YoY growth for 2024, implying a figure in the $15–17M ARR range. One estimate (September 2025) placed ARR at approximately $15M. The company has not publicly disclosed official ARR figures.
Scale: 35,000 business accounts as of early 2025, up 61% year-over-year. Nearly 1 million videos created on the platform. Present in 46+ countries across six continents.
The Products: What Colossyan Actually Offers
Colossyan Creator
The core platform. Users input a text script (or upload a document — PDF, PPT, DOC, TXT), select an avatar, choose a voice, and generate a video. The workflow is designed to minimize the steps between “I have a script” and “I have a usable training video.”
Key features as of mid-2026:
- Text-to-video and script-to-video: The foundational function. Script goes in, talking-head video comes out.
- Document-to-Video: Automatic conversion of uploaded documents into a video with AI-generated scripts and avatar narration. Reduces the content repurposing workflow for L&D teams sitting on libraries of existing written material.
- PowerPoint import: Slide decks can be imported and edited directly within the platform, with AI avatars added to each slide.
- Screen recording with narration: Software training content and process documentation can be produced without leaving the platform.
- AI script assistant: Generates and refines scripts from prompts or document content.
- Prompt-to-video: Natural language → complete video without intermediate script editing step.
- Brand kits: Company logos applied to avatar outfits and presentation materials for visual consistency.
- Team collaboration: Multiple editors can work on a project simultaneously with review workflows.
Branching Scenarios and Interactive Learning
This is where Colossyan’s L&D differentiation is most concrete. The platform supports branching scenarios — interactive decision-tree learning paths where learners make choices and are routed to different video branches based on their responses. Multiple-choice quizzes with customizable feedback are embedded natively, not bolted on through third-party tools.
Up to four AI avatars can appear in a single scene, enabling simulated conversations and role-play scenarios (a learner dealing with an angry customer, a manager delivering feedback, a sales call). Video analytics track views, average watch time, and quiz scores per viewer — data that L&D teams use to measure training effectiveness.
Synthesia competes in interactive video but requires add-ons for some of these features. The degree to which Colossyan’s native implementation is more seamless is a product experience question that depends on specific use cases.
Coming soon: Hotspots, sorting exercises, hyperlinks, and matching activities. The Forge rendering engine (announced, not yet shipped) will deliver 2x faster generation and support videos up to one hour.
SCORM Export
Colossyan exports SCORM 1.2 and SCORM 2004 4th edition — the standard formats for delivering and tracking training content in learning management systems. This means a video produced in Colossyan can be packaged into a SCORM module and dropped directly into Cornerstone, Workday Learning, SAP SuccessFactors, Moodle, or any other LMS with SCORM support.
SCORM completion events trigger webhook notifications, allowing LMS administrators to track which learners finished which modules and at what quiz scores.
The competitive significance: Synthesia restricts SCORM export to its Enterprise tier (custom pricing). Colossyan’s pricing page currently places SCORM at Enterprise as well — sources from earlier in 2024 suggested Business tier included it, which may reflect a pricing change. Prospective buyers should verify current tier availability directly.
LMS Integrations
Beyond SCORM, Colossyan has direct integrations with:
- 360Learning (native)
- Articulate (import AI videos into Rise 360 or Storyline modules)
- Elucidat
- EasyGenerator
- UQualio
- MicroBuilder by ELB Learning
- ClickLearn (exclusive API integration partnership, 2024 — ClickLearn users can create Colossyan videos inside ClickLearn’s platform)
- Zapier (7,000+ app automation via no-code)
- VideoAsk, Tolstoy (interactive video)
- YouTube
The ClickLearn partnership is notable because ClickLearn specializes in software adoption and process documentation — a high-value L&D niche where video narration of screen workflows is particularly valuable.
NEO 2 Avatar Engine
NEO 2 is Colossyan’s rebuilt avatar rendering system. The most significant change over NEO 1 is the removal of industry restrictions: NEO 1 blocked healthcare, biotech, and financial services companies from using stock avatars (a policy Synthesia also maintains). NEO 2 removes those restrictions entirely, meaning a pharmaceutical company can create training content using any avatar in the library.
NEO 2 improvements:
- Full-body movement with natural weight shifts
- Hand gestures timed to speech emphasis
- Head movements and maintained eye contact
- Facial expressions that adjust to content tone
- Synchronized lip movements with improved realism
- 40 languages with enhanced voice quality
The 300+ stock avatars available span diverse ethnicities, ages, genders, and professional styles. Custom avatars — digital twins built from a short recording session — can be created for any person who consents. Instant Avatars convert a 15-second video clip into an AI avatar, with voice cloning available in 30+ languages. Branded avatars can be configured to wear company-specific visual identity.
Scenario avatars are positioned in real-world environmental backgrounds: offices, warehouse floors, retail environments. These are designed for role-play simulations where realistic context matters — a safety training scenario set in an actual warehouse reads differently than the same content against a white background.
G2 user reviews as of 2025 show “Avatar Limitations” as one of the most frequently cited issues (54 mentions), with specific complaints about hand gestures (“not yet great” on many avatars) and facial expressions that can feel stiff. These are known limitations that the NEO 2 engine is iterating against. The company’s in-house research team, led by Director of Research Shahzaib Aslam, is focused on neural video synthesis and human behavior understanding.
Video Generation Speed
Colossyan improved video generation speed by 33% as of its February 2025 Winter Release. G2 reviews note that a typical 1-minute training video currently takes approximately 10 minutes to render — which is meaningful friction on the 15-minute monthly allowance in the Starter plan, and something to account for in high-volume production workflows.
The Forge engine (coming soon) promises 2x generation speed and support for videos up to one hour — addressing both the speed and length limitations in the current system.
Conversational Avatars (Waitlist)
Colossyan has announced AI Agents — real-time interactive conversational avatars trained on uploaded knowledge bases. Use cases include 24/7 customer support, sales coaching, soft skills training simulations, and interactive onboarding.
As of May 2026, this product is in waitlist/early access. The platform is using an invite-based queue system. This puts Colossyan behind D-ID (Visual Agents, shipping), Tavus (CVI, shipping with Phoenix-4/Raven-1 model stack), and HeyGen (Avatar in Motion) in real-time conversational avatar delivery. It is a meaningful gap for enterprise customers who want to evaluate real-time AI avatars alongside async video production from a single vendor.
API (Version 2.0)
The Colossyan REST API supports programmatic video generation with JSON payloads. Capabilities include avatar selection, voice cloning, brand kit application, SCORM export, and webhook push notifications for completion tracking. API 2.0 supports bulk personalized video — generating different versions of the same video with variable fields (different learner names, scores, department-specific content) without manual editing.
Documentation is at docs.colossyan.com.
MCP Server Status
No official Colossyan MCP server exists as of May 2026. No community or third-party MCP integrations for Colossyan were found in public repositories or MCP server directories.
Colossyan provides a REST API and Zapier integration for automation workflows, but no MCP-native tooling exists for teams using Claude Code, Claude Desktop, or other MCP-compatible AI systems. This is a growing gap as competitors begin exploring MCP integration — though notably, Synthesia and D-ID also lack official MCP servers.
Compliance and Security
Certifications:
- SOC 2 Type II (independently audited annually)
- GDPR compliant — data processing agreements available, 72-hour breach notification
Infrastructure:
- AWS, EU and US regions
- TLS 1.3 in transit / AES-256 at rest
- 99.9% uptime SLA (1-hour RTO/RPO)
- Annual third-party penetration testing
- UpGuard attack surface monitoring (score maintained above 900)
- SAML and OIDC SSO
- Custom roles and permissions
- Multi-workspace architecture
- Customer content is never used to train AI models; vendors are contractually prohibited from training on customer data
Gaps:
- No HIPAA certification — a significant gap for US healthcare enterprise customers. (Colossyan creates training videos for healthcare organizations but the platform itself is not HIPAA-certified.)
- No ISO 42001 — Synthesia holds this certification (first in category). Colossyan does not.
For most enterprise L&D buyers, SOC 2 Type II and GDPR compliance satisfy procurement requirements. Healthcare and life sciences companies with strict HIPAA requirements will need to verify with Colossyan’s enterprise team or consider alternatives.
Pricing
| Plan | Monthly (billed monthly) | Annual | Video Minutes | Key Features |
|---|---|---|---|---|
| Free | $0 | $0 | 3 min/month | 20+ avatars, 1 instant avatar, 100+ languages, unlimited viewers |
| Starter | $27/month | $19/month | 15 min/month | 70+ avatars, 3 custom avatars + 1 voice, AI script, prompt-to-video |
| Business | $88/month | $70/month | Unlimited | 170+ avatars, 10 custom avatars, NEO 2, 4 interactive videos/month, 10 auto-translations/month, brand kits, 3 editors |
| Enterprise | Custom | Custom | Unlimited | 200+ avatars, unlimited interactive/translations, SAML/SSO, SCORM export, 24/7 support, dedicated success manager |
A 14-day free trial is available.
The 3-minute free tier and 15-minute Starter plan are notable limitations. Given that 1-minute videos currently take ~10 minutes to render, a 15-minute/month cap on Starter creates meaningful friction for teams evaluating the platform seriously. The Business tier’s unlimited video minutes and NEO 2 access at $70–88/month is the practical entry point for any organization with recurring training video production.
SCORM export is Enterprise-only in the current pricing. Interactive video features (branching, quizzes) are currently listed as Enterprise or limited (4 interactive videos/month on Business). Buyers should verify current tier limits before committing.
Notable Customers and Case Studies
Named enterprise clients:
- Pharmaceutical/healthcare: Novartis, Johnson & Johnson, Hoya
- Automotive: Porsche, Jaguar Land Rover, Continental AG
- Technology/telecom: Vodafone, Cisco, Ericsson
- Logistics: UPS, DSV
- Financial services: AmeriSave
- Media: Paramount
- Engineering/consulting: WSP
- Hospitality: Sonesta (reported 80% cost reduction)
- Other: Under Armour, State of New Mexico
Published case study highlights:
- Sonesta (hospitality): 80% cost reduction in training video production
- AmeriSave (financial services): doubled training content output
- AFNB: reduced content creation time by 90%
The breadth of verticals — pharma, automotive, telecom, logistics, government — reflects Colossyan’s NEO 2 decision to remove industry restrictions. Historically, avatar video platforms that blocked healthcare and financial services from stock avatars limited their own addressable market. Colossyan made an explicit architectural decision to not do that.
Competition: Where Colossyan Sits
vs. Synthesia
This is the primary competitive frame. Both companies are targeting enterprise L&D budgets with AI avatar video platforms. The differences:
| Colossyan | Synthesia | |
|---|---|---|
| Total raised | $28.2M | ~$150M+ |
| ARR (est.) | ~$15M | ~$100M |
| Customers | 35,000 business accounts | Claimed 60,000+ companies |
| Fortune 100 penetration | Not disclosed | 90%+ |
| Avatars | 300+ | 240+ |
| Languages | 100+ | 140+ |
| SCORM export | Enterprise | Enterprise |
| ISO 42001 | No | Yes (first in category) |
| HIPAA | No | Not confirmed either |
| Industry restrictions on stock avatars | None (NEO 2) | Yes — healthcare/biotech/financial blocked |
| Human content moderation | No (instant publish) | Yes (adds hours/days per video) |
| Interactive video | Native (branching, quizzes) | Available but requires more setup |
| MCP server | No | No |
The core Colossyan argument: we deliver comparable video quality with fewer workflow restrictions, faster publishing, better native L&D interactivity, and a lower entry price. The core Synthesia counterargument: we have more avatars, more languages, more enterprise brand recognition, and ISO 42001 certification that matters to regulated procurement teams.
Both arguments are defensible. For an enterprise with hundreds of millions of dollars to spend and a strict compliance checklist, Synthesia’s scale and certifications may win. For a L&D team that needs to produce multilingual training content fast and doesn’t want to wait for human content review on every video, Colossyan’s frictionless approach is a genuine advantage.
vs. HeyGen
HeyGen leads on creative flexibility, voice cloning breadth (175+ languages), and social/marketing content. Its Avatar IV delivers micro-expressions and emotional responsiveness in a different segment of the market. HeyGen has an official MCP server; Colossyan does not. Colossyan leads on L&D-specific features (SCORM, branching, quizzes) and is more explicitly positioned for corporate training. These are partially different markets.
vs. D-ID
D-ID is focused more on marketing video production, real-time conversational avatars (Visual Agents, V4 Expressive), and the novel Agentic Videos product (conversational AI embedded in a video player). D-ID’s simpleshow acquisition ($60M, 500+ Fortune 1000 customers) gives it significant enterprise L&D distribution — but simpleshow’s product is explainer video, not avatar video. D-ID’s platform doesn’t currently compete on SCORM, branching, or LMS integration depth. These are different L&D tools.
vs. Tavus
Tavus is a real-time conversational video AI (CVI) platform with sub-500ms latency and the Phoenix-4/Raven-1 model stack. It competes in personalized video at scale and conversational AI, not in async training content production. The overlap with Colossyan is minimal except for large enterprises evaluating both async training video and real-time avatar use cases from a single vendor.
Limitations and Known Issues
From user review data (G2, Capterra):
- Avatar naturalness: Hand gestures remain the most-cited weakness. NEO 2 has improved this but hasn’t resolved it entirely. Facial expressions can feel mechanical in longer videos.
- Render times: ~10 minutes per 1-minute video. Makes the Starter plan’s 15-minute monthly allowance painful in practice. Forge engine (coming soon) should address this.
- Conversational avatars are still on waitlist: For enterprises who want real-time AI agents alongside async video, Colossyan cannot deliver today. D-ID and Tavus can.
- Template library: Approximately 30 pre-made templates versus 60–75 at Synthesia and HeyGen. More starting-point design work required.
- Project stability: Some reports of save failures and project corruption on larger videos. An issue for production teams with tight deadlines.
- Custom voice quality: Voice clones described as “too monotonous” by some reviewers compared to standard AI-generated voices.
- Content moderation edge cases: Some users report unexpected content flags — a healthcare blockchain use case reportedly triggered a financial products policy. The filtering logic may have edge cases worth testing before committing to Enterprise.
Structural limitations:
- No HIPAA certification — disqualifying for some US healthcare enterprise buyers
- No ISO 42001 — procurement checklists at large regulated enterprises may require this; Synthesia has it
- No MCP server — no AI agent integration path for Claude, ChatGPT, or other MCP-compatible workflows
- Language coverage — 100 languages vs. Synthesia’s 140+ and HeyGen’s 175+; a gap for multilingual global enterprises
- Smaller R&D scale — $28.2M total funding vs. Synthesia’s $150M+ limits the pace of model improvement
What Colossyan Gets Right
Several of Colossyan’s choices are structurally correct for its target market:
Removing industry restrictions. NEO 1’s blocks on healthcare, biotech, and financial services were a self-imposed constraint that limited addressable market without a clear safety benefit. NEO 2 removed them. Novartis and J&J on the customer list suggest the decision is working.
Instant publish without human moderation. Synthesia’s content moderation queue adds hours or days to every video. For L&D teams working against training deadlines, this friction is real. Colossyan’s instant publish after render is a genuine workflow advantage — with the caveat that automated content filtering has edge cases, as some user reports indicate.
Native interactivity without third-party tools. Branching scenarios and quizzes built into the platform rather than requiring Articulate or other authoring tools reduces the stack complexity for L&D teams. The degree of completeness versus standalone authoring tools (Storyline, Rise 360) is a question for prospective buyers to evaluate based on their specific scenarios.
Document-to-Video. L&D teams have enormous libraries of existing written content — compliance documentation, onboarding materials, process guides. The ability to ingest a PDF and generate a narrated video with minimal editing is a direct productivity multiplier for this specific use case.
Scenario avatars with environmental context. A warehouse safety training video set in a warehouse looks different from one shot against a white background. The contextual realism matters for learner engagement in certain training categories.
Rating: 4/5
Colossyan has built a coherent, well-differentiated enterprise L&D platform with a clear product philosophy and a credible customer roster. The pivot from deepfake detection to AI video creation led to a company that thought deeply about its target market rather than trying to be a general-purpose video tool. The NEO 2 engine’s removal of industry restrictions, the native branching scenario and quiz functionality, the LMS integration depth, and the instant publish workflow are all genuine advantages over specific competitors in specific use cases.
The limitations are real but not disqualifying for most buyers. Colossyan is significantly smaller than Synthesia in funding, language coverage, avatar count, and enterprise brand recognition. The conversational avatar product is on waitlist while competitors are shipping. There is no HIPAA certification, no ISO 42001, and no MCP server. Render times on the current engine are slow enough to create friction.
The ideal Colossyan customer is an enterprise L&D team that needs to produce multilingual training videos at speed, wants native branching and quiz features without additional tools, works in a regulated industry where Synthesia’s stock avatar restrictions would be a problem, and doesn’t require HIPAA certification or real-time conversational avatars.
The Synthesia customer is the large enterprise with a strict compliance checklist where ISO 42001 and brand recognition in procurement conversations matters, or where 140+ language coverage is required.
Both can be right depending on what the buyer is optimizing for.
For a platform with $28.2M raised and ~$15M ARR competing against a $4B-valued market leader, Colossyan’s ability to win the Novartis, Porsche, and Jaguar Land Rover accounts suggests the product is credible at the enterprise level. That’s a four-star outcome — strong, differentiated, and genuinely useful for its target market, with known limitations that are worth understanding before committing.
Colossyan is headquartered in London with offices in New York and Budapest. Founded 2020. colossyan.com
We research AI tools from public sources and documentation. We do not receive compensation from vendors reviewed on this site. About ChatForest