DeepSeek's new generation of models has become the focus of technological debate with a very clear proposal: context of up to one million tokens and an architecture of more than one trillion parameters Designed to be efficient and, above all, much cheaper than closed-loop alternatives in the United States, the Chinese company has gone all in with the V4, a family that combines open weights, a huge context window, and an aggressive pricing strategy.
This move comes at a time when Europe and Spain are scrutinizing the cost and technological sovereignty of AI. DeepSeek V4 presents itself as an attractive option for European startups, SMEs and large companies that need frontier-level capabilities, but cannot—or do not want to—rely entirely on expensive proprietary APIs or exclusive hardware like the most sought-after NVIDIA GPUs.
A V4 family centered on 1T of parameters and a context of 1M tokens

DeepSeek has announced the arrival of DeepSeek-V4 Preview as a family of open models that revolves around two ideas: a context window of up to 1 million tokens and giant architectures based on Mixture-of-Experts (MoE)Within this family, two main variants stand out: DeepSeek-V4-Pro and DeepSeek-V4-Flash, both with that 1M context as a hallmark.
At the most ambitious end, V4-Pro operates in the figures of up to 1,6 trillion total parameters (1,6T), although it only activates between 32 and 49 billion parameters in each inference step thanks to the MoE scheme, which is crucial for maintaining efficiency. In parallel, the company has introduced lighter variants, such as V4-Flash and V4-Lite, with around 284-285 billion total parameters and about 13 billion active parameters, designed for deployments where speed and cost are the priorities.
The total number of parameters places the V4 family at the top of the market, but the important detail is that Only a fraction of those experts are token-activated.This allows it to behave like a gigantic model in terms of capacity, but with computing power consumption closer to that of much smaller models. It's an approach that fits with DeepSeek's narrative: competing with large, closed-source models without skyrocketing the cost of use.
The company has also released preliminary variants like V4-Lite, which serve as technical validation, and has been adjusting the deployment schedule. Although V4 is still in limited testing phase In some contexts, the V4 Preview family can already be used in the official chatbot and through the company's updated API, with the 1M context as the default value in its services.
Hybrid architecture and a mix of experts to make the long-term context viable
The key to DeepSeek's ability to offer a context window of one million tokens without the inference cost skyrocketing lies in its architecture. The manufacturer explains that V4 introduces a combination of hybrid care, Mixture-of-Experts and compression techniques designed to work with very long sequences, reducing both FLOPs per token and the memory required.
Among the technical components that the company mentions, the following stand out: MLA (Multi-Head Latent Attention), DSA or DeepSeek Sparse Attention, and conditional memory mechanisms such as EngramTogether, these components aim to reduce the burden of attention calculation, especially when the model has to handle hundreds of thousands or a million tokens in a single pass.
According to data shared by the company itself, in scenarios of 1M tokens DeepSeek-V4-Pro may require around 27% of the FLOPs per token and only 10% of the KV cache compared to previous versions like DeepSeek-V3.2Lighter variants, such as V4-Flash, further reduce these figures, positioning themselves as fast inference solutions for applications where latency is critical.
These types of improvements are not just theoretical: the company claims that the combination of MoE, scattered attention, and context comprehension allows operating with ultra-long context in less extreme hardware already a cost per million tokens significantly lower than that of many closed models with windows of 128K or 200K tokens.
Performance in reasoning, programming, and agentic tasks
DeepSeek doesn't just want to stand out because of its size and context. In its internal comparisons, the company insists that V4-Pro and its variants have been specially optimized for complex reasoning, programming, and agentsThese three areas currently account for a significant portion of business demand. Benchmarks such as SWE-bench, designed to measure the capacity of Understanding and modifying code repositoriesThere is talk of figures above 80% accuracy, in line with leading closed models.
In more general reasoning—including mathematics, STEM disciplines, and chain-of-thought problems—the company places V4-Pro as one of the strongest open modelsand argues that it approaches the level of closed-border proposals. In terms of global awareness, internal data places it at the forefront of the open ecosystem and only behind a few very specific proprietary models, such as certain advanced variants of Gemini.
Beyond the numbers, the emphasis on agentic tasks It points to a use that goes far beyond basic chat. DeepSeek claims that V4 already drives its own infrastructure of code agents and systems that chain multiple stepsThey access tools and work on extensive repositories or document databases. This approach aligns with the current industry trend, where many companies are no longer just looking for a chatbot, but for assistants capable of operating as “digital colleagues” within complex workflows.
These comparisons should be taken with a grain of salt: as with almost all recent AI releases, Much of the data comes from the company itself and from tests in controlled environments.Even so, the combination of long context, efficient architecture, and competitive performance is generating attention among European developers who are comparing costs and capabilities against options such as GPT, Claude, Llama, or Mistral.
Open models, published weights, and compatibility with popular APIs
One of the key factors that has brought DeepSeek its notoriety is its commitment to the open ecosystem. With V4, the company reinforces this approach: has published the technical report and released open weights of the family on platforms such as Hugging Faceallowing researchers, companies and public administrations to download the models and run them on their own infrastructure.
This open-weights approach, in contrast to the completely closed proposals of many US laboratories, has clear implications for Spain and the European Union. The possibility of deploying these models in data centers within the EUunder frameworks such as the GDPR and the EU's future AI regulationIt offers a way to maintain greater control over data without sacrificing top-tier capabilities.
In terms of practical integration, DeepSeek has opted to reduce friction: The API maintains the same base_url and is compatible with OpenAI's ChatCompletions schemes and with the Anthropic interfacesFor many development teams, this means that migrating tests or parts of the traffic to V4 is essentially limited to changing the model identifier to deepseek-v4-pro or deepseek-v4-flash and adjusting a few parameters.
At the same time, the company has set a timeline for retiring older models, such as deepseek-chat and deepseek-reasoner. They will be discontinued and redirected to V4-Flash until their complete withdrawal, which forces those who used them to start preparing for the migration. It's a clear way to concentrate the offering on the new generation and avoid fragmenting the user base into too many legacy variants.
Contained inference costs and focus on economic efficiency
DeepSeek's narrative has revolved around efficiency since its inception. With V4, that discourse is reinforced by a combination of MoE architecture, distributed attention, and hardware optimization that aims to... lower the cost per million tokens to levels well below those of the most well-known premium APIsSome external analyses mention figures around $0,30 per million entry tokens for certain configurations, a fraction of what high-end closed models charge.
In the European context, where infrastructure and energy costs are relevant, this focus on efficiency fits well with the needs of startups and SMEs. Processing extensive legal documents, lengthy medical records, or entire software repositories It ceases to be a luxury reserved for companies with almost unlimited budgets and becomes part of affordable scenarios for emerging projects.
Some AI infrastructure providers already offer early access to DeepSeek V4-based nodes as part of their catalogs, making it easier for European companies they can evaluate real performance and costs without having to build their own infrastructure from scratch.For many organizations, this testing phase is the preliminary step before deciding whether to continue with an outsourced model or opt for on-premise deployments.
Meanwhile, the company's partial silence regarding the exact training costs and the specific hardware used has raised doubts within some sectors. Since 2025, suspicions have circulated about the true volume of resources required to train its models, including estimates pointing to tens of thousands of high-end GPUs. DeepSeek insists it has achieved a new stage of "profitable long-term context"But it has not yet completely cleared up the unknowns about the material scale of its operations.
Impact on startups and companies in Spain and Europe
For the European entrepreneurial ecosystem, and in particular for technology startups in Spain, the emergence of models like DeepSeek V4 opens up options that until recently were difficult to consider. Access a model with over a trillion parameters within the context of 1M tokens and open weights It allows you to explore advanced products without relying exclusively on Silicon Valley suppliers.
In regulated sectors—finance, health, legal, public administration—the possibility of run the model in data centers within the EU or even in your own facilities This is especially relevant. Compliance with the GDPR and national data protection regulations becomes more manageable when information does not have to leave European jurisdictions to be processed by an AI model.
Spanish startups that work with large volumes of documents, such as legaltech, healthtech, or developer tools, can leverage the 1M tokens context to analyzing complete files, very long medical histories, or monolithic code repositories without needing to divide them into multiple pieces and design complicated recovery systems. This reduces technical complexity and, in many cases, latency as well.
At the same time, it's important to keep the risks in mind: the ecosystem of tools surrounding DeepSeek is younger than that of other open models like Llama, and The documentation and community support are still maturing.Furthermore, the fact that it is a Chinese company introduces a geopolitical component that some European organizations view with caution, especially in projects linked to administrations or critical infrastructure.
A move that puts pressure on high-cost, closed models
Beyond its specific specifications, DeepSeek V4 is interpreted within the sector as a further step in the competitive pressure on the most expensive closed models on the marketBy establishing the 1M token context as standard across its official services and accompanying it with open weights, the Chinese company sends a clear message: the ultra-long context no longer needs to be an exclusive feature of a few high-priced proprietary models.
For large Western laboratories, this poses a challenge. OpenAI, Anthropic, and Google have historically used a combination of higher quality, broader context and proprietary ecosystem as a value proposition. The emergence of an open alternative with an even superior context in some cases and very low costs forces a rethinking of product and pricing strategies, especially in segments where the margin of user companies is tight.
In the Spanish-speaking world, where many startups operate with much more modest budgets than their counterparts in the United States, competitive pressure works in their favor. The more powerful and open models available, the greater the ability technical teams will have to choose based on price, regulatory compliance, and use case.and not just from the brand behind the API.
At the same time, DeepSeek knows that its bet is not without challenges: most of the benchmarks and comparisons come from its own documentation or from tests in preview phases, and the market is still waiting to see how the V4 models perform when deployed massively in demanding production environments, including European ones.
Overall, the arrival of DeepSeek V4 consolidates a trend that had been developing for some time: Cutting-edge AI models are no longer the exclusive domain of a few companies with closed systems and astronomical budgets.With a combination of over 1T of parameters, a context of 1M tokens, open weights, and a discourse focused on efficiency, the Chinese company introduces an alternative that companies and developers in Spain and Europe will hardly be able to ignore in their upcoming plans for adopting and renewing AI infrastructure.