State of the Art Neural NetworkCompression and Acceleration

One Stop Compression Shopfor Cloud and Edge AI

Massive Compute Cost Savings

Save a ton of OPEX by offloading your full time inference onto edge devices giving yourself the ability to run your larger cloud models only when needed.

Post Training

We don't inject ourselves in your training or ML Ops pipeline or ask you to change it just for us. Do what you normally do and we'll take it from there.  

Data Free and Secure

Is your training data confidential or proprietary? No worries. We don't need it to optimize. Just a few hundred images of inference data will suffice.

Never locked in

We don't ask you to switch inference engines or to use a proprietary one or tell you how you should train your network. Stick with what you are familiar with. We intervene post training and data free for majority of our techniques.

Optimized for your use case

Need more speed? Need higher accuracy? Need a smaller footprint. You decide what to optimize for. We take care of the rest. 

The Right Formats

We've compressed models for a variety of target hardware including ST Microelectronics, NXP, Intel, Nvidia, TI, Raspberry Pi, Renesas and others. Get the right model format for the right hardware.

The Numbers


Footprint reduction

Via our unique Context Adaptation method compressing your model from 50M parameters down to 100K. 


Gain in speed

Run a highly accurate 100k parameter model at 5-7FPS on a Cortex M7 MCU


State of the Art Methods

That can all be added together for un rivalled compression, speed and accuracy performance

1 min

To quantize

Save days and hours trying to manually or programmatically run compression techniques on your own.


loss in accuracy

For one of our leading int8 quantization techniques making it negligible compared to the 4X gain in size reduction we offer

The Methods

Deep purple illustration of context adaptation and speed

Context Adaptation

Offload 99% of your cloud AI runtime onto the edge. Compress a model by up to 330x with 95% accuracy retention for 20x gains in speed. When you need to squeeze a lot of inference, fast on a tiny device for non critical applications or save on cloud compute or bandwidth, this method is for you ! And the best part is that it's GDPR compliant!

  • Big Tech

    As a Big Tech company, are you running large models on your cloud at enormous energy and compute costs? Take advantage of the edge to offload a majority of your inference runtime and run your large model only as needed from time to time. The best part is that having your inference run on the edge, it's GDPR compliant!

  • Consumer IoT

    Squeeze highly accurate & lightweight models onto cost effective and energy efficient hardware. Gone are the days of having to deploy costly GPUs to run AI on the edge. Cut your CAPEX costs bv over 10X by being able to deploy AI on your average CPU and single digit microcontrollers. 

  • Silicon Providers

    Value the entire range of your hardware platform. Clients can now deploy the smaller 100k adapted models on energy efficient microcontrollers while keeping the larger 50M trained networks on a MPU or GPU hub or even in the cloud. The larger model runs intermittently as needed to update the precision of the smaller satellite models sending only the new weights (and not an entire new model) as needed when the conditions change


Go from Floating Point 32 to int8 or int4 or beyond and keep accuracy at maximum retention, if not lossless, so you can deploy on the latest hardware. We quantize post training and with no training data necessary for easy use.

  • Industrial AI teams

    For small, medium and large companies that have their own Deep and Machine learning teams working to implement as much vision and sound inference as possible on constrained or quantized hardware 

  • Machine Vision Solution Providers

    For companies licensing machine vision solutions and that wish to minimize the deployment capex of their clients by allowing them to install your solution on cost effect CPUs and MCUs

  • Semiconductor Companies

    For the semicondcutor provider wishing to address a wider market for their past, current and future chipsets. Let Datakalab take your chip capabilities further through its compression capabilities.

Deep purple graphic showing quantization via pixels
Deep purple illustration showing pruning via one complex jenga tower and other simple jenga tower


We take a deep dive into your network to tell you which neurons to keep and which ones you can trim off without impacting the end results. Reduce latency by up to 60% and footprint by 50%. We offer best in class performance and a technique that is compatible with sparsity.

  • Industrial AI teams

    For small, medium and large companies that have their own Deep and Machine learning teams that want to get into the nitty gritty of their networks to go beyond quantization. 

  • Machine Vision Solution Providers

    For companies licensing machine vision solutions and that wish to minimize the deployment CAPEX of their clients by allowing them to install their solution on cost effect CPUs and MCUs

  • Semiconductor Companies

    For the semiconductor provider wishing to address a wider market for their past, current and future chipsets and test how structured and unstructured pruning performs on their silicon. 

Batch Norm Folding

If you haven't already folded your own model, we can provide an easy to use library to delete the batch norm folding needed for training from the inference calculations for up to 26% improvement in speed without any degradation in accuracy. BNF can be used with other acceleration strategies on all types of deep learning tasks

  • For Everyone

    BatchNorm folding is that it can significantly reduce the computational cost and memory footprint of the neural network without sacrificing accuracy. This makes it easier to deploy the network on devices with limited resources, such as mobile phones or embedded devices. Additionally, BatchNorm folding can help to improve the overall efficiency of the network, as the folded weights can be optimized for hardware acceleration, leading to faster inference times.

Batch Norm folding Deep purple illustration


Machine Vision AI

Machine vision has been an integral part of our DNA since Datakalab started 6 years ago. As a research lab spin-off that originally worked on opt-in emotional analysis, we know the intricate details of TensorFlow, ONNX, Caffe, Keras, PyTorch and Darknet/YOLO. 
Maybe you're also working with different architectures like MobileNet, EfficientNet, ResNet, etc.
No matter the format or architecture you work with, odds are that we are very familiar with it. Ping us today to see how we can boost your inference speed or save you money on your embedded hardware sped or cloud costs ! 

Audio AI

While audio AI models may have some specificities compared to machine vision models, we know they share a whole bunch of similarities, too ! And at the end of the day, compression and acceleration is about the math.  
As a team that is proud of our research and that loves a good challenge, we've successfully ported our compression methods over to audio AI models and have successfully quantized Audio models to full int8 with only 0.001% drop in accuracy.
If you are working on voice assistants or sound detection, contact us today to squeeze your model on tiny devices without having to sacrifice precision!

The Benefits

Turnkey APIs and CLIs 

Use our turnkey APIs and Command Line Interfaces. You train, we compress. 

High Performance

We push compression and accuracy to new industry levels. You focus on your AI use case, we'll let you deploy it on cost effective hardware without having to sacrifice precision or speed. 

Data Free

You've invested heavily on your confidential training data and you don't want to share it. No problem ! We'll compress your network without needing your training data.

Post Training

We won't ask you to change your inference engine or get involved in your training pipeline. Give us your model once it's been trained and we'll take it from there. 

Opex Savings

Save not only on video bandwidth and transmission costs but also on energy to run the networks!

Capex Savings

Move from cloud to GPU instances or GPUs to CPUs or CPUs to MCUs. Save upwards of 10x on your deployment!

Results Delivery

You tell us what variables you are optimising for... speed, accuracy, size and we'll make it happen!

State of the Art

Speed and Accuracy in one an integrated approach. Use one or all four of our methods together depending on your use case and objectives

Success Stories

glass bottles

Counting output on the edge

  • "Datakalab's proven compression and AI acceleration techniques have allowed us to move our vision based output counting solution from the cloud to the edge providing us with major cost savings on video bandwidth transmission and faster data processing as our intelligence sits literally next to the camera sensors."

    Director of Innovation at Industrial Client.

Heading photo

Trustworthy AI for critical systems

  • Datakalab was selected as part of a group of 12 Deep Tech start-ups alongside 30 large industrial organisations, SMEs, and research laboratories to collectively address issues of trusted artificial intelligence (AI) for critical systems and industrial applications. The collective of industrial and academic partners, began using their compression, simulation, human-machine interaction, testing, and explainability technologies to develop a sovereign platform of trusted AI engineering for critical systems. The programme, funded by the French Future Investment Programme, aims to make France a leader in trusted AI.

airplane in sky with contrails

Lossless Quantization

  • "Our team had almost given up on quantization as a reliable compression technique. We tried everything but saw that the output either lost too much accuracy or the method itself was too complicated and unreliable... until we met Datakalab. Datakalab was able to offer us a reliable and mathematically proven lossless quantization method that just worked and was easy for the team to use repeatedly!"

    Industrial Client in the aviation industry.

Cutting edge R&D is our DNA


Speed and precision of the future... today

With over 20+ research papers and 3 patents pending, our scientists and researchers work non-stop to push the boundaries in lossless AI acceleration and compression.  

Check out our latest papers accepted at the leading International AI conferences NeurIPS, CVPR, ICCV and AAAI and above all. And if you think you can help us push the boundaries even further, hit us up on our career page! 

Latest News reveals the 12 winning start-ups and innovative SMEs from its call for expressions of interest

Datakalab is selected to participate in a french community to design and industrialise trustworthy AI-based critical systems.


Datakalab demonstrated that 91% of participants wore their masks throughout the concert organised by AP-HP and PRODISS.

Award "Best Service Provider"

Datakalab and IN Groupe present their partnership to meet the requirements of the States and promote a french security system solution.

Have questions?

Give us your objectives and constraints. We'll show you what we can do.