One Stop Compression Shopfor Cloud and Edge AI
Massive Compute Cost Savings
Save a ton of OPEX by offloading your full time inference onto edge devices giving yourself the ability to run your larger cloud models only when needed.
We don't inject ourselves in your training or ML Ops pipeline or ask you to change it just for us. Do what you normally do and we'll take it from there.
Data Free and Secure
Is your training data confidential or proprietary? No worries. We don't need it to optimize. Just a few hundred images of inference data will suffice.
Never locked in
We don't ask you to switch inference engines or to use a proprietary one or tell you how you should train your network. Stick with what you are familiar with. We intervene post training and data free for majority of our techniques.
Optimized for your use case
Need more speed? Need higher accuracy? Need a smaller footprint. You decide what to optimize for. We take care of the rest.
The Right Formats
We've compressed models for a variety of target hardware including ST Microelectronics, NXP, Intel, Nvidia, TI, Raspberry Pi, Renesas and others. Get the right model format for the right hardware.
Gain in speed
State of the Art Methods
loss in accuracy
Offload 99% of your cloud AI runtime onto the edge. Compress a model by up to 330x with 95% accuracy retention for 20x gains in speed. When you need to squeeze a lot of inference, fast on a tiny device for non critical applications or save on cloud compute or bandwidth, this method is for you ! And the best part is that it's GDPR compliant!
As a Big Tech company, are you running large models on your cloud at enormous energy and compute costs? Take advantage of the edge to offload a majority of your inference runtime and run your large model only as needed from time to time. The best part is that having your inference run on the edge, it's GDPR compliant!
Squeeze highly accurate & lightweight models onto cost effective and energy efficient hardware. Gone are the days of having to deploy costly GPUs to run AI on the edge. Cut your CAPEX costs bv over 10X by being able to deploy AI on your average CPU and single digit microcontrollers.
Value the entire range of your hardware platform. Clients can now deploy the smaller 100k adapted models on energy efficient microcontrollers while keeping the larger 50M trained networks on a MPU or GPU hub or even in the cloud. The larger model runs intermittently as needed to update the precision of the smaller satellite models sending only the new weights (and not an entire new model) as needed when the conditions change
Go from Floating Point 32 to int8 or int4 or beyond and keep accuracy at maximum retention, if not lossless, so you can deploy on the latest hardware. We quantize post training and with no training data necessary for easy use.
Industrial AI teams
For small, medium and large companies that have their own Deep and Machine learning teams working to implement as much vision and sound inference as possible on constrained or quantized hardware
Machine Vision Solution Providers
For companies licensing machine vision solutions and that wish to minimize the deployment capex of their clients by allowing them to install your solution on cost effect CPUs and MCUs
For the semicondcutor provider wishing to address a wider market for their past, current and future chipsets. Let Datakalab take your chip capabilities further through its compression capabilities.
We take a deep dive into your network to tell you which neurons to keep and which ones you can trim off without impacting the end results. Reduce latency by up to 60% and footprint by 50%. We offer best in class performance and a technique that is compatible with sparsity.
Industrial AI teams
For small, medium and large companies that have their own Deep and Machine learning teams that want to get into the nitty gritty of their networks to go beyond quantization.
Machine Vision Solution Providers
For companies licensing machine vision solutions and that wish to minimize the deployment CAPEX of their clients by allowing them to install their solution on cost effect CPUs and MCUs
For the semiconductor provider wishing to address a wider market for their past, current and future chipsets and test how structured and unstructured pruning performs on their silicon.
Batch Norm Folding
If you haven't already folded your own model, we can provide an easy to use library to delete the batch norm folding needed for training from the inference calculations for up to 26% improvement in speed without any degradation in accuracy. BNF can be used with other acceleration strategies on all types of deep learning tasks
BatchNorm folding is that it can significantly reduce the computational cost and memory footprint of the neural network without sacrificing accuracy. This makes it easier to deploy the network on devices with limited resources, such as mobile phones or embedded devices. Additionally, BatchNorm folding can help to improve the overall efficiency of the network, as the folded weights can be optimized for hardware acceleration, leading to faster inference times.
Machine Vision AI
Machine vision has been an integral part of our DNA since Datakalab started 6 years ago. As a research lab spin-off that originally worked on opt-in emotional analysis, we know the intricate details of TensorFlow, ONNX, Caffe, Keras, PyTorch and Darknet/YOLO.
Maybe you're also working with different architectures like MobileNet, EfficientNet, ResNet, etc.
No matter the format or architecture you work with, odds are that we are very familiar with it. Ping us today to see how we can boost your inference speed or save you money on your embedded hardware sped or cloud costs !
While audio AI models may have some specificities compared to machine vision models, we know they share a whole bunch of similarities, too ! And at the end of the day, compression and acceleration is about the math.
As a team that is proud of our research and that loves a good challenge, we've successfully ported our compression methods over to audio AI models and have successfully quantized Audio models to full int8 with only 0.001% drop in accuracy.
If you are working on voice assistants or sound detection, contact us today to squeeze your model on tiny devices without having to sacrifice precision!
Turnkey APIs and CLIs
Use our turnkey APIs and Command Line Interfaces. You train, we compress.
We push compression and accuracy to new industry levels. You focus on your AI use case, we'll let you deploy it on cost effective hardware without having to sacrifice precision or speed.
You've invested heavily on your confidential training data and you don't want to share it. No problem ! We'll compress your network without needing your training data.
We won't ask you to change your inference engine or get involved in your training pipeline. Give us your model once it's been trained and we'll take it from there.
Save not only on video bandwidth and transmission costs but also on energy to run the networks!
Move from cloud to GPU instances or GPUs to CPUs or CPUs to MCUs. Save upwards of 10x on your deployment!
You tell us what variables you are optimising for... speed, accuracy, size and we'll make it happen!
State of the Art
Speed and Accuracy in one an integrated approach. Use one or all four of our methods together depending on your use case and objectives
Speed and precision of the future... today
With over 20+ research papers and 3 patents pending, our scientists and researchers work non-stop to push the boundaries in lossless AI acceleration and compression.
Check out our latest papers accepted at the leading International AI conferences NeurIPS, CVPR, ICCV and AAAI and above all. And if you think you can help us push the boundaries even further, hit us up on our career page!