LIBRISTO
LIBROAMANTO
povinné
Staňte sa súčasťou komunity milovníkov kníh z celého sveta a získajte hromadu výhod. Založiť účet zdarma
0
Doprava zadarmo s Packetou nad 59.99 €
Kuriér DPD 2.99 Zberné miesto GLS 2.99 SPS 3.99 Kuriér GLS 3.49 SPS Parcel Shop 2.99 Packeta kurýr 3.99 Pošta 3.99 Zberné miesto DPD 2.99 Zberné miesto DPD 0.00 Packeta 2.99

Doprava zdarma pre objednávky nad 59,99 € s Packetou a SPS Boxmi.

Quantized Model Deployment

INT8 and FP16 Compression for Mobile Acceleration

Jazyk AngličtinaAngličtina
Kniha Brožovaná
Kniha Quantized Model Deployment Clara Whiskers
Libristo kód: 52388434
Nakladateľstvo Independently published, máj 2026
What if the only thing standing between your neural network and real-time mobile performance is the... Celý popis
? points 44 b Nové Nové
18.30
Skladom u dodávateľa Odosielame za 9-15 dní

30 dní na vrátenie tovaru

What if the only thing standing between your neural network and real-time mobile performance is the precision you refuse to give up?
Your model ran flawlessly in PyTorch-400MB of FP32 weights, a 350-watt GPU, and all the thermal headroom in the world. Then you deployed it to a phone. It stuttered. It heated up. The OS killed it before it produced a single inference. The market no longer asks whether AI can run on mobile. It asks why your AI is slower and less accurate than the cloud version. The answer is not your architecture. It is your precision.
This book is the field manual for engineers who refuse to accept the old compromise of smaller models and weaker accuracy. Inside, you will learn:
• Why INT8 and FP16 are not arbitrary format choices, but hardware-mandated keys to dedicated acceleration paths on Snapdragon, Apple Neural Engine, and MediaTek APU • How naïve post-training quantization can crater accuracy by double-digit percentages-and the calibration, range estimation, and outlier handling techniques that prevent it • The exact deployment architecture for TensorFlow Lite, Core ML, ONNX Runtime Mobile, and NNAPI, including operator fusion and numerical equivalence testing • Why quantization is the only optimization that simultaneously improves latency, accuracy, and power consumption-and how to combine it with pruning and knowledge distillation for wearables and IoT
Stop accepting the compromise between speed and accuracy. Build models that run cooler, faster, and sharper on the devices already in your users' pockets. The precision you can no longer afford is the precision you can finally reclaim.

Herečka & Polyglotka
EWA KASP pre
Prehrať video
Ewa Kasp
Libristo má najväčší výber cudzojazyčnej literatúry. Preto si knihy kupujem tu.
Darujte túto knihu ešte dnes
Je to jednoduché
1 Pridajte knihu do košíka a vyberte možnosť doručiť ako darček 2 Obratom Vám zašleme poukaz 3 Knihu zašleme na adresu obdarovaného

Prihlásenie

Prihláste sa k svojmu účtu. Ešte nemáte Libristo účet? Vytvorte si ho teraz!

 
povinné
povinné

Nemáte účet? Získajte výhody Libristo účtu!

Vďaka Libristo účtu budete mať všetko pod kontrolou.

Vytvoriť Libristo účet