Magika is a novel AI powered file type detection tool that relies on the recent advance of deep learning to provide accurate detection. Under the hood, Magika employs a custom, hig

Magika by Google | Best AI for Data | Find AI Tools & Apps

Magika is a novel AI powered file type detection tool that relies on the recent advance of deep learning to provide accurate detection. Under the hood, Magika employs a custom, highly optimized Keras model that only weighs about 1MB, and enables precise file identification within milliseconds, even when running on a single CPU. In an evaluation with over 1M files and over 100 content types (covering both binary and textual file formats), Magika achieves 99%+ precision and recall. Magika is used at scale to help improve Google users’ safety by routing Gmail, Drive, and Safe Browsing files to the proper security and content policy scanners. You can try Magika without anything by using our web demo, which runs locally in your browser! Here is an example of what Magika command line output look like: Key Features: <ul> <li>Available as a Python command line, a Python API, and an experimental TFJS version (which powers our web demo).</li> <li>Trained on a dataset of over 25M files across more than 100 content types.</li> <li>On our evaluation, Magika achieves 99%+ average precision and recall, outperforming existing approaches.</li> <li>Supports more than 100 content types.</li> <li>Batching: You can pass to the command line and API multiple files at the same time, and Magika will use batching to speed up the inference time.</li> <li>Near-constant inference time independently from the file size; Magika only uses a limited subset of the file's bytes.</li> <li>Supports three different prediction modes to tweak the tolerance to errors: high-confidence, medium-confidence, and best-guess.</li> <li>Open source with more enhancements in the pipeline.</li> </ul>

Magika by Google

Subscribe to the AI Search Newsletter