This week's sponsor

The Omni Group

Celebrating 25 years of human-centered productivity!


Prizmo Go Review: Smarter OCR with the iPhone’s Camera

I've long been using Prizmo to quickly extract text contained in photos using the iPhone's camera. Developed by Creaceed, Prizmo has always stood out among iOS scanner apps thanks to its accurate and fast OCR. While most scanner apps focus on digitizing documents and exporting PDFs, Prizmo complemented that functionality with the ability to recognize and share text with just a couple of taps. Prizmo could be used as a scanner app for paperless workflows, but I preferred to keep it on my devices as a dedicated utility to effortlessly extract and share text.

With Prizmo Go, released today on the App Store, Creaceed is doubling down on Prizmo's best feature with a separate app that's been entirely designed with OCR and sharing text in mind. While OCR was a feature of Prizmo, it becomes the cornerstone of the experience in Prizmo Go, which takes advantage of impressive new OCR technologies to make character recognition smarter, faster, and better integrated with other iOS apps.

I've been using Prizmo Go for the past couple of weeks, and it's one of the most intriguing apps I've tried in a while because it genuinely offers something new. Unlike its predecessor, Prizmo Go implements Microsoft's Cognitive Services tech to perform cloud-based OCR. Local, on-device character recognition built on Creaceed's engine still exists with support for 10 languages, offline mode, and automatic language detection, but Cloud OCR (how the feature is called in the app) is what differentiates Prizmo Go from Prizmo.

By relying on Microsoft's Computer Vision API, Cloud OCR in Prizmo Go can automatically detect text in 22 languages, including ones that aren't supported by the built-in, non-cloud OCR such as Chinese, Japanese, and Arabic. To make Cloud OCR work in the app, Creaceed had to cleverly optimize the app's engine for additional options: for instance, Prizmo Go supports both horizontal and vertical lines of text in Japanese via Cloud OCR – which required adjusting the app's document layout engine to make the camera and API work together. Furthermore, there are other fascinating technical bits under the hood for both local and Cloud OCR: Prizmo Go offers image stabilization through sharpness tracking, and it pre-processes an image as you're holding the iPhone's camera to capture a document by doing perspective correction and a dynamic rescale to ensure the app is properly tracking the text you've meant to identify.

All of this results in an innovative text scanner where Cloud OCR has been seamlessly integrated with iOS hardware. As soon as you point the camera to something that has text in it (a document, a business card, another screen – whatever you want), you'll get real-time text highlights directly in the camera view. Even with Cloud OCR enabled (which you can confirm with the cloud icon at the top of the screen), it only takes a second for text to be processed and highlighted inside the camera.

The visual effect is remarkable: it feels like the iPhone's camera is capable of parsing entire paragraphs of text in less than two seconds, which is even more impressive when you consider that OCR is happening in the cloud. I've never seen anything like Prizmo Go's real-time OCR in the camera before.

Prizmo Go's real-time text recognition via Cloud OCR.


Replay

There are some options you can test in the camera view. You can enable the camera flash and image stabilization, or import an image from Photos or other document providers if you don't want to take a new picture using Prizmo Go. If you already have an image in your clipboard, tapping the Image button in the lower left corner will offer a shortcut to import the image you've copied – useful for extracting text out of images you've copied from Safari or other apps.

In addition, you can disable Cloud OCR at any time and switch to the app's built-in recognition, or you can open the Settings and tweak a few other preferences. These include a special low power mode to disable visual effects like the real-time text overlay to save on battery (though low power mode does not affect the quality of text recognition), and a toggle to enable QR code detection in the camera.

What happens after taking a picture in Prizmo Go is equally innovative and unique. After the real-time preview, Prizmo Go will perform its final processing and bring up a split-screen view with the image at the top and recognized text in the lower half of the display. Recognized text is highlighted in blue over the original image, and you can select the extracted plain text at the bottom. However, you can also tweak the selection of text by swiping over the highlights and choosing different bits of text to extract. As you swipe across underlined words to select them, subtle haptic feedback will confirm your selection and put words in the text card underneath the image. Want to extract two non-contiguous sentences from a scanned photo? Just tap & hold the screen until the crosshair loupe appears, select what you need, and you'll get extracted text at the bottom, ready to be shared.

This combination of iOS interface conventions (blue text highlights, magnification loupe) with new technologies such as haptic feedback and Cloud OCR in such an intuitive experience is what makes Prizmo Go one of my favorite app debuts from the past few months. Once text has been extracted, it can also be shared with extensions or copied – it couldn't be easier.

Prizmo Go's Accessibility options.

Prizmo Go's Accessibility options.

There's an important accessibility angle to Prizmo Go, too: a Reader option is prominently featured among actions for extracted text, which will speak the captured text in the associated language. Words are highlighted in yellow as they're spoken by iOS' VoiceOver; these are the same voices that you can find in Settings > General > Accessibility > Speech > Voices, including their enhanced variants. There are even playback controls to pause and resume text-to-speech and a slider to tweak the voice's speed. I can only imagine the potential of Prizmo Go for visually impaired users or people who simply can't read fine print and other small text labels; now, using the iPhone's camera, any image can be transformed in a matter of seconds to text that can be spoken aloud, copied, and shared.

Prizmo Go doesn't disappoint from an automation standpoint either. In this first release, the folks at Creaceed have included an x-callback-url compliant URL scheme that enables integration with other apps to pass images and receive extracted text as output. In a nutshell: you can launch Prizmo Go in two modes (take a new picture or use a picture from the clipboard) and specify what you want to do once text has been recognized and extracted. Text can be sent back to another app via x-callback-url, allowing you to set up powerful chains of automations between multiple apps and Prizmo Go.

I've put together a workflow that demonstrates how Prizmo Go can be integrated with third-party apps and other iOS features. First, the workflow will ask you to pick a mode – whether you want to take a new image in Prizmo Go, or if you want to send a previously copied image to the app; if you choose the Clipboard option, you'll be presented with a native photo picker, and the image you select will be copied to the clipboard. Then, the workflow will ask you to choose a type of OCR – I left cloud, en, and it as options; all of these flags are documented in the app's automation page. Finally, I added a menu that lets you save extracted text into the clipboard or as a new note in DEVONthink using the app's new automation features I detailed earlier today.


Replay

After choosing these options, Prizmo Go will launch, it'll extract text and let you validate it, and then it'll either copy text to the clipboard or send it to DEVONthink – all in a single automation flow that can even be triggered from the Workflow widget.

I've been using this workflow several times a week to quickly extract bits of text from images I want to archive in DEVONthink. You can get it here.1

I've been using Prizmo Go for a couple of weeks to extract text from a variety of sources: business cards and others pamphlets2, prescriptions for our new puppies, and even screenshots of apps that don't let you select text (I'm looking at you, App Store app descriptions). In my experience, Cloud OCR has been fast and reliable, with an overall superior quality than Prizmo's built-in OCR. Cloud OCR isn't perfect – it gets the occasional accented character wrong, or it can't recognize an uppercase letter for certain typefaces – but its minor issues don't impact an otherwise incredible integration with an app that can scan text in over 20 languages within a couple of seconds.

Prizmo Go is also based on a novel business model, and I'm a fan of what Creaceed is attempting here. Prizmo Go is free to download on the App Store, and there are two kinds of In-App Purchases in the app. The first one is a one-time $4.99 In-App Purchase to unlock Export options; these are clipboard and share operations for extracted text, plus interactions with data detectors such as links and phone numbers. The Export Pack does not include VoiceOver, which is always unlocked for Accessibility purposes.

Cloud OCR units are the second type of In-App Purchases in Prizmo Go. Essentially, extracting text through Cloud OCR consumes a unit. Units can be refilled with packs: a 100-unit pack is $0.99, while 1000 units are $4.99. These consumable In-App Purchases are synced across devices with iCloud (so you won't have to buy multiple packs on all your devices), and there's a free 10-unit In-App "Purchase" you can unlock to test Cloud OCR before committing to a paid pack. Microsoft's Cognitive Services APIs are not free for developers, and I think Creaceed found a good compromise: instead of forcing users to pay a recurring subscription, they're treating successful cloud conversions as consumable units that can be easily refilled over time, which is smart.

On the first episode of AppStories, I asked John whether the App Store could still surprise us today. Prizmo Go is the perfect example of how an app category can be reinvented with fresh approaches and modern tech. Neither OCR nor scanning through the iPhone's camera are new ideas; highlighting recognized text in real-time through a Cloud OCR engine that supports 22 languages, however, is an experience I never had on iOS before.

Despite the complex technologies they're using, Creaceed shipped a polished, intuitive app that feels like having a superpower inside the iPhone's camera. Prizmo Go is an excellent implementation of computer vision and iOS Camera features, and it's gained a permanent spot on my iPhone and iPad.

Prizmo Go is available for free on the App Store.


  1. In my tests, Prizmo Go's URL scheme sometimes failed to recognize images from the clipboard passed by Workflow. It also would have been nice to pass images directly to the URL scheme via base64 encoding – exactly like OmniFocus, Ulysses, and DEVONthink do. I believe Creaceed is working on improving both aspects for a future release. ↩︎
  2. Prizmo Go has data detectors for phone numbers, URLs, and times, which makes it easy to initiate actions directly from the extracted text just by tapping links. ↩︎

Unlock MacStories Extras

Club MacStories offers exclusive access to extra MacStories content, delivered every week; it's also a way to support us directly.

Club MacStories will help you discover the best apps for your devices and get the most out of your iPhone, iPad, and Mac. Plus, it's made in Italy.

Starting at $5/month, with an annual option available. Join the Club.

A Club MacStories membership includes:

  • MacStories Weekly newsletter, delivered every week on Friday with app collections, tips, iOS workflows, and more;
  • Monthly Log newsletter, delivered once every month with behind-the-scenes stories, app notes, personal journals, and more;
  • Access to occasional giveaways, discounts, and free downloads.