Welcome to the Reading Class !

The ultimate guide of the Huawei ML Kit’s Text Recognition


Sponsor


Long time no see friends. Happy New Year, everybody 🥳. We spent most of our time in our homes in 2020. And that made us understand how life is short and precious. I hope 2021 will not make us grateful for the cursed 2020. Good wishes aside, and welcome to another Mr.Roboto series for the new year. In this article, I’m going to show you how you can get texts from Bitmaps via ML Kit’s Text Recognition library. For that, I would explain the topic with the help of the sample app. It’s a simple app to see how things work together. You can find the GitHub repository at the end of the article. So, what are we waiting for? Let’s dive in.

## Text Recognition

The text recognition service can extract text from images of receipts, business cards, and documents. This service is useful for industries such as printing, education, and logistics. You can use it to create apps that handle data entry and check tasks.

This service can run on the cloud or device, but the supported languages differ in the two scenarios. The on-device API can recognize text in Simplified Chinese, Japanese, Korean, and Latin-based languages (including English, Spanish, Portuguese, Italian, German, French, Russian, and special characters. For details about the supported special characters, please refer to Latin Script Supported by On-device Text Recognition). while the on-cloud API supports much more languages such as Simplified Chinese, English, Spanish, Portuguese, Italian, German, French, Russian, Japanese, Korean, Polish, Finnish, Norwegian, Swedish, Danish, Turkish, Thai, Arabic, Hindi, and Indonesian.

The text recognition service is able to recognize text in both static images and dynamic camera streams with a host of APIs, which you can call synchronously or asynchronously to build your text recognition-enabled apps.

This lengthy but neat explanation is straight from the official ML Kit’s Text Recognition document.

## Development

First things first, before you test it out. Please scroll down to the Test section to do a preliminary setup.

The usage is fairly straightforward. There are two ways to recognize texts. The first one is selecting a bitmap from the gallery. The second one triggers text recognition while the camera stream is on.

The sample app uses both On-device (offline) and On-cloud (online) analyzing process with some conditions. If the device is connected to the network, it would use the On-cloud method which also gives more accurate results than the On-device method. If it is not then it would use the On-device method which is also great but limited compared to the On-cloud method. But don’t worry, those two methods have a subtle difference in terms of result accuracy.

The app flow starts with the initialization of the stream text recognition due to the initial camera preview.

Then analyzeStream() invokes on start recognition click event.

OcrDetectorProcessor is our class that inherits MLTransactor for getting output texts from camera stream. transactResult(...) method triggers with short intervals. If you notice resultAction is the lambda variable that enables us to react to the recognized outputs.

Whenever the user selects Recognize from the gallery, the initialization of the static text recognition kicks in.

TextLanguage enum class encapsulates our language support in our domain. In the gist below, you would see that we restricted to only 3 languages for the sample app.

After that, we provide the bitmap for the recognition process.

Finally, we show the result for both methods at the end via simple toast.

## Language Codes

You would see all the supported languages and their corresponding language codes. The ones with asterisk mean that the On-device method does not support them yet;

  • en (English)
  • zh (Chinese)
  • ja (Japanese)
  • ko (Korean)
  • ru (Russian)
  • de (German)
  • fr (French)
  • it (Italian)
  • pt (Portuguese)
  • es (Spanish)
  • pl (Polish)*
  • no (Norwegian)*
  • sv (Swedish)*
  • da (Danish)*
  • tr (Turkish)*
  • fi (Finnish)*
  • th (Thai)*
  • ar (Arabic)*
  • hi (Hindi)*

Before putting an end to the development phase, I would like to remind you that you could find the package reference here to find out thoroughly.

## Test

⚠️ Each HMS Integration requires the same initial steps to begin with. You could use this link to prepare your app before implementing features into it. Please don’t skip this part. This is a mandatory phase. HMS Kits will not work as they should without it.

After reading it, you should do one or two things to run the app. First, enable ML Kit under the Manage APIs tab on AppGallery Connect and should see the image below after enabling it.

Then, download the agconnect-services.json file that is generated and place it under the app directory.

## Github Repository

HMS Text Recognition GitHub Link

That is it for this article. You could search for any question that comes to your mind via Huawei Developer Forum. And lastly, you can find lengthy detailed videos on Huawei Developers YouTube channel. These resources diversify learning channels and make things easy to pick and learn from a huge knowledge pool. In short, there is something for everybody here 😄. Please comment if you’ve any questions on your mind. Stay tuned for more HMS Development resources. Thanks for reading. Be safe, folks.


© 2024 Yekta Sarioglu. All rights reserved.