Sometimes in web applications we come across with pages which are loaded with images with text on it. In these scenarios we cannot read the text inside the image with the normal Xpath selectors. Xpath will only help to locate where the image is actually. So to sort out this scenario we have a special class in java called OCR (Optical Character Recognition) which does the work for us.
To start off with OCR first you need to follow some steps.
- Download "Asprise OCR" libraries , depending on the operating system you are using .
- Unzip the downloaded folder and add the "aspriseOCR.jar" file to your working directory . If you want you can download the single jar file from here.
- Set the path in environments variables My Computer --> Properties --> Advanced system settings --> environment Variables --> Create a new system variable as below.
4. And now you are good to go
Look at below image
now you can just get this same code and try your self
import java.awt.Image;
import java.awt.image.BufferedImage;
import java.awt.image.RenderedImage;
import java.io.File;
import java.io.IOException;
import java.net.URL;
import javax.imageio.ImageIO;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.testng.annotations.BeforeTest;
import org.testng.annotations.Test;
import com.asprise.util.ocr.OCR;
public class Test2 {
//Read text values on an image
public static void main(String[] args) throws Exception {
//Creates an object from WebDriver
WebDriver driver = new FirefoxDriver();
//Calls the URL
driver.get("http://anjiztechshare.blogspot.com/2013/11/read-text-on-image.html");
//Locate the image and get the image scr attribute values and assigns to a variable
String imageUrl=driver.findElement(By.xpath("//*[@id='post-body-8001467531740453328']/div[4]/div[7]/a/img")).getAttribute("src");
//Prints the assigned variable values
System.out.println("Image source path : \n"+ imageUrl);
Thread.sleep(3000);
//Creates a new object from URL class and pass the variable as a parameter
URL url = new URL(imageUrl);
//Read the URl value
Image image = ImageIO.read(url);
//Creates a String variable to assign the values read by the OCR()
String s = new OCR().recognizeCharacters((RenderedImage) image);
//Prints the text on the image line by line
System.out.println("Text From Image : \n"+ s);
//Prints the length of the image
System.out.println("Length of total text : \n"+ s.length());
driver.quit();
}
}
Here is the output of the program
Image source path :
https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEioEQTjbpO6sn0p9vr7xJupgPuYllrM6ePy8Jii8w1Qzp5B_28MJOr-quBowTghGDqCD2kCQUgQxCWB2KM0Li5FhsHZ16SqNir1L423weXHswI1Inkg_GKgxljeyAC_hyQMx0ZPTt94hpPN/s320/love.jpg
Text From Image :
Never M2suse the O, ne
Who Likes You
Never Say Busy To Th,e One
Who Needs You
Never cheat The One
Who ReaZZy Trust You,
Never foJnget The One
Who Zways Remember You.
Length of total text :
175
Now lets see how we can read an image from the hard drive and prints the letters in it. please read my next blog post. Good luck for now.


No comments:
Post a Comment