• No results found

Image Recognition in Artificial Intelligence


Academic year: 2021

Share "Image Recognition in Artificial Intelligence"


Loading.... (view fulltext now)

Full text





[Term paper of Artificial Intelligence (CAP- 402)]

Topic:- Image Recognition in Artificial Intelligence.

Submitted TO: Submitted By:

Mrs. Charu Sharma

Aradhana katoch

Dept. CA

Class: BCA-MCA

Roll No. 07


Synopsys Course Code: CAP402 Course Instructor: Ms. Navdeep Course Tutor: _________ Student Roll Number: 07 Section is: E3601


I declare that this Term paper is my individual work. I have not copied from any other student's work or from any other source except where due acknowledgement is made explicitly in the text, nor has any part been written for me by another person.

Aradhana katoch (Student Signature) Evaluator’s Comment: __________________________________________________________________________________ __________________________________________________________________________________ __________________________________________________________________________________ _________



Contents ... 3

Introduction to the topic: ... 3

Image Processing ... 5

Example:- ... 7

Example of Image recognition:- ... 8

MODEL:- ... 9

Properties of the model:- ... 9

Motion analysis ... 10

Introduction to the topic:

"Image recognition is the research area that studies the operation and design of a picture that recognize patterns in it. Image recognition is a long-standing challenge in science. Important application areas are image analysis through which we can try to make our computers to recognize the images as a human mind recognise.


"For example, when we see a dog, first we recognize that it's an animal....This recognition concept is simple and familiar to everybody in the real world environment, but in the world of artificial intelligence, recognizing such objects is an amazing feat. The functionality of the human brain is amazing; it is not comparable with any artificial machines or software. It is done through machines Applications include finger print identification, face recognition, character recognition, signature recognition and classification of objects in scientific/research areas such as astronomy, engineering, statistics, medical, machine learning and neural networks."

These include statistical and structural pattern recognition; image analysis; computational models of vision; enhancement, restoration, segmentation, feature extraction, shape and texture analysis; character and text recognition.

Current research on the image recognition for


It takes surprisingly few pixels of information to be able to identify the subject of an

image. The discovery could lead to great advances in the automated identification of online images and, ultimately, provide a basis for computers to see like humans do. Laboratory, and colleagues have been trying to find out what is the smallest amount of information--that is, the shortest numerical representation--that can be derived from an image that will provide a useful indication of its content.

At present, the only ways to search for images are based on text captions that people have entered by hand for each picture, and many images lack such information. Automatic identification would also provide a way to index pictures people download from digital cameras onto their computers, without having to go through and caption each one by hand. And ultimately it could lead to true machine vision, which could someday allow robots to make sense of the data coming from their cameras and figure out where they are.

We will try to find very short codes for images, so that if two images have a similar sequence, they are probably similar--composed of roughly the same object, in roughly the same configuration." If one image has been identified with a caption or title, then other images that match its numerical code would likely show the same object and so the name associated with one picture can be transferred to the others.

Psychologists have proposed that many human-object interaction activities form unique classes of scenes. Recognizing these scenes is important for many social functions. To enable a computer to do this is however a challenging task. Much of artificial intelligence deals with autonomous planning or deliberation for robotical systems to navigate through an environment. A detailed understanding of these environments is required to navigate through them. Information about the environment could be provided by a computer vision system, acting as a vision sensor and providing high-level information about the environment.


Image Processing

image processing, image analysis and machine vision. There is a significant overlap in the range of techniques and applications that these cover. This implies that the basic techniques that are used and developed in these fields are more or less identical, something which can be interpreted as there is only one field with different names. On the other hand, it appears to be necessary for research groups, scientific journals, conferences and companies to present or market themselves as belonging specifically to one of these fields and, hence, various characterizations which distinguish each of the fields from the others have been presented. The following characterizations appear relevant but should not be taken as universally accepted:

• Image processing and image analysis tend to focus on 2D images, how to transform one image to another, e.g., by pixel-wise operations such as contrast enhancement, local operations such as edge extraction or noise removal, or geometrical transformations such as rotating the image. This characterisation implies that image processing/analysis neither require assumptions nor produce interpretations about the image content.

• They tends to focus on the 3D scene projected onto one or several images, e.g., how to reconstruct structure or other information about the 3D scene from one or several images. Computer vision often relies on more or less complex assumptions about the scene depicted in an image.

• They tends to focus on applications, mainly in manufacturing, e.g., vision based autonomous robots and systems for vision based inspection or measurement. This implies that image sensor technologies and control theory often are integrated with the processing of image data to control a robot and that real-time processing is emphasised by means of efficient implementations in hardware and software. It also implies that the external conditions such as lighting can be and are often more controlled in machine vision than they are in general computer vision, which can enable the use of different algorithms.

• There is also a field called imaging which primarily focus on the process of producing images, but sometimes also deals with processing and analysis of images. For example, medical imaging contains lots of work on the analysis of image data in medical applications.

Typical tasks of image recognition

:-Determining whether or not the image data contains some specific object, feature, or activity. This task can normally be solved robustly and without effort by a human, but is still not satisfactorily solved in computer vision for the general case: arbitrary objects in arbitrary situations. The existing methods for dealing with this problem can at best solve it only for specific objects, such as simple geometric objects (e.g., polyhedra), human faces, printed or


hand-written characters, or vehicles, and in specific situations, typically described in terms of well-defined illumination, background, and pose of the object relative to the camera.

Different varieties of the recognition problem are described in the literature:

Object recognition : one or several pre-specified or learned objects or object classes

can be recognized, usually together with their 2D positions in the image or 3D poses in the scene.

Identification: An individual instance of an object is recognized. Examples:

identification of a specific person's face or fingerprint, or identification of a specific vehicle.

Detection: the image data is scanned for a specific condition. Examples: detection of

possible abnormal cells or tissues in medical images or detection of a vehicle in an automatic road toll system. Detection based on relatively simple and fast computations is sometimes used for finding smaller regions of interesting image data which can be further analysed by more computationally demanding techniques to produce a correct interpretation.

Several specialized tasks based on recognition exist, such as:

Content-based image retrieval : finding all images in a larger set of images which

have a specific content. The content can be specified in different ways, for example in terms of similarity relative a target image (give me all images similar to image X), or in terms of high-level search criteria given as text input (give me all images which contains many houses, are taken during winter, and have no cars in them).

Pose estimation : estimating the position or orientation of a specific object relative to

the camera. An example application for this technique would be assisting a robot arm in retrieving objects from a conveyor belt in an assembly line situation.

Optical character recognition (OCR): identifying characters in images of printed or handwritten text, usually with a view to encoding the text in a format more amenable to editing or indexing

Department of Computer Science and Engineering, Michigan State University. "The Pattern Recognition and Image Processing (PRIP) Lab faculty and students investigate the use of machines to recognize patterns or objects. Methods are developed to sense objects, to discover which of their features distinguish them from others, and to design algorithms which can be used by a machine to do the classification. ... Important applications include face recognition, fingerprint identification, document image analysis, 3D object model construction, robot navigation, and visualization/exploration of 3D volumetric data. Current research problems include biometric authentication, automatic surveillance and tracking, handless HCI, face modeling, digital watermarking and analyzing structure of online documents. Recent graduates of the lab have worked on handwriting recognition, signature verification, visual learning, and image retrieval."


Example:-It takes surprisingly few pixels of information to be able to identify the subject of an image, a team led by an MIT researcher has found. The discovery could lead to great advances in the automated identification of online images and, ultimately, provide a basis for computers to see like humans do.

Deriving such a short representation would be an important step toward making it possible to catalog the billions of images on the Internet automatically. At present, the only ways to search for images are based on text captions that people have entered by hand for each picture, and many images lack such information. Automatic identification would also provide a way to index pictures people download from digital cameras onto their computers, without having to go through and caption each one by hand. And ultimately it could lead to true machine vision, which could someday allow robots to make sense of the data coming from their cameras and figure out where they are.

so that if two images have a similar sequence [of numbers], they are probably similar--composed of roughly the same object, in roughly the same configuration." If one image has been identified with a caption or title, then other images that match its numerical code would likely show the same object (such as a car, tree, or person) and so the name associated with one picture can be transferred to the others. "With very large amounts of images, even relatively simple algorithms are able to perform fairly well" in identifying images this way.


Face recognition (a part of image recognition)

:-Face recognition systems are progressively becoming popular as means of extracting biometric information. Face recognition has a critical role in biometric systems and is attractive for numerous applications including visual surveillance and security. Because of the general public acceptance of face images on various documents, face recognition has a great potential to become the next generation biometric technology of choice. Face images are also the only biometric information available in some legacy databases and international terrorist watch-lists and can be acquired even without subjects' cooperation.

Example of Image

recognition:-Detecting objects in cluttered scenes and estimating articulated human body parts are two challenging problems in computer vision. The difficulty is particularly pronounced in activities involving human-object interactions (e.g. playing tennis), where the relevant object tends to be small or only partially visible, and the human body parts are often self-occluded. We observe, however, that objects and human poses can serve as mutual context to each other – recognizing one facilitates the recognition of the other. We then cast the model learning task as a structure learning problem, of which the structural connectivity

between the object, the overall human pose, and different body parts are estimated through a structure search approach, and the parameters of the model are estimated by a new max-margin algorithm. On a sports data set of six classes of human-object interactions .

Introduction:-Using context to aid visual recognition is recently receiving more and more attention. Psychology experiments show that context plays an important role in recognition in

the human visual system. object detection and recognition ,scene recognition ,action classification ,and segmentation ,While the idea of using context is clearly a good one, a curious observation shows that most of the context information has contributed

relatively little to boost performances in recognition tasks between context based methods and sliding window based methods for object detection .


MODEL:-Objects and human poses can serve as mutual context to facilitate the recognition of each other. the human pose is better estimated by seeing the cricket bat, from which we can have a strong prior of the pose of the human. the cricket ball is detected by understanding the human pose of throwing the ball. One reason to account for the relatively small margin is, in our opinion, the lack of strong context. While it is nice to detect cars in the context of roads, powerful car detectors can nevertheless detect cars with high accuracy whether they are on the road or not. Indeed, for the human visual system, detecting visual abnormality out of context is crucial for survival and social activities Many important image recognition tasks rely critically on context. One such scenario is the problem of human pose estimation and object detection in human-object interaction activities .However, the two difficult tasks can benefit greatly from serving as context for each othe. The goal of this paper is to model the

mutual context of objects and human poses in HOI activities so that each can facilitate

the recognition of the other. Given a set of training images, our model automatically discovers the relevant poses for each type of HOI activity, and furthermore the connectivity and spatial relationships between the objects and body parts. We formulate this task as a structure learning problem, of which the connectivity is learned by a structure search approach, and the model parameters are discriminatively estimated by a novel max-margin approach. By modeling the mutual co-occurrence and spatial relations of objects and

human poses, we show that our algorithm significantly improves the performance of both object detection and pose estimation on a dataset of sports images.

Some techniques have been proposed to avoid exhaustively searching the image which

makes the algorithm more efficient. While the most popular detectors are still based on sliding windows, more recent work has tried to integrate context to obtain better performance . However, in most of the works the performance is improved by a relatively small margin. It is out of the scope of this paper to develop an object detection or pose estimation method that generally applies to all situations. Instead, we focus on the role of context

in these problems. In most of these works, one type of scene information serves as contextual facilitation to a main recognition problem. For example, ground planes and horizons can help to refine pedestrian detections.

Properties of the

model:-Co-occurrence context for the activity class, object, and human pose. Given the presence of a tennis racket, the human pose is more likely to be playing tennis instead of playing croquet. That is to say, co-occurrence information can be beneficial for coherently modeling the object, the human pose, and the activity class. Multiple types of human poses for each


activity. Our model allows each activity (�) to consist of more than one human pose (�).

Treating � as a hidden variable, our model automatically discovers the possible poses from

training images. This gives us more flexibility to deal with the situations where the human poses in the same activity are inconsistent.

Image Recognation Syatems

:-Motion analysis

Several tasks relate to motion estimation where an image sequence is processed to produce an estimate of the velocity either at each points in the image or in the 3D scene, or even of the camera that produces the images . Examples of such tasks are:

Egomotion : determining the 3D rigid motion (rotation and translation) of the camera

from an image sequence produced by the camera.

Tracking : following the movements of a (usually) smaller set of interest points or

objects (e.g., vehicles or humans) in the image sequence.

Optical flow : to determine, for each point in the image, how that point is moving

relative to the image plane, i.e., its apparent motion. This motion is a result both of how the corresponding 3D point is moving in the scene and how the camera is moving relative to the scene.

Scene reconstruction

Given one or (typically) more images of a scene, or a video, scene reconstruction aims at computing a 3D model of the scene. In the simplest case the model can be a set of 3D points. More sophisticated methods produce a complete 3D surface model.

Image restoration

The aim of image restoration is the removal of noise (sensor noise, motion blur, etc.) from images. The simplest possible approach for noise removal is various types of filters such as low-pass filters or median filters. More sophisticated methods assume a model of how the local image structures look like, a model which distinguishes them from the noise. By first analysing the image data in terms of the local image structures, such as lines or edges, and then controlling the filtering based on local information from the analysis step, a better level of noise removal is usually obtained compared to the simpler approaches. An example in this field is the inpainting.

Some systems are stand-alone applications which solve a specific measurement or detection problem, while others constitute a sub-system of a larger design which, for example, also contains sub-systems for control of mechanical actuators, planning, information databases, man-machine interfaces, etc. The specific implementation of a computer vision system also depends on if its functionality is pre-specified or if some part of it can be learned or modified during operation. There are, however, typical functions which are found in many computer vision systems.


Image acquisition:

A digital image is produced by one or several image sensors, which, besides various types of light-sensitive cameras, include range sensors, tomography devices, radar, ultra-sonic cameras, etc. Depending on the type of sensor, the resulting image data is an ordinary 2D image, a 3D volume, or an image sequence. The pixel values typically correspond to light intensity in one or several spectral bands (gray images or colour images), but can also be related to various physical measures, such as depth, absorption or reflectance of sonic or electromagnetic waves, or nuclear magnetic resonance.


Before a computer vision method can be applied to image data in order to extract some specific piece of information, it is usually necessary to process the data in order to assure that it satisfies certain assumptions implied by the method. Examples are

o Re-sampling in order to assure that the image coordinate system is correct. o Noise reduction in order to assure that sensor noise does not introduce false


o Contrast enhancement to assure that relevant information can be detected. o Scale-space representation to enhance image structures at locally appropriate


Feature extraction:

Image features at various levels of complexity are extracted from the image data. Typical examples of such features are

o Lines, edges and ridges.

o Localized interest points such as corners, blobs or points.

More complex features may be related to texture, shape or motion.


At some point in the processing a decision is made about which image points or regions of the image are relevant for further processing. Examples are

o Selection of a specific set of interest points

o Segmentation of one or multiple image regions which contain a specific object

of interest.

High-level processing:

At this step the input is typically a small set of data, for example a set of points or an image region which is assumed to contain a specific object. The remaining processing deals with, for example:

o Verification that the data satisfy model-based and application specific


o Estimation of application specific parameters, such as object pose or object


o Classifying a detected object into different categories.

So, image processing help AI to identify the image and respond according to the image identification.


Related documents

The purpose of this test is to provide information about how a tested individual’s genes may affect carrier status for some inherited diseases, responses to some drugs, risk

We find no overall di ff erence in the average accuracy of IVR and traditional human polls, but IVR polls conducted prior to human polls are significantly poorer predictors of

“Comparison of impervious surface area and normalized difference vegetation index as indicators of surface urban heat island effects in Landsat imagery.” Remote Sensing

Хувилбар бүлгийн хулганы элэгний эдийн дээжийг Тэхийн шээг 3 бэлдмэлээр эмчилгээ хийснээс 7, 14, 21 хоногийн дараа харахад элэгний хэлтэнцрүүдэд үүссэн

To handle user requests in a dialog program, you must assign function codes to the relevant screen and window elements in the Screen Painter and menu Painter, mark the element

However, those participants who used coping strategies such as avoidance, like Lucy and Rachel, seemed to continue to experience a disrupted view of their future which had a number

(Although basic math facts include addition, subtraction, multiplication, and division, this study tested multiplication only.) The students’ fluency was then related to their

This report outlines the findings of the “Crossing the line” research study, which examined gender transitioning and experiences and responses to sexual violence for trans women