AN IMPROVED MACHINE LEARNING FRAMEWORK FOR CONSTRUCTING THE RADIO ASTRONOMY IN SPECTRACAL

(1)

An Improved Machine Learning Framework for Constructing the Radio Astronomy

in Spectracal

K.Mahalakshmi

1

, K.P.S.Sathya Priya

2

and T.Senthil Prakash

3

1_{PG Student, Department of CSE, Shree Venkateshwara Hi-Tech Engineering College, Gobi, India.} 2

Assistant Professor, Department of CSE, Shree Venkateshwara Hi-Tech Engineering College, Gobi, India.

3_{Head of the Department, Department of CSE, Shree Venkateshwara Hi-Tech Engineering College, Gobi, India.}

Article Received: 30 January 2018 Article Accepted: 27 March 2018 Article Published: 10 June 2018

1.INTRODUCTION

Digital image processing is the use of computer algorithms to perform image processing on digital images. As a

subcategory or field of digital signal processing, digital image processing has many advantages over analog image

processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such

as the build-up of noise and signal distortion during processing. Since images are defined over two dimensions

(perhaps more) digital image processing may be modeled in the form of multidimensional systems.

What is an image?

An image is an array, or a matrix, of square pixels (picture elements) arranged in columns and rows.

Figure 1: An image — an array or a matrix of pixels arranged in columns and rows

In a (8-bit) greyscale image each picture element has an assigned intensity that ranges from 0 to 255. A grey scale

image is what people normally call a black and white image, but the name emphasizes that such an image will also

include many shades of grey.

A B S T R A C T

(2)

Some greyscale images have more greyscales, for instance 16 bit = 65536 greyscales. In principle three greyscale

images can be combined to form an image with 281,474,976,710,656 greyscales. There are two general groups of

‗images‘: vector graphics (or line art) and bitmaps (pixel-based or ‗images‘). Some of the most common file

formats are: GIF — an 8-bit (256 colour), non-destructively compressed bitmap format. Mostly used for web. Has

several sub-standards one of which is the animated GIF. JPEG — a very efficient (i.e. much information per byte)

destructively compressed 24 bit (16 million colours) bitmap format. Widely used, especially for web and Internet

(bandwidth-limited).TIFF — the standard 24 bit publication bitmap format. Compresses non-destructively with,

for instance, Lempel-Ziv-Welch (LZW) compression. PS — Postscript, a standard vector format. Has numerous

sub-standards and can be difficult to transport across platforms and operating systems. PSD – a dedicated

Photoshop format that keeps all the information in an image including all the layers.

Colours

For science communication, the two main colour spaces are RGB and CMYK.

RGB

The RGB colour model relates very closely to the way we perceive colour with the r, g and b receptors in our

retinas. RGB uses additive colour mixing and is the basic colour model used in television or any other medium that

projects colour with light. It is the basic colour model used in computers and for web graphics, but it cannot be

used for print production. The secondary colours of RGB – cyan, magenta, and yellow – are formed by mixing two

of the primary colours (red, green or blue) and excluding the third colour. Red and green combine to make yellow,

green and blue to make cyan, and blue and red form magenta. The combination of red, green, and blue in full

(3)

intensities mix together according to the additive colour mixing model. This is analogous to stacking slide images

on top of each other and shining light through them.

CMYK

The 4-colour CMYK model used in printing lays down overlapping layers of varying percentages of transparent

cyan (C), magenta (M) and yellow (Y) inks. In addition a layer of black (K) ink can be added. The CMYK model

uses the subtractive colour model.

CMYK

The 4-colour CMYK model used in printing lays down overlapping layers of varying percentages of transparent

cyan (C), magenta (M) and yellow (Y) inks. In addition a layer of black (K) ink can be added. The CMYK model

uses the subtractive colour model.

GAMUT

The range, or gamut, of human colour perception is quite large. The two colour spaces discussed here span only a

fraction of the colours we can see. Furthermore the two spaces do not have the same gamut, meaning that

converting from one colour space to the other may cause problems for colours in the outer regions of the gamuts.

2. ANALOG IMAGE PROCESSING

Analog Image Processing refers to the alteration of image through electrical means. The most common example is

the television image. The television signal is a voltage level which varies in amplitude to represent brightness

through the image. By electrically varying the signal, the displayed image appearance is altered. The brightness and

contrast controls on a TV set serve to adjust the amplitude and reference of the video signal, resulting in the

brightening, darkening and alteration of the brightness range of the displayed image.

3. K-ANONYMIZATION AS SPATIAL INDEXING

The connection between anonymization and spatial indexing is perhaps not entirely surprising, as used a new

special-purpose spatial index structure (the ―pyramid tree‖) to anonymize objects moving in the spatial domain.

(4)

use of a classical spatial index that is already implemented and distributed in commercial and open-source RDBMS

products. The indexing anonymizing connection gives us a different perspective in the k-anonymization domain,

has several advantages over previously proposed k-anonymization algorithms, and unifies several desired goals for

anonymization into a single approach An R-tree index-based approach to k-anonymization furnishes us with

efficient index-construction algorithms that enable faster bulk anonymization times than previous techniques, even

for memory-resident data sets. It further shows that applying R-tree bulk-loading algorithms to anonymizing yields

anonymization algorithms that perform well even on data sets much larger than main memory. This enables us to

anonymize a data set of 100,000,000 records. It is observed that minimal bounding boxes from the indexing domain

suggest anonymizations that leave gaps in the domain. This can yield far more precise anonymizations than

previously proposed anonymization techniques, none of which consider leaving gaps. This opens up an interesting

and novel aspect of the always- present tension between anonymization and precision that has not been previously

explored in the k-anonymization literature.

Spatial indexes are well-suited to exploit anticipated workloads while anonymizing data sets. Selecting specific

quasi identifier attributes on which to build an index and biased splitting algorithms are two ways that can

incorporate query workloads into the anonymization. A database owner may wish to distribute anonymized tables

of different granularity‖ to separate groups, reflecting her trust. For example, she may deliver a 5- anonymization of

her table to a medical research group while delivering a 10 anonymous version to an insurance research group

Rather than re-anonymize the original table for each group, facing the danger of privacy violation in the presence of

collusion, to exploit the tree structure of a spatial index for automatic generation of multi-granular anonymized data

sets that preserves k- anonymity. Since database indexes are specifically designed for record insertions, deletions

and updates, by using them for anonymization that automatically get a mechanism for incremental anonymization.

However, incremental anonymization raises issues with respect to the preservation of privacy. If an attacker has

external knowledge of which individual‘s records are being inserted, deleted or updated in a data set, then the

attacker may be able to issue a series of queries over time and deduce sensitive information. While providing an

incrementally updatable anonymization technique does not solve the inference problem, it is a much better platform

for updates than current techniques, which could potentially require re-anonymization of the entire data set after

each update.

Finally, the index-based approach to anonymization can exploit the efficiency inherent in index update and

bulk-loading algorithms. Previous research in k-anonymizing algorithms has focused almost exclusively on the

quality of the resulting anonymization, rather than on the speed with which that anonymization is achieved. An

exception is the Mondrian algorithm from, where the authors present a polynomial time algorithm, thus making it

practical to consider anonymizing large data sets. While absolute performance was not the goal of that paper, it is

interesting to note that the approach suggested in that paper constitutes a top-down multidimensional spatial

partitioning algorithm, whereas spatial index building algorithms represent a bottom-up spatial partitioning

(5)

To investigate the quality and efficiency of both approaches, the reimplementation the Mondrian algorithm

described in and compared it to bottom-up index-based algorithms. It is found that the bottom-up approach gave

better quality as measured by the discernibility penalty KL divergence and the ―certainty metric Furthermore,

experiments with our implementation also showed that the bottom-up approach adopted by index bulk-loading

algorithms is an order of magnitude faster than the top-down Mondrian approach. It is an interesting area for future

research to determine whether this is a fundamental property of all top-down and bottom-up approaches.

4. BIG UNIVERSE BIG DATA

Large-Scale Data Analysis in Astronomy Machine learning methods can uncover the relation between input data

(galaxy images) and outputs (physical properties of galaxies) based on input/output samples, and they‘ve already

proved successful in various astrophysical contexts. For example, Daniel Mortlock and colleagues use Bayesian

analysis to find the most distant quasar to date. These extremely bright objects form at the center of large galaxies

and are very rare. Bayesian comparison has helped scientists select a few most likely objects for re-observation

from thousands of candidates.

In astronomy, distances from Earth to galaxies are measured by their redshifts, but accurate estimations require

expensive spectroscopy. Getting accurate redshifts from photometry alone is an essential but unsolved task for

which machine learning methods are widely applied.

Another application is the measurement of galaxy morphologies. Usually, we assign a galaxy a class based on its

appearance, traditionally via visual inspection. Lately, this has been accelerated by the citizen science project

Galaxy Zoo, 3 which aims to involve the public in classifying galaxies. Volunteers have contributed more than 100

million classifications, which allow astrophysicists to look for links between galaxies‘ appearances (morphology)

and internal and external properties.

Several discoveries have been made through the use of data from Galaxy Zoo, and the classifications have provided

numerous hints to the correlations between various processes governing galaxy evolution. A galaxy‘s morphology

is difficult to quantize in a concise manner, and automated methods are high on the wish list of astrophysicists.

There exists some work on reproducing the classifications using machine learning alone,4 but better systems will

be necessary when dealing with the data products of next-generation telescopes.

5. CONCLUSION

To solve the scalability and astronomy challenges in the existing algorithms establishing a direct link between

sampling and sparsity, compressed sensing had a huge impact in many scientific fields, especially in astronomy. to

emphasizing so rigorously the importance of sparsity, compressed sensing also has shed light on all work related to

sparse data representation (such as the wavelet transform, curvelet transform, etc.). Indeed, a signal is generally not

sparse in direct space (i.e., pixel space) but can be very sparse after being decomposed on a specific set of functions.

(6)

solution, since such a solution may be (under appropriate conditions) the exact one. Similar results are hardly

accessible with other regularization methods. This explains why wavelets and curvelets are so successful for

astronomical image denoising, deconvolution, and inpainting.

REFERENCES

1. Arel, I., Rose, D. C., & Karnowski, T. P. 2010, IEEE Computational Intelligence Magazine, 5, 13.

2. Baldi, R. D., Capetti, A., & Giovannini, G. 2016, Astronomische Nachrichten, 337, 114.

3. Banfield, J. K., Wong, O. I., Willett, K. W., et al. 2015, MNRAS, 453, 2326.

4. Banfield, J. K., Andernach, H., Kapi´nska, A. D., et al. 2016, MNRAS, 460, 2376.

5. Bates, S. D., Bailes, M., Barsdell, B. R., et al. 2012, MNRAS, 427, 1052.