Top PDF Signal processing methods for genomic sequence analysis

Signal processing methods for genomic sequence analysis

Signal processing methods for genomic sequence analysis

Malvar during the first two internships and with Dr. Ivan Tashev during the third one. Rico was an extremely good mentor, who showed me what it means to maintain a perfect balance between theoretical research and practical applications. Although he was really busy with his work (well, you may not believe how many meetings he has as a director of MSR), he always found time for discussions and informal chats. I met Ivan during my third internship, where we worked on a project to develop a new adaptive microphone array processing algorithm. Ivan was a passionate mentor and he was very supportive and helpful throughout my internship. In fact, he taught me from A to Z about microphone arrays, although he himself was occupied with so many other things. During the project, I had a chance to build a linear microphone array by myself (it involved designing the array, cutting the plastic board using a laser cutter, soldering the wires, etc.), which was great fun! I am truly grateful to my MSR mentors, Rico and Ivan, who gave me these invaluable experiences that I will never forget.
Show more

196 Read more

Complete sequence and genomic analysis of murine gammaherpesvirus 68.

Complete sequence and genomic analysis of murine gammaherpesvirus 68.

homologs of HVS genes, all of which are also present in the KSHV genome and many of which are present in the EBV genome (Table 1). One ORF, K3, appears to be present in the KSHV and gHV68 genomes but not in the HVS or EBV genome. The overall identity of gHV68 ORFs to their ho- mologs in the other gammaherpesviruses indicates that gHV68 is more closely related to HVS and KSHV than to EBV (Table 1). However, at the level of individual ORFs, there is consid- erable variability in the extent of homology, and there are several ORFs for which the putative gHV68 ORF product is most closely related to the EBV gene product (see below). It should be noted that we assigned gHV68 genes as homologs of other gammaherpesvirus genes only when BLASTP analyses identified regions of homology (see Materials and Methods). Thus, for example, the gHV68 M9 ORF may be the homolog of HVS and KSHV ORF 65 and/or EBV BFRF3, but none of these gene products were sufficiently homologous to the puta- tive gHV68 M9 gene product to score in the BLASTP search. There was generally good size conservation between gHV68 ORFs and those in KSHV, HVS, and EBV (Table 1). One exception is ORF 57 of gHV68, which is significantly shorter than the KSHV, HVS, and EBV homologs. On further analy- sis, we found that ORF 57 is open upstream of the ATG at position 76662, which was used to calculate the size of the putative ORF 57 protein (Table 1). A BLASTP search of the ORF upstream of position 76662 revealed significant homol- ogy to HVS gene 57. In addition, alignment of this upstream region demonstrated significant homology with portions of the HVS and KSHV gene 57 proteins and the EBV BMLF1 gene. This finding suggests that gHV68 ORF 57 may be a spliced gene, with the ATG-initiated ORF in Table 1 present as the C-terminal portion of a longer ORF 57 product.
Show more

11 Read more

Analysis of Genetic Translation using Signal Processing

Analysis of Genetic Translation using Signal Processing

The utility of our model from the mechanistic perspective is that it suggests how both reading frame maintenance and reading frame shifts could be encoded in mRNA se- quences using translational speed to modulate positional accuracy. The model captures the idea that the instantaneous component of hybridization energy, D k (whose amount is a func- tion of the mRNA sequence), is available to the ribosomal complex to adjust the position of the mRNA relative to the ribosomal decoding center by an amount that is proportional to the time required for a tRNA or release factor to fully occupy the A-site. The model implies that the codon bias of mRNAs could reflect the existence of a position-adjusting mechanism to maintain reading frame. Through codon selection, each mRNA sequence carries the information to fine-tune the position of each codon in the decoding center taking into consideration variable translational speed.
Show more

117 Read more

Principal Component Analysis in ECG Signal Processing

Principal Component Analysis in ECG Signal Processing

Since a wide range of clinical examinations involves ECG sig- nals, huge amounts of data are produced not only for im- mediate scrutiny, but also for database storage for future re- trieval and review. Although hard disk technology has un- dergone dramatic improvements in recent years, increased disk size is parallelled by the ever-increasing wish of physi- cians to store more information. In particular, the inclusion of additional ECG leads, the use of higher sampling rates and finer amplitude resolution, the inclusion of noncardiac sig- nals such as blood pressure and respiration, and so on, lead to rapidly increasing demands on disk size. An important driving force behind the development of methods for data compression is the transmission of ECG signals across pub- lic telephone networks, cellular networks, intrahospital net- works, and wireless communication systems. Transmission of uncompressed data is today too slow, making it incom- patible with real-time demands that often accompany many ECG applications.
Show more

21 Read more

Entropy And Power Analysis Of Brain Signal Data By EEG Signal Processing

Entropy And Power Analysis Of Brain Signal Data By EEG Signal Processing

some entropy methods has been successfully used in EEG feature extraction for epilepsy detection, such as Sample, Approximate, Spectral entropy , and motor imagery such as Approximate , Kolmogorov , and Spectral entropy. We believe that entropy contains useful information and features that are unique to individuals, and hence entropy can be used for person identification .The performance of a person identification system can be measured based on its accuracy and efficiency (identification speed).

5 Read more

Sequence analysis and editing for bisulphite genomic sequencing projects

Sequence analysis and editing for bisulphite genomic sequencing projects

Bisulphite genomic sequencing is a widely used technique for detailed analysis of the methylation status of a region of DNA. It relies upon the selective deamination of unmethylated cytosine to uracil after treatment with sodium bisulphite, usually followed by PCR amplification of the chosen target region. Since this two-step procedure replaces all unmethy- lated cytosine bases with thymine, PCR products derived from unmethylated templates contain only three types of nucleotide, in unequal proportions. This can create a number of technical difficulties (e.g. for some base-calling methods) and impedes manual analysis of sequencing results (since the long runs of T or A residues are difficult to align visually with the parent sequence). To facilitate the detailed analysis of bisulphite PCR products (parti- cularly using multiple cloned templates), we have developed a visually intuitive program that identifies the methylation status of CpG dinucleotides by analysis of raw sequence data files produced by MegaBace or ABI sequencers as well as Staden SCF trace files and plain text files. The program then also collates and presents data derived from indepen- dent templates (e.g. separate clones). This results in a considerable reduction in the time required for completion of a detailed genomic methylation project.
Show more

9 Read more

Analysis of Genomic Sequence Using DSP Techniques in LABVIEW

Analysis of Genomic Sequence Using DSP Techniques in LABVIEW

chronic systemic inflammatory disease involving primarily the peripheral synovial joints is the disease taken for analysis. Many genes which are responsible for RA disease were found out and also the genomic sequence for each of these gens were found using databases such as KEGG (Kyoto Encyclopaedia of Genes and Genomes) and National centre of Biotechnology Information (NCBI). Along with abnormal genes few normal genes were also taken. Both normal and abnormal genes were then compared using the Digital Signal Processing techniques (DSP). Here, Fast Fourier Transform (FFT) is applied to achieve the comparison. The FFT tool is available in LabVIEW Software 2011 version. Appropriate code was written to extract a string sequence, convert this string sequence into numeric sequence and then apply FFT both the normal and abnormal. Gene sequence was given as inputs to the code. Analysis of the spectrum obtained for both normal and abnormal sequence was done by computing the mean amplitude. Separate code were written and implement for calculating the mean amplitude.
Show more

8 Read more

Sparse machine learning methods with applications in multivariate signal processing

Sparse machine learning methods with applications in multivariate signal processing

ML is a relatively young field that can be considered an extension of traditional statistics, with influences from optimisation, artificial intelligence, and theoretical computer science (to name but a few). One of the fundamental tenets of ML is statistical inference and decision making, with a focus on prediction performance of inferred models and exploratory data analysis. In contrast to traditional statistics, there is less focus on issues such as coverage (i.e. the interval for which it can be stated with a given level of confidence contains at least a specified proportion of the sample). In statistics, classical methods rely heavily on assumptions which are often not met in practice. In particular, it is often assumed that the data residuals are normally distributed, at least approximately, or that the central limit theorem can be relied on to produce normally distributed estimates. Unfortunately, when there are outliers in the data, classical (linear) methods often have very poor performance. This calls for theoretically justified non- linear methods which require fewer assumptions. This is the approach that will be taken throughout this thesis, with a focus on developing a computational methodology for efficient inference with empirical evaluation. This will be backed up through analysis drawn from statistical learning theory, which allows us to make guarantees about the generalisation performance (or other relevant properties) of particular algorithms given certain assumptions on the classes of data.
Show more

159 Read more

Signal Processing for Radar-Based Gait Analysis

Signal Processing for Radar-Based Gait Analysis

is able to correctly estimate the number of persons, and track the change in the number of individuals as we observe more data. [TSM+18] • Robust and sequential detection of gait asymmetry: Combining ideas of robust statistics and sequential analysis, an online approach for radar-based gait analysis is proposed. To this end, the radar micro- Doppler signatures are analyzed and salient features are extracted, which quantify the (dis)similarity between consecutive steps, and, thus, gait (a)symmetry. The distributions of these features under each gait class are obtained by means of kernel density estimation. To account for the inaccuracies of the density estimation, an uncertainty model is used to obtain a set of most similar distributions. These distributions are then employed to construct a sequential probability ratio test to render a fast and reliable decision about the underlying gait class. Based on real radar data, it is shown that the proposed approach can achieve high detection rates at reduced measurement durations. [SRZA19] • Robust genetic algorithm for the prediction of the maximal knee an- gle during walking: Given a set of radar features, the maximal knee angle during walking is predicted using a new robust genetic algorithm based nonlinear regression method. It simultaneously performs feature selec- tion, parameter optimization for the support vector machine and outlier rejection by encoding these aspects into the chromosome design. Using experimental radar data, it is shown that the proposed algorithm signifi- cantly outperforms competing methods in terms of prediction accuracy.
Show more

160 Read more

Classification and segmentation methods with application in audio and acoustic signal processing

Classification and segmentation methods with application in audio and acoustic signal processing

Envelope Analysis and Data-Driven Ap­ proaches to Acoustic Feature Extraction for Predicting the Remaining Useful Life of Rotating Machinery, Proceedings of the IEEE International Confer[r]

210 Read more

Signal Processing Methods for Quantitative Power Doppler Microvascular Angiography

Signal Processing Methods for Quantitative Power Doppler Microvascular Angiography

The WFSC method in [22] was extended to process 3-D quadrature demodulated (IQ) data instead of the scan-converted images analyzed in our previous work. Use of unfiltered IQ data enables retrospective application of any number of wall filter cut-off frequencies to construct the selection curves. Power Doppler processing software was developed in MATLAB R2013a (The MathWorks, Inc., Natick, MA, USA). For each image plane of a 3-D volume, Doppler IQ data were filtered using a third-order Type I Chebychev IIR wall filter at 100 increments of cut-off frequency from 0.005 to 0.5 times the pulse repetition frequency (PRF), which was set to 4 kHz. A third-order Chebychev filter was selected based on the analysis in [29, 30]. Each image plane was divided into adjacent, non-overlapping rectangular subregions of equal dimensions using an automated method detailed in [22]. The smallest rectangular window that maintains CPD < 0.80 in all subregions is used, so the number and size of the subregions varies among ROIs. Color pixel densities were computed after filtering at each cut-off frequency to construct WFSCs for each subregion. Characteristic intervals within these WFSCs were automatically identified as ranges of cut-off frequency that are bounded by local maxima in the normalized absolute first difference of the CPD, |∆CPD| norm , using the algorithm
Show more

192 Read more

Advanced Signal Processing Methods for Planetary Radar Sounders Data

Advanced Signal Processing Methods for Planetary Radar Sounders Data

A Bat Inspired Model for Clutter Reduction in Radar Sounder Systems Due to the radar sounders wide antenna beam, off-nadir surface reflections (i.e. clutter) of the transmitted signal can compete with echoes coming from the subsurface thus masking them. Different strategies have been adopted for clutter mitigation. However, none of them proved to be the final solution for this specific problem. In this context, we took inspiration from bats to study effective cletter detection strategies. Bats are very well known for their ability in discriminating between a prey and unwanted clutter (e.g. foliage) by effectively employing their sonar. According to recent studies, big brown bats can discriminate clutter by transmitting two different carrier frequencies. Most interestingly, there are many striking analogies between the characteristics of the bat sonar and the one of a radar sounder. Among the most important ones, they share the same nadir acquisition geometry and transmitted signal type (i.e. linear frequency modulation). In this thesis, we explore the feasibility of exploiting frequency diversity for the purpose of clutter discrimination in radar sounding by mimicking unique bats signal processing strategies. Accordingly, we propose a frequency diversity clutter reduction method based on specific mathematical conditions that, if verified, allows to perform the disambiguation between the clutter and the subsurface signal. These analytic conditions depend on factors such as difference in central carrier frequencies, surface roughness and subsurface material properties. One of the main strength of the proposed technique is that it does not rely on any a-priori information such a digital elevation model of the surface that is not always available. The method performances have been evaluated on Mars experimental data acquired by SHARAD proving its effectiveness. In particular the results obtained on radargrams of icy and volcanic region of Mars, which where previously analyzed by time-consuming analysis conducted by planetary geophysicists, proved that the method can effectively discriminate between genuine subsurface reflections and off- nadir surface clutter thus paving the way for a new generation of multi-frequency radar sounders with clutter discrimination capabilities.
Show more

131 Read more

Statistical Methods for Signal Processing with Application to Automatic Accent Recognition

Statistical Methods for Signal Processing with Application to Automatic Accent Recognition

through cepstral analysis. In detail, we examine the prediction ability of the classifiers with different numbers of MFCCs, varying from as small as 12 to as large as 39. The number of filters in the filter bank is chosen to be 40 in order to extract rich informa- tion from the signal. In terms of classification, we apply linear discriminant function, quadratic discriminant function, SVM with linear, RBF, and 2nd order polynomial ker- nels, and k-NN. Table 4.4 provides the performance results on frequency domain. In each cell, the first value represents the training accuracy, the second value (in italic) the mean prediction accuracy, and the third value (in parenthesis) the standard deviation of the prediction accuracy.
Show more

74 Read more

Functional Analysis of the Murine Coronavirus Genomic RNA Packaging Signal

Functional Analysis of the Murine Coronavirus Genomic RNA Packaging Signal

Our demonstration of the principal, and possibly exclusive, role of the MHV PS in the selection of gRNA for virion assembly raises a number of further issues. One of the most immediate questions is the identity of the interacting partner that carries out the selection of gRNA. The obvious candidate for this activity is the nucleocapsid (N) protein, which is the only known protein FIG 8 Relative fitness of PS mutants. (A) Monolayers of 17Cl1 cells were coinfected with ⌬ PS and wild-type viruses at an initial input PFU ratio of 1:1, 10:1, or 100:1, as detailed in Materials and Methods. Harvested released virus was serially propagated for a total of five passages. At each passage, RNA was isolated from infected cells and analyzed by RT-PCR, using a pair of primers flanking the central region of the nsp15 ORF to assay the presence or absence of the PS. PCR products were analyzed by agarose gel electrophoresis; the sizes of DNA markers are indicated on the left of each gel. Control lanes show RT-PCR products obtained from infections with wild-type virus or the ⌬ PS mutant alone or from uninfected cells (mock). (B) Competition between silPS and silPS-PS2 viruses was evaluated by the same procedure, except that RT-PCR was carried out with a pair of primers flanking the intergenic region between the replicase and the S ORFs to assay the presence or absence of the transposed PS element (PS2). The asterisk to the right of each agarose gel marks the position of an artifactual heteroduplex band formed by opposite strands of the 501-bp silPS-PS2 product and the 341-bp silPS product. The positions and sizes of PCR primers in the schematics are not to scale.
Show more

11 Read more

Analysis of ECG Signal for Detecting Heart Blocks Using Signal Processing Techniques

Analysis of ECG Signal for Detecting Heart Blocks Using Signal Processing Techniques

Peter kovacs [4] presents an algorithm which generates realistic synthesis ECG signals, this algorithm, among others, can be used to testing the new methods in ECG processing. By using numerical and Geometrical parameters, which are diagnostically importance the generated signal can be interpreted as Bio medical signal with important diagnostically intervals such as QRS, QT, PR etc. On the other hand this method gives us a strict mathematical control over the signal.The details and importance of the ECG wave form given in the table 1.1 and1.2, based on this data the radiologist can estimated the of the heart of the human. The proposed method consist the simulation of generate of ECG wave in order to consideration of heart blocks.
Show more

7 Read more

Analyzing the Effects of Graph Construction Methods on Image using Graph Signal Processing

Analyzing the Effects of Graph Construction Methods on Image using Graph Signal Processing

The representation, analysis, and compression of such data is a challenging task that requires the development of new tools like Graph signal processing that can identify and properly exploit data structures. The data domain, in these cases, is defined by a graph. The graph consists of vertices, where the data values are defined / sensed, and the edges connecting these vertices. Graph exploits the fundamental relations among the data based on their relevant properties. Processing of signals whose sensing domains are defined by graphs resulted in graph data processing as an emerging field [1][2] in big data signal processing today. This is a big step forward from the classical time (or space) series data analysis.
Show more

8 Read more

Development of  Digital Signal Processing and Statistical Classification Methods for Distinguishing Nasal Consonants

Development of Digital Signal Processing and Statistical Classification Methods for Distinguishing Nasal Consonants

In the past few years, researchers from the Voice I/O Lab of the Department of Computer Science at North Carolina State University have attempted to identify nasal consonants. They designed algorithms using the first two moments: mean and variance ([15][8]). In this chapter, those algorithms are briefly reviewed and their performance is re- evaluated. Although those algorithms do not provide convincing results by themselves, the idea of moment-space analysis does bring more information on the spectra of the nasals and will help design new approaches in the future.

112 Read more

Integrated genomic analysis of mitochondrial RNA processing in human cancers

Integrated genomic analysis of mitochondrial RNA processing in human cancers

Previous studies have highlighted that sequencing mis- matches observed in RNA sequencing data at particular positions in the mitochondrial genome represent post- transcriptional modification events [17, 18, 28]. The assumption behind this approach is that chemical modifi- cations of RNA either act as a road-block to the reverse transcription enzyme during library preparation or cause the enzyme to mis-incorporate nucleotides, resulting in sequencing errors [29]. Recent work by Mercer et al. [28], Sanchez et al. [17], and ourselves [18] has shown that sequence mismatches occur at a high rate at the ninth position of different mitochondrial tRNAs, which are posi- tions that are known to be post-transcriptionally methyl- ated, and subsequent experiments by ourselves [18] have shown that the proportion of mismatches at these sites is systematic and repeatable across replication experiments. Recently, new sequencing methods have confirmed the presence of post-transcriptional methylation events at 19/ 22 p9 sites by comparing RNA sequencing data from samples that have been treated with demethylation en- zymes against matched untreated samples [30]. Within this work, the general quantitative nature of using mismatch and strand termination events as a proxy for post-transcriptional modification was also shown, and although the ratio of these events does not perfectly match post-transcriptional modification levels (since the reference allele is sometimes incorporated), the two levels are highly similar. Under this model, we inferred the level of p9 site methylation as the proportion of non- reference alleles for the 11 positions identified as undergo- ing post-transcriptional methylation in our previous study
Show more

12 Read more

Elementary Time Frequency Analysis of EEG Signal Processing

Elementary Time Frequency Analysis of EEG Signal Processing

performed out using discrete Fourier Transform (DFT), discrete cosine transform (DCT) etc. But these tools for analysis has an disadvantage that spectrum of the signals is degraded due to short term windowing analysis and fixed transforms. Parametric estimation methods of EEG signals like Auto-regression (AR) models have an advantage over DCT of correct representation of frequency domain analysis but has disadvantage of improper estimation of model parameters since the measured signal is of very limited length. EEG signals are statistically non-stationary signals. Because there are many abnormal events occurring while capturing EEG signals. [1]
Show more

6 Read more

Digital signal processing approaches to bird song analysis

Digital signal processing approaches to bird song analysis

A more complex method of VAD which results in a similar style output of song start and end times uses HMMs. HMMs have been widely used for sound sequence analysis in speech and have particular appeal of temporal flexibility that goes beyond temporal matching [148]. Duan et al. in [48] used Song Scope software [1] which utilizes HMMs for bird detection in a comparison against threshold methods used by Raven. Song Scope is aimed at detecting call structures, which is a different approach to Raven. The Song Scope classification algorithms are based on HMMs using spectral feature vectors similar to MFCCs as these methods have been proven to work effectively in robust speech recognition applications [2, 48]. However, this approach is very sensitive to the purity of syllables. If syllables are polluted by non-target species or background noise, the model is very sensitive, thereby affecting the recognition accuracy. Using Song Scope effectively, requires some background knowledge of signal processing to understand and setup the parameters. Song Scope also supports batch processing to deal with large datasets [48]. Graciarena et al. [57] used a simple VAD system, with acoustic models trained with bird vocalization data to segment their bird recordings. Bird vocalization segments were extracted from CD waveforms. Segments of bird vocalizations were identified, with background noise and very short call segments discarded. Wei Chu [31] also used HMMs to detect Robin syllable boundaries.
Show more

201 Read more

Show all 10000 documents...